TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation
Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin
[Preprint]
Getting Started
Create the environment
# create conda env
conda create -n TransFGU python=3.8
# activate conda env
conda activate TransFGU
# install pytorch
conda install pytorch=1.8 torchvision cudatoolkit=10.1
# install other dependencies
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
pip install -r requirements.txt
Dataset Preparation
- MS-COCO Dataset: Download the trainset, validset, annotations and the json files, place the extracted files into
root/data/MSCOCO
. - PascalVOC Dataset: Download training/validation data, place the extracted files into
root/data/PascalVOC
. - Cityscapes Dataset: Download leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip, place the extracted files into
root/data/Cityscapes
. - LIP Dataset: Download TrainVal_images.zip and TrainVal_parsing_annotations.zip, place the extracted files into
root/data/LIP
.
the structure of dataset folders should be as follow:
data/
│── MSCOCO/
│ ├── images/
│ │ ├── train2017/
│ │ └── val2017/
│ └── annotations/
│ ├── train2017/
│ ├── val2017/
│ ├── instances_train2017.json
│ └── instances_val2017.json
│── Cityscapes/
│ ├── leftImg8bit/
│ │ ├── train/
│ │ │ ├── aachen
│ │ │ └── ...
│ │ └──── val/
│ │ ├── frankfurt
│ │ └── ...
│ └── gtFine/
│ ├── train/
│ │ ├── aachen
│ │ └── ...
│ └──── val/
│ ├── frankfurt
│ └── ...
│── PascalVOC/
│ ├── JPEGImages/
│ ├── SegmentationClass/
│ └── ImageSets/
│ └── Segmentation/
│ ├── train.txt
│ └── val.txt
└── LIP/
├── train_images/
├── train_segmentations/
├── val_images/
├── val_segmentations/
├── train_id.txt
└── val_id.txt
Model download
- please download the pretrained dino model (deit small 8x8), then place it into
root/weight/dino/
- download trained model from Google Drive or Baidu Netdisk (code:1118), then place them into
root/weight/trained/
Name | mIoU | Pixel Accuracy | Model |
---|---|---|---|
COCOStuff-27 | 16.19 | 44.52 | Google Drive |
COCOStuff-171 | 11.93 | 34.32 | Google Drive |
COCO-80 | 12.69 | 64.31 | Google Drive |
Cityscapes | 16.83 | 77.92 | Google Drive |
Pascal-VOC | 37.15 | 83.59 | Google Drive |
LIP-5 | 25.16 | 65.76 | Google Drive |
LIP-16 | 15.49 | 60.08 | Google Drive |
LIP-19 | 12.24 | 42.52 | Google Drive |
Train and Evaluate Our Method
To train and evaluate our method on different datasets under desired granularity level, please follow the instructions here.
Citation
If you find our work useful in your research, please consider citing:
@article{yin2021transfgu,
title={TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation},
author={Zhaoyun, Yin and Pichao, Wang and Fan, Wang and Xianzhe, Xu and Hanling, Zhang and Hao, Li and Rong, Jin},
journal={arXiv preprint arXiv:2112.01515},
year={2021}
}
LICENSE
The code is released under the MIT license.
Copyright
Copyright (C) 2010-2021 Alibaba Group Holding Limited.