Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

Overview

Learning to Adapt Structured Output Space for Semantic Segmentation

Pytorch implementation of our method for adapting semantic segmentation from the synthetic dataset (source domain) to the real dataset (target domain). Based on this implementation, our result is ranked 3rd in the VisDA Challenge.

Contact: Yi-Hsuan Tsai (wasidennis at gmail dot com) and Wei-Chih Hung (whung8 at ucmerced dot edu)

Paper

Learning to Adapt Structured Output Space for Semantic Segmentation
Yi-Hsuan Tsai*, Wei-Chih Hung*, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang and Manmohan Chandraker
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight) (* indicates equal contribution).

Please cite our paper if you find it useful for your research.

@inproceedings{Tsai_adaptseg_2018,
  author = {Y.-H. Tsai and W.-C. Hung and S. Schulter and K. Sohn and M.-H. Yang and M. Chandraker},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  title = {Learning to Adapt Structured Output Space for Semantic Segmentation},
  year = {2018}
}

Example Results

Quantitative Reuslts

Installation

  • Install PyTorch from http://pytorch.org with Python 2 and CUDA 8.0

  • NEW Add the LS-GAN objective to improve the performance

    • Usage: add --gan LS option during training (see below for more details)
  • PyTorch 0.4 with Python 3 and CUDA 8.0

    • Usage: replace the training and evaluation codes with the ones in the pytorch_0.4 folder
    • Update: tensorboard is provided by adding --tensorboard in the command
    • Note: the single-level model works as expected, while the multi-level model requires smaller weights, e.g., --lambda-adv-target1 0.00005 --lambda-adv-target2 0.0005. We will investigate this issue soon.
  • Clone this repo

git clone https://github.com/wasidennis/AdaptSegNet
cd AdaptSegNet

Dataset

  • Download the GTA5 Dataset as the source domain, and put it in the data/GTA5 folder

  • Download the Cityscapes Dataset as the target domain, and put it in the data/Cityscapes folder

Pre-trained Models

  • Please find our-pretrained models using ResNet-101 on three benchmark settings here

  • They include baselines (without adaptation and with feature adaptation) and our models (single-level and multi-level)

Testing

  • NEW Update results using LS-GAN and using Synscapes as the source domain

  • Download the pre-trained multi-level GTA5-to-Cityscapes model and put it in the model folder

  • Test the model and results will be saved in the result folder

python evaluate_cityscapes.py --restore-from ./model/GTA2Cityscapes_multi-ed35151c.pth
python evaluate_cityscapes.py --model DeeplabVGG --restore-from ./model/GTA2Cityscapes_vgg-ac4ac9f6.pth
python compute_iou.py ./data/Cityscapes/data/gtFine/val result/cityscapes

Training Examples

  • NEW Train the GTA5-to-Cityscapes model (single-level with LS-GAN)
python train_gta2cityscapes_multi.py --snapshot-dir ./snapshots/GTA2Cityscapes_single_lsgan \
                                     --lambda-seg 0.0 \
                                     --lambda-adv-target1 0.0 --lambda-adv-target2 0.01 \
                                     --gan LS
  • Train the GTA5-to-Cityscapes model (multi-level)
python train_gta2cityscapes_multi.py --snapshot-dir ./snapshots/GTA2Cityscapes_multi \
                                     --lambda-seg 0.1 \
                                     --lambda-adv-target1 0.0002 --lambda-adv-target2 0.001
  • Train the GTA5-to-Cityscapes model (single-level)
python train_gta2cityscapes_multi.py --snapshot-dir ./snapshots/GTA2Cityscapes_single \
                                     --lambda-seg 0.0 \
                                     --lambda-adv-target1 0.0 --lambda-adv-target2 0.001

Related Implementation and Dataset

  • Y.-H. Tsai, K. Sohn, S. Schulter, and M. Chandraker. Domain Adaptation for Structured Output via Discriminative Patch Representations. In ICCV, 2019. (Oral) [paper] [project] [Implementation Guidance]
  • W.-C. Hung, Y.-H Tsai, Y.-T. Liou, Y.-Y. Lin, and M.-H. Yang. Adversarial Learning for Semi-supervised Semantic Segmentation. In BMVC, 2018. [paper] [code]
  • Y.-H. Chen, W.-Y. Chen, Y.-T. Chen, B.-C. Tsai, Y.-C. Frank Wang, and M. Sun. No More Discrimination: Cross City Adaptation of Road Scene Segmenters. In ICCV 2017. [paper] [project]

Acknowledgment

This code is heavily borrowed from Pytorch-Deeplab.

Note

The model and code are available for non-commercial research purposes only.

  • 10/2019: update performance and training/evaluation codes for using LS-GAN and Synscapes (especially thanks to Yan-Ting Liu for helping experiments)
  • 01/2019: upate the training code for PyTorch 0.4
  • 07/23/2018: update evaluation code for PyTorch 0.4
  • 06/04/2018: update pretrained VGG-16 model
  • 02/2018: code released
Comments
  • Couldn't reproduce the result reported in the paper

    Couldn't reproduce the result reported in the paper

    Hi, thanks for the code! I am trying to reproduce your result using your code, but I only get a mIoU of 40 after 45000 iterations. The result get even worse after 65000 iterations. I am wondering if it is very sensitive to how many iterations to train? Also is there other things to mind using your code?

    opened by YifeiAI 23
  • Bug on deeplab classifier model

    Bug on deeplab classifier model

    class Classifier_Module(nn.Module):
        def __init__(self, inplanes, dilation_series, padding_series, num_classes):
            super(Classifier_Module, self).__init__()
            self.conv2d_list = nn.ModuleList()
            for dilation, padding in zip(dilation_series, padding_series):
                self.conv2d_list.append(
                    nn.Conv2d(inplanes, num_classes, kernel_size=3, stride=1, padding=padding, dilation=dilation, bias=True))
    
            for m in self.conv2d_list:
                m.weight.data.normal_(0, 0.01)
    
        def forward(self, x):
            out = self.conv2d_list[0](x)
            for i in range(len(self.conv2d_list) - 1):
                out += self.conv2d_list[i + 1](x)
                return out
    

    Why is the “return out " statement on the for loop? In my understanding, out is the sum of the pyramid layer , so it should be outside of the for loop.

    opened by lianqing11 15
  • About baseline result with res101-deeplabv2

    About baseline result with res101-deeplabv2

    Hi @wasidennis, I use your provided res101 backbone and train deeplab v2 on GTA as your code, Then I evaluate the model on cityscapes, but I only got 31.1 which is far from your baseline result of 36.6. Did I miss anything? Looking forward to your reply.

    opened by shanyuhu 7
  • Training takes a long time?

    Training takes a long time?

    Hi,

    Been trying to replicate the gta -> cityscapes results on an AWS instance (p2.xlarge). I am using docker with cuda 8.0 and pytorch 0.4.1. Running the following:

    python train_gta2cityscapes_multi.py --snapshot-dir ./snapshots/GTA2Cityscapes_single_lsgan \
                                         --lambda-seg 0.0 \
                                         --lambda-adv-target1 0.0 --lambda-adv-target2 0.01 \
                                         --gan LS
    
    

    It takes about 5.7 seconds per iteration. Given that the model converges at 120k iterations, it's gonna take me more than a week to train it, which sounds insane. Is there something wrong here or are those the expected times?

    opened by defqoon 6
  • RuntimeError: cuda runtime error (2) : out of memory

    RuntimeError: cuda runtime error (2) : out of memory

    Hello. I tried to execute your code, but it gives out of memory error. I am working with NVIDIA Titan Xp, which has 12GB memory capacity. It seems that the input sizes of the source and target images are quite big. ((1280,720) and (1024,512))

    Can I ask which GPU devices you use for training?

    opened by wgchang 6
  • Understanding the weights for multi-level adversarial learning

    Understanding the weights for multi-level adversarial learning

    Hello, It is a great work. I saw the readme file where you say the in pytorch-0.4 version the multi-level learning needs the weights for adv learning to be 0.0005 and 0.00005 respectively. I would like to know how you came up with this value ? I am asking because I am working on different dataset and it would be very helpful if I can understand how to specify the reasonable value for adv learning.

    opened by AshStuff 5
  • Error when training using VGG-16 model

    Error when training using VGG-16 model

    @wasidennis @m3phisto @hfslyc , i am trying to run the code (VGG source only). But when i run as you mentioned in previous thread, with --lambda-adv-target1 0 --lambda-adv-target2 i got this error:"train_gta2cityscapes_multi.py", line 311, in main pred1, pred2 = model(images) ValueError: not enough values to unpack (expected 2, got 1)`

    Are there any modifications that have to be done for training source only (without) adaptation experiment? If yes, i would be pleased if you could share the code.

    opened by alphjheon 5
  • How is the trend of loss changing?

    How is the trend of loss changing?

    I train my data using this method. The loss_adv1 is increased. loss_seg1 and loss_D1 is decreased. In this situation, should I make LAMBDA_ADV_TARGET1 larger ?

    opened by Sunting78 5
  • RuntimeError: there are no graph nodes that require computing gradients

    RuntimeError: there are no graph nodes that require computing gradients

    Hi, Thank you for sharing your code. I met a problem which is as follows:

    **Traceback (most recent call last):

    File "train_gta2cityscapes_multi.py", line 412, in main() File "train_gta2cityscapes_multi.py", line 304, in main loss.backward() File "/home/yaxing/anaconda2/envs/pytorch/lib/python2.7/site-packages/torch/autograd/variable.py", line 156, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables) File "/home/yaxing/anaconda2/envs/pytorch/lib/python2.7/site-packages/torch/autograd/init.py", line 98, in backward variables, grad_variables, retain_graph) RuntimeError: there are no graph nodes that require computing gradients**

    Note I never change the code. During training, the error happens instead of the beginning. There are three times(iter:25281/25000, 8500/25000, 600/25000 ). The pytorch is 0.2.0_4 with python2.7.

    please forgive me to bother you

    opened by yaxingwang 5
  • feature space adaptation

    feature space adaptation

    Hi, thanks for the great work and the codes. In the original implementation, i only found the output space adaptation, since the both outputs from deeplab_multi are coming from the classifier layers and the upsampled for the later alignment. So if i want to implement the real feature space adaptation, i should take the feature maps right before the classifier layers, right? And also a question, the output of from layer5, in my opinion, is not shallow enough, would aligning the features coming from very shallow layers help the domain adaptation?

    looking forward to your reply.

    opened by jianingwangind 4
  • Train the model on other dataset.

    Train the model on other dataset.

    Hello! I have trained your model successfully recently on GTA5 and Cityscapes. And I want to try it on other dataset. I am a newbie in this domain and I met a problem. The new dataset labels are black-and-white images. I found that there maybe some problems in labels. So I want to know that what does "label2train" mean in json file? I really can not understand these numbers([0, 255], [1, 255], [2, 255], [3, 255], [4, 255], [5, 255], [6, 255], [7, 0], [8, 1], [9, 255], [10, 255], [11, 2], [12, 3], [13, 4], [14, 255], [15, 255], [16, 255], [17, 5], [18, 255], [19, 6], [20, 7], [21, 8], [22, 9], [23, 10], [24, 11], [25, 12], [26, 13], [27, 14], [28, 15], [29, 255], [30, 255], [31, 16], [32, 17], [33, 18], [-1, 255]).

    And some of them is applied in the gta5_dataset.py in init. I really can not understand these numbers.
    Maybe my problem is very stupid. I am really a newbie just want to learn more. Thank you very much!!

    opened by JIANGbb95 4
  • About the numeric scale of the dicriminator loss

    About the numeric scale of the dicriminator loss

    opened by Muming-Zhao 0
  • data augmentation

    data augmentation

    hello author,

    Did you use the the mirror and scale augmentaions in the dataloader. There are arguments in the dataloader function, but have not been used ?

    thank you

    opened by soans1994 0
  • error when loading the pretrained model

    error when loading the pretrained model

    when I load the pretrained model which is downloaded from

    http://vllab.ucmerced.edu/ytsai/CVPR18/DeepLab_resnet_pretrained_init-f81d91e8.pth

    then torch.load() will report an error

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/data1/xxx/software/anaconda3/envs/torch18/lib/python3.8/site-packages/torch/serialization.py", line 593, in load
        return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
      File "/data1/xxx/software/anaconda3/envs/torch18/lib/python3.8/site-packages/torch/serialization.py", line 779, in _legacy_load
        deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
    RuntimeError: unexpected EOF, expected 278891 more bytes. The file might be corrupted.
    

    Is the online model incomplete?

    opened by TomSheng21 0
  • Can you share a log file?

    Can you share a log file?

    Thank you to share your amazing code!

    I try to use deeper segmentation network. But after changing the segmentation backbone network, the adversarial loss diverges only after a few steps.

    Can you share a training log file of single-level adaptation training for both VGG-16 & ResNet-101? Thank you!

    opened by drumyseong 0
  • How much  NUM_STEPS can be reduced to train the model well?

    How much NUM_STEPS can be reduced to train the model well?

    Hello i want to train model faster.
    Could i decrease the NUM_STEPS to 25000? what are the effects of decreasing the NUM_STEPS ? I will be very grateful if you answer my questions sincerely

    opened by farahnazmalekzadeh 0
Owner
Yi-Hsuan Tsai
Yi-Hsuan Tsai
Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

Rush Kapoor 2 Nov 21, 2022
Code Release for Learning to Adapt to Evolving Domains

EAML Code release for "Learning to Adapt to Evolving Domains" (NeurIPS 2020) Prerequisites PyTorch >= 0.4.0 (with suitable CUDA and CuDNN version) tor

null 23 Dec 7, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Cross-media Structured Common Space for Multimedia Event Extraction Table of Contents Overview Requirements Data Quickstart Citation Overview The code

Manling Li 49 Nov 21, 2022
[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

Quande Liu 178 Jan 6, 2023
[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Dec 29, 2022
An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

CV Lab @ Yonsei University 35 Oct 26, 2022
3D ResNets for Action Recognition (CVPR 2018)

3D ResNets for Action Recognition Update (2020/4/13) We published a paper on arXiv. Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh,

Kensho Hara 3.5k Jan 6, 2023
StarGAN - Official PyTorch Implementation (CVPR 2018)

StarGAN - Official PyTorch Implementation ***** New: StarGAN v2 is available at https://github.com/clovaai/stargan-v2 ***** This repository provides t

Yunjey Choi 5.1k Jan 4, 2023
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place

Mikaela Uy 294 Dec 12, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Mingrui Yu 3 Jan 7, 2022
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

Microsoft 247 Dec 25, 2022
Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation

PLOP: Learning without Forgetting for Continual Semantic Segmentation This repository contains all of our code. It is a modified version of Cermelli e

Arthur Douillard 116 Dec 14, 2022
Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Dual super-resolution learning for semantic segmentation 2021-01-02 Subpixel Update Happy new year! The 2020-12-29 update of SISR with subpixel conv p

Sam 79 Nov 24, 2022
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

Computer Vision Insitute, SZU 113 Dec 27, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

null 39 Jul 21, 2022
Point Cloud Denoising input segmentation output raw point-cloud valid/clear fog rain de-noised Abstract Lidar sensors are frequently used in environme

Point Cloud Denoising input segmentation output raw point-cloud valid/clear fog rain de-noised Abstract Lidar sensors are frequently used in environme

null 75 Nov 24, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 897 Jan 5, 2023