Semi-supervised semantic segmentation needs strong, varied perturbations

Overview

Semi-supervised semantic segmentation using CutMix and Colour Augmentation

Implementations of our papers:

Licensed under MIT license.

Colour augmentation

Please see our new paper for a full discussion, but a summary of our findings can be found in our [colour augmentation](Colour augmentation.ipynb) Jupyter notebook.

Requirements

We provide an environment.yml file that can be used to re-create a conda environment that provides the required packages:

conda env create -f environment.yml

Then activate with:

conda activate cutmix_semisup_seg

(note: this will not install the library needed to use the PSPNet architecture; see below)

In general we need:

  • Python >= 3.6
  • PyTorch >= 1.4
  • torchvision 0.5
  • OpenCV
  • Pillow
  • Scikit-image
  • Scikit-learn
  • click
  • tqdm
  • Jupyter notebook for the notebooks
  • numpy 1.18

Requirements for PSPNet

To use the PSPNet architecture (see Pyramid Scene Parsing Network by Zhao et al.), you will need to install the logits-from_models branch of https://github.com/Britefury/semantic-segmentation-pytorch:

pip install git+https://github.com/Britefury/semantic-segmentation-pytorch.git@logits-from-models

Datasets

You need to:

  1. Download/acquire the datsets
  2. Write the config file semantic_segmentation.cfg giving their paths
  3. Convert them if necessary; the CamVid, Cityscapes and ISIC 2017 datasets must be converted to a ZIP-based format prior to use. You must run the provided conversion utilities to create these ZIP files.

Dataset preparation instructions can be found here.

Running the experiments

We provide four programs for running experiments:

  • train_seg_semisup_mask_mt.py: mask driven consistency loss (the main experiment)
  • train_seg_semisup_aug_mt.py: augmentation driven consistency loss; used to attempt to replicate the ISIC 2017 baselines of Li et al.
  • train_seg_semisup_ict.py: Interpolation Consistency Training; a baseline for contrast with our main approach
  • train_seg_semisup_vat_mt.py: Virtual Adversarial Training adapted for semantic segmentation

They can be configured via command line arguments that are described here.

Shell scripts

To replicate our results, we provide shell scripts to run our experiments.

Cityscapes
> sh run_cityscapes_experiments.sh <run> <split_rng_seed>

where <run> is the name of the run and <split_rng_seed> is an integer RNG seed used to select the supervised samples. Please see the comments at the top of run_cityscapes_experiments.sh for further explanation.

To re-create the 5 runs we used for our experiments:

> sh run_cityscapes_experiments.sh 01 12345
> sh run_cityscapes_experiments.sh 02 23456
> sh run_cityscapes_experiments.sh 03 34567
> sh run_cityscapes_experiments.sh 04 45678
> sh run_cityscapes_experiments.sh 05 56789
Pascal VOC 2012 (augmented)
> sh run_pascal_aug_experiments.sh <n_supervised> <n_supervised_txt>

where <n_supervised> is the number of supervised samples and <n_supervised_txt> is that number as text. Please see the comments at the top of run_pascal_aug_experiments.sh for further explanation.

We use the same data split as Mittal et al. It is stored in data/splits/pascal_aug/split_0.pkl that is included in the repo.

Pascal VOC 2012 (augmented) with DeepLab v3+
> sh run_pascal_aug_deeplab3plus_experiments.sh <n_supervised> <n_supervised_txt>
ISIC 2017 Segmentation
> sh run_isic2017_experiments.sh <run> <split_rng_seed>

where <run> is the name of the run and <split_rng_seed> is an integer RNG seed used to select the supervised samples. Please see the comments at the top of run_isic2017_experiments.sh for further explanation.

To re-create the 5 runs we used for our experiments:

> sh run_isic2017_experiments.sh 01 12345
> sh run_isic2017_experiments.sh 02 23456
> sh run_isic2017_experiments.sh 07 78901
> sh run_isic2017_experiments.sh 08 89012
> sh run_isic2017_experiments.sh 09 90123

In early experiments, we test 10 seeds and selected the middle 5 when ranked in terms of performance, hence the specific seed choice.

Exploring the input data distribution present in semantic segmentation problems

Cluster assumption

First we examine the input data distribution presented by semantic segmentation problems with a view to determining if the low density separation assumption holds, in the notebook Semantic segmentation input data distribution.ipynb This notebook also contains the code used to generate the images from Figure 1 in the paper.

Inter-class and intra-class variance

Secondly we examine the inter-class and intra-class distance (as a proxy for inter-class and intra-class variance) in the notebook Plot inter-class and intra-class distances from files.ipynb

Note that running the second notebook requires that you generate some data files using the intra_inter_class_patch_dist.py program.

Toy 2D experiments

The toy 2D experiments used to produce Figure 3 in the paper can be run using the toy2d_train.py program, which is documented here.

You can re-create the toy 2D experiments by running the run_toy2d_experiments.sh shell script:

> sh run_toy2d_experiments.sh <run>
Comments
  • Confused on the

    Confused on the "loss_mask"

    Hi, really nice codebase!

    We have tried to understand the details of your implementation and find the usage of the loss mask a little bit confusing (as shown below).

    https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/train_seg_semisup_mask_mt.py#L294-L297

    First, I have checked the code carefully and find that you define the 'mask' for the unlabeled images at the following code snippet:

    https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_data.py#L90-L96

    Therefore, we can see that the initial 'mask' for all images is set as an array of 255 with the same size as the input image. Then you assign the mask value as following:

    https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_transforms_cv.py#L198

    According to my understanding, only when we apply the cv2.warpAffine transform, some of the 'mask' values become zero (as shown below):

    https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_transforms_cv.py#L371-L372

    If my understanding is correct, the batch_um0 and batch_um1 might not be necessary if we do not apply the warpAffine transformation.

    https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/train_seg_semisup_mask_mt.py#L294-L297

    So my first question is about the influence of affine.cat_nx2x3 (shown as below) on the final performance.

    https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_transforms_cv.py#L350-L356

    So I guess the purpose of the loss mask is mainly to filter out the influence of the extra introduced pixels when applying the warpAffine augmentation. Is my understanding correct?

    It would be great if you could point out any of my misunderstandings!

    opened by PkuRainBow 5
  • Question about the cons weight.

    Question about the cons weight.

    Hi. In mean teacher, the consistency weight is 100. but in this work, all consistency weight is 1. Isn't this value too small? Can you tell me the details of setting this parameter, because I have seen other work(such as CPS, cvpr2021) that uses a consistency weight of around 100 when reproducing the mean-teacher method as well.

    Looking forward to your help.

    best,

    opened by CuberrChen 4
  • Pretrained models

    Pretrained models

    Hi @Britefury,

    Thank you for making the code publicly available.

    Can you share the pretrained models for the deeplabv2 architecture (Cutmix, Cutout, VAT and ICT) please? We are having troubles reproducing the results.

    Thanks in advance.

    opened by wvangansbeke 4
  • What is the

    What is the "mask_arr" in the code?

    Hi. Thanks for your interesting work!

    I have a question about the following code: https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/datapipe/seg_data.py#L90-L96 What is the "mask_arr" here? All elements of this array are set to 255. Why should we define it here?

    In the file of the main experiment, "mask_arr" is converted to "mask" and called by: https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L295

    https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L297

    https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L306

    https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L324

    In the above code, a "loss_mask" is generated for unlabeled loss, i.e., the consistency loss. Are all elements of "loss_mask" equal to 1? Can you explain what it does?

    Thanks in advance.

    opened by ZHKKKe 2
  • How to run the jobs on multiple GPUs.

    How to run the jobs on multiple GPUs.

    Great job for the semi-supervised semantic segmentation problem!

    After checking the implementations, we do not find the related code to support running jobs with multiple GPUs,

    We only find that the job function in the job_helper.py contains a parameter "num_jobs" as shown below,

    https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/job_helper.py#L87-L116

    opened by PkuRainBow 1
  • Add hubconf.py

    Add hubconf.py

    This adds hubconf.py so that, for example, the following can work when "Ivan1248" is replaced with "Britefury":

    model = torch.hub.load('Ivan1248/cutmix-semisup-seg', 'resnet101_deeplab_imagenet', num_classes=19) 
    

    If you would like, you can accept this or make changes.

    opened by Ivan1248 0
  • Questions in converting cityscapes dataset

    Questions in converting cityscapes dataset

    Thanks for your excellent work and kindly code releasing. I have a problem when reproducing the cityscapes experiment.

    Following your tutorial, I have reproduced the experimental results, 44.37 vs. 51.73 with 6.79 improvement under the cutmix setting. When delving deep into the codes, I notice that you implement a downsample_label_img function for downsampling ground truth in the convert_cityscapes.py. However, it is more common to use nearest downsampling method in cv2 or PIL. Then, I replace the downsample_label_img function with the following: y_img = cv2.resize(y_img, (1024, 512), interpolation=cv2.INTER_NEAREST) and re-conduct the cityscapes experiment. The results degrades greatly, 43.79 vs. 47.02 with only 3.23 improvement.

    I wonder the reason why a different downsample method affects the result greatly and the motivation to reimplement a downsample method rather than using a common one.

    opened by jfzhuang 3
  • Improvement due to loss function or data augmentation?

    Improvement due to loss function or data augmentation?

    Hi, I went through the paper and had a tough time understanding the jist of it. Is it:

    1. Cut-Mix for segmentation images improving the accuracy metrics or
    2. The consistency loss function is improving your scores

    In a more naive way, is the work about cut-mix for segmentation(orginal is for classification) or the work is about consistency loss

    opened by ramanathan831 0
  • Small concerns on the experiments in Table-4

    Small concerns on the experiments in Table-4

    Really nice work and super impressive results on the low-data regime (shown in Table 4, also pasted as following)!!

    image

    We have some small concerns about the red circle marked results:

    1. On the 1/100 subset column, the baseline of DeepLabv3+/PSPNet is only 37.95%/36.69% while your method achieves 59.52%/67.20% separately. We really appreciate you if you could provide some detailed explanation of why your method could achieve so huge gains! One of our small concerns is that applying the CutMix scheme + Mean-Teacher scheme on the supervised baseline method w/o using unlabeled images might be a more reasonable baseline setting. It would be great if could share with us some results of such settings.

    2. On the 1/8 subset column, the baseline performance of DeepLabv3+ is slightly worse than the PSPNet. According to our experience, the DeepLabv3+ should perform much better. Could you share with us some explanation on it?

    3. On the full set column, we observe that the performance of the proposed method is slightly worse than the baseline. Could you share your comments on the possible reasons?

    4. According to your code, all the experiments fix the BN statics and apply crop size 321x321 with a single GPU. Do you have any plans to train or have you ever trained your method on a more strong setting such as with crop size 512x512 + SyncBN + 8x V100 GPUs. MMSegmentationor openseg.pytorch might be a good candidate codebase.

    Great thanks for your valuable time and wait for your explanation!

    opened by PkuRainBow 2
  • Requesting pretrained models

    Requesting pretrained models

    Hi, I found your paper really insightful, been trying to replicate the results. Thanks for the code and the elaborate explanations with training scripts. I was wondering if you could share the pretrained models for Cityscapes dataset (Cutout, Cutmix and baseline models) for comparison and evaluation.

    opened by prachigarg23 0
Owner
null
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

klein 125 Jan 3, 2023
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

Jia Research Lab 137 Dec 14, 2022
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Jungbeom Lee 110 Dec 7, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

DV Lab 137 Dec 14, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 8, 2023
ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

ST++ This is the official PyTorch implementation of our paper: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation. Lihe Ya

Lihe Yang 147 Jan 3, 2023
[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

Haochen Wang 268 Dec 24, 2022
[cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

PS-MT [cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation by Yuyuan Liu, Yu Tian, Yuanhong Chen, Fengbei Liu, Vasile

Yuyuan Liu 132 Jan 3, 2023
A Strong Baseline for Image Semantic Segmentation

A Strong Baseline for Image Semantic Segmentation Introduction This project is an open source semantic segmentation toolbox based on PyTorch. It is ba

Clark He 49 Sep 20, 2022
Differentiable Optimizers with Perturbations in Pytorch

Differentiable Optimizers with Perturbations in PyTorch This contains a PyTorch implementation of Differentiable Optimizers with Perturbations in Tens

Jake Tuero 54 Jun 22, 2022
Official repository for "On Generating Transferable Targeted Perturbations" (ICCV 2021)

On Generating Transferable Targeted Perturbations (ICCV'21) Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Fatih Porikli Paper:

Muzammal Naseer 46 Nov 17, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning This is the official PyTorch implementation for UniMoCo pape

dddzg 49 Jan 2, 2023
Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Tom-R.T.Kvalvaag 2 Dec 17, 2021
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 3, 2023
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

Healthcare Intelligence Laboratory 1.3k Jan 3, 2023
This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

null 1.4k Jan 4, 2023