Semi-supervised semantic segmentation needs strong, varied perturbations

Last update: Dec 20, 2022

Related tags

Deep Learning cutmix-semisup-seg

Overview

Semi-supervised semantic segmentation using CutMix and Colour Augmentation

Implementations of our papers:

Semi-supervised semantic segmentation needs strong, varied perturbations by Geoff French, Samuli Laine, Timo Aila, Michal Mackiewicz and Graham Finlayson
Colour augmentation for improved semi-supervised semantic segmentation by Geoff French and Michal Mackiewicz

Licensed under MIT license.

Colour augmentation

Please see our new paper for a full discussion, but a summary of our findings can be found in our [colour augmentation](Colour augmentation.ipynb) Jupyter notebook.

Requirements

We provide an environment.yml file that can be used to re-create a conda environment that provides the required packages:

conda env create -f environment.yml

Then activate with:

conda activate cutmix_semisup_seg

(note: this will not install the library needed to use the PSPNet architecture; see below)

In general we need:

Python >= 3.6
PyTorch >= 1.4
torchvision 0.5
OpenCV
Pillow
Scikit-image
Scikit-learn
click
tqdm
Jupyter notebook for the notebooks
numpy 1.18

Requirements for PSPNet

To use the PSPNet architecture (see Pyramid Scene Parsing Network by Zhao et al.), you will need to install the logits-from_models branch of https://github.com/Britefury/semantic-segmentation-pytorch:

pip install git+https://github.com/Britefury/semantic-segmentation-pytorch.git@logits-from-models

Datasets

You need to:

Download/acquire the datsets
Write the config file semantic_segmentation.cfg giving their paths
Convert them if necessary; the CamVid, Cityscapes and ISIC 2017 datasets must be converted to a ZIP-based format prior to use. You must run the provided conversion utilities to create these ZIP files.

Dataset preparation instructions can be found here.

Running the experiments

We provide four programs for running experiments:

train_seg_semisup_mask_mt.py: mask driven consistency loss (the main experiment)
train_seg_semisup_aug_mt.py: augmentation driven consistency loss; used to attempt to replicate the ISIC 2017 baselines of Li et al.
train_seg_semisup_ict.py: Interpolation Consistency Training; a baseline for contrast with our main approach
train_seg_semisup_vat_mt.py: Virtual Adversarial Training adapted for semantic segmentation

They can be configured via command line arguments that are described here.

Shell scripts

To replicate our results, we provide shell scripts to run our experiments.

Cityscapes

> sh run_cityscapes_experiments.sh <run> <split_rng_seed>

where <run> is the name of the run and <split_rng_seed> is an integer RNG seed used to select the supervised samples. Please see the comments at the top of run_cityscapes_experiments.sh for further explanation.

To re-create the 5 runs we used for our experiments:

> sh run_cityscapes_experiments.sh 01 12345
> sh run_cityscapes_experiments.sh 02 23456
> sh run_cityscapes_experiments.sh 03 34567
> sh run_cityscapes_experiments.sh 04 45678
> sh run_cityscapes_experiments.sh 05 56789

Pascal VOC 2012 (augmented)

> sh run_pascal_aug_experiments.sh <n_supervised> <n_supervised_txt>

where <n_supervised> is the number of supervised samples and <n_supervised_txt> is that number as text. Please see the comments at the top of run_pascal_aug_experiments.sh for further explanation.

We use the same data split as Mittal et al. It is stored in data/splits/pascal_aug/split_0.pkl that is included in the repo.

Pascal VOC 2012 (augmented) with DeepLab v3+

> sh run_pascal_aug_deeplab3plus_experiments.sh <n_supervised> <n_supervised_txt>

ISIC 2017 Segmentation

> sh run_isic2017_experiments.sh <run> <split_rng_seed>

where <run> is the name of the run and <split_rng_seed> is an integer RNG seed used to select the supervised samples. Please see the comments at the top of run_isic2017_experiments.sh for further explanation.

To re-create the 5 runs we used for our experiments:

> sh run_isic2017_experiments.sh 01 12345
> sh run_isic2017_experiments.sh 02 23456
> sh run_isic2017_experiments.sh 07 78901
> sh run_isic2017_experiments.sh 08 89012
> sh run_isic2017_experiments.sh 09 90123

In early experiments, we test 10 seeds and selected the middle 5 when ranked in terms of performance, hence the specific seed choice.

Exploring the input data distribution present in semantic segmentation problems

Cluster assumption

First we examine the input data distribution presented by semantic segmentation problems with a view to determining if the low density separation assumption holds, in the notebook Semantic segmentation input data distribution.ipynb This notebook also contains the code used to generate the images from Figure 1 in the paper.

Inter-class and intra-class variance

Secondly we examine the inter-class and intra-class distance (as a proxy for inter-class and intra-class variance) in the notebook Plot inter-class and intra-class distances from files.ipynb

Note that running the second notebook requires that you generate some data files using the intra_inter_class_patch_dist.py program.

Toy 2D experiments

The toy 2D experiments used to produce Figure 3 in the paper can be run using the toy2d_train.py program, which is documented here.

You can re-create the toy 2D experiments by running the run_toy2d_experiments.sh shell script:

> sh run_toy2d_experiments.sh <run>

Comments

Confused on the "loss_mask"

Hi, really nice codebase!

We have tried to understand the details of your implementation and find the usage of the loss mask a little bit confusing (as shown below).

https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/train_seg_semisup_mask_mt.py#L294-L297

First, I have checked the code carefully and find that you define the 'mask' for the unlabeled images at the following code snippet:

https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_data.py#L90-L96

Therefore, we can see that the initial 'mask' for all images is set as an array of 255 with the same size as the input image. Then you assign the mask value as following:

https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_transforms_cv.py#L198

According to my understanding, only when we apply the cv2.warpAffine transform, some of the 'mask' values become zero (as shown below):

https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_transforms_cv.py#L371-L372

If my understanding is correct, the batch_um0 and batch_um1 might not be necessary if we do not apply the warpAffine transformation.

https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/train_seg_semisup_mask_mt.py#L294-L297

So my first question is about the influence of affine.cat_nx2x3 (shown as below) on the final performance.

https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/datapipe/seg_transforms_cv.py#L350-L356

So I guess the purpose of the loss mask is mainly to filter out the influence of the extra introduced pixels when applying the warpAffine augmentation. Is my understanding correct?

It would be great if you could point out any of my misunderstandings!

opened by PkuRainBow 5
Question about the cons weight.

Hi. In mean teacher, the consistency weight is 100. but in this work, all consistency weight is 1. Isn't this value too small? Can you tell me the details of setting this parameter, because I have seen other work(such as CPS, cvpr2021) that uses a consistency weight of around 100 when reproducing the mean-teacher method as well.

Looking forward to your help.

best,

opened by CuberrChen 4
Pretrained models

Hi @Britefury,

Thank you for making the code publicly available.

Can you share the pretrained models for the deeplabv2 architecture (Cutmix, Cutout, VAT and ICT) please? We are having troubles reproducing the results.

Thanks in advance.

opened by wvangansbeke 4
What is the "mask_arr" in the code?

Hi. Thanks for your interesting work!

I have a question about the following code: https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/datapipe/seg_data.py#L90-L96 What is the "mask_arr" here? All elements of this array are set to 255. Why should we define it here?

In the file of the main experiment, "mask_arr" is converted to "mask" and called by: https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L295

https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L297

https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L306

https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L324

In the above code, a "loss_mask" is generated for unlabeled loss, i.e., the consistency loss. Are all elements of "loss_mask" equal to 1? Can you explain what it does?

Thanks in advance.

opened by ZHKKKe 2
How to run the jobs on multiple GPUs.

Great job for the semi-supervised semantic segmentation problem!

After checking the implementations, we do not find the related code to support running jobs with multiple GPUs,

We only find that the job function in the job_helper.py contains a parameter "num_jobs" as shown below,

https://github.com/Britefury/cutmix-semisup-seg/blob/3a8839c7145052ef1e5d0e5862500c6c95b4a042/job_helper.py#L87-L116

opened by PkuRainBow 1
Add hubconf.py
This adds hubconf.py so that, for example, the following can work when "Ivan1248" is replaced with "Britefury":

model = torch.hub.load('Ivan1248/cutmix-semisup-seg', 'resnet101_deeplab_imagenet', num_classes=19)

If you would like, you can accept this or make changes.
opened by Ivan1248 0
Questions in converting cityscapes dataset

Thanks for your excellent work and kindly code releasing. I have a problem when reproducing the cityscapes experiment.

Following your tutorial, I have reproduced the experimental results, 44.37 vs. 51.73 with 6.79 improvement under the cutmix setting. When delving deep into the codes, I notice that you implement a downsample_label_img function for downsampling ground truth in the convert_cityscapes.py. However, it is more common to use nearest downsampling method in cv2 or PIL. Then, I replace the downsample_label_img function with the following: y_img = cv2.resize(y_img, (1024, 512), interpolation=cv2.INTER_NEAREST) and re-conduct the cityscapes experiment. The results degrades greatly, 43.79 vs. 47.02 with only 3.23 improvement.

I wonder the reason why a different downsample method affects the result greatly and the motivation to reimplement a downsample method rather than using a common one.

opened by jfzhuang 3
Improvement due to loss function or data augmentation?
Hi, I went through the paper and had a tough time understanding the jist of it. Is it:

Cut-Mix for segmentation images improving the accuracy metrics or

The consistency loss function is improving your scores

In a more naive way, is the work about cut-mix for segmentation(orginal is for classification) or the work is about consistency loss
opened by ramanathan831 0
Small concerns on the experiments in Table-4
Really nice work and super impressive results on the low-data regime (shown in Table 4, also pasted as following)!!

We have some small concerns about the red circle marked results:

On the 1/100 subset column, the baseline of DeepLabv3+/PSPNet is only 37.95%/36.69% while your method achieves 59.52%/67.20% separately. We really appreciate you if you could provide some detailed explanation of why your method could achieve so huge gains! One of our small concerns is that applying the CutMix scheme + Mean-Teacher scheme on the supervised baseline method w/o using unlabeled images might be a more reasonable baseline setting. It would be great if could share with us some results of such settings.

On the 1/8 subset column, the baseline performance of DeepLabv3+ is slightly worse than the PSPNet. According to our experience, the DeepLabv3+ should perform much better. Could you share with us some explanation on it?

On the full set column, we observe that the performance of the proposed method is slightly worse than the baseline. Could you share your comments on the possible reasons?

According to your code, all the experiments fix the BN statics and apply crop size 321x321 with a single GPU. Do you have any plans to train or have you ever trained your method on a more strong setting such as with crop size 512x512 + SyncBN + 8x V100 GPUs. MMSegmentationor openseg.pytorch might be a good candidate codebase.

Great thanks for your valuable time and wait for your explanation!
opened by PkuRainBow 2
Requesting pretrained models

Hi, I found your paper really insightful, been trying to replicate the results. Thanks for the code and the elaborate explanations with training scripts. I was wondering if you could share the pretrained models for Cityscapes dataset (Cutout, Cutmix and baseline models) for comparison and evaluation.

opened by prachigarg23 0

Owner

GitHub

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

125 Jan 3, 2023

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

137 Dec 14, 2022

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

110 Dec 7, 2022

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

137 Dec 14, 2022

[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

387 Jan 8, 2023

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

ST++ This is the official PyTorch implementation of our paper: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation. Lihe Ya

147 Jan 3, 2023

[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

268 Dec 24, 2022

[cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

PS-MT [cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation by Yuyuan Liu, Yu Tian, Yuanhong Chen, Fengbei Liu, Vasile

132 Jan 3, 2023

A Strong Baseline for Image Semantic Segmentation

A Strong Baseline for Image Semantic Segmentation Introduction This project is an open source semantic segmentation toolbox based on PyTorch. It is ba

49 Sep 20, 2022

Differentiable Optimizers with Perturbations in Pytorch

Differentiable Optimizers with Perturbations in PyTorch This contains a PyTorch implementation of Differentiable Optimizers with Perturbations in Tens

54 Jun 22, 2022

Official repository for "On Generating Transferable Targeted Perturbations" (ICCV 2021)

On Generating Transferable Targeted Perturbations (ICCV'21) Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Fatih Porikli Paper:

46 Nov 17, 2022

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

337 Dec 15, 2022

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning This is the official PyTorch implementation for UniMoCo pape

49 Jan 2, 2023

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

2 Dec 17, 2021

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

5 Dec 10, 2022

[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

106 Jan 3, 2023

Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

1.3k Jan 3, 2023

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

1.4k Jan 4, 2023