The implementation of "Bootstrapping Semantic Segmentation with Regional Contrast".

Shikun Liu

Last update: Dec 30, 2022

Related tags

Deep Learning semi-supervised-learning semantic-segmentation pascal-voc cityscapes contrastive-learning semi-supervised-methods

Overview

ReCo - Regional Contrast

This repository contains the source code of ReCo and baselines from the paper, Bootstrapping Semantic Segmentation with Regional Contrast, introduced by Shikun Liu, Shuaifeng Zhi, Edward Johns, and Andrew Davison.

Check out our project page for more qualitative results.

Datasets

ReCo is evaluated with three datasets: CityScapes, PASCAL VOC and SUN RGB-D in the full label mode, among which CityScapes and PASCAL VOC are additionally evaluated in the partial label mode.

For CityScapes, please download the original dataset from the official CityScapes site: leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip. Create and extract them to the corresponding dataset/cityscapes folder.
For Pascal VOC, please download the original training images from the official PASCAL site: VOCtrainval_11-May-2012.tar and the augmented labels here: SegmentationClassAug.zip. Extract the folder JPEGImages and SegmentationClassAug into the corresponding dataset/pascal folder.
For SUN RGB-D, please download the train dataset here: SUNRGBD-train_images.tgz, test dataset here: SUNRGBD-test_images.tgz and labels here: sunrgbd_train_test_labels.tar.gz. Extract and place them into the corresponding dataset/sun folder.

After making sure all datasets having been downloaded and placed correctly, run each processing file python dataset/{DATASET}_preprocess.py to pre-process each dataset ready for the experiments. The preprocessing file also includes generating partial label for Cityscapes and Pascal dataset with three random seeds. Feel free to modify the partial label size and random seed to suit your own research setting.

For the lazy ones: just download the off-the-shelf pre-processed datasets here: CityScapes, Pascal VOC and SUN RGB-D.

Training Supervised and Semi-supervised Models

In this paper, we introduce two novel training modes for semi-supervised learning.

Full Labels Partial Dataset: A sparse subset of training images has full ground-truth labels, with the remaining data unlabelled.
Partial Labels Full Dataset: All images have some labels, but covering only a sparse subset of pixels.

Running the following four scripts would train each mode with supervised or semi-supervised methods respectively:

python train_sup.py             # Supervised learning with full labels.
python train_semisup.py         # Semi-supervised learning with full labels.
python train_sup_partial.py     # Supervised learning with partial labels.
python train_semisup_patial.py  # Semi-supervised learning with partial labels.

Important Flags

All supervised and semi-supervised methods can be trained with different flags (hyper-parameters) when running each training script. We briefly introduce some important flags for the experiments below.

Flag Name	Usage	Comments
`num_labels`	number of labelled images in the training set, choose `0` for training all labelled images	only available in the full label mode
`partial`	percentage of labeled pixels for each class in the training set, choose `p0, p1, p5, p25` for training 1, 1%, 5%, 25% labelled pixel(s) respectively	only available in the partial label mode
`num_negatives`	number of negative keys sampled for each class in each mini-batch	only applied when training with ReCo loss
`num_queries`	number of queries sampled for each class in each mini-batch	only applied when training with ReCo loss
`output_dim`	dimensionality for pixel-level representation	only applied when training with ReCo loss
`temp`	temperature used in contrastive learning	only applied when training with ReCo loss
`apply_aug`	semi-supervised methods with data augmentation, choose `cutout, cutmix, classmix`	only available in the semi-supervised methods; our implementations for CutOut, CutMix and ClassMix
`weak_threshold`	weak threshold `delta_w` in active sampling	only applied when training with ReCo loss
`strong_threshold`	strong threshold `delta_s` in active sampling	only applied when training with ReCo loss
`apply_reco`	toggle on or off	apply our proposed ReCo loss

Training ReCo + ClassMix with the fewest full label setting in each dataset (the least appeared classes in each dataset have appeared in 5 training images):

python train_semisup.py --dataset pascal --num_labels 60 --apply_aug classmix --apply_reco
python train_semisup.py --dataset cityscapes --num_labels 20 --apply_aug classmix --apply_reco
python train_semisup.py --dataset sun --num_labels 50 --apply_aug classmix --apply_reco

Training ReCo + ClassMix with the fewest partial label setting in each dataset (each class in each training image only has 1 labelled pixel):

python train_semisup_partial.py --dataset pascal --partial p0 --apply_aug classmix --apply_reco
python train_semisup_partial.py --dataset cityscapes --partial p0 --apply_aug classmix --apply_reco
python train_semisup_partial.py --dataset sun --partial p0 --apply_aug classmix --apply_reco

Training ReCo + Supervised with all labelled data:

python train_sup.py --dataset {DATASET} --num_labels 0 --apply_reco

Training with ReCo is expected to require 12 - 16G of memory in a single GPU setting. All the other baselines can be trained under 12G in a single GPU setting.

Visualisation on Pre-trained Models

We additionally provide the pre-trained baselines and our method for 20 labelled Cityscapes and 60 labelled Pascal VOC, as examples for visualisation. The precise mIoU performance for each model is listed in the following table. The pre-trained models will produce the exact same qualitative results presented in the original paper.

	Supervised	ClassMix	ReCo + ClassMix
CityScapes (20 Labels)	38.10 [link]	45.13 [link]	50.14 [link]
Pascal VOC (60 Labels)	36.06 [link]	53.71 [link]	57.12 [link]

Download the pre-trained models with the links above, then create and place them into the folder model_weights in this repository. Run python visual.py to visualise the results.

Other Notices

We observe that the performance for the full label semi-supervised setting in CityScapes dataset is not stable across different machines, for which all methods may drop 2-5% performance, though the ranking keeps the same. Different GPUs in the same machine do not affect the performance. The performance for the other datasets in the full label mode, and the performance for all datasets in the partial label mode is consistent.
Please use --seed 0, 1, 2 to accurately reproduce/compare our results with the exactly same labelled and unlabelled split we used in our experiments.

Citation

If you found this code/work to be useful in your own research, please considering citing the following:

@article{liu2021reco,
    title={Bootstrapping Semantic Segmentation with Regional Contrast},
    author={Liu, Shikun and Zhi, Shuaifeng and Johns, Edward and Davison, Andrew J},
    journal={arXiv preprint arXiv:2104.04465},
    year={2021}
}

Contact

If you have any questions, please contact [email protected].

Comments

About the performance of training

I'm sorry to bother you. I see that your code works very well, so I want to try to run it. However, when I perform reco and perform semi supervision on 600 tags, the result is only 0.6719. Due to the problem of GPU memory, I changed the batchsize from 10 to 6, and I haven't changed other parameters. Why is the result gap so large? I can't figure out the reason now. I'd like to ask you for advice. Thank you very much

opened by swt199211 20
关于Active Query Sampling的一些问题

作者你好！拜读了一下这篇论文，感觉收获很大！但是关于Active Query Sampling这一块我有一些小疑问：文中的策略采用的是选取高entropy的像素点作为query，但是高entropy代表着模型对这个像素的预测不够确信而且也不够准确，从而使得这种query和它对应的positive key之间的拉近实际上不一定是准确的，即错误的拉近了query和positive key的距离，很有可能降低了模型的performance，你们是怎么看待这事的。

opened by chaochao42 15
Request for the code of DeepLab V2.

Hi, thanks for your sharing code, it's really an awesome work!

Can you release the code of DeepLabe V2 which contains the projector? There's only code of DeepLab V3.

Currently I'm studying the contrastive learning based on DeepLabe V2, if you can provide the corresponding code for my reference, it will really appreciate.

opened by super233 8
foreground & background class

I'm amazed with your work about the design of queries and keys sampling! But I get confused at some part. I've tried the DeepLab series model and if I remember correctly, it have a class called background to remove the pixel that we don't interest. And I couldn't find that part in your code and wondering how you inference when you don't have the ground truth data (such as the code in visual.py.) Also wondering how it will influence in training / model performance part, like the model will have one more class although it not attend to estimate the mIOU.

opened by HuangBugWei 7
RandomSampler when creating the dataloaders

Hello, I would like to ask why are you using the RandomSampler when the dataloaders are created for labeled and unlabeled data. For example, when the unlabeled data are more than the labeled, is sampled a specific number of samples from the unlabeled dataset in order to match the amount of the labeled data. Let say that we have 2000 labeled images and 9000 unlabeled images. From the set of 9000 we randomly select a subset of 2000 images at each training epoch, so we construct 2 dataloaders with the same length. Could this sampling make an appropriate use of the whole unlabeled dataset?

Many thanks

opened by nysp78 7
question about Ablative Analysis

Hi, In Section 4.4 Ablative Analysis, the Effect of Active Sampling was verified. Could you please tell me how to conduct (Random Query,Random Key ) and other settings? Thanks!

opened by wwjwy 6
Number of Queries and keys

再次打扰一下作者，请问消融研究中的Number of Queries and keys，keys数量选择指的是负样本（r k-）的数量吗？还有就是消融研究中query-key数量研究是怎么进行的呢，比如query依次选择32 64 128 256 时，对应的key数量是固定值吗？如果是固定为多少呢？谢谢！

opened by wwjwy 5
`Reflect' when filling the cropped images?

Hi, thanks for your great work in combining contrastive learning and semi-supervised semantic segmentation!

I'm a littble bit confused about your code: https://github.com/lorenmt/reco/blob/main/build_data.py#L38 The images are filled in the reflect mode including the test stage. Is this fare when comparing with previous state-of-the-art methods where the images are filled in the constant mode with value of 0?

opened by Haochen-Wang409 5
understanding the reco loss

I am kinda new to contrast learning and although I completely follow the math, I have a confusion when I am looking at the code (by code I mean the general code, not only your code :) ).

So in the paper Eq.1 is the reco loss, which is the pixel wise contrastive loss, but in the code here I cannot understand why it is computed different from the way that Eq1 is written, is it a common thing in practice?

In more details, in the code, we have all_feat, which is 256x513x256, so for each sample the first one is the positive and the rest are the negatives. The seg_logits then will have 256x513 dimension. We then compute the similarity via cosine_similarity (which I think is different from the contrastive loss in eq 1).

Then, in the next step based on the F.cross_entropy we want all of them to have label zeros, because they have to be the same as the positive one. My confusion is that why we dont use Eq1 as it is written in the paper and why we use F.cross_entropy. I feel this is not exactly the same Eq1. Can you please help me understand the relation?

opened by seyeeet 4
how can I learn the decision boundries?

Hello,

Thank you very much for this interesting work. Can you please let me know how I can compute the decision boundary as you show in the last page of the paper ?

opened by seyeeet 4
About Random Sampler at the supervised setting

Hello,

When you are creating the dataloader for labeled data in the supervised setting, you are using a random sampler that samples a specific amount of labeled data from the total available labeled images.

num_samples = self.batch_size * 200 # for total 40k iterations with 200 epochs

train_l_loader = torch.utils.data.DataLoader(train_l_dataset, batch_size=self.batch_size, sampler=sampler.RandomSampler(data_source=train_l_dataset, replacement=True, num_samples=num_samples), drop_last=True )

Why are you using this sampler and forward pass a subset of labeled images instead of the whole amount of labeled data? Is it an effective training approach?

Thanks

opened by nysp78 3

Owner

Shikun Liu

Ph.D. Student, The Dyson Robotics Lab at Imperial College.

GitHub https://shikun.io/projects/regional-contrast

ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

3 Oct 6, 2022

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

NuPIC Numenta Platform for Intelligent Computing The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implem

6.3k Dec 30, 2022

PyTorch implementation of neural style transfer algorithm

neural-style-pt This is a PyTorch implementation of the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias

770 Jan 2, 2023

PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

121 Nov 5, 2022

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

3.9k Jan 5, 2023

Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

141 Dec 30, 2022

Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree

This is a Python implementation of cover trees, a data structure for finding nearest neighbors in a general metric space (e.g., a 3D box with periodic

28 Nov 25, 2022

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

364 Dec 28, 2022

Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow

xRBM Library Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow Installation Using pip: pip install xrbm Examples Tut

55 Dec 29, 2022

A fast Evolution Strategy implementation in Python

Evostra: Evolution Strategy for Python Evolution Strategy (ES) is an optimization technique based on ideas of adaptation and evolution. You can learn

251 Dec 8, 2022

🌳 A Python-inspired implementation of the Optimum-Path Forest classifier.

OPFython: A Python-Inspired Optimum-Path Forest Classifier Welcome to OPFython. Note that this implementation relies purely on the standard LibOPF. Th

30 Jan 4, 2023

Implementation of Geometric Vector Perceptron, a simple circuit for 3d rotation equivariance for learning over large biomolecules, in Pytorch. Idea proposed and accepted at ICLR 2021

Geometric Vector Perceptron Implementation of Geometric Vector Perceptron, a simple circuit with 3d rotation equivariance for learning over large biom

59 Nov 24, 2022

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

101 Nov 25, 2022

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022

PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition. Transformer models are good at capturing content-based

565 Jan 4, 2023

An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

48 Sep 27, 2022

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

76 Nov 23, 2022

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

Exploring simple siamese representation learning This is a PyTorch re-implementation of the SimSiam paper on ImageNet dataset. The results match that

72 Nov 9, 2022

PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

357 Jan 4, 2023