This is the code of paper ``Contrastive Coding for Active Learning under Class Distribution Mismatch'' with python.

Last update: Dec 22, 2022

Related tags

Deep Learning CCAL

Overview

Contrastive Coding for Active Learning under Class Distribution Mismatch

Official PyTorch implementation of ["Contrastive Coding for Active Learning under Class Distribution Mismatch"]( ICCV2021）

1. Requirements

Environments

Currently, requires following packages.

CUDA 10.1+
python == 3.7.9
pytorch == 1.7.1
torchvision == 0.8.2
scikit-learn == 0.24.0
tensorboardx == 2.1
matplotlib == 3.3.3
numpy == 1.19.2
scipy == 1.5.3
apex == 0.1
diffdist == 0.1
pytorch-gradual-warmup-lr packages

Datasets

For CIFAR10 and CIFAR100, we provide a function to automatically download and preprocess the data, you can also download the datasets from the link, and please download it to ~/data.

2. Training

Currently, all code examples are assuming distributed launch with 4 multi GPUs. To run the code with single GPU, remove -m torch.distributed.launch --nproc_per_node=4.

Semantic feature extraction

To train semantic feature extraction in the paper, run this command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 contrast_main.py --mismatch 0.8 --dataset <DATASET> --model <NETWORK> --mode senmatic --shift_trans_type none --batch_size 32 --epoch <EPOCH> --logdir './model/semantic'

Option
For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
In our experiment, we set --epoch 700 in cfar10 and --epoch 2000 in cifar100 .
And we set mismatch = 0.2, 0.4, 0.6, 0.8.

Distinctive feature extraction

To train distinctive feature extraction in the paper, run this command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 contrast_main.py --mismatch 0.8 --dataset <DATASET> --model <NETWORK> --mode feature --shift_trans_type rotation --batch_size 32 --epoch 700 --logdir './model/distinctive'

Option
For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
In our experiment, we set --epoch 700 in cifar10 and cifar100 .
And we set mismatch = 0.2, 0.4, 0.6, 0.8.

Joint query strategy

To select samples from unlabeled dataset in the paper, run this command:

CUDA_VISIBLE_DEVICES=0 python active_main.py --mode eval --k 100.0 --t 0.9 --dataset <DATASET> --model <NETWORK> --mismatch <MISMATCH> --target <INT> --shift_trans_type rotation --print_score --ood_samples 10 --resize_factor 0.54 --resize_fix --load_feature_path './model/distinctive/last.model' --load_senmatic_path './model/semantic/last.model'  --load_path './model'

Option
For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
The value of mismatch is between 0 and 1. In our experiment, we set mismatch = 0.2, 0.4, 0.6, 0.8.
--target represents the number of queried samples in each category in each AL cycle.

Then, we can get the index of the samples be queried in each active learning cycle. Take mismatch=0.8 for example，the index of the samples should be added in to CCAL_master/train_classifier/get_index_80.

3. Evaluation

To evaluate the proformance of CCAL, we provide a script to train a classifier, as shown in CCAL_master/train_classifier. , run this command to train the classifier:

CUDA_VISIBLE_DEVICES=0 python main.py --cuda --split <CYCLES> --dataset <DATASET> --mismatch <MISMATCH> --number <NUMBER> --epoch 100

Option
For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
The value of mismatch is between 0 and 1. In our experiment, we set mismatch = 0.2, 0.4, 0.6, 0.8. The value of mismatch should be the same as before.
--number indicates the cycle of active learning.
--epoch indicates the epochs that training continues in each active learning cycle. In our experiment, we set --epoch 100.
--split represents the cycles of active learning.

Then, we can get the average of the accuracies over 5 runs(random seed = 0,1,2,3,4,5).

4. Citation

@InProceedings{Du_2021_ICCV,
    author    = {Du, Pan and Zhao, Suyun and Chen, Hui and Chai, Shuwen and Chen, Hong and Li, Cuiping},
    title     = {Contrastive Coding for Active Learning Under Class Distribution Mismatch},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {8927-8936}
}

5. Reference

@inproceedings{tack2020csi,
  title={CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances},
  author={Jihoon Tack and Sangwoo Mo and Jongheon Jeong and Jinwoo Shin},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}

Comments

Joint Query Strategy

Joint Query Strategy "--target represents the number of queried samples in each category in each AL cycle." The variable target is not mentioned in the code nor in the paper to the best of my understanding. I think it is a very crucial variable for the experiments.

opened by zlaskar 0
Code Run Issues
I found the following errors:

https://github.com/RUC-DWBI-ML/CCAL/blob/main/contrast_main.py#L17 "from evals.total.eval import test_classifier" - The module 'total' is not present. Perhaps it should be evals.eval

https://github.com/RUC-DWBI-ML/CCAL/blob/main/active_main.py#L143 model_senmatic = C.get_shift_classifer(model_senmatic, args.K_shift).to(device). Here args.K_shift should be 1 for model_senmatic. In the code it gets set to 4 which is true for model_feature

https://github.com/RUC-DWBI-ML/CCAL/blob/main/evals/eval.py#L13 The above issue in point 2 continues to function eval_unlabeled_detection(). Here P.K_shift should also be 1 for lines connected with semantic features.

The train_classifier/main.py should generate CCAL_master/train_classifier/get_index_80.py file but it does not. The function to write labeled indices to file is not present.

All the above issues can be fixed but will take some time to figure out for new people. It would be great if you can fix it. It would be great if you can share pre-trained semantic and distinctive model encoders.
opened by zlaskar 3

Neighborhood Contrastive Learning for Novel Class Discovery

Neighborhood Contrastive Learning for Novel Class Discovery This repository contains the official implementation of our paper: Neighborhood Contrastiv

56 Dec 9, 2022

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

113 Dec 27, 2022

Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min

29 Nov 21, 2022

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

42 Nov 24, 2022

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

8.1k Jan 6, 2023

This is the code of paper ``Contrastive Coding for Active Learning under Class Distribution Mismatch'' with python.

Related tags

Overview

Contrastive Coding for Active Learning under Class Distribution Mismatch

1. Requirements

Environments

Datasets

2. Training

Semantic feature extraction

Distinctive feature extraction

Joint query strategy

3. Evaluation

4. Citation

5. Reference

You might also like...

Neighborhood Contrastive Learning for Novel Class Discovery

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

Code for the Active Speakers in Context Paper (CVPR2020)

Comments

Joint Query Strategy

Code Run Issues

Owner

Transformer Huffman coding - Complete Huffman coding through transformer

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss