[NeurIPS 2020] Semi-Supervision (Unlabeled Data) & Self-Supervision Improve Class-Imbalanced / Long-Tailed Learning

Overview

Rethinking the Value of Labels for Improving Class-Imbalanced Learning

This repository contains the implementation code for paper:
Rethinking the Value of Labels for Improving Class-Imbalanced Learning
Yuzhe Yang, and Zhi Xu
34th Conference on Neural Information Processing Systems (NeurIPS), 2020
[Website] [arXiv] [Paper] [Slides] [Video]

If you find this code or idea useful, please consider citing our work:

@inproceedings{yang2020rethinking,
  title={Rethinking the Value of Labels for Improving Class-Imbalanced Learning},
  author={Yang, Yuzhe and Xu, Zhi},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

Overview

In this work, we show theoretically and empirically that, both semi-supervised learning (using unlabeled data) and self-supervised pre-training (first pre-train the model with self-supervision) can substantially improve the performance on imbalanced (long-tailed) datasets, regardless of the imbalanceness on labeled/unlabeled data and the base training techniques.

Semi-Supervised Imbalanced Learning: Using unlabeled data helps to shape clearer class boundaries and results in better class separation, especially for the tail classes. semi

Self-Supervised Imbalanced Learning: Self-supervised pre-training (SSP) helps mitigate the tail classes leakage during testing, which results in better learned boundaries and representations. self

Installation

Prerequisites

Dependencies

  • PyTorch (>= 1.2, tested on 1.4)
  • yaml
  • scikit-learn
  • TensorboardX

Code Overview

Main Files

Main Arguments

  • --dataset: name of chosen long-tailed dataset
  • --imb_factor: imbalance factor (inverse value of imbalance ratio \rho in the paper)
  • --imb_factor_unlabel: imbalance factor for unlabeled data (inverse value of unlabel imbalance ratio \rho_U)
  • --pretrained_model: path to self-supervised pre-trained models
  • --resume: path to resume checkpoint (also for evaluation)

Getting Started

Semi-Supervised Imbalanced Learning

Unlabeled data sourcing

CIFAR-10-LT: CIFAR-10 unlabeled data is prepared following this repo using the 80M TinyImages. In short, a data sourcing model is trained to distinguish CIFAR-10 classes and an "non-CIFAR" class. For each class, images are then ranked based on the prediction confidence, and unlabeled (imbalanced) datasets are constructed accordingly. Use the following link to download the prepared unlabeled data, and place in your data_path:

SVHN-LT: Since its own dataset contains an extra part with 531.1K additional (labeled) samples, they are directly used to simulate the unlabeled dataset.

Note that the class imbalance in unlabeled data is also considered, which is controlled by --imb_factor_unlabel (\rho_U in the paper). See imbalance_cifar.py and imbalance_svhn.py for details.

Semi-supervised learning with pseudo-labeling

To perform pseudo-labeling (self-training), first a base classifier is trained on original imbalanced dataset. With the trained base classifier, pseudo-labels can be generated using

python gen_pseudolabels.py --resume <ckpt-path> --data_dir <data_path> --output_dir <output_path> --output_filename <save_name>

We provide generated pseudo label files for CIFAR-10-LT & SVHN-LT with \rho=50, using base models trained with standard cross-entropy (CE) loss:

To train with unlabeled data, for example, on CIFAR-10-LT with \rho=50 and \rho_U=50

python train_semi.py --dataset cifar10 --imb_factor 0.02 --imb_factor_unlabel 0.02

Self-Supervised Imbalanced Learning

Self-supervised pre-training (SSP)

To perform Rotation SSP on CIFAR-10-LT with \rho=100

python pretrain_rot.py --dataset cifar10 --imb_factor 0.01

To perform MoCo SSP on ImageNet-LT

python pretrain_moco.py --dataset imagenet --data <data_path>

Network training with SSP models

Train on CIFAR-10-LT with \rho=100

python train.py --dataset cifar10 --imb_factor 0.01 --pretrained_model <path_to_ssp_model>

Train on ImageNet-LT / iNaturalist 2018

python -m imagenet_inat.main --cfg <path_to_ssp_config> --model_dir <path_to_ssp_model>

Results and Models

All related data and checkpoints can be found via this link. Individual results and checkpoints are detailed as follows.

Semi-Supervised Imbalanced Learning

CIFAR-10-LT

Model Top-1 Error Download
CE + D_U@5x (\rho=50 and \rho_U=1) 16.79 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=25) 16.88 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=50) 18.36 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=100) 19.94 ResNet-32

SVHN-LT

Model Top-1 Error Download
CE + D_U@5x (\rho=50 and \rho_U=1) 13.07 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=25) 13.36 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=50) 13.16 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=100) 14.54 ResNet-32

Test a pretrained checkpoint

python train_semi.py --dataset cifar10 --resume <ckpt-path> -e

Self-Supervised Imbalanced Learning

CIFAR-10-LT

  • Self-supervised pre-trained models (Rotation)

    Dataset Setting \rho=100 \rho=50 \rho=10
    Download ResNet-32 ResNet-32 ResNet-32
  • Final models (200 epochs)

    Model \rho Top-1 Error Download
    CE(Uniform) + SSP 10 12.28 ResNet-32
    CE(Uniform) + SSP 50 21.80 ResNet-32
    CE(Uniform) + SSP 100 26.50 ResNet-32
    CE(Balanced) + SSP 10 11.57 ResNet-32
    CE(Balanced) + SSP 50 19.60 ResNet-32
    CE(Balanced) + SSP 100 23.47 ResNet-32

CIFAR-100-LT

  • Self-supervised pre-trained models (Rotation)

    Dataset Setting \rho=100 \rho=50 \rho=10
    Download ResNet-32 ResNet-32 ResNet-32
  • Final models (200 epochs)

    Model \rho Top-1 Error Download
    CE(Uniform) + SSP 10 42.93 ResNet-32
    CE(Uniform) + SSP 50 54.96 ResNet-32
    CE(Uniform) + SSP 100 59.60 ResNet-32
    CE(Balanced) + SSP 10 41.94 ResNet-32
    CE(Balanced) + SSP 50 52.91 ResNet-32
    CE(Balanced) + SSP 100 56.94 ResNet-32

ImageNet-LT

  • Self-supervised pre-trained models (MoCo)
    [ResNet-50]

  • Final models (90 epochs)

    Model Top-1 Error Download
    CE(Uniform) + SSP 54.4 ResNet-50
    CE(Balanced) + SSP 52.4 ResNet-50
    cRT + SSP 48.7 ResNet-50

iNaturalist 2018

  • Self-supervised pre-trained models (MoCo)
    [ResNet-50]

  • Final models (90 epochs)

    Model Top-1 Error Download
    CE(Uniform) + SSP 35.6 ResNet-50
    CE(Balanced) + SSP 34.1 ResNet-50
    cRT + SSP 31.9 ResNet-50

Test a pretrained checkpoint

# test on CIFAR-10 / CIFAR-100
python train.py --dataset cifar10 --resume <ckpt-path> -e

# test on ImageNet-LT / iNaturalist 2018
python -m imagenet_inat.main --cfg <path_to_ssp_config> --model_dir <path_to_model> --test

Acknowledgements

This code is partly based on the open-source implementations from the following sources: OpenLongTailRecognition, classifier-balancing, LDAM-DRW, MoCo, and semisup-adv.

Contact

If you have any questions, feel free to contact us through email ([email protected] & [email protected]) or Github issues. Enjoy!

Comments
  • Questions about self-supervised learning on cifar10

    Questions about self-supervised learning on cifar10

    Thanks for sharing the codes! This work is really interesting to me. My questions are as follows:

    I'm trying to reproduce the results in Table 2. Specifically, I trained the models with/without self-supervised pre-training (SSP). However, the baselines (w.o. SSP) consistently outperform those with SSP under different training rules (including None, Resample, and Reweight). The best precisions are presented below. For each experimental setting, I run twice to see if the results are stable, so there're two numbers per cell.

    image

    For your reference, I used the following commands:

    • Train Rotation
    python pretrain_rot.py --dataset cifar10  --imb_factor 0.01 --arch resnet32
    
    • Train baseline
    python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None 
    
    • Train baseline + SSP
    python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None --pretrained_model xxx 
    
    opened by ZezhouCheng 7
  • error python pretrain_rot.py --dataset cifar10 --imb_factor 0.01

    error python pretrain_rot.py --dataset cifar10 --imb_factor 0.01

    When I use python pretrain_rot.py --dataset cifar10 --imb_factor 0.01,it occurs RuntimeError: Given input size: (2048x1x1). Calculated output size: (2048x0x0). Output size is too small. How should I modify the code?

    opened by madoka109 4
  • Can't achieve the given performance: ResNet-50 + SSP+CE(Uniform) for imageNet-LT

    Can't achieve the given performance: ResNet-50 + SSP+CE(Uniform) for imageNet-LT

    I download the pre-trained model from the given path Resnet-50-rot. And train the model with the given config imagenet_inat/config/ImageNet_LT/feat_uniform.yaml The training cmd is: python imb_cls/imagenet_inat/main.py --cfg 'imb_cls/imagenet_inat/config/ImageNet_LT/feat_uniform.yaml' --model_dir workdir/pretrain/moco_ckpt_0200.pth.tar. I only get 41.1 top-1 accuracy but the given model achieved 45.6 [CE(Uniform) + SSP].

    Can you help me check where is the problem? image image

    opened by ChCh1999 3
  • Question about the Self-supervised pre-trained models (MoCo)

    Question about the Self-supervised pre-trained models (MoCo)

    Thanks for your exellent code! I reproduced the results of CE(uniform) + SSP and cRT+SSP on ImageNet_LT based on the Self-supervised pre-trained models (MoCo), and got the same results as reported in your paper. But I still have some question about the MoCo SSP checkpoint. I directly evaluated the performance of MoCo checkpoints + cRT (without CE-uniform supervised training), and the accuracy is 0.118, which is not good. But according to the original paper of MoCo, the accuracy of MoCo on full imagenet should be 0.60+, which is not far from supervised learning. So is the 0.118 accuracy reasonable? It's much lower than supervised accuracy on ImageNet_LT.

    opened by seekingup 3
  • Have you ever tried

    Have you ever tried "Semi-Supervised Imbalanced Learning on ImageNet-LT"?

    Hi, have you ever tried "Semi-Supervised Imbalanced Learning" on ImageNet-LT?

    According to the experiment result in the paper, the performance with Semi-Supervised Imbalanced Learning seems better than Self-Supervised Imbalanced Learning on CIFAR-10-LT.

    If I want to try this experiment, how can I modify the dataset/imagenet.py to dataset/imblance_imagenet.py (similar to imblance_cifar.py)?

    opened by e96031413 2
  • Some problems about the assumption in the papaer.

    Some problems about the assumption in the papaer.

    Hi, I'm very interested in your paper. Especially, the proofs attract me. However, I meet some questions on understanding the proof.

    "We assume a properly designed black-box self-supervised task so that the learned representation is Z = k1 ||X||^{2} + k2, where k1, k2 > 0. Precisely, this means that we have access to the new features Zi for the i-th data after the black-box self-supervised step, without knowing explicitly what the transformation ψ is. "

    I'm confused by the following questions: (1) Why a properly designed black-box self-supervised task can obtain the learned representation, Z = k1 ||X||^{2} + k2 ? whether the moco or rotation-based self-supervised method respect this assumption?

    (2) Why the supervised classification task can not obtain the similar representation, Z = k1 ||X||^{2} + k2 ?

    opened by jiequancui 2
  • What is the intended learning rate schedule?

    What is the intended learning rate schedule?

    https://github.com/YyzHarry/imbalanced-semi-self/blob/16d8f02264d9e16602d1a47acc43053b6bb007c4/utils.py#L28-L39

    Hi, thanks for sharing your code!

    I have a question about the referenced code above. In the 'adjust_learning_rate' function, the lines 34 and 35 will never be passed. Can I ask the learning rate schedule that you used for experiments in the paper?

    According to the 'adjust_learning_rate' function, the learning rate may change as follows.

    epoch lr 0: args.lr * 1 / 5
    1: args.lr * 2 / 5 2: args.lr * 3 / 5 3: args.lr * 4 / 5 4: args.lr * 5 / 5 5 ~ 160: args.lr 161~: args.lr * 0.01

    opened by ChanghwaPark 1
  • Where can I setting the CE(Uniform) and CE(Balanced) ?

    Where can I setting the CE(Uniform) and CE(Balanced) ?

    I see the Self-supervised pretrained learning (SSP). There are many models in SSP.

    1. CE(Uniform) + SSP
    2. CE(Balanced) + SSP

    Where can I setting the CB in train.py code? In my opinion, per_cls_weights seems to set a uniform or balance. Does the CB setting mean 'Reweight' in args.train_rule?

        if args.train_rule == 'Reweight':
            beta = 0.9999
            effective_num = 1.0 - np.power(beta, cls_num_list)
            per_cls_weights = (1.0 - beta) / np.array(effective_num)
            per_cls_weights = per_cls_weights / np.sum(per_cls_weights) * len(cls_num_list)
            per_cls_weights = torch.FloatTensor(per_cls_weights).cuda(args.gpu)
        elif args.train_rule == 'DRW':
            idx = epoch // 160
            betas = [0, 0.9999]
            effective_num = 1.0 - np.power(betas[idx], cls_num_list)
            per_cls_weights = (1.0 - betas[idx]) / np.array(effective_num)
            per_cls_weights = per_cls_weights / np.sum(per_cls_weights) * len(cls_num_list)
            per_cls_weights = torch.FloatTensor(per_cls_weights).cuda(args.gpu)
        else:
            per_cls_weights = None
    
        if args.loss_type == 'CE':
            criterion = nn.CrossEntropyLoss(weight=per_cls_weights).cuda(args.gpu)
        elif args.loss_type == 'LDAM':
            criterion = LDAMLoss(cls_num_list=cls_num_list, max_m=0.5, s=30, weight=per_cls_weights).cuda(args.gpu)
        elif args.loss_type == 'Focal':
            criterion = FocalLoss(weight=per_cls_weights, gamma=1).cuda(args.gpu)
        else:
            warnings.warn('Loss type is not listed')
            return
    
    opened by hooseok 1
  • What's the required hardware to reproduce the result?

    What's the required hardware to reproduce the result?

    Thanks for sharing this code. It's interesting. May I know the required hardware to reproduce the result?

    The reason I'm asking because I tried to run "pretrain_rot.py --dataset 'cifar10' --imb_factor 0.01 ", but the system doesn't response for a long time when running at "output = model(inputs)".

    opened by zjamy 1
  • Error:  No module named 'dataset.resnet_cifar' when running

    Error: No module named 'dataset.resnet_cifar' when running

    When I run this command: python train_semi.py --dataset cifar10 --imb_factor 0.02 --imb_factor_unlabel 0.02

    I got this error: Traceback (most recent call last): File "train_semi.py", line 15, in from dataset.imbalance_cifar import SemiSupervisedImbalanceCIFAR10 File "/home/insights-user/imbalanced-semi-self/dataset/init.py", line 1, in from .resnet_cifar import * ModuleNotFoundError: No module named 'dataset.resnet_cifar'

    opened by khuongnd 1
  • moco on cifar dataset

    moco on cifar dataset

    Thanks for the great repo!

    I have a quick question, is there any specific reason not adding cifar&svhn datasets to the moco training script? like, it's not suitable or the performance is really bad on the small datasets?

    Thanks!

    opened by IssacCyj 1
  • Why use 5 times more unlabeled data?

    Why use 5 times more unlabeled data?

    I read the paper. Question about Appendices E3: Effect of Unlabeled Data amount.

    The results of CE+Du are 21.75, 20.35, 18.36, and 16.88 about {0.5x, 1x, 5x, 10x}. The result of 10x is better than 5x more unlabeled data. But in this paper selected 5 times.

    Is there a reason?

    opened by HeewonChung92 1
Owner
Yuzhe Yang
Ph.D. student at MIT CSAIL
Yuzhe Yang
Mixup for Supervision, Semi- and Self-Supervision Learning Toolbox and Benchmark

OpenSelfSup News Downstream tasks now support more methods(Mask RCNN-FPN, RetinaNet, Keypoints RCNN) and more datasets(Cityscapes). 'GaussianBlur' is

AI Lab, Westlake University 332 Jan 3, 2023
imbalanced-DL: Deep Imbalanced Learning in Python

imbalanced-DL: Deep Imbalanced Learning in Python Overview imbalanced-DL (imported as imbalanceddl) is a Python package designed to make deep imbalanc

NTUCSIE CLLab 19 Dec 28, 2022
MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

This repository is the official PyTorch implementation of Meta-Balance. Find the paper on arxiv MetaBalance: High-Performance Neural Networks for Clas

Arpit Bansal 20 Oct 18, 2021
BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

Balanced-Evolutionary-Semi-Stacking Code for the paper ''BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalan

null 0 Jan 16, 2022
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
Learning trajectory representations using self-supervision and programmatic supervision.

Trajectory Embedding for Behavior Analysis (TREBA) Implementation from the paper: Jennifer J. Sun, Ann Kennedy, Eric Zhan, David J. Anderson, Yisong Y

null 58 Jan 6, 2023
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch >= 1.2.0 P

null 16 Dec 14, 2022
[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression

Delving into Deep Imbalanced Regression This repository contains the implementation code for paper: Delving into Deep Imbalanced Regression Yuzhe Yang

Yuzhe Yang 568 Dec 30, 2022
Awesome Long-Tailed Learning

Awesome Long-Tailed Learning This repo pays specially attention to the long-tailed distribution, where labels follow a long-tailed or power-law distri

Stomach_ache 284 Jan 6, 2023
Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min

Denis 29 Nov 21, 2022
[ICML 2021] Break-It-Fix-It: Learning to Repair Programs from Unlabeled Data

Break-It-Fix-It: Learning to Repair Programs from Unlabeled Data This repo provides the source code & data of our paper: Break-It-Fix-It: Unsupervised

Michihiro Yasunaga 86 Nov 30, 2022
Improving Calibration for Long-Tailed Recognition (CVPR2021)

MiSLAS Improving Calibration for Long-Tailed Recognition Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia [arXiv] [slide] [BibTeX] Introductio

Jia Research Lab 116 Dec 20, 2022
Improving Calibration for Long-Tailed Recognition (CVPR2021)

Improving Calibration for Long-Tailed Recognition (CVPR2021)

Jia Research Lab 19 Apr 28, 2021
Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

Tong WU 89 Dec 15, 2022
Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Exploring Classification Equilibrium in Long-Tailed Object Detection (LOCE, ICCV 2021) Paper Introduction The conventional detectors tend to make imba

null 52 Nov 21, 2022
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen

Zhongqi Miao 761 Dec 26, 2022
Improving Calibration for Long-Tailed Recognition (CVPR2021)

MiSLAS Improving Calibration for Long-Tailed Recognition Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia [arXiv] [slide] [BibTeX] Introductio

DV Lab 116 Dec 20, 2022
Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective Zhengzhuo Xu, Zenghao Chai, Chun Yuan This is the PyTorch implement

Sincere 16 Dec 15, 2022