Improving Calibration for Long-Tailed Recognition (CVPR2021)

Jia Research Lab

Last update: Dec 20, 2022

Related tags

Overview

MiSLAS

Improving Calibration for Long-Tailed Recognition

Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia

[arXiv] [slide] [BibTeX]

Introduction: This repository provides an implementation for the CVPR 2021 paper: "Improving Calibration for Long-Tailed Recognition" based on LDAM-DRW and Decoupling models. Our study shows, because of the extreme imbalanced composition ratio of each class, networks trained on long-tailed datasets are more miscalibrated and over-confident. MiSLAS is a simple, and efficient two-stage framework for long-tailed recognition, which greatly improves recognition accuracy and markedly relieves over-confidence simultaneously.

Installation

Requirements

Python 3.7
torchvision 0.4.0
Pytorch 1.2.0
yacs 0.1.8

Virtual Environment

conda create -n MiSLAS python==3.7
source activate MiSLAS

Install MiSLAS

git clone https://github.com/Jia-Research-Lab/MiSLAS.git
cd MiSLAS
pip install -r requirements.txt

Dataset Preparation

Change the data_path in config/*/*.yaml accordingly.

Training

Stage-1:

To train a model for Stage-1 with mixup, run:

(one GPU for CIFAR-10-LT & CIFAR-100-LT, four GPUs for ImageNet-LT, iNaturalist 2018, and Places-LT)

python train_stage1.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage1_mixup.yaml

DATASETNAME can be selected from cifar10, cifar100, imagenet, ina2018, and places.

ARCH can be resnet32 for cifar10/100, resnet50/101/152 for imagenet, resnet50 for ina2018, and resnet152 for places, respectively.

Stage-2:

To train a model for Stage-2 with one GPU (all the above datasets), run:

python train_stage2.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage2_mislas.yaml resume /path/to/checkpoint/stage1

The saved folder (including logs and checkpoints) is organized as follows.

MiSLAS
├── saved
│   ├── modelname_date
│   │   ├── ckps
│   │   │   ├── current.pth.tar
│   │   │   └── model_best.pth.tar
│   │   └── logs
│   │       └── modelname.txt
│   ...

Evaluation

To evaluate a trained model, run:

python eval.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage1_mixup.yaml  resume /path/to/checkpoint/stage1
python eval.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage2_mislas.yaml resume /path/to/checkpoint/stage2

Results and Models

1) CIFAR-10-LT and CIFAR-100-LT

Stage-1 (mixup):

Dataset	Top-1 Accuracy	ECE (15 bins)	Model
CIFAR-10-LT IF=10	87.6%	11.9%	link
CIFAR-10-LT IF=50	78.1%	2.49%	link
CIFAR-10-LT IF=100	72.8%	2.14%	link
CIFAR-100-LT IF=10	59.1%	5.24%	link
CIFAR-100-LT IF=50	45.4%	4.33%	link
CIFAR-100-LT IF=100	39.5%	8.82%	link

Stage-2 (MiSLAS):

Dataset	Top-1 Accuracy	ECE (15 bins)	Model
CIFAR-10-LT IF=10	90.0%	1.20%	link
CIFAR-10-LT IF=50	85.7%	2.01%	link
CIFAR-10-LT IF=100	82.5%	3.66%	link
CIFAR-100-LT IF=10	63.2%	1.73%	link
CIFAR-100-LT IF=50	52.3%	2.47%	link
CIFAR-100-LT IF=100	47.0%	4.83%	link

Note: To obtain better performance, we highly recommend changing the weight decay 2e-4 to 5e-4 on CIFAR-LT.

2) Large-scale Datasets

Stage-1 (mixup):

Dataset	Arch	Top-1 Accuracy	ECE (15 bins)	Model
ImageNet-LT	ResNet-50	45.5%	7.98%	link
iNa'2018	ResNet-50	66.9%	5.37%	link
Places-LT	ResNet-152	29.4%	16.7%	link

Stage-2 (MiSLAS):

Dataset	Arch	Top-1 Accuracy	ECE (15 bins)	Model
ImageNet-LT	ResNet-50	52.7%	1.78%	link
iNa'2018	ResNet-50	71.6%	7.67%	link
Places-LT	ResNet-152	40.4%	3.41%	link

Citation

Please consider citing MiSLAS in your publications if it helps your research. :)

@inproceedings{zhong2021mislas,
    title={Improving Calibration for Long-Tailed Recognition},
    author={Zhisheng Zhong, Jiequan Cui, Shu Liu, and Jiaya Jia},
    booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021},
}

Contact

If you have any questions about our work, feel free to contact us through email (Zhisheng Zhong: [email protected]) or Github issues.

Comments

Regarding the test accuracy
I hope I am not wrong. In the code I am seeing that you are calculating test accuracy after every few training iterations and taking the max of them. My question was

The results reported in the paper, Are they the maximum accuracy or the final accuracy after all the iterations?.

Is the validation same as test in cifar ?
opened by KAISER1997 5
About the BN part

Hi, thanks for your great work. I am wondering about the BN part, it seems that the methods like "cRT" and "DRW" do update the running mean and variances, right? I can not find the code segment which aims to freeze this part.

opened by tzhxs 5
Shift Learning implementation

Hi, thanks for your works. However, in your paper, the implementation of shift learning has not been described detail.

I guess that the BN parameters are re-trained in Stage-II, since the different means and variances. Is that true?

opened by ZhiyuanDang 4
Question about effect of shifted BN

Hi! Thanks for the great work. In issue#2, you mentioned that LWS fix the affine part(alpha, beta in the paper, as far as I understand) and update the running means and variances in Stage-2. Then I understand that LWS also uses shifted BN, however, in figure 4 there are differences in ACC, ECE between mixup+LWS and mixup+LWS+shifted BN.

What makes improvement in that experiment? Is there anything wrong with what I understand?

opened by cieske 2
Access to models is limited in google drive

Hello Zhisheng,

The access to the models is restricted by Google Drive (picture below, in French, translated below the picture). Could you make the models accessible to everyone?

PS: I may have sent you access requests, sorry about that.

Robin

Authorization is required You need to request owner access or sign in with an account that has the necessary permissions. Find out more

opened by RobinVogel 2
Question about label-aware smoothing

Hello: In the paper, I think you mean nll_loss is only for the gt label and smooth_loss is for the remaining K-1 label. But in the code https://github.com/Jia-Research-Lab/MiSLAS/blob/e8f91e59a910c5543ea1bcabb955ba368c606a00/methods.py#L62 I think you still contain the gt label in the smooth_loss. I am confusing about this.

opened by Phoebe-ovo 2
When will the code be released?

Hi! Thank you for such an inspiring work! Do you have any plan of releasing your code? I'm looking forward to that.

Plus, I have a small question regarding the method. In the paper you mentioned that when applying mixup in stage 2 yields no obvious improvement, but I cannot find a description of your overall method and I'd like to know in your final framework whether you use mixup in stage 1 only or in both stage 1&2. Thanks again!

opened by Duconnor 2
Have you tried 90 epochs training with mixup on ImageNet or iNaturalist ?

Hi @zs-zhong ,

Have you tried 90 epochs training with mixup on ImageNet or iNaturalist ?

I have made some improvements based on your work, but due to the lack of computing resources, training a model for 180/200 epochs is too time-consuming for me, especially for iNaturalist.

In my reproduction, under the condition of training 90 epochs with mixup (alpha 0.2) on ImageNet-LT, epochs of stage-2 is 10, the accuracy of methods with ResNet-50 are as follows:

| | Stage-1 | mixup | Stage-2 | cRT | LWS | | ---- | ---- | ---- | ---- | ---- | ---- | | Reported in Decouple | 90 epochs | | 10 epochs | 47.3 | 47.7 | | My Reproduce | 90 epochs | | 10 epochs | 48.7 | 49.3 | | My Reproduce | 90 epochs | ✅ | 10 epochs | 47.6 | 47.4 | | My Reproduce | 180 epochs | | 10 epochs | 51.0 | 51.8 | | Reported in MiSLAS | 180 epochs | | 10 epochs | 50.3 | 51.2 | | Reported in MiSLAS | 180 epochs | ✅ | 10 epochs | 51.7 | 52.0 |

They look much worse than the model trained for 180 epochs with mixup, and it does not even have improvement compared to normal training.

I guess this is because mixup could be regarded as a regularization method, which requires longer training epochs, 90 epochs cannot make the network converge.

However, I cannot get the result of using mixup to train 90 epochs on the iNaturalist data set, because the iNaturalist data set is too large and I can't put it in the memory, which makes it take about a week for me to train R50 once.

If possible, could you please provide the pre-trained ResNet-50 model for training 90 epochs with mixup on iNaturalist? I believe this will also be beneficial for fair comparison of future work.

Thank you again for your contribution and look forward to your reply.

opened by mitming 2

Owner

Jia Research Lab

Research lab focusing on CV led by Prof. Jiaya Jia

GitHub

Improving Calibration for Long-Tailed Recognition (CVPR2021)

MiSLAS Improving Calibration for Long-Tailed Recognition Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia [arXiv] [slide] [BibTeX] Introductio

116 Dec 20, 2022

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen

761 Dec 26, 2022

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective Zhengzhuo Xu, Zenghao Chai, Chun Yuan This is the PyTorch implement

16 Dec 15, 2022

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Representative Batch Normalization (RBN) with Feature Calibration The official implementation of the CVPR2021 oral paper: Representative Batch Normali

76 Nov 9, 2022

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

89 Dec 15, 2022

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Exploring Classification Equilibrium in Long-Tailed Object Detection (LOCE, ICCV 2021) Paper Introduction The conventional detectors tend to make imba

52 Nov 21, 2022

Awesome Long-Tailed Learning

Awesome Long-Tailed Learning This repo pays specially attention to the long-tailed distribution, where labels follow a long-tailed or power-law distri

284 Jan 6, 2023

A Simple Long-Tailed Rocognition Baseline via Vision-Language Model

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

4 Jan 20, 2022

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

11 Dec 1, 2021

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch >= 1.2.0 P

16 Dec 14, 2022

Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"

[AAAI22] Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification We point out the overlooked unbiasedness in long-tailed clas

28 Oct 18, 2022

Improving Calibration for Long-Tailed Recognition (CVPR2021)

Related tags

Overview

MiSLAS

Installation

Training

Evaluation

Results and Models

Citation

Contact

Comments

Regarding the test accuracy

About the BN part

Shift Learning implementation

Question about effect of shifted BN

Access to models is limited in google drive

Question about label-aware smoothing

When will the code be released?

Have you tried 90 epochs training with mixup on ImageNet or iNaturalist ?

Owner

Jia Research Lab

Improving Calibration for Long-Tailed Recognition (CVPR2021)

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Awesome Long-Tailed Learning

A Simple Long-Tailed Rocognition Baseline via Vision-Language Model

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

The comma.ai Calibration Challenge!

Camera calibration & 3D pose estimation tools for AcinoSet

Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021