Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Rahul Vigneswaran

Last update: Jan 2, 2023

Related tags

Deep Learning python computer-vision pytorch nvidia representation-learning cvpr iccv iclr cifar100 longtail long-tail feature-generation iith pytorch-implementation iiith mini-imagenet long-tailed-recognition long-tail-datasets long-tailed-detection icvgip-2021

Overview

TailCalibX : Feature Generation for Long-tail Classification

by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi

[arXiv] [Code] [pip Package] [Video]

🐣 Easy Usage (Recommended way to use our method)
- 💻 Installation
- 👨‍💻 Example Code
🧪 Advanced Usage
🏋️‍♂️ Trained weights
🪀 Results on a Toy Dataset
🌴 Directory Tree
📃 Citation
👁 Contributing
❤ About me
✨ Extras
📝 License

🐣 Easy Usage (Recommended way to use our method)

⚠ Caution: TailCalibX is just TailCalib employed multiple times. Specifically, we generate a set of features once every epoch and use them to train the classifier. In order to mimic that, three things must be done at every epoch in the following order:

Collect all the features from your dataloader.
Use the tailcalib package to make the features balanced by generating samples.
Train the classifier.
Repeat.

💻 Installation

Use the package manager pip to install tailcalib.

pip install tailcalib

👨‍💻 Example Code

Check the instruction here for a much more detailed python package information.

# Import
from tailcalib import tailcalib

# Initialize
a = tailcalib(base_engine="numpy")   # Options: "numpy", "pytorch"

# Imbalanced random fake data
import numpy as np
X = np.random.rand(200,100)
y = np.random.randint(0,10, (200,))

# Balancing the data using "tailcalib"
feat, lab, gen = a.generate(X=X, y=y)

# Output comparison
print(f"Before: {np.unique(y, return_counts=True)}")
print(f"After: {np.unique(lab, return_counts=True)}")

🧪 Advanced Usage

✔ Things to do before you run the code from this repo

Change the data_root for your dataset in main.py.
If you are using wandb logging (Weights & Biases), make sure to change the wandb.init in main.py accordingly.

📀 How to use?

For just the methods proposed in this paper :
- For CIFAR100-LT: run_TailCalibX_CIFAR100-LT.sh
- For mini-ImageNet-LT : run_TailCalibX_mini-ImageNet-LT.sh
For all the results show in the paper :
- For CIFAR100-LT: run_all_CIFAR100-LT.sh
- For mini-ImageNet-LT : run_all_mini-ImageNet-LT.sh

📚 How to create the mini-ImageNet-LT dataset?

Check Notebooks/Create_mini-ImageNet-LT.ipynb for the script that generates the mini-ImageNet-LT dataset with varying imbalance ratios and train-test-val splits.

⚙ Arguments

--seed : Select seed for fixing it.
- Default : 1
--gpu : Select the GPUs to be used.
- Default : "0,1,2,3"
--experiment: Experiment number (Check 'libs/utils/experiment_maker.py').
- Default : 0.1
--dataset : Dataset number.
- Choices : 0 - CIFAR100, 1 - mini-imagenet
- Default : 0
--imbalance : Select Imbalance factor.
- Choices : 0: 1, 1: 100, 2: 50, 3: 10
- Default : 1
--type_of_val : Choose which dataset split to use.
- Choices: "vt": val_from_test, "vtr": val_from_train, "vit": val_is_test
- Default : "vit"
--cv1 to --cv9 : Custom variable to use in experiments - purpose changes according to the experiment.
- Default : "1"
--train : Run training sequence
- Default : False
--generate : Run generation sequence
- Default : False
--retraining : Run retraining sequence
- Default : False
--resume : Will resume from the 'latest_model_checkpoint.pth' and wandb if applicable.
- Default : False
--save_features : Collect feature representations.
- Default : False
--save_features_phase : Dataset split of representations to collect.
- Choices : "train", "val", "test"
- Default : "train"
--config : If you have a yaml file with appropriate config, provide the path here. Will override the 'experiment_maker'.
- Default : None

🏋️‍♂️ Trained weights

Experiment	CIFAR100-LT (ResNet32, seed 1, Imb 100)	mini-ImageNet-LT (ResNeXt50)
TailCalib	Git-LFS	Git-LFS
TailCalibX	Git-LFS	Git-LFS
CBD + TailCalibX	Git-LFS	Git-LFS

🪀 Results on a Toy Dataset

The higher the Imb ratio, the more imbalanced the dataset is. Imb ratio = maximum_sample_count / minimum_sample_count.

Check this notebook to play with the toy example from which the plot below was generated.

🌴 Directory Tree

TailCalibX
├── libs
│   ├── core
│   │   ├── ce.py
│   │   ├── core_base.py
│   │   ├── ecbd.py
│   │   ├── modals.py
│   │   ├── TailCalib.py
│   │   └── TailCalibX.py
│   ├── data
│   │   ├── dataloader.py
│   │   ├── ImbalanceCIFAR.py
│   │   └── mini-imagenet
│   │       ├── 0.01_test.txt
│   │       ├── 0.01_train.txt
│   │       └── 0.01_val.txt
│   ├── loss
│   │   ├── CosineDistill.py
│   │   └── SoftmaxLoss.py
│   ├── models
│   │   ├── CosineDotProductClassifier.py
│   │   ├── DotProductClassifier.py
│   │   ├── ecbd_converter.py
│   │   ├── ResNet32Feature.py
│   │   ├── ResNext50Feature.py
│   │   └── ResNextFeature.py
│   ├── samplers
│   │   └── ClassAwareSampler.py
│   └── utils
│       ├── Default_config.yaml
│       ├── experiments_maker.py
│       ├── globals.py
│       ├── logger.py
│       └── utils.py
├── LICENSE
├── main.py
├── Notebooks
│   ├── Create_mini-ImageNet-LT.ipynb
│   └── toy_example.ipynb
├── readme_assets
│   ├── method.svg
│   └── toy_example_output.svg
├── README.md
├── run_all_CIFAR100-LT.sh
├── run_all_mini-ImageNet-LT.sh
├── run_TailCalibX_CIFAR100-LT.sh
└── run_TailCalibX_mini-imagenet-LT.sh

Ignored tailcalib_pip as it is for the tailcalib pip package.

📃 Citation

@inproceedings{rahul2021tailcalibX,
    title   = {{Feature Generation for Long-tail Classification}},
    author  = {Rahul Vigneswaran and Marc T. Law and Vineeth N. Balasubramanian and Makarand Tapaswi},
    booktitle = {ICVGIP},
    year = {2021}
}

👁 Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

❤ About me

Rahul Vigneswaran

✨ Extras

🐝 Long-tail buzz : If you are interested in deep learning research which involves long-tailed / imbalanced dataset, take a look at Long-tail buzz to learn about the recent trending papers in this field.

📝 License

MIT

You might also like...

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation This repository contains the official PyTorch implementation of the following

270 Dec 30, 2022

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

31 Oct 13, 2022

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

8 Oct 29, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

Working demo of the Multi-class and Anomaly classification model using the CLIP feature space

Comments

File Not found error

Hi, while executing the code, it generates the following error. The file is also present in the current directory. I was just wondering if anyone could help with this.

bug

opened by Angelina1996 2
Creatin mini ImageNet_LT

Hi Rahul, Just wondering if you could help with the following error in creating a mini imageNet_LT dataset. I am using imageNet2012 which has 2 directories, train and val. ImageNet Directory pic is also provided below the error message. Error (base) [Angelina@rob-gpu Notebooks]$ python create_mini_ImageNet_LT.py Traceback (most recent call last): File "create_mini_ImageNet_LT.py", line 69, in train_x, test_x, train_y, test_y = train_test_split(final_1, labels_1, train_size=train_split_ratio, test_size=test_split_ratio, stratify=labels_1) File "/home/Angelina/anaconda3/lib/python3.8/site-packages/sklearn/model_selection/_split.py", line 2130, in train_test_split n_train, n_test = _validate_shuffle_split(n_samples, test_size, train_size, File "/home/Angelina/anaconda3/lib/python3.8/site-packages/sklearn/model_selection/_split.py", line 1810, in _validate_shuffle_split raise ValueError( ValueError: With n_samples=0, test_size=0.19999999999999996 and train_size=0.8, the resulting train set will be empty. Adjust any of the aforementioned parameters.

opened by Angelina1996 2
The experiments of CE+TailCalib

Thanks for your excellent work! But I have a question about experiments of CE+TailCalib. I want to find this experiment, but it seems that not including in below. So how can I find the information about this experiment? Thank you!
reproducibility

opened by madoka109 1
question about cosineCE

I have run the CIFAR-100-Imb100 experiment_no=0.2, i got this results which is different from that in the paper. CIFAR100 | Imb100 | val-toatal | val-many | val-med | val-few -- | -- | -- | -- | -- | -- Seed1 | CosineCE | 0.403 | 0.676 | 0.379 | 0.112 Seed2 | CosineCE | 0.403 | 0.670 | 0.396 | 0.101 Seed3 | CosineCE | 0.405 | 0.680 | 0.380 | 0.113 Seed10 | CosineCE | 0.4051 | 0.6737 | 0.3923 | 0.1067 Seed20 | CosineCE | 0.3968 | 0.6754 | 0.3909 | 0.7867 Seed30 | CosineCE | 0.4045 | 0.6811 | 0.378 | 0.1127
reproducibility

opened by Z-ZHHH 3

Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Related tags

Overview

TailCalibX : Feature Generation for Long-tail Classification

Table of contents

🐣 Easy Usage (Recommended way to use our method)

💻 Installation

👨‍💻 Example Code

🧪 Advanced Usage

✔ Things to do before you run the code from this repo

📀 How to use?

📚 How to create the mini-ImageNet-LT dataset?

⚙ Arguments

🏋️‍♂️ Trained weights

🪀 Results on a Toy Dataset

🌴 Directory Tree

📃 Citation

👁 Contributing

❤ About me

✨ Extras

📝 License

You might also like...

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Working demo of the Multi-class and Anomaly classification model using the CLIP feature space

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Official implementation of Long-Short Transformer in PyTorch.

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Comments

File Not found error

Creatin mini ImageNet_LT

The experiments of CE+TailCalib

question about cosineCE

Owner

Rahul Vigneswaran

PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)