TailCalibX : Feature Generation for Long-tail Classification
by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi
[arXiv] [Code] [pip Package] [Video]
Table of contents
-
๐ฃ Easy Usage (Recommended way to use our method) -
๐งช Advanced Usage -
๐๏ธโโ๏ธ Trained weights -
๐ช Results on a Toy Dataset -
๐ด Directory Tree -
๐ Citation -
๐ Contributing -
โค About me -
โจ Extras -
๐ License
๐ฃ
Easy Usage (Recommended way to use our method)
- Collect all the features from your dataloader.
- Use the
tailcalib
package to make the features balanced by generating samples. - Train the classifier.
- Repeat.
๐ป
Installation
Use the package manager pip to install tailcalib.
pip install tailcalib
๐จโ๐ป
Example Code
Check the instruction here for a much more detailed python package information.
# Import
from tailcalib import tailcalib
# Initialize
a = tailcalib(base_engine="numpy") # Options: "numpy", "pytorch"
# Imbalanced random fake data
import numpy as np
X = np.random.rand(200,100)
y = np.random.randint(0,10, (200,))
# Balancing the data using "tailcalib"
feat, lab, gen = a.generate(X=X, y=y)
# Output comparison
print(f"Before: {np.unique(y, return_counts=True)}")
print(f"After: {np.unique(lab, return_counts=True)}")
๐งช
Advanced Usage
โ
Things to do before you run the code from this repo
- Change the
data_root
for your dataset inmain.py
. - If you are using wandb logging (Weights & Biases), make sure to change the
wandb.init
inmain.py
accordingly.
๐
How to use?
- For just the methods proposed in this paper :
- For CIFAR100-LT:
run_TailCalibX_CIFAR100-LT.sh
- For mini-ImageNet-LT :
run_TailCalibX_mini-ImageNet-LT.sh
- For CIFAR100-LT:
- For all the results show in the paper :
- For CIFAR100-LT:
run_all_CIFAR100-LT.sh
- For mini-ImageNet-LT :
run_all_mini-ImageNet-LT.sh
- For CIFAR100-LT:
๐
How to create the mini-ImageNet-LT dataset?
Check Notebooks/Create_mini-ImageNet-LT.ipynb
for the script that generates the mini-ImageNet-LT dataset with varying imbalance ratios and train-test-val splits.
โ
Arguments
-
--seed
: Select seed for fixing it.- Default :
1
- Default :
-
--gpu
: Select the GPUs to be used.- Default :
"0,1,2,3"
- Default :
-
--experiment
: Experiment number (Check 'libs/utils/experiment_maker.py').- Default :
0.1
- Default :
-
--dataset
: Dataset number.- Choices :
0 - CIFAR100, 1 - mini-imagenet
- Default :
0
- Choices :
-
--imbalance
: Select Imbalance factor.- Choices :
0: 1, 1: 100, 2: 50, 3: 10
- Default :
1
- Choices :
-
--type_of_val
: Choose which dataset split to use.- Choices:
"vt": val_from_test, "vtr": val_from_train, "vit": val_is_test
- Default :
"vit"
- Choices:
-
--cv1
to--cv9
: Custom variable to use in experiments - purpose changes according to the experiment.- Default :
"1"
- Default :
-
--train
: Run training sequence- Default :
False
- Default :
-
--generate
: Run generation sequence- Default :
False
- Default :
-
--retraining
: Run retraining sequence- Default :
False
- Default :
-
--resume
: Will resume from the 'latest_model_checkpoint.pth' and wandb if applicable.- Default :
False
- Default :
-
--save_features
: Collect feature representations.- Default :
False
- Default :
-
--save_features_phase
: Dataset split of representations to collect.- Choices :
"train", "val", "test"
- Default :
"train"
- Choices :
-
--config
: If you have a yaml file with appropriate config, provide the path here. Will override the 'experiment_maker'.- Default :
None
- Default :
๐๏ธโโ๏ธ
Trained weights
Experiment | CIFAR100-LT (ResNet32, seed 1, Imb 100) | mini-ImageNet-LT (ResNeXt50) |
---|---|---|
TailCalib | Git-LFS | Git-LFS |
TailCalibX | Git-LFS | Git-LFS |
CBD + TailCalibX | Git-LFS | Git-LFS |
๐ช
Results on a Toy Dataset
The higher the Imb ratio
, the more imbalanced the dataset is. Imb ratio = maximum_sample_count / minimum_sample_count
.
Check this notebook to play with the toy example from which the plot below was generated.
๐ด
Directory Tree
TailCalibX
โโโ libs
โ โโโ core
โ โ โโโ ce.py
โ โ โโโ core_base.py
โ โ โโโ ecbd.py
โ โ โโโ modals.py
โ โ โโโ TailCalib.py
โ โ โโโ TailCalibX.py
โ โโโ data
โ โ โโโ dataloader.py
โ โ โโโ ImbalanceCIFAR.py
โ โ โโโ mini-imagenet
โ โ โโโ 0.01_test.txt
โ โ โโโ 0.01_train.txt
โ โ โโโ 0.01_val.txt
โ โโโ loss
โ โ โโโ CosineDistill.py
โ โ โโโ SoftmaxLoss.py
โ โโโ models
โ โ โโโ CosineDotProductClassifier.py
โ โ โโโ DotProductClassifier.py
โ โ โโโ ecbd_converter.py
โ โ โโโ ResNet32Feature.py
โ โ โโโ ResNext50Feature.py
โ โ โโโ ResNextFeature.py
โ โโโ samplers
โ โ โโโ ClassAwareSampler.py
โ โโโ utils
โ โโโ Default_config.yaml
โ โโโ experiments_maker.py
โ โโโ globals.py
โ โโโ logger.py
โ โโโ utils.py
โโโ LICENSE
โโโ main.py
โโโ Notebooks
โ โโโ Create_mini-ImageNet-LT.ipynb
โ โโโ toy_example.ipynb
โโโ readme_assets
โ โโโ method.svg
โ โโโ toy_example_output.svg
โโโ README.md
โโโ run_all_CIFAR100-LT.sh
โโโ run_all_mini-ImageNet-LT.sh
โโโ run_TailCalibX_CIFAR100-LT.sh
โโโ run_TailCalibX_mini-imagenet-LT.sh
Ignored tailcalib_pip
as it is for the tailcalib
pip package.
๐
Citation
@inproceedings{rahul2021tailcalibX,
title = {{Feature Generation for Long-tail Classification}},
author = {Rahul Vigneswaran and Marc T. Law and Vineeth N. Balasubramanian and Makarand Tapaswi},
booktitle = {ICVGIP},
year = {2021}
}
๐
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
โค
About me
โจ
Extras