Official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

vanint

Last update: Dec 30, 2022

Related tags

Text Data & NLP TADE-AgnosticLT

Overview

Test-Agnostic Long-Tailed Recognition

This repository is the official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

TADE (our method) innovates the expert training scheme by introducing diversity-promoting expertise-guided losses, which train different experts to handle distinct class distributions. In this way, the learned experts would be more diverse than existing multi-expert methods, leading to better ensemble performance, and aggregatedly simulate a wide spectrum of possible class distributions.
TADE develops a new self-supervised method, namely prediction stability maximization, to adaptively aggregate these experts for better handling unknown test distribution, using unlabeled test class data.

Results

ImageNet-LT (ResNeXt-50)

Long-tailed recognition with uniform test class distribution:

Methods	MACs(G)	Top-1 acc.	Model
Softmax	4.26	48.0
RIDE	6.08	56.3
TADE (ours)	6.08	58.8	Download

Test-agnostic long-tailed recognition:

Methods	MACs(G)	Forward-50	Forward-10	Uniform	Backward-10	Backward-50
Softmax	4.26	66.1	60.3	48.0	34.9	27.6
RIDE	6.08	67.6	64.0	56.3	48.7	44.0
TADE (ours)	6.08	69.4	65.4	58.8	54.5	53.1

CIFAR100-Imbalance ratio 100 (ResNet-32)

Long-tailed recognition with uniform test class distribution：

Methods	MACs(G)	Top-1 acc.
Softmax	0.07	41.4
RIDE	0.11	48.0
TADE (ours)	0.11	49.8

Test-agnostic long-tailed recognition：

Methods	MACs(G)	Forward-50	Forward-10	Uniform	Backward-10	Backward-50
Softmax	0.07	62.3	56.2	41.4	25.8	17.5
RIDE	0.11	63.0	57.0	48.0	35.4	29.3
TADE (ours)	0.11	65.9	58.3	49.8	43.9	42.4

Places-LT (ResNet-152)

Long-tailed recognition with uniform test class distribution:

Methods	MACs(G)	Top-1 acc.
Softmax	11.56	31.4
RIDE	13.18	40.3
TADE (ours)	13.18	40.9

Test-agnostic long-tailed recognition:

Methods	MACs(G)	Forward-50	Forward-10	Uniform	Backward-10	Backward-50
Softmax	11.56	45.6	40.2	31.4	23.4	19.4
RIDE	13.18	43.1	41.6	40.3	38.2	36.9
TADE (ours)	13.18	46.4	43.3	40.9	41.4	41.6

iNaturalist 2018 (ResNet-50)

Long-tailed recognition with uniform test class distribution:

Methods	MACs(G)	Top-1 acc.
Softmax	4.14	64.7
RIDE	5.80	71.8
TADE (ours)	5.80	72.9

Test-agnostic long-tailed recognition:

Methods	MACs(G)	Forward-3	Forward-2	Uniform	Backward-2	Backward-3
Softmax	4.14	65.4	65.5	64.7	64.0	63.4
RIDE	5.80	71.5	71.9	71.8	71.9	71.8
TADE (ours)	5.80	72.3	72.5	72.9	73.5	73.3

Requirements

To install requirements:

pip install -r requirements.txt

Hardware requirements

8 GPUs with >= 11G GPU RAM are recommended. Otherwise the model with more experts may not fit in, especially on datasets with more classes (the FC layers will be large). We do not support CPU training, but CPU inference could be supported by slight modification.

Datasets

Four bechmark datasets

Please download these datasets and put them to the /data file.
ImageNet-LT and Places-LT can be found at here.
iNaturalist data should be the 2018 version from here.
CIFAR-100 will be downloaded automatically with the dataloader.

data
├── ImageNet_LT
│   ├── test
│   ├── train
│   └── val
├── CIFAR100
│   └── cifar-100-python
├── Place365
│   ├── data_256
│   ├── test_256
│   └── val_256
└── iNaturalist 
    ├── test2018
    └── train_val2018

Txt files

We provide txt files for test-agnostic long-tailed recognition for ImageNet-LT, Places-LT and iNaturalist 2018. CIFAR-100 will be generated automatically with the code.
For iNaturalist 2018, please unzip the iNaturalist_train.zip.

data_txt
├── ImageNet_LT
│   ├── ImageNet_LT_backward2.txt
│   ├── ImageNet_LT_backward5.txt
│   ├── ImageNet_LT_backward10.txt
│   ├── ImageNet_LT_backward25.txt
│   ├── ImageNet_LT_backward50.txt
│   ├── ImageNet_LT_forward2.txt
│   ├── ImageNet_LT_forward5.txt
│   ├── ImageNet_LT_forward10.txt
│   ├── ImageNet_LT_forward25.txt
│   ├── ImageNet_LT_forward50.txt
│   ├── ImageNet_LT_test.txt
│   ├── ImageNet_LT_train.txt
│   ├── ImageNet_LT_uniform.txt
│   └── ImageNet_LT_val.txt
├── Places_LT_v2
│   ├── Places_LT_backward2.txt
│   ├── Places_LT_backward5.txt
│   ├── Places_LT_backward10.txt
│   ├── Places_LT_backward25.txt
│   ├── Places_LT_backward50.txt
│   ├── Places_LT_forward2.txt
│   ├── Places_LT_forward5.txt
│   ├── Places_LT_forward10.txt
│   ├── Places_LT_forward25.txt
│   ├── Places_LT_forward50.txt
│   ├── Places_LT_test.txt
│   ├── Places_LT_train.txt
│   ├── Places_LT_uniform.txt
│   └── Places_LT_val.txt
└── iNaturalist18
    ├── iNaturalist18_backward2.txt
    ├── iNaturalist18_backward3.txt
    ├── iNaturalist18_forward2.txt
    ├── iNaturalist18_forward3.txt
    ├── iNaturalist18_train.txt
    ├── iNaturalist18_uniform.txt
    └── iNaturalist18_val.txt

Pretrained models

For the training on Places-LT, we follow previous method and use the pre-trained model.
Please download the checkpoint. Unzip and move the checkpoint files to /model/pretrained_model_places/.

Script

ImageNet-LT

Training

To train the expertise-diverse model, run this command:

python train.py -c configs/config_imagenet_lt_resnext50_tade.json

Evaluate

To evaluate expertise-diverse model on the uniform test class distribution, run:

python test.py -r checkpoint_path

To evaluate expertise-diverse model on agnostic test class distributions, run:

python test_all_imagenet.py -r checkpoint_path

Test-time training

To test-time train the expertise-diverse model for agnostic test class distributions, run:

python test_train_imagenet.py -c configs/test_time_imagenet_lt_resnext50_tade.json -r checkpoint_path

CIFAR100-LT

Training

To train the expertise-diverse model, run this command:

python train.py -c configs/config_cifar100_ir100_tade.json

One can change the imbalance ratio from 100 to 10/50 by changing the config file.

Evaluate

To evaluate expertise-diverse model on the uniform test class distribution, run:

python test.py -r checkpoint_path

To evaluate expertise-diverse model on agnostic test class distributions, run:

python test_all_cifar.py -r checkpoint_path

Test-time training

To test-time train the expertise-diverse model for agnostic test class distributions, run:

python test_train_cifar.py -c configs/test_time_cifar100_ir100_tade.json -r checkpoint_path

One can change the imbalance ratio from 100 to 10/50 by changing the config file.

Places-LT

Training

To train the expertise-diverse model, run this command:

python train.py -c configs/config_places_lt_resnet152_tade.json

Evaluate

To evaluate expertise-diverse model on the uniform test class distribution, run:

python test_places.py -r checkpoint_path

To evaluate expertise-diverse model on agnostic test class distributions, run:

python test_all_places.py -r checkpoint_path

Test-time training

To test-time train the expertise-diverse model for agnostic test class distributions, run:

python test_train_places.py -c configs/test_time_places_lt_resnet152_tade.json -r checkpoint_path

iNaturalist 2018

Training

To train the expertise-diverse model, run this command:

python train.py -c configs/config_iNaturalist_resnet50_tade.json

Evaluate

To evaluate expertise-diverse model on the uniform test class distribution, run:

python test.py -r checkpoint_path

To evaluate expertise-diverse model on agnostic test class distributions, run:

python test_all_inat.py -r checkpoint_path

Test-time training

To test-time train the expertise-diverse model for agnostic test class distributions, run:

python test_train_inat.py -c configs/test_time_iNaturalist_resnet50_tade.json -r checkpoint_path

Citation

If you find our work inspiring or use our codebase in your research, please cite our work.

@article{zhang2021test,
  title={Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision},
  author={Zhang, Yifan and Hooi, Bryan and Hong, Lanqing and Feng, Jiashi},
  journal={arXiv},
  year={2021}
}

Acknowledgements

This is a project based on this pytorch template.

The mutli-expert framework are based on RIDE. The data generation of agnostic test class distributions takes references from LADE.

Comments

GPU

hello. very sorry for commenting here, my question is from ppn portfolio which its issues is closed and I had no other way to ask from you. i tried to run that project on google colab but it took so much time which colab doesn't accept. i figured out that tensorflow 1.4.0 doesn't use gpu. is there any solution for that? i tried so much but i got no answer. please help me. and if there is anything that i should consider while using colab for that project, please remind me.

regards

opened by m1996 14
Try TADE on custom dataset

Hi,

Your excellent work really catches my eye, and I want to try TADE on my own dataset to test if it works for industry tasks, but the results doesn't look good compared with conventional methods like focal loss. Results is shown below:

TADE without Test Training: tr: 84 acc val: 87 acc TADE with Test Training 1 epoch: ** val 82.25** TADE with Test Training 5 epoch: ** val 38.41** TADE with Test Training 8 epoch: ** val 30** Focal Loss:tr: 88 acc val: 89 acc

It seems like that result of TADE without test-training is slightly worse than Focal Loss. And with the increase of Test-Training epochs, the accuracy becomes worse. The custom dataset is about a industrial defect classification task, so most pictures have similar background. And pictures can be divided into three categories. train datasetset cls_num_list = [2883,1019,56]

Question: I am not sure if it is because there are only three categories in my dataset, so that it is hard for the output_logit vector to represent similarity, which makes the performance of self-supervised aggregation worse. Do you have idea about that?

opened by Lllllp93 12
About CIFAR10-LT's Implementation details

Hello， In your paper,the top-1 accuracy on CIFAR10-LT(Imbalance Ratio=10,100) is 90.8% and 83.8%,but when I run your source code,the top-1 accuracy on CIFAR10-LT(Imbalance Ratio=10,100) is 90.16% and 82.92%,What are the specific Implementation details on CIFAR10-LT? Thank you~

opened by sunhappy6900 6
n_gpus vs batch_size
Are you offsetting the batch_size for the number n_gpus in the config itself?

CIFAR100:

n_gpus = 1

batch_size = 128 Effective batchsize => 128*1 = 128 ?

ImageNet-LT:

n_gpus = 2

batch_size = 64 Effective batchsize => 64*2 = 128 ?

iNaturalist18:

n_gpus = 4

batch_size = 512 Effective batchsize => 512*4 = 2048 ?
opened by rahulvigneswaran 5
Doubts regarding the experimental setup
For CIFAR100-LT a. Are there different val and test set? b. On what dataset split do you choose the best-trained model? c. What split do you use for hyperparam tuning?

For iNaturalist18 a. Are there different val and test set? b. On what dataset split do you choose the best-trained model? c. What split do you use for hyperparam tuning? d. Even though there is an officially available test set (https://github.com/visipedia/inat_comp/tree/master/2018#Data) for iNaturalist18, why don't you use that?

General doubts a. What seeds do you use? b. Do you take a mean of multiple seeds?
opened by rahulvigneswaran 4
A question about perferance.
A great job. Your work solves a wider range of LT problems.

But I m confused with TADE performance on the vanilla LT test set.

Actually, with the same backbone and training strategy, the following methods adopt almost the same loss, but the top-1 ACC varies, for example on CIFAR100-LT-IR-100:

ICLR'21 logit adjustment [43.89% cf. origin paper Tab.3 ]

CVPR'21 LADE without test prior [45.6% cf. this paper Tab.8(a)]

NeurIPS'20 Balanced Softmax which can be rewritten as Eq.3 in this paper [46.1% cf. this paper Tab.8(a)]

In such a situation, TADE should get the best performance when the expert E2 (Eq.3 in this paper) mainly works. If so, it should not outperform the above methods by a large margin, right?

However, the TADE's top-1 ACC is 49.8% (cf. this paper Tab.8(a)) and the weight of experts is [0.40 0.35 0.24] (cf. this paper Tab. 12). The E1 mainly works.

So I just wondering how to explain the improvement of TADE on the vanilla test dataset?
opened by XuZhengzhuo 3
About Backbone

Hi, Thank you very much for your work. I would like to ask if you tried to use ResNeXt101-32x4d instead of ResNeXt50 in your experiments. After my experiments, ResNeXt101 is not as effective as ResNeXt50. Are there any other parameters that need to be changed besides the backbone?

Best,

opened by oldfemalepig 3
About a question of test_training_cifar.py

In line 200 and 201 of test_training_cifar.py: dataset = IMBALANCECIFAR100(data_dir, train=True, download=True, transform=train_trsfm, imb_type=imb_type, imb_factor=test_imb_factor, reverse=reverse) train_dataset = IMBALANCECIFAR100(data_dir, train=True, download=True, transform= TwoCropsTransform(train_trsfm), imb_type=imb_type, imb_factor=test_imb_factor, reverse=reverse) why you set the train is True? I think it should be False to obtain the weighting parameters of test set. Can you explain it? Thanks!

opened by lastonephy 2
About the setting of shared backbone and separate expert

Hi~ Thanks for your excellent work. I have two questions about the paper and the code. (1) I notice that, in default, the shared backbone contains only the layer_1 & layer_2 of resent, other layers in resnet (layer_3 & layer_4) are all in the "expert". Could this setting be described as "shared backbone"? I mean, by saying "shared backbone", the readers will assume that only the classier layer is treated as the "expert". (2) Have you tried the setting that only the classier layer is treated as the "expert"? How much the performance decrease is?

opened by zhiyuanyou 2
Where to find the Objective function in the code

In 4.3 of the paper, objective function is used to calculate the weight of each expert, I guess it is in test_all.py, but I can't find it. Please tell me the spefic position in the code, thank you!

opened by madoka109 2
Implementation detail about LDAM loss
Hi Vanint, I notice that in your LDAM loss implementation the scale is applied on the the adjustment only

x_m = x - batch_m * self.s

which is different from the original LDAM loss

return F.cross_entropy(self.s*output, target, weight=self.weight)

basically equivalent to

x_m = (x - batch_m) * self.s

could you explained more on this detail? Is the coefficient absorbed somewhere on the logit output?
opened by fliman 2

Owner

vanint

GitHub

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention ACL2021 Findings Usage 0. Prepare environment Requirements: python==3.6 te

8 Dec 16, 2022

SummerTime - Text Summarization Toolkit for Non-experts

A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.

213 Jan 4, 2023

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

186 Dec 24, 2022

Language-Agnostic SEntence Representations

LASER Language-Agnostic SEntence Representations LASER is a library to calculate and use multilingual sentence embeddings. NEWS 2019/11/08 CCMatrix is

3.2k Jan 4, 2023

MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert, MILES uses the bert-base-multilingual-uncased model, as well as simple language-agnostic approaches to complex word identification (CWI) and candidate ranking.

MILES Multilingual Lexical Simplifier Explore the docs » Read LSBert Paper · Report Bug · Request Feature About The Project MILES is a multilingual te

45 Oct 19, 2022

xFormers is a modular and field agnostic library to flexibly generate transformer architectures by interoperable and optimized building blocks.

Description xFormers is a modular and field agnostic library to flexibly generate transformer architectures by interoperable and optimized building bl

2.3k Jan 8, 2023

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

Learning Signal-Agnostic Manifolds of Neural Fields This is the uncleaned code for the paper Learning Signal-Agnostic Manifolds of Neural Fields. The

60 Dec 12, 2022

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.

End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the

22 Nov 13, 2022

skweak: A software toolkit for weak supervision applied to NLP tasks

Labelled data remains a scarce resource in many practical NLP scenarios. This is especially the case when working with resource-poor languages (or text domains), or when using task-specific labels without pre-existing datasets. The only available option is often to collect and annotate texts by hand, which is expensive and time-consuming.

Norsk Regnesentral (Norwegian Computing Center)

850 Dec 28, 2022

Labelling platform for text using distant supervision

With DataQA, you can label unstructured text documents using rule-based distant supervision.

245 Aug 5, 2022

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

CRNN paper：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 1. create your ow

3 Apr 2, 2022

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

The PyTorch-Kaldi Speech Recognition Toolkit PyTorch-Kaldi is an open-source repository for developing state-of-the-art DNN/HMM speech recognition sys

2.3k Dec 27, 2022

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

data2vec-pytorch PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI (F

105 Jan 4, 2023

Official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

Related tags

Overview

Test-Agnostic Long-Tailed Recognition

Results

ImageNet-LT (ResNeXt-50)

CIFAR100-Imbalance ratio 100 (ResNet-32)

Places-LT (ResNet-152)

iNaturalist 2018 (ResNet-50)

Requirements

Hardware requirements

Datasets

Four bechmark datasets

Txt files

Pretrained models

Script

ImageNet-LT

Training

Evaluate

Test-time training

CIFAR100-LT

Training

Evaluate

Test-time training

Places-LT

Training

Evaluate

Test-time training

iNaturalist 2018

Training

Evaluate

Test-time training

Citation

Acknowledgements

Comments

Owner

vanint

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.

SummerTime - Text Summarization Toolkit for Non-experts

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Language-Agnostic SEntence Representations

MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert, MILES uses the bert-base-multilingual-uncased model, as well as simple language-agnostic approaches to complex word identification (CWI) and candidate ranking.

xFormers is a modular and field agnostic library to flexibly generate transformer architectures by interoperable and optimized building blocks.

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

skweak: A software toolkit for weak supervision applied to NLP tasks

Labelling platform for text using distant supervision

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

Extracting Summary Knowledge Graphs from Long Documents

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

Beyond Paragraphs: NLP for Long Sequences

Japanese Long-Unit-Word Tokenizer with RemBertTokenizerFast of Transformers

Long text token classification using LongFormer