Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Zhongqi Miao

Last update: Dec 26, 2022

Related tags

Deep Learning computer-vision deep-learning open-set long-tail pytorch-implementation cvpr2019 open-long-tail-recognition oltr

Overview

Large-Scale Long-Tailed Recognition in an Open World

[Project] [Paper] [Blog]

Overview

Open Long-Tailed Recognition (OLTR) is the author's re-implementation of the long-tail recognizer described in:
"Large-Scale Long-Tailed Recognition in an Open World"
Ziwei Liu^*, Zhongqi Miao^*, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu (CUHK & UC Berkeley / ICSI) in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019, Oral Presentation

Further information please contact Zhongqi Miao and Ziwei Liu.

Update notifications

03/04/2020: We changed all valirables named selfatt to modulatedatt so that the attention module can be properly trained in the second stage for Places-LT. ImageNet-LT does not have this problem since the weights are not freezed. We have updated new results using fixed code, which is still better than reported. The weights are also updated. Thanks!
02/11/2020: We updated configuration files for Places_LT dataset. The current results are a little bit higher than reported, even with updated F-measure calculation. One important thing to be considered is that we have unfrozon the model weights for the first stage training of Places-LT, which means it is not suitable for single-GPU training in most cases (we used 4 1080ti in our implementation). However, for the second stage, since the memory and center loss do not support multi-GPUs currently, please switch back to single-GPU training. Thank you very much!
01/29/2020: We updated the False Positive calculation in util.py so that the numbers are normal again. The reported F-measure numbers in the paper might be a little bit higher than actual numbers for all baselines. We will update it as soon as possible. We have updated the new F-measure number in the following table. Thanks.
12/19/2019: Updated modules with 'clone()' methods and set use_fc in ImageNet-LT stage-1 config to False. Currently, the results for ImageNet-LT is comparable to reported numbers in the paper (a little bit better), and the reproduced results are updated below. We also found the bug in Places-LT. We will update the code and reproduced results as soon as possible.
08/05/2019: Fixed a bug in utils.py. Update re-implemented ImageNet-LT weights at the end of this page.
05/02/2019: Fixed a bug in run_network.py so the models train properly. Update configuration file for Imagenet-LT stage 1 training so that the results from the paper can be reproduced.

Requirements

PyTorch (version >= 0.4.1)
scikit-learn

Data Preparation

NOTE: Places-LT dataset have been updated since the first version. Please download again if you have the first version.

First, please download the ImageNet_2014 and Places_365 (256x256 version). Please also change the data_root in main.py accordingly.
Next, please download ImageNet-LT and Places-LT from here. Please put the downloaded files into the data directory like this:

data
  |--ImageNet_LT
    |--ImageNet_LT_open
    |--ImageNet_LT_train.txt
    |--ImageNet_LT_test.txt
    |--ImageNet_LT_val.txt
    |--ImageNet_LT_open.txt
  |--Places_LT
    |--Places_LT_open
    |--Places_LT_train.txt
    |--Places_LT_test.txt
    |--Places_LT_val.txt
    |--Places_LT_open.txt

Download Caffe Pre-trained Models for Places_LT Stage_1 Training

Caffe pretrained ResNet152 weights can be downloaded from here, and save the file to ./logs/caffe_resnet152.pth

Getting Started (Training & Testing)

ImageNet-LT

Stage 1 training:

python main.py --config ./config/ImageNet_LT/stage_1.py

Stage 2 training:

python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py

Close-set testing:

python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py --test

Open-set testing (thresholding)

python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py --test_open

Test on stage 1 model

python main.py --config ./config/ImageNet_LT/stage_1.py --test

Places-LT

Stage 1 training (At this stage, multi-GPU might be necessary since we are finetuning a ResNet-152.):

python main.py --config ./config/Places_LT/stage_1.py

Stage 2 training (At this stage, only single-GPU is supported, please switch back to single-GPU training.):

python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

Close-set testing:

python main.py --config ./config/Places_LT/stage_2_meta_embedding.py --test

Open-set testing (thresholding)

python main.py --config ./config/Places_LT/stage_2_meta_embedding.py --test_open

Reproduced Benchmarks and Model Zoo (Updated on 03/05/2020)

ImageNet-LT Open-Set Setting

Backbone	Many-Shot	Medium-Shot	Few-Shot	F-Measure	Download
ResNet-10	44.2	35.2	17.5	44.6	model

Places-LT Open-Set Setting

Backbone	Many-Shot	Medium-Shot	Few-Shot	F-Measure	Download
ResNet-152	43.7	40.2	28.0	50.0	model

CAUTION

The current code was prepared using single GPU. The use of multi-GPU can cause problems except for the first stage of Places-LT.

License and Citation

The use of this software is released under BSD-3.

@inproceedings{openlongtailrecognition,
  title={Large-Scale Long-Tailed Recognition in an Open World},
  author={Liu, Ziwei and Miao, Zhongqi and Zhan, Xiaohang and Wang, Jiayun and Gong, Boqing and Yu, Stella X.},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019}
}

Comments

Reproduce model results
Thanks for the inspiring work and code :)

I'm having trouble to reproduce the results (plain model as well as final model on both datasets.) I have used the default settings without any alterations. Can you shed some insights on the results (perhaps this is caused by the hyper-parameters) and maybe if it is OK for you to provide the trained models for both stage1 and stage2?

The results I have reproduced are as following:

ImageNet-LT

Stage1(close-setting): Evaluation_accuracy_micro_top1: 0.204 Averaged F-measure: 0.160 Many_shot_top1: 0.405; Median_shot_top1: 0.099; Low_shot_top1: 0.006

Stage1(open-setting): Open-set Accuracy: 0.178 Evaluation_accuracy_micro_top1: 0.199 Averaged F-measure: 0.291 Many_shot_top1: 0.396; Median_shot_top1: 0.096; Low_shot_top1: 0.006

Stage2(close-setting): Evaluation_accuracy_micro_top1: 0.339 Averaged F-measure: 0.322 Many_shot_top1: 0.411; Median_shot_top1: 0.330; Low_shot_top1: 0.167

Stage2(open-setting): Open-set Accuracy: 0.245 Evaluation_accuracy_micro_top1: 0.327 Averaged F-measure: 0.455 Many_shot_top1: 0.398; Median_shot_top1: 0.318; Low_shot_top1: 0.159

Places-LT

Stage1(close-setting): Evaluation_accuracy_micro_top1: 0.268 Averaged F-measure: 0.248 Many_shot_top1: 0.442; Median_shot_top1: 0.221; Low_shot_top1: 0.058

Stage1(open-setting): Open-set Accuracy: 0.018 Evaluation_accuracy_micro_top1: 0.267 Averaged F-measure: 0.373 Many_shot_top1: 0.441; Median_shot_top1: 0.219; Low_shot_top1: 0.057

Stage2(close-setting): Evaluation_accuracy_micro_top1: 0.349 Averaged F-measure: 0.338 Many_shot_top1: 0.387; Median_shot_top1: 0.355; Low_shot_top1: 0.263

Stage2(open-setting): Open-set Accuracy: 0.120 Evaluation_accuracy_micro_top1: 0.342 Averaged F-measure: 0.477 Many_shot_top1: 0.382; Median_shot_top1: 0.349; Low_shot_top1: 0.254
bug
opened by JasAva 15
Code Error

Hello, When I run python main.py --config ./config/Places_LT/stage_2_meta_embedding.py, there is an error.

File "./models/MetaEmbeddingClassifier.py", line 33, in forward dist_cur = torch.norm(x_expand - centroids_expand, 2, 2) RuntimeError: The size of tensor a (365) must match the size of tensor b (122) at non-singleton dimension 1

Here, I print the shape of x_expand and centroids_expand.

torch.Size([86, 365, 512]) torch.Size([86, 122, 512])

Could you give some advice to solve this problem?
enhancement question

opened by AmingWu 15
Reproducing Plain Model Baseline Accuracies

Hi, Thank you for releasing the code for your paper. Can you please clarify how to reproduce the accuracies for the plain model baseline on ImageNet-LT in Table 3 of your paper ? I'm running the following commands : -> python main.py --config ./config/ImageNet_LT/stage_1.py -> python main.py --config ./config/ImageNet_LT/stage_1.py --test which gives me the following output : Evaluation_accuracy_micro_top1: 0.119 Averaged F-measure: 0.108 Many_shot_accuracy_top1: 0.148 Median_shot_accuracy_top1: 0.112 Low_shot_accuracy_top1: 0.062

From the paper, the numbers should be : Evaluation_accuracy_micro_top1: 0.209 Many_shot_accuracy_top1: 0.409 Median_shot_accuracy_top1: 0.107 Low_shot_accuracy_top1: 0.004

Stage 1 is simply the baseline resnet-10 training on the entire dataset, right, or am I missing something ?
bug

opened by ssfootball04 9
Maybe wrong in F_measure calculation on openset

https://github.com/zhmiao/OpenLongTailRecognition-OLTR/blob/master/utils.py#L88

Seems it should be changed from

false_pos += 1 if preds[i] != labels[i] and labels[i] != -1 and preds[i] != -1 else 0 to : false_pos += 1 if preds[i] != labels[i] and ((labels[i] != -1 and preds[i] != -1) or label[i] == -1) else 0

opened by tuobay 7
Convolutional layers in ResNet152 are freezed in stage 1

@zhmiao After carefully reading the code, I find that the convolutional layers in ResNet152 are freezed during the stage #1’ training which resulting the training no meaning, So pls help to check. thanks
question

opened by jchhuang 7
Question about centroids update

Thank you for releasing the code for this awesome work. I have a question about the centroids update. I have read the code and find that centroids only calculated once at the beginning of the model initialization at stage 2. Could you help me to find out how the centroids update? And I wonder that is the centroids correct because the parameters of attention is just initialized and the features of centroids are not learned.
question

opened by xyy19920105 6
All the Backbone Net used by the compared methods are also frozen when training the datasets？

Here， I am curious about that the weights of Backbone Net used by the compared methods are also frozen when training the datasets？If so，I think it is not fair to compare their performances since their performances does not exhaust if their weights are frozen

opened by jchhuang 5
The calculation of open-set F-measure
Hi, I wonder if true positive, false positive and false negative are counted correctly. https://github.com/zhmiao/OpenLongTailRecognition-OLTR/blob/4a1f4009921b1c99029bfda151915058ff086a51/utils.py#L86-L89 Here are some examples according to the above code: (pairs of prediction and label)

class_a, class_a (TP)

class_b, class_a (FP)

-1, class_a (?)

class_a, -1 (FN)

-1, -1 (?)

I'm confused about

why the 2nd example is counted as FP rather than FN? (FP means the label is negative, but the prediction is positive, so what is positive here)

why the 3nd example is not counted as FN?

is the last example TN or TP?

question
opened by drcege 5
CosNormClassifier#L24

Hi, Thank you for releasing the code for your paper. I'm a little confused when I look at this line of code. https://github.com/zhmiao/OpenLongTailRecognition-OLTR/models/CosNormClassifier.py#L24. It seems to be different from what is described in the paper. Could you help me with this problem?

Thank you very much.

opened by sudabai666 5
why fix all parameters except self attention parameters?

Hi, dear author: I'm wondering why in stage2, you fix all parameters except self attention parameters? So you mean for feat model, we only use stage1 to learn the features, and in stage2, you only learn about self attention? Can we learn all the parameters in stage 2 instead? Thanks very much for your explanation!

opened by RainbowShirlley 4
Regarding the datasets

Hi, Thank you again for your code release. I am puzzled by the following issues, which I'm hoping you can help me with : -> Places-LT has 62.5K examples, differently from the reported 184.5K images in the paper. Is the mistake in the paper or in the released dataset ? -> I am unable to reproduce the dataset statistics for ImageNet-LT and Places-LT using Zipf's law ( discrete Pareto distribution : https://en.wikipedia.org/wiki/Pareto_distribution, https://en.wikipedia.org/wiki/Zipf%27s_law ) with alpha=6 ( which seems rather high ). Moreover, the log-log plot is not completely linear in my opinion :

opened by ssfootball04 4
Could you please give me an example of arranging ILSVRC2014 dataset?

I have downloaded ILSVRC2014 and it looks like this: Then I untar the training set, it looks like this:
So how to arrange them? Could you give me an example?（For example, the COCO dataset is usually arranged like this: annotations、train2017、val2017)
I am in urgent need and many thanks to you~~

opened by TalentedMUSE 2
Unable to reproduce baseline result on ImageNet-LT

Hi, I would like to ask for the plain model of Resnet10 on ImageNet_LT, how many epochs did you use? Because if I set it to 90 epochs the performance is much better than the reported baseline performance. Thank you!

opened by Sonseca97 1
Error when running stage_1.py under Places_LT

Hi, I am going to train on my own dataset, I changed all the data path accordingly. But when I was trying to fine-tune resnet-152, I encountered the following error and don't know how to fix it. Any help will be appreciate!

Traceback (most recent call last): File "main.py", line 57, in training_model.train() File "/data/OLTR/OpenLongTailRecognition-OLTR/run_networks.py", line 192, in train for step, (inputs, labels, _) in enumerate(self.data['train']): File "/opt/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in next return self._process_data(data) File "/opt/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data data.reraise() File "/opt/anaconda3/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise raise self.exc_type(msg) TypeError: function takes exactly 5 arguments (1 given)

opened by Zheweiqiu 0
Applications for face recognition

Hi,

I have read the paper and wondering what features (meta-embedding or direct feature) did you use as a face recognition embedding.

Since megaface is an open set, each meta-embedding would be a close-zero vector, so it may not be right to use meta-embedding as the face recognition embedding, am i right?

opened by twmht 0

Owner

Zhongqi Miao

GitHub

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

42 Jan 7, 2023

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022

Unofficial PyTorch Implementation of AHDRNet (CVPR 2019)

AHDRNet-PyTorch This is the PyTorch implementation of Attention-guided Network for Ghost-free High Dynamic Range Imaging (CVPR 2019). The official cod

4 Sep 8, 2022

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

118 Dec 5, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

173 Dec 21, 2022

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

89 Dec 15, 2022

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

70 Nov 22, 2022

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

116 Dec 19, 2022

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Related tags

Overview

Large-Scale Long-Tailed Recognition in an Open World

Overview

Update notifications

Requirements

Data Preparation

Download Caffe Pre-trained Models for Places_LT Stage_1 Training

Getting Started (Training & Testing)

ImageNet-LT

Places-LT

Reproduced Benchmarks and Model Zoo (Updated on 03/05/2020)

ImageNet-LT Open-Set Setting

Places-LT Open-Set Setting

CAUTION

License and Citation

Comments

Owner

Zhongqi Miao

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

Unofficial PyTorch Implementation of AHDRNet (CVPR 2019)

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

《Single Image Reflection Removal Beyond Linearity》(CVPR 2019)

STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Code release for Universal Domain Adaptation(CVPR 2019)

Learning Correspondence from the Cycle-consistency of Time (CVPR 2019)

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.