Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Overview

Large-Scale Long-Tailed Recognition in an Open World

[Project] [Paper] [Blog]

Overview

Open Long-Tailed Recognition (OLTR) is the author's re-implementation of the long-tail recognizer described in:
"Large-Scale Long-Tailed Recognition in an Open World"
Ziwei Liu*Zhongqi Miao*Xiaohang ZhanJiayun WangBoqing GongStella X. Yu  (CUHK & UC Berkeley / ICSI)  in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019, Oral Presentation

Further information please contact Zhongqi Miao and Ziwei Liu.

Update notifications

  • 03/04/2020: We changed all valirables named selfatt to modulatedatt so that the attention module can be properly trained in the second stage for Places-LT. ImageNet-LT does not have this problem since the weights are not freezed. We have updated new results using fixed code, which is still better than reported. The weights are also updated. Thanks!
  • 02/11/2020: We updated configuration files for Places_LT dataset. The current results are a little bit higher than reported, even with updated F-measure calculation. One important thing to be considered is that we have unfrozon the model weights for the first stage training of Places-LT, which means it is not suitable for single-GPU training in most cases (we used 4 1080ti in our implementation). However, for the second stage, since the memory and center loss do not support multi-GPUs currently, please switch back to single-GPU training. Thank you very much!
  • 01/29/2020: We updated the False Positive calculation in util.py so that the numbers are normal again. The reported F-measure numbers in the paper might be a little bit higher than actual numbers for all baselines. We will update it as soon as possible. We have updated the new F-measure number in the following table. Thanks.
  • 12/19/2019: Updated modules with 'clone()' methods and set use_fc in ImageNet-LT stage-1 config to False. Currently, the results for ImageNet-LT is comparable to reported numbers in the paper (a little bit better), and the reproduced results are updated below. We also found the bug in Places-LT. We will update the code and reproduced results as soon as possible.
  • 08/05/2019: Fixed a bug in utils.py. Update re-implemented ImageNet-LT weights at the end of this page.
  • 05/02/2019: Fixed a bug in run_network.py so the models train properly. Update configuration file for Imagenet-LT stage 1 training so that the results from the paper can be reproduced.

Requirements

Data Preparation

NOTE: Places-LT dataset have been updated since the first version. Please download again if you have the first version.

  • First, please download the ImageNet_2014 and Places_365 (256x256 version). Please also change the data_root in main.py accordingly.

  • Next, please download ImageNet-LT and Places-LT from here. Please put the downloaded files into the data directory like this:

data
  |--ImageNet_LT
    |--ImageNet_LT_open
    |--ImageNet_LT_train.txt
    |--ImageNet_LT_test.txt
    |--ImageNet_LT_val.txt
    |--ImageNet_LT_open.txt
  |--Places_LT
    |--Places_LT_open
    |--Places_LT_train.txt
    |--Places_LT_test.txt
    |--Places_LT_val.txt
    |--Places_LT_open.txt

Download Caffe Pre-trained Models for Places_LT Stage_1 Training

  • Caffe pretrained ResNet152 weights can be downloaded from here, and save the file to ./logs/caffe_resnet152.pth

Getting Started (Training & Testing)

ImageNet-LT

  • Stage 1 training:
python main.py --config ./config/ImageNet_LT/stage_1.py
  • Stage 2 training:
python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py
  • Close-set testing:
python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py --test
  • Open-set testing (thresholding)
python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py --test_open
  • Test on stage 1 model
python main.py --config ./config/ImageNet_LT/stage_1.py --test

Places-LT

  • Stage 1 training (At this stage, multi-GPU might be necessary since we are finetuning a ResNet-152.):
python main.py --config ./config/Places_LT/stage_1.py
  • Stage 2 training (At this stage, only single-GPU is supported, please switch back to single-GPU training.):
python main.py --config ./config/Places_LT/stage_2_meta_embedding.py
  • Close-set testing:
python main.py --config ./config/Places_LT/stage_2_meta_embedding.py --test
  • Open-set testing (thresholding)
python main.py --config ./config/Places_LT/stage_2_meta_embedding.py --test_open

Reproduced Benchmarks and Model Zoo (Updated on 03/05/2020)

ImageNet-LT Open-Set Setting

Backbone Many-Shot Medium-Shot Few-Shot F-Measure Download
ResNet-10 44.2 35.2 17.5 44.6 model

Places-LT Open-Set Setting

Backbone Many-Shot Medium-Shot Few-Shot F-Measure Download
ResNet-152 43.7 40.2 28.0 50.0 model

CAUTION

The current code was prepared using single GPU. The use of multi-GPU can cause problems except for the first stage of Places-LT.

License and Citation

The use of this software is released under BSD-3.

@inproceedings{openlongtailrecognition,
  title={Large-Scale Long-Tailed Recognition in an Open World},
  author={Liu, Ziwei and Miao, Zhongqi and Zhan, Xiaohang and Wang, Jiayun and Gong, Boqing and Yu, Stella X.},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019}
}
Comments
  • Reproduce model results

    Reproduce model results

    Thanks for the inspiring work and code :)

    I'm having trouble to reproduce the results (plain model as well as final model on both datasets.) I have used the default settings without any alterations. Can you shed some insights on the results (perhaps this is caused by the hyper-parameters) and maybe if it is OK for you to provide the trained models for both stage1 and stage2?

    The results I have reproduced are as following:

    1. ImageNet-LT

    Stage1(close-setting): Evaluation_accuracy_micro_top1: 0.204 Averaged F-measure: 0.160 Many_shot_top1: 0.405; Median_shot_top1: 0.099; Low_shot_top1: 0.006

    Stage1(open-setting): Open-set Accuracy: 0.178 Evaluation_accuracy_micro_top1: 0.199 Averaged F-measure: 0.291 Many_shot_top1: 0.396; Median_shot_top1: 0.096; Low_shot_top1: 0.006

    Stage2(close-setting): Evaluation_accuracy_micro_top1: 0.339 Averaged F-measure: 0.322 Many_shot_top1: 0.411; Median_shot_top1: 0.330; Low_shot_top1: 0.167

    Stage2(open-setting): Open-set Accuracy: 0.245 Evaluation_accuracy_micro_top1: 0.327 Averaged F-measure: 0.455 Many_shot_top1: 0.398; Median_shot_top1: 0.318; Low_shot_top1: 0.159

    1. Places-LT

    Stage1(close-setting): Evaluation_accuracy_micro_top1: 0.268 Averaged F-measure: 0.248 Many_shot_top1: 0.442; Median_shot_top1: 0.221; Low_shot_top1: 0.058

    Stage1(open-setting): Open-set Accuracy: 0.018 Evaluation_accuracy_micro_top1: 0.267 Averaged F-measure: 0.373 Many_shot_top1: 0.441; Median_shot_top1: 0.219; Low_shot_top1: 0.057

    Stage2(close-setting): Evaluation_accuracy_micro_top1: 0.349 Averaged F-measure: 0.338 Many_shot_top1: 0.387; Median_shot_top1: 0.355; Low_shot_top1: 0.263

    Stage2(open-setting): Open-set Accuracy: 0.120 Evaluation_accuracy_micro_top1: 0.342 Averaged F-measure: 0.477 Many_shot_top1: 0.382; Median_shot_top1: 0.349; Low_shot_top1: 0.254

    bug 
    opened by JasAva 15
  • Code Error

    Code Error

    Hello, When I run python main.py --config ./config/Places_LT/stage_2_meta_embedding.py, there is an error.

    File "./models/MetaEmbeddingClassifier.py", line 33, in forward dist_cur = torch.norm(x_expand - centroids_expand, 2, 2) RuntimeError: The size of tensor a (365) must match the size of tensor b (122) at non-singleton dimension 1

    Here, I print the shape of x_expand and centroids_expand.

    torch.Size([86, 365, 512]) torch.Size([86, 122, 512])

    Could you give some advice to solve this problem?

    enhancement question 
    opened by AmingWu 15
  • Reproducing Plain Model Baseline Accuracies

    Reproducing Plain Model Baseline Accuracies

    Hi, Thank you for releasing the code for your paper. Can you please clarify how to reproduce the accuracies for the plain model baseline on ImageNet-LT in Table 3 of your paper ? I'm running the following commands : -> python main.py --config ./config/ImageNet_LT/stage_1.py -> python main.py --config ./config/ImageNet_LT/stage_1.py --test which gives me the following output : Evaluation_accuracy_micro_top1: 0.119 Averaged F-measure: 0.108 Many_shot_accuracy_top1: 0.148 Median_shot_accuracy_top1: 0.112 Low_shot_accuracy_top1: 0.062

    From the paper, the numbers should be : Evaluation_accuracy_micro_top1: 0.209 Many_shot_accuracy_top1: 0.409 Median_shot_accuracy_top1: 0.107 Low_shot_accuracy_top1: 0.004

    Stage 1 is simply the baseline resnet-10 training on the entire dataset, right, or am I missing something ?

    bug 
    opened by ssfootball04 9
  • Maybe wrong in F_measure calculation on openset

    Maybe wrong in F_measure calculation on openset

    https://github.com/zhmiao/OpenLongTailRecognition-OLTR/blob/master/utils.py#L88

    Seems it should be changed from

    false_pos += 1 if preds[i] != labels[i] and labels[i] != -1 and preds[i] != -1 else 0 to : false_pos += 1 if preds[i] != labels[i] and ((labels[i] != -1 and preds[i] != -1) or label[i] == -1) else 0

    opened by tuobay 7
  • Convolutional layers in ResNet152 are freezed in stage 1

    Convolutional layers in ResNet152 are freezed in stage 1

    @zhmiao After carefully reading the code, I find that the convolutional layers in ResNet152 are freezed during the stage #1’ training which resulting the training no meaning, So pls help to check. thanks

    question 
    opened by jchhuang 7
  • Question about centroids update

    Question about centroids update

    Thank you for releasing the code for this awesome work. I have a question about the centroids update. I have read the code and find that centroids only calculated once at the beginning of the model initialization at stage 2. Could you help me to find out how the centroids update? And I wonder that is the centroids correct because the parameters of attention is just initialized and the features of centroids are not learned.

    question 
    opened by xyy19920105 6
  • All the Backbone Net used by the compared methods are also frozen when training the datasets?

    All the Backbone Net used by the compared methods are also frozen when training the datasets?

    Here, I am curious about that the weights of Backbone Net used by the compared methods are also frozen when training the datasets?If so,I think it is not fair to compare their performances since their performances does not exhaust if their weights are frozen

    opened by jchhuang 5
  • The calculation of open-set F-measure

    The calculation of open-set F-measure

    Hi, I wonder if true positive, false positive and false negative are counted correctly. https://github.com/zhmiao/OpenLongTailRecognition-OLTR/blob/4a1f4009921b1c99029bfda151915058ff086a51/utils.py#L86-L89 Here are some examples according to the above code: (pairs of prediction and label)

    • class_a, class_a (TP)
    • class_b, class_a (FP)
    • -1, class_a (?)
    • class_a, -1 (FN)
    • -1, -1 (?)

    I'm confused about

    • why the 2nd example is counted as FP rather than FN? (FP means the label is negative, but the prediction is positive, so what is positive here)
    • why the 3nd example is not counted as FN?
    • is the last example TN or TP?
    question 
    opened by drcege 5
  • CosNormClassifier#L24

    CosNormClassifier#L24

    opened by sudabai666 5
  • why fix all parameters except self attention parameters?

    why fix all parameters except self attention parameters?

    Hi, dear author: I'm wondering why in stage2, you fix all parameters except self attention parameters? So you mean for feat model, we only use stage1 to learn the features, and in stage2, you only learn about self attention? Can we learn all the parameters in stage 2 instead? Thanks very much for your explanation!

    opened by RainbowShirlley 4
  • Regarding the datasets

    Regarding the datasets

    Hi, Thank you again for your code release. I am puzzled by the following issues, which I'm hoping you can help me with : -> Places-LT has 62.5K examples, differently from the reported 184.5K images in the paper. Is the mistake in the paper or in the released dataset ? -> I am unable to reproduce the dataset statistics for ImageNet-LT and Places-LT using Zipf's law ( discrete Pareto distribution : https://en.wikipedia.org/wiki/Pareto_distribution, https://en.wikipedia.org/wiki/Zipf%27s_law ) with alpha=6 ( which seems rather high ). Moreover, the log-log plot is not completely linear in my opinion : ImageNet-LT Places-LT

    opened by ssfootball04 4
  • Could you please give me an example of arranging ILSVRC2014 dataset?

    Could you please give me an example of arranging ILSVRC2014 dataset?

    I have downloaded ILSVRC2014 and it looks like this: image Then I untar the training set, it looks like this: image
    So how to arrange them? Could you give me an example?(For example, the COCO dataset is usually arranged like this: annotations、train2017、val2017)
    I am in urgent need and many thanks to you~~

    opened by TalentedMUSE 2
  • Unable to reproduce baseline result on ImageNet-LT

    Unable to reproduce baseline result on ImageNet-LT

    Hi, I would like to ask for the plain model of Resnet10 on ImageNet_LT, how many epochs did you use? Because if I set it to 90 epochs the performance is much better than the reported baseline performance. Thank you!

    opened by Sonseca97 1
  • Error when running stage_1.py under Places_LT

    Error when running stage_1.py under Places_LT

    Hi, I am going to train on my own dataset, I changed all the data path accordingly. But when I was trying to fine-tune resnet-152, I encountered the following error and don't know how to fix it. Any help will be appreciate!

    Traceback (most recent call last): File "main.py", line 57, in training_model.train() File "/data/OLTR/OpenLongTailRecognition-OLTR/run_networks.py", line 192, in train for step, (inputs, labels, _) in enumerate(self.data['train']): File "/opt/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in next return self._process_data(data) File "/opt/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data data.reraise() File "/opt/anaconda3/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise raise self.exc_type(msg) TypeError: function takes exactly 5 arguments (1 given)

    opened by Zheweiqiu 0
  • Applications for face recognition

    Applications for face recognition

    Hi,

    I have read the paper and wondering what features (meta-embedding or direct feature) did you use as a face recognition embedding.

    Since megaface is an open set, each meta-embedding would be a close-zero vector, so it may not be right to use meta-embedding as the face recognition embedding, am i right?

    opened by twmht 0
Owner
Zhongqi Miao
Zhongqi Miao
This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

Jiaqi Wang 42 Jan 7, 2023
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
Unofficial PyTorch Implementation of AHDRNet (CVPR 2019)

AHDRNet-PyTorch This is the PyTorch implementation of Attention-guided Network for Ghost-free High Dynamic Range Imaging (CVPR 2019). The official cod

Yutong Zhang 4 Sep 8, 2022
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

Zhengyuan Yang 118 Dec 5, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Sungha Choi 173 Dec 21, 2022
Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

Tong WU 89 Dec 15, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning        Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

FeiLong 116 Dec 19, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 264 Dec 23, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 309 Dec 30, 2022
《Single Image Reflection Removal Beyond Linearity》(CVPR 2019)

Single-Image-Reflection-Removal-Beyond-Linearity Paper Single Image Reflection Removal Beyond Linearity. Qiang Wen, Yinjie Tan, Jing Qin, Wenxi Liu, G

Qiang Wen 51 Jun 24, 2022
STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

STEAL This is the official inference code for: Devil Is in the Edges: Learning Semantic Boundaries from Noisy Annotations David Acuna, Amlan Kar, Sanj

null 469 Dec 26, 2022
The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video ?? | Colab demo Deep Exemplar-based Video Col

Bo Zhang 253 Dec 27, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

null 21 Nov 9, 2022
Code release for Universal Domain Adaptation(CVPR 2019)

Universal Domain Adaptation Code release for Universal Domain Adaptation(CVPR 2019) Requirements python 3.6+ PyTorch 1.0 pip install -r requirements.t

THUML @ Tsinghua University 229 Dec 23, 2022
Learning Correspondence from the Cycle-consistency of Time (CVPR 2019)

TimeCycle Code for Learning Correspondence from the Cycle-consistency of Time (CVPR 2019, Oral). The code is developed based on the PyTorch framework,

Xiaolong Wang 706 Nov 29, 2022
PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

PointRCNN PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud Code release for the paper PointRCNN:3D Object Proposal Generation a

Shaoshuai Shi 1.5k Dec 27, 2022