Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

CASIA-IVA-Lab

Last update: Dec 4, 2022

Related tags

Deep Learning ACSL

Overview

Adaptive Class Suppression Loss for Long-Tail Object Detection

This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection. [Paper]

Requirements

1. Environment:

The requirements are exactly the same as BalancedGroupSoftmax. We tested on the following settings:

python 3.7
cuda 10.0
pytorch 1.2.0
torchvision 0.4.0
mmcv 0.2.14

conda create -n mmdet python=3.7 -y
conda activate mmdet

pip install cython
pip install numpy
pip install torch
pip install torchvision
pip install pycocotools
pip install matplotlib
pip install terminaltables

# download the source code of mmcv 0.2.14 from https://github.com/open-mmlab/mmcv/tree/v0.2.14
cd mmcv-0.2.14
pip install -v -e .
cd ../

git clone https://github.com/CASIA-IVA-Lab/ACSL.git

cd ACSL/lvis-api/
python setup.py develop

cd ../
python setup.py develop

2. Data:

a. For dataset images:

# Make sure you are in dir ACSL

mkdir data
cd data
mkdir lvis
mkdir pretrained_models
mkdir download_models

If you already have COCO2017 dataset, it will be great. Link train2017 and val2017 folders under folder lvis.
If you do not have COCO2017 dataset, please download: COCO train set and COCO val set and unzip these files and mv them under folder lvis.

b. For dataset annotations:

Download lvis annotations: lvis_train_ann and lvis_val_ann.
Unzip all the files and put them under lvis,

c. For pretrained models:

Download the corresponding pre-trained models below.

To train baseline models, we need models trained on COCO to initialize. Please download the corresponding COCO models at mmdetection model zoo.
Move these model files to ./data/pretrained_models/

d. For download_models:

Download the trained baseline models and ACSL models from BaiduYun, code is 2jp3

To train ACSL models, we need corresponding baseline models trained on LVIS to initialize and fix all parameters except for the last FC layer.
Move these model files to ./data/download_models/

After all these operations, the folder data should be like this:

    data
    ├── lvis
    │   ├── lvis_v0.5_train.json
    │   ├── lvis_v0.5_val.json
    │   ├── train2017
    │   │   ├── 000000100582.jpg
    │   │   ├── 000000102411.jpg
    │   │   ├── ......
    │   └── val2017
    │       ├── 000000062808.jpg
    │       ├── 000000119038.jpg
    │       ├── ......
    └── pretrained_models
    │       ├── faster_rcnn_r50_fpn_2x_20181010-443129e1.pth
    │       ├── ......
    └── download_models
            ├── R50-baseline.pth
            ├── ......

Training

Note: Please make sure that you have prepared the pretrained_models and the download_models and they have been put to the path specified in ${CONIFG_FILE}.

Use the following commands to train a model.

# Single GPU
python tools/train.py ${CONFIG_FILE}

# Multi GPU distributed training
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

All config files are under ./configs/.

./configs/baselines: all baseline models.
./configs/acsl: models for ACSL models.

For example, to train a ACSL model with Faster R-CNN R50-FPN:

# Single GPU
python tools/train.py configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py

# Multi GPU distributed training (for 8 gpus)
./tools/dist_train.sh configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py 8

Important: The default learning rate in config files is for 8 GPUs and 2 img/gpu (batch size = 8*2 = 16). According to the Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu. (Cited from mmdetection.)

Testing

Use the following commands to test a trained model.

# single gpu test
python tools/test_lvis.py \
 ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

# multi-gpu testing
./tools/dist_test_lvis.sh \
 ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

$RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.

$EVAL_METRICS: Items to be evaluated on the results. bbox for bounding box evaluation only. bbox segm for bounding box and mask evaluation.

For example (assume that you have finished the training of ACSL models.):

To evaluate the trained ACSL model with Faster R-CNN R50-FPN for object detection:

# single-gpu testing
python tools/test_lvis.py configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py \
 ./work_dirs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl/epoch_12.pth \
  --out acsl_val_result.pkl --eval bbox

# multi-gpu testing (8 gpus)
./tools/dist_test_lvis.sh configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py \
./work_dirs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl/epoch_12.pth 8 \
--out acsl_val_result.pkl --eval bbox

Results and models

Please refer to our paper for more details.

Method	Models	bbox mAP	Config file	Pretrained Model	Model
baseline	R50-FPN	21.18	file	COCO-R50	R50-baseline
ACSL	R50-FPN	26.36	file	R50-baseline	R50-acsl
baseline	R101-FPN	22.36	file	COCO-R101	R101-baseline
ACSL	R101-FPN	27.49	file	R101-baseline	R101-acsl
baseline	X101-FPN	24.70	file	COCO-X101	X101-baseline
ACSL	X101-FPN	28.93	file	X101-baseline	X101-acsl
baseline	Cascade-R101	25.14	file	COCO-Cas-R101	Cas-R101-baseline
ACSL	Cascade-R101	29.71	file	Cas-R101-baseline	Cas-R101-acsl
baseline	Cascade-X101	27.14	file	COCO-Cas-X101	Cas-X101-baseline
ACSL	Cascade-X101	31.47	file	Cas-X101-baseline	Cas-X101-acsl

Important: The code of BaiduYun is 2jp3

Citation

@inproceedings{wang2021adaptive,
  title={Adaptive Class Suppression Loss for Long-Tail Object Detection},
  author={Wang, Tong and Zhu, Yousong and Zhao, Chaoyang and Zeng, Wei and Wang, Jinqiao and Tang, Ming},
  journal={CVPR},
  year={2021}
}

Credit

This code is largely based on BalancedGroupSoftmax and mmdetection v1.0.rc0 and LVIS API.

Comments

question about wi

dear authors: Thank you fou your excellent work, and I read the article today. But I have a quention about Wi as shown in Equation (5): why Wi = 1 means generate negative gradients? I am looking forward to your replay.

opened by WilyZhao8 2
question about sigmoid and background category

dear authors: Thank you fou your excellent work~ i have a question about the sigmoid and softmax, why this paper use sigmoid instead of softmax, and How to deal with background categories. looking forward your reply！ @changewt

opened by Never-Walk-Away 0
About the bg sample

thanks your excellent paper! there have a question about the process of background. in RQL LOSS, it use a parameter E(r) to judge the region proposal r is foreground or backgroumd. and then separate treatment. so i wonder know how to use the background and foreground in your work! I look forward to your reply～～

opened by Never-Walk-Away 0
Some questions about the loss

hello! Thank you for your great work and your idea is really excellent. But there are still some questions I don't understand. In BCE loss and the gradient in your paper are formulated as the following respectively: However, the formulation of BCE that I learned is like this: So, the if i!=k, the loss yi brings would be zero. Thanks a lot! best wishes

opened by Kingofolk 0
some question

Hello, author! I want to ask you why you should use the decoupling fine-tuning training method to carry out the experiment. Can't you train 24 epochs end-to-end? Or is the effect of end-to-end training worse? In addition, can fine-tuning training be regarded as a kind of tick?

opened by sanmulab 0

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Related tags

Overview

Adaptive Class Suppression Loss for Long-Tail Object Detection

Requirements

1. Environment:

2. Data:

a. For dataset images:

b. For dataset annotations:

c. For pretrained models:

d. For download_models:

Training

Testing

Results and models

Citation

Credit

Comments

question about wi

question about sigmoid and background category

About the bg sample

Some questions about the loss

some question

Owner

CASIA-IVA-Lab

PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Official implementation of Protected Attribute Suppression System, ICCV 2021

Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

OpenLT: An open-source project for long-tail classification

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".

An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

Code for the TIP 2021 Paper "Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss"

The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

[CVPR 2021] Official PyTorch Implementation for "Iterative Filter Adaptive Network for Single Image Defocus Deblurring"

Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)