The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

TempleX

Last update: Jul 30, 2022

Related tags

Deep Learning RCNet

Overview

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21)

By Zhuofan Zong, Qianggang Cao, Biao Leng

Introduction

Feature pyramid networks (FPN) are widely exploited for multi-scale feature fusion in existing advanced object detection frameworks. Numerous previous works have developed various structures for bidirectional feature fusion, all of which are shown to improve the detection performance effectively. We observe that these complicated network structures require feature pyramids to be stacked in a fixed order, which introduces longer pipelines and reduces the inference speed. Moreover, semantics from non-adjacent levels are diluted in the feature pyramid since only features at adjacent pyramid levels are merged by the local fusion operation in a sequence manner. To address these issues, we propose a novel architecture named RCNet, which consists of Reverse Feature Pyramid (RevFP) and Cross-scale Shift Network (CSN). RevFP utilizes local bidirectional feature fusion to simplify the bidirectional pyramid inference pipeline. CSN directly propagates representations to both adjacent and non-adjacent levels to enable multi-scale features more correlative. Extensive experiments on the MS COCO dataset demonstrate RCNet can consistently bring significant improvements over both one-stage and two-stage detectors with subtle extra computational overhead. In particular, RetinaNet is boosted to 40.2 AP, which is 3.7 points higher than baseline, by replacing FPN with our proposed model. On COCO test-dev, RCNet can achieve very competitive performance with a single-model single-scale 50.5 AP.

Models

Pretrained models will be available.

Training and Testing

This project is based on mmdetection. Please follow mmdetection on how to install and use this repo. Config files can be found in configs/rcnet/.

Results on MS COCO

Detector	Backbone	Neck	Lr schd	mAP(val)	mAP(test)
RetinaNet	R50	RCNet	1x	40.2	-
ATSS	R50	RCNet	1x	42.6	-
GFL	R50	RCNet	1x	43.1	-
GFL	R101	RCNet	2x	47.1	47.4
GFL	X101-64x4d	RCNet	2x	48.9	49.2
GFL	X101-64x4d-DCN	RCNet	2x	50.2	50.5

Citations

If you find RCNet useful in your research, please consider citing:

@inproceedings{zong2021rcnet,
author = {Zong, Zhuofan and Cao, Qianggang and Leng, Biao},
title = {RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection},
booktitle = {ACM MM},
pages = {5637–5645},
year = {2021}
}

License

This project is released under the Apache 2.0 license

Comments

could you please share the code of rcfpn on faster r-cnn?

In FPN, P6 is obtained by max-pooling P5 with stride 2 after top-down and lateral connections. In other word, we have C2, C3, C4 and C5 to bulid ReFPN and produce P2, P3, P4 and P5 correspondingly. Since there is a CSN module in RCFPN, P6 should be dirved from the processed P5. So, could you tell how to reimplement CSN in this case? Thanks.

opened by HB-X 2

[ACM MM 2021] Yes, "Attention is All You Need", for Exemplar based Colorization

Transformer for Image Colorization This is an implemention for Yes, "Attention Is All You Need", for Exemplar based Colorization, and the current soft

30 Dec 7, 2022

How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

LETGAN How to Learn a Domain Adaptive Event Simulator? ACM MM 2021 Running Environment: pytorch=1.4, 1 NVIDIA-1080TI. More details can be found in pap

4 Sep 20, 2022

Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

Bib Parser Convenient script to parse .bib files with the ACM Digital Library li

1 Jan 26, 2022

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

66 Dec 26, 2022

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

33 Dec 2, 2022

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

46 Dec 14, 2022

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

OpenVINO Inference API This is a repository for an object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operati

68 Nov 24, 2022

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

76 Dec 22, 2022

The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

Related tags

Overview

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21)

Introduction

Models

Training and Testing

Results on MS COCO

Citations

License

You might also like...

[ACM MM 2021] Yes, "Attention is All You Need", for Exemplar based Colorization

How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Comments

could you please share the code of rcfpn on faster r-cnn?

Owner

TempleX

Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021.

The official project of SimSwap (ACM MM 2020)

Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers