[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

EMI-Group

Last update: Dec 30, 2022

Related tags

Deep Learning object-detection semantic-segmentation instance-segmentation panoptic-segmentation real-time-semantic-segmentation feature-alignment

Overview

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page]

@inproceedings{
  huang2021fapn,
  title={{FaPN}: Feature-aligned Pyramid Network for Dense Image Prediction},
  author={Shihua Huang and Zhichao Lu and Ran Cheng and Cheng He},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2021}
}

Overview

FaPN vs. FPN	Before vs. After Alignment

This project provides the official implementation for our ICCV2021 paper "FaPN: Feature-aligned Pyramid Network for Dense Image Prediction" based on Detectron2. FaPN is a simple yet effective top-down pyramidal architecture to generate multi-scale features for dense image prediction. Comprised of a feature alignment module (FAM) and a feature selection module (FSM), FaPN addresses the issue of feature alignment in the original FPN, leading to substaintial improvements on various dense prediction tasks, such as object detection, semantic, instance, panoptic segmentation, etc.

Installation

This project is based on Detectron2, which can be constructed as follows.

Install Detectron2 following the instructions.
Setup the dataset following the structure.
Copy this project to /path/to/detectron2
Install DCNv2 following Install DCNv2.md.

Training

To train a model with 8 GPUs, run:

cd /path/to/detectron2/tools
python3 train_net.py --config-file <config.yaml> --num-gpus 8

For example, to launch Faster R-CNN training (1x schedule) with ResNet-50 backbone on 8 GPUs, one should execute:

cd /path/to/detectron2/tools
python3 train_net.py --config-file ../configs\COCO-Detection\faster_rcnn_R_50_FAN_1x.yaml --num-gpus 8

Evaluation

To evaluate a pre-trained model with 8 GPUs, run:

cd /path/to/detectron2/tools
python3 train_net.py --config-file <config.yaml> --num-gpus 8 --eval-only MODEL.WEIGHTS /path/to/model_checkpoint

Results

COCO Object Detection

Faster R-CNN + FaPN:

Name	lr sched	box AP	box APs	box APm	box APl	download
R50	1x	39.2	24.5	43.3	49.1	model \| log
R101	3x	42.8	27.0	46.2	54.9	model \| log

Cityscapes Semantic Segmentation

PointRend + FaPN:

Name	lr sched	mask mIoU	mask i_IoU	mask IoU_sup	mask iIoU_sup	download
R50	1x	80.0	61.3	90.6	78.5	model \| log
R101	1x	80.1	62.2	90.8	78.6	model \| log

COCO Instance Segmentation

Mask R-CNN + FaPN:

Name	lr sched	mask AP	mask APs	box AP	box APs	download
R50	1x	36.4	18.1	39.8	24.3	model \| log
R101	3x	39.4	20.9	43.8	27.4	model \| log

PointRend + FaPN:

Name	lr sched	mask AP	mask APs	box AP	box APs	download
R50	1x	37.6	18.6	39.4	24.2	model \| log

COCO Panoptic Segmentation

PanopticFPN + FaPN:

Name	lr sched	PQ	mask mIoU	St PQ	box AP	Th PQ	download
R50	1x	41.1	43.4	32.5	38.7	46.9	model \| log
R101	3x	44.2	45.7	35.0	43.0	53.3	model \| log

Comments

Multiplying upsampled features by 2

Could you please explain what is the reason for multiplying feat_up by 2 here https://github.com/EMI-Group/FaPN/blob/main/detectron2/modeling/backbone/fan.py#L66 ?

opened by shkarupa-alex 3
Is that possible replace dcnv2 plugin with torchvision dcn API?
Hi, wonder:

Is that possible replace dcnv2 plugin with torchvision dcn API?

Can dcnv2 replaced with normal conv since dcn is not well supported in production. Also, will the performance drop a lot using normal conv?
opened by luohao123 1
Can you give me this model?

Hello, I am very interested in your FaPN model. Seeing that the FPS of the COCO dataset is very high in semantic segmentation, can you give me this model? I want to reproduce it.

opened by Icebinge 1
Error when train model

Hi, I follow your instructions to arrange the project. However, this error still occurs. Could you help me with this issue?

KeyError: "No object named 'build_resnet_fan_backbone' found in 'BACKBONE' registry!"

opened by LeoniusChen 0
the output channals of FeatureAlign_V2

self.dcpack_L2 = dcn_v2(out_nc, out_nc, 3, stride=1, padding=1, dilation=1, deformable_groups=8, extra_offset_mask=True) why out_nc is 256, not 216(3 x kernel_size x kernal_size x deformable_groups) ?

opened by ChengYi1996 1
some errors

I want to use fapn structure in my code, but " File "/databank/home/DCNv2/dcn_v2.py", line 43, in forward ctx.deformable_groups, RuntimeError: expected scalar type Float but found Half" appears，How can I solve it? My version is cuda10.1, python3.7,pytoch 1.7.0

opened by Piplebobble 1
Codes about FeatureAlign_V2

In moduleFeatureAlign_v2 offset = self.offset(torch.cat([feat_arm, feat_up * 2], dim=1)) # concat for offset by compute the dif Why multiple feat_up by 2?

opened by LightningChan 3
Ablation Study about other FPN-like module?

Hi, thank you for your contribution. It is a good work.

I have a small question. Have you done any ablation studies about other FPN-like modules? i.e. FPN-NAP, PAN, BIFPN, etc. It will be much more convincable if you provide such result.

Thank you.

opened by markson14 1

Owner

EMI-Group

The Evolving Machine Intelligence (EMI) Group, established in 2018, is motivated to understand how evolution generates complexity, diversity and intelligence.

GitHub http://www.shihuahuang.cn/fapn/

Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Adaptive Task-Relational Context (ATRC) This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense

35 Dec 5, 2022

Learning RAW-to-sRGB Mappings with Inaccurately Aligned Supervision (ICCV 2021)

Learning RAW-to-sRGB Mappings with Inaccurately Aligned Supervision (ICCV 2021) PyTorch implementation of Learning RAW-to-sRGB Mappings with Inaccurat

53 Dec 20, 2022

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

110 Dec 20, 2022

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

110 Dec 20, 2022

The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21) By Zhuofan Zong, Qianggang Cao, Biao Leng Introduction F

9 Jul 30, 2022

Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection

fpn.pytorch Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection Introduction This project inherits the property of our pytorc

912 Dec 21, 2022

Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

Unified-EPT Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation. Installation Linux, CUDA>=10.0,

29 Aug 23, 2022

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo