FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

FCOSR TensorRT inference code is available at: https://github.com/lzh420202/TensorRT_Inference

We add a multiprocess version DOTA2COCO into DOTA_devkit package, you could switch USE_MULTI_PROCESS to control the function in prepare_dota.py

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

benchmark

Details (Test device: nvidia RTX 2080ti)

Methods backbone FPS mAP(%)
ReDet ReR50 8.8 76.25
S2ANet Mobilenet v2 18.9 67.46
S2ANet R50 14.4 74.14
R3Det R50 9.2 71.9
Oriented-RCNN Mobilenet v2 21.2 72.72
Oriented-RCNN R50 13.8 75.87
Oriented-RCNN R101 11.3 76.28
RetinaNet-O Mobilenet v2 22.4 67.95
RetinaNet-O R50 16.5 72.7
RetinaNet-O R101 13.3 73.7
Faster-RCNN-O Mobilenet v2 23 67.41
Faster-RCNN-O R50 14.4 72.29
Faster-RCNN-O R101 11.4 72.65
FCOSR-S Mobilenet v2 23.7 74.05
FCOSR-M Rx50 14.6 77.15
FCOSR-L Rx101 7.9 77.39

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 74.05 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 76.11 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 77.15 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 79.25 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 77.39 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 78.80 model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 66.37 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 73.14 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 68.74 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 73.79 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 69.96 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 75.41 model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model backbone Rot. Sched. Param. Input GFLOPs FPS AP50(07) AP75(07) AP50(12) AP75(12) download
FCOSR-S Mobilenet v2 40k iters 7.29M 800×800 61.57 35.3 90.08 76.75 92.67 75.73 model/cfg
FCOSR-M ResNext50-32x4 40k iters 31.37M 800×800 127.87 26.9 90.15 78.58 94.84 81.38 model/cfg
FCOSR-L ResNext101-64x4 40k iters 89.61M 800×800 271.75 15.1 90.14 77.98 95.74 80.94 model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model backbone Head channels Sched. Param Size Input GFLOPs FPS mAP onnx TensorRT
FCOSR-lite Mobilenet v2 256 3x 6.9M 51.63MB 1024×1024 101.25 7.64 74.30 onnx trt
FCOSR-tiny Mobilenet v2 128 3x 3.52M 23.2MB 1024×1024 35.89 10.68 73.93 onnx trt

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

A part of Dota1.0 dataset (whole image mode) Code

name size patch size gap patches det objects det time(s)
P0031.png 5343×3795 1024 200 35 1197 2.75
P0051.png 4672×5430 1024 200 42 309 2.38
P0112.png 6989×4516 1024 200 54 184 3.02
P0137.png 5276×4308 1024 200 35 66 1.95
P1004.png 7001×3907 1024 200 45 183 2.52
P1125.png 7582×4333 1024 200 54 28 2.95
P1129.png 4093×6529 1024 200 40 70 2.23
P1146.png 5231×4616 1024 200 42 64 2.29
P1157.png 7278×5286 1024 200 63 184 3.47
P1378.png 5445×4561 1024 200 42 83 2.32
P1379.png 4426×4182 1024 200 30 686 1.78
P1393.png 6072×6540 1024 200 64 893 3.63
P1400.png 6471×4479 1024 200 48 348 2.63
P1402.png 4112×4793 1024 200 30 293 1.68
P1406.png 6531×4182 1024 200 40 19 2.19
P1415.png 4894x4898 1024 200 36 190 1.99
P1436.png 5136×5156 1024 200 42 39 2.31
P1448.png 7242×5678 1024 200 63 51 3.41
P1457.png 5193×4658 1024 200 42 382 2.33
P1461.png 6661×6308 1024 200 64 27 3.45
P1494.png 4782×6677 1024 200 48 70 2.61
P1500.png 4769×4386 1024 200 36 92 1.96
P1772.png 5963×5553 1024 200 49 28 2.70
P1774.png 5352×4281 1024 200 35 291 1.95
P1796.png 5870×5822 1024 200 49 308 2.74
P1870.png 5942×6059 1024 200 56 135 3.04
P2043.png 4165×3438 1024 200 20 1479 1.49
P2329.png 7950×4334 1024 200 60 83 3.26
P2641.png 7574×5625 1024 200 63 269 3.41
P2642.png 7039×5551 1024 200 63 451 3.50
P2643.png 7568×5619 1024 200 63 249 3.40
P2645.png 4605×3442 1024 200 24 357 1.42
P2762.png 8074×4359 1024 200 60 127 3.23
P2795.png 4495×3981 1024 200 30 65 1.64
Comments
  • 论文结果

    论文结果

    您好,非常好的工作!这里有几个问题想向您请教下。

    1. 请问论文中表格4-6的结果是在验证集上测试的呢?还是test结果?代码上的验证过程是基于DOTA1_0_trainval1024.json 和HRSC_L1_train.json的

    2. 在做完执行完prepare_dota.py之后,我发现不同类别之间instance数量的差异还是很大的(这里Gaps设置的是200)?不知道这个问题您有没有遇到过? prepare_dota.py 的setting 如下:

    Sub patch size: 1024
    Gaps: 200
    Data type: dota10
    Processor number: 16
    Multi scale: False
    ------------------------------
    padding: True
    

    详细的类别实例个数的统计如下: [(1, 19647), (2, 1136), (3, 4317), (4, 832), (5, 59578), (6, 43930), (7, 77605), (8, 6093), (9, 1258), (10, 13941), (11, 925), (12, 1002), (13, 16286), (14, 3925), (15, 1260)]

    opened by bestzsq 4
  • hrsc to coco format

    hrsc to coco format

    hello, thanks for your awesome work : ) I am trying to convert the HRSC dataset to coco format so that I can train with it. Is there any tool available in the repo to do that? I have found this tool:

    python tools/dataset_converters/pascal_voc.py ${DEVKIT_PATH} [-h] [-o ${OUT_DIR}]
    

    However, when I run it, it says:

    Traceback (most recent call last):
      File "tools/dataset_converters/pascal_voc.py", line 236, in <module>
        main()
      File "tools/dataset_converters/pascal_voc.py", line 210, in main
        raise IOError(f'The devkit path {devkit_path} contains neither '
    OSError: The devkit path data/HRSC2016/FullDataSet/ contains neither "VOC2007" nor "VOC2012" subfolder
    

    Do you have any tip for that?

    opened by geobao 3
  • Trying to replicate an experiment

    Trying to replicate an experiment

    Hi there,

    I train by leaving the defaults. Specifically, I used:

    ./tools/dist_train.sh configs/fcosrbox/fcosr_rx50_32x4d_fpn_3x_dota10_single.py 0,1,2,3 --seed 129 --deterministic
    

    Then I measure mAP by executing:

    python tools/test.py configs/fcosrbox/fcosr_rx50_32x4d_fpn_3x_dota10_single.py work_dirs/DOTA10/FCOSR-M/FCOSR_rx50_32x4d_fpn_3x_dota10_single/latest.pth --eval mAP
    

    My result is:

    npos num: 0
    {'iou_50': {'mAp': 0.0, 'detail': {'plane': 0.0, 'baseball-diamond': 0.0, 'bridge': 0.0, 'ground-track-field': 0.0, 'small-vehicle': 0.0, 'large-vehicle': 0.0, 'ship': 0.0, 'tennis-court': 0.0, 'basketball-court': 0.0, 'storage-tank': 0.0, 'soccer-ball-field': 0.0, 'roundabout': 0.0, 'harbor': 0.0, 'swimming-pool': 0.0, 'helicopter': 0.0}}}
    

    What am I missing? I have tried with 3 seeds: 98 , 8 and 129. With the same result

    opened by geobao 3
  • onnx

    onnx

    作者你好,感谢你的工作,我对无锚的遥感检测也比较感兴趣,想尝试着复现工作。但是遇到了这个问题。我创建了虚拟环境 安装了pytorch、torchvision.编译了 FCOSR和DOTA_devkit. 结果我python train.py config时 出现了onnx这个属性没有?我是遗漏了什么工作吗?我不能直接在pytorch上实验吗?后面的TensorRT我没用到 是不是这个原因呢? T65(GLTNU(~Q`K3EHDLNIXT

    opened by yzk-lab 3
  • DOTA test results,what's the difference between '*' and '**'?

    DOTA test results,what's the difference between '*' and '**'?

    Hi, nice work!

    In paper, there is '*'indicates multi-scale training and testing,'**' means rotation test mode during multi-scale testing,what's the difference between '*' and '**'? and how to set rotation test mode during multi-scale testing?

    Thanks!

    opened by bestzsq 1
  • error of DOTA_devkit/prepare_dota.py

    error of DOTA_devkit/prepare_dota.py

    When I use DOTA_devkit, error occured: Traceback (most recent call last): File "DOTA_devkit/prepare_dota.py", line 195, in print_arg(args) File "DOTA_devkit/prepare_dota.py", line 34, in print_arg if scales.scales: NameError: name 'scales' is not defined

    replace 'scales' with 'args'

    opened by gys1287009045 1
  • Where is Multi-level sampling strategy (MLS)  in the code?

    Where is Multi-level sampling strategy (MLS) in the code?

    Hello author, thank you for your work. I want to learn the Multi-level sampling strategy(MLS) part in detail, but I can't find it in the code. Please tell me where MLS is in the code, thanks!

    opened by youranran 1
  • Head头内等函数疑惑

    Head头内等函数疑惑

    作者大佬您好! 拜读了您的论文后,备受启发!这两天在阅读您的FCOSRboxHead代码。但是,为何line720:gt_bboxes = self.rotate2rect(gt_rboxes);line729: ngds_score = self.get_ngds_score(xs, ys, gt_rboxes, mode='shrink', version='v2');line730:gds_score = self.get_gds_score(xs, ys, gt_rboxes, mode='shrink', refined=True);line733:inside_gt_rbox_mask, gt_rboxes_idx = self.get_rotate_inside_mask_with_gds(xs, ys, gt_rboxes, 0.23, ngds_score, True)等这些函数里面全是空?另外:还有一个问题,在训练的时候,损失都很正常,但是训练完每一个epoch进行验证的时候,各项指标均为0?我是按照作者get_start里对dota数据集进行划分的。 盼回复!祝生活愉快!

    opened by chentp-1183 0
  • 卡住不动,训练不了

    卡住不动,训练不了

    2022-06-07 19:45:48,620 - mmdet - INFO - initialize MobileNetV2_N with init_cfg {'type': 'Pretrained', 'checkpoint': 'open-mmlab://mmdet/mobilenet_v2'} 2022-06-07 19:45:48,620 - mmcv - INFO - load model from: open-mmlab://mmdet/mobilenet_v2 2022-06-07 19:45:48,621 - mmcv - INFO - Use load_from_openmmlab loader 2022-06-07 19:45:48,637 - mmcv - WARNING - The model and loaded state dict do not match exactly

    unexpected key in source state_dict: conv2.conv.weight, conv2.bn.weight, conv2.bn.bias, conv2.bn.running_mean, conv2.bn.running_var, conv2.bn.num_batches_tracked

    2022-06-07 19:45:48,642 - mmdet - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'} 2022-06-07 19:45:48,645 - mmdet - INFO - initialize FCOSRboxHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01, 'override': {'type': 'Normal', 'name': 'fcos_cls', 'std': 0.01, 'bias_prob': 0.01}} loading annotations into memory... Done (t=0.98s) creating index... index created!

    opened by longzeyilang 3
  • what is the argument regress_weight for?

    what is the argument regress_weight for?

    你好, I would like to know what is the meaning of this line:

    regress_weight=dict(type='iou')),
    

    It is in the config file, here I find it confusing because the regression loss is already set as ProbiouLoss with mode L1 and loss_weight=1 as per:

    regress=[dict(type='ProbiouLoss', mode='l1', loss_weight=1.0)],
    

    So what is the argument regress_weight for?

    谢谢

    opened by geobao 0
Owner
null
Aerial Imagery dataset for fire detection: classification and segmentation (Unmanned Aerial Vehicle (UAV))

Aerial Imagery dataset for fire detection: classification and segmentation using Unmanned Aerial Vehicle (UAV) Title FLAME (Fire Luminosity Airborne-b

null 78 Nov 29, 2022
DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection

DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection Code for our Paper DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Obje

Steven Lang 57 Dec 7, 2022
Mmrotate - OpenMMLab Rotated Object Detection Benchmark

OpenMMLab website HOT OpenMMLab platform TRY IT OUT ?? Documentation | ??️ Insta

OpenMMLab 1.2k Dec 4, 2022
A simple python module to generate anchor (aka default/prior) boxes for object detection tasks.

PyBx WIP A simple python module to generate anchor (aka default/prior) boxes for object detection tasks. Calculated anchor boxes are returned as ndarr

thatgeeman 4 Oct 23, 2022
Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

Lane Follower This code is for the lane follower, including perception and control, as shown below. Environment Hardware Industrial Camera Intel-NUC(1

Siqi Fan 3 Jul 7, 2022
CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices.

CenterFace Introduce CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices. Recent Update 2019.09.

StarClouds 1.2k Nov 27, 2022
Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

ming71 215 Nov 28, 2022
Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

null 51 Sep 22, 2022
Tiny Object Detection in Aerial Images.

AI-TOD AI-TOD is a dataset for tiny object detection in aerial images. [Paper] [Dataset] Description AI-TOD comes with 700,621 object instances for ei

jwwangchn 115 Dec 1, 2022
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

Adam Van Etten 143 Dec 5, 2022
A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

A Pytorch Implementation of Source data‐free domain adaptation of object detector through domain‐specific perturbation Please follow Faster R-CNN and

null 1 Dec 25, 2021
Code for CVPR 2021 paper: Anchor-Free Person Search

Introduction This is the implementationn for Anchor-Free Person Search in CVPR2021 License This project is released under the Apache 2.0 license. Inst

null 156 Nov 26, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 143 Nov 24, 2022
A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our report on Arxiv.

null 7.6k Dec 5, 2022
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

null 7.6k Dec 3, 2022
The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

FAPIS The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter Introduction This repo is primari

Khoi Nguyen 7 Sep 20, 2022
Yoloxkeypointsegment - An anchor-free version of YOLO, with a simpler design but better performance

Introduction 关键点版本:已完成 全景分割版本:已完成 实例分割版本:已完成 YOLOX is an anchor-free version of

null 23 Oct 20, 2022
A whale detector design for the Kaggle whale-detector challenge!

CNN (InceptionV1) + STFT based Whale Detection Algorithm So, this repository is my PyTorch solution for the Kaggle whale-detection challenge. The obje

Tarin Ziyaee 92 Sep 28, 2021
HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

null 1 Jan 5, 2022