Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Overview

Receptive Field Block Net for Accurate and Fast Object Detection

By Songtao Liu, Di Huang, Yunhong Wang

Updatas (2021/07/23): YOLOX is here!, stronger YOLO with ONNX, TensorRT, ncnn, and OpenVino supported!!

Updates: we propose a new method to get 42.4 mAP at 45 FPS on COCO, code is available here

Introduction

Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the discriminability and robustness of features. We further assemble the RFB module to the top of SSD with a lightweight CNN model, constructing the RFB Net detector. You can use the code to train/evaluate the RFB Net for object detection. For more details, please refer to our ECCV paper.

   

VOC2007 Test

System mAP FPS (Titan X Maxwell)
Faster R-CNN (VGG16) 73.2 7
YOLOv2 (Darknet-19) 78.6 40
R-FCN (ResNet-101) 80.5 9
SSD300* (VGG16) 77.2 46
SSD512* (VGG16) 79.8 19
RFBNet300 (VGG16) 80.7 83
RFBNet512 (VGG16) 82.2 38

COCO

System test-dev mAP Time (Titan X Maxwell)
Faster R-CNN++ (ResNet-101) 34.9 3.36s
YOLOv2 (Darknet-19) 21.6 25ms
SSD300* (VGG16) 25.1 22ms
SSD512* (VGG16) 28.8 53ms
RetinaNet500 (ResNet-101-FPN) 34.4 90ms
RFBNet300 (VGG16) 30.3 15ms
RFBNet512 (VGG16) 33.8 30ms
RFBNet512-E (VGG16) 34.4 33ms

MobileNet

System COCO minival mAP #parameters
SSD MobileNet 19.3 6.8M
RFB MobileNet 20.7 7.4M

Citing RFB Net

Please cite our paper in your publications if it helps your research:

@InProceedings{Liu_2018_ECCV,
author = {Liu, Songtao and Huang, Di and Wang, andYunhong},
title = {Receptive Field Block Net for Accurate and Fast Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Contents

  1. Installation
  2. Datasets
  3. Training
  4. Evaluation
  5. Models

Installation

  • Install PyTorch-0.4.0 by selecting your environment on the website and running the appropriate command.
  • Clone this repository. This repository is mainly based on ssd.pytorch and Chainer-ssd, a huge thank to them.
    • Note: We currently only support PyTorch-0.4.0 and Python 3+.
  • Compile the nms and coco tools:
./make.sh

Note: Check you GPU architecture support in utils/build.py, line 131. Default is:

'nvcc': ['-arch=sm_52',
  • Then download the dataset by following the instructions below and install opencv.
conda install opencv

Note: For training, we currently support VOC and COCO.

Datasets

To make things easy, we provide simple VOC and COCO dataset loader that inherits torch.utils.data.Dataset making it fully compatible with the torchvision.datasets API.

VOC Dataset

Download VOC2007 trainval & test
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>
Download VOC2012 trainval
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

COCO Dataset

Install the MS COCO dataset at /path/to/coco from official website, default is ~/data/COCO. Following the instructions to prepare minival2014 and valminusminival2014 annotations. All label files (.json) should be under the COCO/annotations/ folder. It should have this basic structure

$COCO/
$COCO/cache/
$COCO/annotations/
$COCO/images/
$COCO/images/test2015/
$COCO/images/train2014/
$COCO/images/val2014/

UPDATE: The current COCO dataset has released new train2017 and val2017 sets which are just new splits of the same image sets.

Training

mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
  • To train RFBNet using the train script simply specify the parameters listed in train_RFB.py as a flag or manually change them.
python train_RFB.py -d VOC -v RFB_vgg -s 300 
  • Note:
    • -d: choose datasets, VOC or COCO.
    • -v: choose backbone version, RFB_VGG, RFB_E_VGG or RFB_mobile.
    • -s: image size, 300 or 512.
    • You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see train_RFB.py for options)
    • If you want to reproduce the results in the paper, the VOC model should be trained about 240 epoches while the COCO version need 130 epoches.

Evaluation

To evaluate a trained network:

python test_RFB.py -d VOC -v RFB_vgg -s 300 --trained_model /path/to/model/weights

By default, it will directly output the mAP results on VOC2007 test or COCO minival2014. For VOC2012 test and COCO test-dev results, you can manually change the datasets in the test_RFB.py file, then save the detection results and submitted to the server.

Models

Comments
  • 请教下,我用新的训练集进行训练,一共10类,测试时候出现这个错误:

    请教下,我用新的训练集进行训练,一共10类,测试时候出现这个错误:

    RuntimeError: Error(s) in loading state_dict for RFBNet: size mismatch for conf.0.weight: copying a param of torch.Size([126, 512, 3, 3]) from checkpoint, where the shape is torch.Size([66, 512, 3, 3]) in current model. size mismatch for conf.0.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.1.weight: copying a param of torch.Size([126, 1024, 3, 3]) from checkpoint, where the shape is torch.Size([66, 1024, 3, 3]) in current model. size mismatch for conf.1.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.2.weight: copying a param of torch.Size([126, 512, 3, 3]) from checkpoint, where the shape is torch.Size([66, 512, 3, 3]) in current model. size mismatch for conf.2.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.3.weight: copying a param of torch.Size([126, 256, 3, 3]) from checkpoint, where the shape is torch.Size([66, 256, 3, 3]) in current model. size mismatch for conf.3.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.4.weight: copying a param of torch.Size([84, 256, 3, 3]) from checkpoint, where the shape is torch.Size([44, 256, 3, 3]) in current model. size mismatch for conf.4.bias: copying a param of torch.Size([84]) from checkpoint, where the shape is torch.Size([44]) in current model. size mismatch for conf.5.weight: copying a param of torch.Size([84, 256, 3, 3]) from checkpoint, where the shape is torch.Size([44, 256, 3, 3]) in current model. size mismatch for conf.5.bias: copying a param of torch.Size([84]) from checkpoint, where the shape is torch.Size([44]) in current model.

    我并不清楚前面的checkpoint的torch.size的126/84这个是在哪修改的

    opened by chenchch94 11
  • error when running test_RFB.py without cuda

    error when running test_RFB.py without cuda

    test_RFB.py runs OK when cuda sets as True. However, when I set cuda as False in test_RFB.py and got the following error:

    Traceback (most recent call last): File "test_RFB.py", line 193, in top_k, thresh=0.01) File "demo_RFB.py", line 91, in test_net out = net(x) # forward pass File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/home/topspinn/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 185, in forward x = self.basek File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 72, in forward input = module(input) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 282, in forward self.padding, self.dilation, self.groups) RuntimeError: Expected object of type Variable[torch.FloatTensor] but found type Variable[torch.cuda.FloatTensor] for argument #1 'weight'

    opened by kaishijeng 11
  • I  meet the problem.

    I meet the problem.

    Traceback (most recent call last): File "test_RFB.py", line 49, in from models.RFB_Net_E_vgg import build_net File "/media/media_share/linkfile/RFBNet/models/RFB_Net_E_vgg.py", line 405 return RFBNet(phase, size, *multibox(size, vgg(base[str(size)], 3),add_extras(size, extras[str(size)], 1024),mbox[str(size)], num_classes), num_classes) SyntaxError: only named arguments may follow *expression

    could you help me?

    thank you very much

    opened by 10183308 11
  • A problem about img channel

    A problem about img channel

    Hello, first thanks to your amazing open source algorithm. Maybe it's a minor problem, but I will be appreciated if you have time to answer. In your data/data_augment.py file, you use cv2 to read img file. As we all know, cv2 reads images in BGR order, but you subtract mean value (104, 117, 123), which is RGB order, as you write in annotation. A few days ago, I tried to use (123, 117, 104) instead and got a little improve in both precision and recall, and I'm not sure it is because the order of mean value. Would you please help me with this problem? Thanks again for your great algorithm.

    opened by left4back 7
  • how to deal with the bug of make.sh

    how to deal with the bug of make.sh

    Hi !@ruinmessi I run the ./make.sh but get g++ -pthread -shared -B /home/liye/anaconda3/compiler_compat -L/home/liye/anaconda3/lib -Wl,-rpath=/home/liye/anaconda3/lib,--no-as-needed build/temp.linux-x86_64-3.6/nms/nms_kernel.o build/temp.linux-x86_64-3.6/nms/gpu_nms.o -L/usr/local/cuda-8.0/lib64 -L/home/liye/anaconda3/lib -R/usr/local/cuda-8.0/lib64 -lcudart -lpython3.6m -o /media/ubuntue/extdisk1/liye/RFBNet-master/utils/nms/gpu_nms.cpython-36m-x86_64-linux-gnu.so g++: error: unrecognized command line option ‘-R’ error: command 'g++' failed with exit status 1

    why? can you help me?

    opened by liye228 6
  • error in training mobilenet

    error in training mobilenet

    @ruinmessi

    I follow your instruction below to train VOC with mobilenet, but got an error:

    python3 train_RFB.py -d VOC -v RFB_mobile -s 300 300 21 Traceback (most recent call last): File "train_RFB.py", line 88, in net = build_net('train', img_dim, num_classes) File "/home/topspin/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 348, in build_net mbox[str(size)], num_classes), num_classes) TypeError: init() missing 2 required positional arguments: 'head' and 'num_classes'

    Any idea why this happens?

    Thanks,

    opened by kaishijeng 6
  • High overhead GPU to CPU

    High overhead GPU to CPU

    The conversion of boxes (cuda float tensors) which are returned from the detector forward to cpu float tensors has extremely high overhead. (I ignored the conversion to numpy array,takes about a microsecond)

    boxes = boxes.cpu().numpy()

    It takes approximately 22 milliseconds on a 512 input size (detection time is approximately 9 milliseconds)

    opened by evroni 6
  • test trained model RFBNet300

    test trained model RFBNet300

    Hi, thank you for your great work.

    When I try to evaluate the trained model RFBNet300_VOC_80_7, which is one of your given models. I got the following error:

    im_detect: 1/10 0.022s 0.001s Evaluating detections Writing aeroplane VOC results file Writing bicycle VOC results file Writing bird VOC results file Writing boat VOC results file Writing bottle VOC results file Writing bus VOC results file Writing car VOC results file Writing cat VOC results file Writing chair VOC results file Writing cow VOC results file Writing diningtable VOC results file Writing dog VOC results file Writing horse VOC results file Writing motorbike VOC results file Writing person VOC results file Writing pottedplant VOC results file Writing sheep VOC results file Writing sofa VOC results file Writing train VOC results file Writing tvmonitor VOC results file VOC07 metric? Yes Traceback (most recent call last): File "/home/johnsnore/Research/RFBNet/test_RFB.py", line 190, in top_k, thresh=0.01) File "/home/johnsnore/Research/RFBNet/test_RFB.py", line 143, in test_net testset.evaluate_detections(all_boxes, save_folder) File "/home/johnsnore/Research/RFBNet/data/voc0712.py", line 253, in evaluate_detections self._do_python_eval(output_dir) File "/home/johnsnore/Research/RFBNet/data/voc0712.py", line 310, in _do_python_eval use_07_metric=use_07_metric) File "/home/johnsnore/Research/RFBNet/data/voc_eval.py", line 152, in voc_eval BB = BB[sorted_ind, :] IndexError: too many indices for array

    And I found BB=[] Is that something wrong? Should I do something else before run test_RFB.py?

    opened by JingpengSun 5
  • How can i get MS COCO 'train2014(or2017)_gt_roidb.pkl'file in cache folder?

    How can i get MS COCO 'train2014(or2017)_gt_roidb.pkl'file in cache folder?

    opened by seongkyun 5
  • Loss_l : inf in training ?

    Loss_l : inf in training ?

    Hello, first of all, thanks for your code releasing. I got the training loss inf, acutally loss_l = inf, i use your original code (only fixed some bug), but i don't know why i got inf. Parameters: lr:0.004, batchsize:32, base_model:vgg_reducedfc.pth GPU: 1080ti

    Any comments will be appreciated. Thanks very much!

    opened by transcendentsky 5
  • Speed of RFBNet and SSD on COCO and VOC, have you consider the nms time?

    Speed of RFBNet and SSD on COCO and VOC, have you consider the nms time?

    Hi, Thank you for share your codes! When I test your code on VOC and SSD, I cannot reproduce the speed as you reported. My configuration is Titan XP, pytorch 0.4, cuda 9.0, and get 20+ms for VOC, 30+ms for COCO, not like yours (80+fps, and 15ms for COCO). When test the speed, I use conf_thresh 0.01, gpunms. I have also tried pytorch 0.3, still cannot reproduce the speed. The time of nms for me is ~20ms for COCO, and ~10ms for VOC. So have you consider the NMS time when report the fps? Any advice for getting the correct speed? Thank you very much!

    opened by tjulyz 4
  • FileNotFoundError

    FileNotFoundError

    Loading base network... Initializing weights... Loading Dataset... Traceback (most recent call last): File "train_RFB.py", line 257, in train() File "train_RFB.py", line 165, in train dataset = VOCDetection(VOCroot, train_sets, preproc( File "/home/fsr/code/RFBNet/data/voc0712.py", line 173, in init for line in open(os.path.join(rootpath, 'ImageSets', 'Main', name+'.txt')): FileNotFoundError: [Errno 2] No such file or directory: '/home/fsr/data/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt'

    The document exists, still reported the error, ask how to modify it

    opened by Lixia1221 0
  • BasicSepConv 这里面的卷积是深度可分离卷积吗?

    BasicSepConv 这里面的卷积是深度可分离卷积吗?

    class BasicSepConv(nn.Module):

    def __init__(self, in_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True, bias=False):
        super(BasicSepConv, self).__init__()
        self.out_channels = in_planes
        self.conv = nn.Conv2d(in_planes, in_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups = in_planes, bias=bias) #这是深度可分离卷积吗 用keras 的depthwise代替可以吗?
        self.bn = nn.BatchNorm2d(in_planes,eps=1e-5, momentum=0.01, affine=True) if bn else None
        self.relu = nn.ReLU(inplace=True) if relu else None
    
    def forward(self, x):
        x = self.conv(x)
        if self.bn is not None:
            x = self.bn(x)
        if self.relu is not None:
            x = self.relu(x)
        return x
    
    opened by getr1ch 0
  • Question about Random Crop transformation parameters

    Question about Random Crop transformation parameters

    Hi! First of all thanks for your great job and contribution!

    I was checking the code of the data augmentation transforms, specifically the _crop function. But, I have found that in some images it is almost impossible to find a valid crop (according to what I understand), when the image has more than 3 or 4 bboxes.

    According to what I see in the code, you decide a minimum and maxium IoU randomly in this list:

    mode = random.choice((
                None,
                (0.1, None),
                (0.3, None),
                (0.5, None),
                (0.7, None),
                (0.9, None),
                (None, None),
            ))
    ...
    min_iou, max_iou = mode
    

    Later on you compute the crop (roi variable) and the IoU of the bboxes with the full crop:

    roi = np.array((l, t, l + w, t + h))
    iou = matrix_iou(boxes, roi[np.newaxis])
    

    and finally you check if all the bboxes have a minimum IoU with the whole crop bigger the randomly selected one:

    if not (min_iou <= iou.min() and iou.max() <= max_iou):
                continue
    

    and if they are not, you try again until you reach 50 trials.

    However, I think its almost impossible to find a valid crop for images with a couple of bboxes. So, they are never cropped unless we fall in the (None, None) case. E.g. https://cocodataset.org/#explore?id=363917.

    The reasons I find are:

    1. The IoU is computed against the whole crop area, so it is difficult to get high IoU values unless the crop is tight around the BBox.
    2. The minimum IoU is required for all the BBoxes. That is, it is always required that the crop takes all the BBoxes of the image.

    Am I right? Was this intended somehow?

    Thank you so much in advance!

    opened by mkmenta 0
  • fail to run ./make.sh

    fail to run ./make.sh

    c_type’? tstate->exc_type = local_type; ^~~~~~~~ curexc_type pycocotools/_mask.c:14327:13: error: ‘PyThreadState {aka struct _ts’ has no member named ‘exc_value’; did you mean ‘curexc_value’? tstate->exc_value = local_value; ^~~~~~~~~ curexc_value pycocotools/_mask.c:14328:13: error: ‘PyThreadState {aka struct _ts’ has no member named ‘exc_traceback’; did you mean ‘curexc_tracebac’? tstate->exc_traceback = local_tb; ^~~~~~~~~~~~~ curexc_traceback error: command 'gcc' failed with exit status 1

    opened by KrystalCWT 4
Owner
Liu Songtao
我萧峰大好男儿~ Factos👍👀​
Liu Songtao
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

MIC-DKFZ 1.2k Jan 4, 2023
Realtime segmentation with ENet, the fast and accurate segmentation net.

Enet This is a realtime segmentation net with almost 22 fps on GTX1080 ti, and the model size is very small with only 28M. This repo contains the infe

JinTian 14 Aug 30, 2022
Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Custom Keras ML block example for Edge Impulse This repository is an example on

Edge Impulse 8 Nov 2, 2022
Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

CTDNet The PyTorch code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection" Requirements Python 3.6

CVTEAM 28 Oct 20, 2022
PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Yulun Zhang 1.2k Dec 26, 2022
Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

Nguyen Mau Dzung 271 Nov 29, 2022
PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PN-Net We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single

null 1 Oct 2, 2021
Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

Andres Mauricio Rondon Patiño 24 Oct 22, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

Doron Adler 43 Oct 3, 2022
RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

Phong Nguyen Ha 4 May 26, 2022
1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

Instead, two models for appearance modeling are included, together with the open-source BAGS model and the full set of code for inference. With this code, you can achieve around mAP@23 with TAO test set (based on our estimation).

null 79 Oct 8, 2022
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

Xuebin Qin 6.5k Jan 9, 2023
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

null 3 Jan 26, 2022
A lightweight deep network for fast and accurate optical flow estimation.

FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation The official PyTorch implementation of FastFlowNet (ICRA 2021). Authors: Lingtong

Tone 161 Jan 3, 2023
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

Zhengyuan Yang 118 Dec 5, 2022