Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Liu Songtao

Last update: Dec 21, 2022

Related tags

Overview

Receptive Field Block Net for Accurate and Fast Object Detection

By Songtao Liu, Di Huang, Yunhong Wang

Updatas (2021/07/23): YOLOX is here!, stronger YOLO with ONNX, TensorRT, ncnn, and OpenVino supported!!

Updates: we propose a new method to get 42.4 mAP at 45 FPS on COCO, code is available here

Introduction

Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the discriminability and robustness of features. We further assemble the RFB module to the top of SSD with a lightweight CNN model, constructing the RFB Net detector. You can use the code to train/evaluate the RFB Net for object detection. For more details, please refer to our ECCV paper.

VOC2007 Test

System	mAP	FPS (Titan X Maxwell)
Faster R-CNN (VGG16)	73.2	7
YOLOv2 (Darknet-19)	78.6	40
R-FCN (ResNet-101)	80.5	9
SSD300* (VGG16)	77.2	46
SSD512* (VGG16)	79.8	19
RFBNet300 (VGG16)	80.7	83
RFBNet512 (VGG16)	82.2	38

COCO

System	test-dev mAP	Time (Titan X Maxwell)
Faster R-CNN++ (ResNet-101)	34.9	3.36s
YOLOv2 (Darknet-19)	21.6	25ms
SSD300* (VGG16)	25.1	22ms
SSD512* (VGG16)	28.8	53ms
RetinaNet500 (ResNet-101-FPN)	34.4	90ms
RFBNet300 (VGG16)	30.3	15ms
RFBNet512 (VGG16)	33.8	30ms
RFBNet512-E (VGG16)	34.4	33ms

MobileNet

System	COCO minival mAP	#parameters
SSD MobileNet	19.3	6.8M
RFB MobileNet	20.7	7.4M

Citing RFB Net

Please cite our paper in your publications if it helps your research:

@InProceedings{Liu_2018_ECCV,
author = {Liu, Songtao and Huang, Di and Wang, andYunhong},
title = {Receptive Field Block Net for Accurate and Fast Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Installation
Datasets
Training
Evaluation
Models

Installation

Install PyTorch-0.4.0 by selecting your environment on the website and running the appropriate command.
Clone this repository. This repository is mainly based on ssd.pytorch and Chainer-ssd, a huge thank to them.
- Note: We currently only support PyTorch-0.4.0 and Python 3+.
Compile the nms and coco tools:

./make.sh

Note: Check you GPU architecture support in utils/build.py, line 131. Default is:

'nvcc': ['-arch=sm_52',

Then download the dataset by following the instructions below and install opencv.

conda install opencv

Note: For training, we currently support VOC and COCO.

Datasets

To make things easy, we provide simple VOC and COCO dataset loader that inherits torch.utils.data.Dataset making it fully compatible with the torchvision.datasets API.

VOC Dataset

Download VOC2007 trainval & test

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>

Download VOC2012 trainval

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

COCO Dataset

Install the MS COCO dataset at /path/to/coco from official website, default is ~/data/COCO. Following the instructions to prepare minival2014 and valminusminival2014 annotations. All label files (.json) should be under the COCO/annotations/ folder. It should have this basic structure

$COCO/
$COCO/cache/
$COCO/annotations/
$COCO/images/
$COCO/images/test2015/
$COCO/images/train2014/
$COCO/images/val2014/

UPDATE: The current COCO dataset has released new train2017 and val2017 sets which are just new splits of the same image sets.

Training

First download the fc-reduced VGG-16 PyTorch base network weights at: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth or from our BaiduYun Driver
MobileNet pre-trained basenet is ported from MobileNet-Caffe, which achieves slightly better accuracy rates than the original one reported in the paper, weight file is available at: https://drive.google.com/open?id=13aZSApybBDjzfGIdqN1INBlPsddxCK14 or BaiduYun Driver.
By default, we assume you have downloaded the file in the RFBNet/weights dir:

mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth

To train RFBNet using the train script simply specify the parameters listed in train_RFB.py as a flag or manually change them.

python train_RFB.py -d VOC -v RFB_vgg -s 300

Note:
- -d: choose datasets, VOC or COCO.
- -v: choose backbone version, RFB_VGG, RFB_E_VGG or RFB_mobile.
- -s: image size, 300 or 512.
- You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see train_RFB.py for options)
- If you want to reproduce the results in the paper, the VOC model should be trained about 240 epoches while the COCO version need 130 epoches.

Evaluation

To evaluate a trained network:

python test_RFB.py -d VOC -v RFB_vgg -s 300 --trained_model /path/to/model/weights

By default, it will directly output the mAP results on VOC2007 test or COCO minival2014. For VOC2012 test and COCO test-dev results, you can manually change the datasets in the test_RFB.py file, then save the detection results and submitted to the server.

Models

Comments

请教下，我用新的训练集进行训练，一共10类，测试时候出现这个错误：

RuntimeError: Error(s) in loading state_dict for RFBNet: size mismatch for conf.0.weight: copying a param of torch.Size([126, 512, 3, 3]) from checkpoint, where the shape is torch.Size([66, 512, 3, 3]) in current model. size mismatch for conf.0.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.1.weight: copying a param of torch.Size([126, 1024, 3, 3]) from checkpoint, where the shape is torch.Size([66, 1024, 3, 3]) in current model. size mismatch for conf.1.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.2.weight: copying a param of torch.Size([126, 512, 3, 3]) from checkpoint, where the shape is torch.Size([66, 512, 3, 3]) in current model. size mismatch for conf.2.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.3.weight: copying a param of torch.Size([126, 256, 3, 3]) from checkpoint, where the shape is torch.Size([66, 256, 3, 3]) in current model. size mismatch for conf.3.bias: copying a param of torch.Size([126]) from checkpoint, where the shape is torch.Size([66]) in current model. size mismatch for conf.4.weight: copying a param of torch.Size([84, 256, 3, 3]) from checkpoint, where the shape is torch.Size([44, 256, 3, 3]) in current model. size mismatch for conf.4.bias: copying a param of torch.Size([84]) from checkpoint, where the shape is torch.Size([44]) in current model. size mismatch for conf.5.weight: copying a param of torch.Size([84, 256, 3, 3]) from checkpoint, where the shape is torch.Size([44, 256, 3, 3]) in current model. size mismatch for conf.5.bias: copying a param of torch.Size([84]) from checkpoint, where the shape is torch.Size([44]) in current model.

我并不清楚前面的checkpoint的torch.size的126/84这个是在哪修改的

opened by chenchch94 11
error when running test_RFB.py without cuda

test_RFB.py runs OK when cuda sets as True. However, when I set cuda as False in test_RFB.py and got the following error:

Traceback (most recent call last): File "test_RFB.py", line 193, in top_k, thresh=0.01) File "demo_RFB.py", line 91, in test_net out = net(x) # forward pass File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/home/topspinn/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 185, in forward x = self.basek File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 72, in forward input = module(input) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 282, in forward self.padding, self.dilation, self.groups) RuntimeError: Expected object of type Variable[torch.FloatTensor] but found type Variable[torch.cuda.FloatTensor] for argument #1 'weight'

opened by kaishijeng 11
I meet the problem.

Traceback (most recent call last): File "test_RFB.py", line 49, in from models.RFB_Net_E_vgg import build_net File "/media/media_share/linkfile/RFBNet/models/RFB_Net_E_vgg.py", line 405 return RFBNet(phase, size, *multibox(size, vgg(base[str(size)], 3),add_extras(size, extras[str(size)], 1024),mbox[str(size)], num_classes), num_classes) SyntaxError: only named arguments may follow *expression

could you help me?

thank you very much

opened by 10183308 11
A problem about img channel

Hello, first thanks to your amazing open source algorithm. Maybe it's a minor problem, but I will be appreciated if you have time to answer. In your data/data_augment.py file, you use cv2 to read img file. As we all know, cv2 reads images in BGR order, but you subtract mean value (104, 117, 123), which is RGB order, as you write in annotation. A few days ago, I tried to use (123, 117, 104) instead and got a little improve in both precision and recall, and I'm not sure it is because the order of mean value. Would you please help me with this problem? Thanks again for your great algorithm.

opened by left4back 7
how to deal with the bug of make.sh

Hi !@ruinmessi I run the ./make.sh but get g++ -pthread -shared -B /home/liye/anaconda3/compiler_compat -L/home/liye/anaconda3/lib -Wl,-rpath=/home/liye/anaconda3/lib,--no-as-needed build/temp.linux-x86_64-3.6/nms/nms_kernel.o build/temp.linux-x86_64-3.6/nms/gpu_nms.o -L/usr/local/cuda-8.0/lib64 -L/home/liye/anaconda3/lib -R/usr/local/cuda-8.0/lib64 -lcudart -lpython3.6m -o /media/ubuntue/extdisk1/liye/RFBNet-master/utils/nms/gpu_nms.cpython-36m-x86_64-linux-gnu.so g++: error: unrecognized command line option ‘-R’ error: command 'g++' failed with exit status 1

why? can you help me?

opened by liye228 6
error in training mobilenet

@ruinmessi

I follow your instruction below to train VOC with mobilenet, but got an error:

python3 train_RFB.py -d VOC -v RFB_mobile -s 300 300 21 Traceback (most recent call last): File "train_RFB.py", line 88, in net = build_net('train', img_dim, num_classes) File "/home/topspin/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 348, in build_net mbox[str(size)], num_classes), num_classes) TypeError: init() missing 2 required positional arguments: 'head' and 'num_classes'

Any idea why this happens?

Thanks,

opened by kaishijeng 6
High overhead GPU to CPU

The conversion of boxes (cuda float tensors) which are returned from the detector forward to cpu float tensors has extremely high overhead. (I ignored the conversion to numpy array,takes about a microsecond)

boxes = boxes.cpu().numpy()

It takes approximately 22 milliseconds on a 512 input size (detection time is approximately 9 milliseconds)

opened by evroni 6
test trained model RFBNet300

Hi, thank you for your great work.

When I try to evaluate the trained model RFBNet300_VOC_80_7, which is one of your given models. I got the following error:

im_detect: 1/10 0.022s 0.001s Evaluating detections Writing aeroplane VOC results file Writing bicycle VOC results file Writing bird VOC results file Writing boat VOC results file Writing bottle VOC results file Writing bus VOC results file Writing car VOC results file Writing cat VOC results file Writing chair VOC results file Writing cow VOC results file Writing diningtable VOC results file Writing dog VOC results file Writing horse VOC results file Writing motorbike VOC results file Writing person VOC results file Writing pottedplant VOC results file Writing sheep VOC results file Writing sofa VOC results file Writing train VOC results file Writing tvmonitor VOC results file VOC07 metric? Yes Traceback (most recent call last): File "/home/johnsnore/Research/RFBNet/test_RFB.py", line 190, in top_k, thresh=0.01) File "/home/johnsnore/Research/RFBNet/test_RFB.py", line 143, in test_net testset.evaluate_detections(all_boxes, save_folder) File "/home/johnsnore/Research/RFBNet/data/voc0712.py", line 253, in evaluate_detections self._do_python_eval(output_dir) File "/home/johnsnore/Research/RFBNet/data/voc0712.py", line 310, in _do_python_eval use_07_metric=use_07_metric) File "/home/johnsnore/Research/RFBNet/data/voc_eval.py", line 152, in voc_eval BB = BB[sorted_ind, :] IndexError: too many indices for array

And I found BB=[] Is that something wrong? Should I do something else before run test_RFB.py？

opened by JingpengSun 5
How can i get MS COCO 'train2014(or2017)_gt_roidb.pkl'file in cache folder?

I'm trying to train this code on my pc with your instructions. But there is a problem that generating ~_gt_roidb.pkl process. I've read dataset instructions but nothing explained about generating MS COCO cache folder and ~_gt_roidb.pkl file, even can not find the method on https://github.com/rbgirshick/py-faster-rcnn/blob/77b773655505599b94fd8f3f9928dbf1a9a776c7/data/README.md . Is there any way to get that file to train this model?

opened by seongkyun 5
Loss_l : inf in training ?

Hello, first of all, thanks for your code releasing. I got the training loss inf, acutally loss_l = inf, i use your original code (only fixed some bug), but i don't know why i got inf. Parameters: lr:0.004, batchsize:32, base_model:vgg_reducedfc.pth GPU: 1080ti

Any comments will be appreciated. Thanks very much!

opened by transcendentsky 5
Speed of RFBNet and SSD on COCO and VOC, have you consider the nms time?

Hi, Thank you for share your codes! When I test your code on VOC and SSD, I cannot reproduce the speed as you reported. My configuration is Titan XP, pytorch 0.4, cuda 9.0, and get 20+ms for VOC, 30+ms for COCO, not like yours (80+fps, and 15ms for COCO). When test the speed, I use conf_thresh 0.01, gpunms. I have also tried pytorch 0.3, still cannot reproduce the speed. The time of nms for me is ~20ms for COCO, and ~10ms for VOC. So have you consider the NMS time when report the fps? Any advice for getting the correct speed? Thank you very much!

opened by tjulyz 4
FileNotFoundError

Loading base network... Initializing weights... Loading Dataset... Traceback (most recent call last): File "train_RFB.py", line 257, in train() File "train_RFB.py", line 165, in train dataset = VOCDetection(VOCroot, train_sets, preproc( File "/home/fsr/code/RFBNet/data/voc0712.py", line 173, in init for line in open(os.path.join(rootpath, 'ImageSets', 'Main', name+'.txt')): FileNotFoundError: [Errno 2] No such file or directory: '/home/fsr/data/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt'

The document exists, still reported the error, ask how to modify it

opened by Lixia1221 0

BasicSepConv 这里面的卷积是深度可分离卷积吗？

class BasicSepConv(nn.Module):

def __init__(self, in_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True, bias=False):
    super(BasicSepConv, self).__init__()
    self.out_channels = in_planes
    self.conv = nn.Conv2d(in_planes, in_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups = in_planes, bias=bias) #这是深度可分离卷积吗 用keras 的depthwise代替可以吗？
    self.bn = nn.BatchNorm2d(in_planes,eps=1e-5, momentum=0.01, affine=True) if bn else None
    self.relu = nn.ReLU(inplace=True) if relu else None

def forward(self, x):
    x = self.conv(x)
    if self.bn is not None:
        x = self.bn(x)
    if self.relu is not None:
        x = self.relu(x)
    return x

opened by getr1ch 0

Question about Random Crop transformation parameters
Hi! First of all thanks for your great job and contribution!

I was checking the code of the data augmentation transforms, specifically the _crop function. But, I have found that in some images it is almost impossible to find a valid crop (according to what I understand), when the image has more than 3 or 4 bboxes.

According to what I see in the code, you decide a minimum and maxium IoU randomly in this list:

mode = random.choice(( None, (0.1, None), (0.3, None), (0.5, None), (0.7, None), (0.9, None), (None, None), )) ... min_iou, max_iou = mode

Later on you compute the crop (roi variable) and the IoU of the bboxes with the full crop:

roi = np.array((l, t, l + w, t + h)) iou = matrix_iou(boxes, roi[np.newaxis])

and finally you check if all the bboxes have a minimum IoU with the whole crop bigger the randomly selected one:

if not (min_iou <= iou.min() and iou.max() <= max_iou): continue

and if they are not, you try again until you reach 50 trials.

However, I think its almost impossible to find a valid crop for images with a couple of bboxes. So, they are never cropped unless we fall in the (None, None) case. E.g. https://cocodataset.org/#explore?id=363917.

The reasons I find are:

The IoU is computed against the whole crop area, so it is difficult to get high IoU values unless the crop is tight around the BBox.

The minimum IoU is required for all the BBoxes. That is, it is always required that the crop takes all the BBoxes of the image.

Am I right? Was this intended somehow?

Thank you so much in advance!
opened by mkmenta 0
fail to run ./make.sh

c_type’? tstate->exc_type = local_type; ^~~~~~~~ curexc_type pycocotools/_mask.c:14327:13: error: ‘PyThreadState {aka struct _ts’ has no member named ‘exc_value’; did you mean ‘curexc_value’? tstate->exc_value = local_value; ^~~~~~~~~ curexc_value pycocotools/_mask.c:14328:13: error: ‘PyThreadState {aka struct _ts’ has no member named ‘exc_traceback’; did you mean ‘curexc_tracebac’? tstate->exc_traceback = local_tb; ^~~~~~~~~~~~~ curexc_traceback error: command 'gcc' failed with exit status 1

opened by KrystalCWT 4

Owner

Liu Songtao

我萧峰大好男儿~ Factos👍👀

GitHub

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

1.2k Jan 4, 2023

Realtime segmentation with ENet, the fast and accurate segmentation net.

Enet This is a realtime segmentation net with almost 22 fps on GTX1080 ti, and the model size is very small with only 28M. This repo contains the infe

14 Aug 30, 2022

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Custom Keras ML block example for Edge Impulse This repository is an example on

8 Nov 2, 2022

Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

CTDNet The PyTorch code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection" Requirements Python 3.6

28 Oct 20, 2022

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

1.2k Dec 26, 2022

Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

271 Nov 29, 2022

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PN-Net We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single

1 Oct 2, 2021

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

24 Oct 22, 2022

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

104 Nov 25, 2022

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

MOT-Tracking-by-Detection-Pipeline Tracking-by-Detection形式のMOT(Multi Object Trac

41 Nov 23, 2022

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Related tags

Overview

Receptive Field Block Net for Accurate and Fast Object Detection

Updatas (2021/07/23): YOLOX is here!, stronger YOLO with ONNX, TensorRT, ncnn, and OpenVino supported!!

Updates: we propose a new method to get 42.4 mAP at 45 FPS on COCO, code is available here

Introduction

VOC2007 Test

COCO

MobileNet

Citing RFB Net

Contents

Installation

Datasets

VOC Dataset

Download VOC2007 trainval & test

Download VOC2012 trainval

COCO Dataset

Training

Evaluation

Models

Comments

Owner

Liu Songtao

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

Realtime segmentation with ENet, the fast and accurate segmentation net.

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U-2-Net: U Square Net - Modified for paired image training of style transfer

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Yolo object detection - Yolo object detection with python

A lightweight deep network for fast and accurate optical flow estimation.

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?