Recurrent Scale Approximation (RSA) for Object Detection

Overview

Recurrent Scale Approximation (RSA) for Object Detection

Codebase for Recurrent Scale Approximation for Object Detection in CNN published at ICCV 2017, [arXiv]. Here we offer the training and test code for two modules in the paper, scale-forecast network and recurrent scale approximation (RSA). Models for face detection trained on some open datasets are also provided.

Note: This project is still underway. Please stay tuned for more features soon!

Codebase at a Glance

train/: Training code for modules scale-forecast network and RSA

predict/: Test code for the whole detection pipeline

afw_gtmiss.mat: Revised face data annotation mentioned in Section 4.1 in the paper.

Grab and Go (Demo)

Caffe models for face detection trained on popular datasets.

  • Base RPN model: predict/output/ResNet_3b_s16/tot_wometa_1epoch, trained on Widerface (fg/bg), COCO (bg only) and ImageNet Det (bg only)
  • RSA model: predict/output/ResNet_3b_s16_fm2fm_pool2_deep/65w, trained on Widerface, COCO, and ImageNet Det

Steps to run the test code:

  1. Compile CaffeMex_v2 with matlab interface

  2. Add CaffeMex_v2/matlab/ to matlab search path

  3. See tips in predict/script_start.m and run it!

  4. After processing for a few minutes, the detection and alignment results will be shown in an image window. Please click the image window to view all results. If you set line 8 in script_start.m to false as default, you should observe some results as above.

Train Your Own Model

Still in progress, this part will be released later.

FAQ

We will list the common issues of this project as time goes. Stay tuned! :)

Citation

Please kindly cite our work if it helps your research:

@inproceedings{liu_2017_rsa,
  Author = {Yu Liu and Hongyang Li and Junjie Yan and Fangyin Wei and Xiaogang Wang and Xiaoou Tang},
  Title = {Recurrent Scale Approximation for Object Detection in CNN},
  Journal = {IEEE International Conference on Computer Vision},
  Year = {2017}
}

Acknowledgment

We appreciate the contribution of the following researchers:

Dong Chen @Microsoft Research, some basic ideas are inspired by him when Yu Liu worked as an intern at MSR.

Jiongchao Jin @Beihang University, some baseline results are provided by him.

Comments
  • some questions in the paper

    some questions in the paper

    (1)in RSA section of your paper, i do not understand This paragraph:"During inference, we first have the possible scales of the input from the scale-forecast network. The image is then resized accordingly to the extent that the smallest scale (corresponding to the largest feature map) is resized to the range of [64; 128]." is "the smallest scale" of faces or image? and do you mean that resize operation is to ensure "the smallest scale" in the range of [64; 128].?

    (2)for Scale forecast Network in figure 3 of you paper, when it output 28 in [20,30] which means the size of face is in the range of [128,256], corresponding m is equal to 2. this means after RSA, the size of the face is in range of [32,64]. and when it output 35 in [30,40] which means the size of is in the range of [256,512], corresponding m is equal to 3. this means after RSA, the size of the face is in range of [32,64],too. so RSA is to ensure the size of face in the range of [32,64]?

    look forward your answers, thanks!

    opened by liyuanyaun 9
  • On my GTX1080, the time cost is 0.6~0.8s/pic. Is that normal?

    On my GTX1080, the time cost is 0.6~0.8s/pic. Is that normal?

    My speed:

    >> script_start
    Cleared 1 solvers and 0 stand-alone nets
    Step 1 fcn: 1/4...时间已过 0.255855 秒。
    Step 1 fcn: 2/4...时间已过 0.317631 秒。
    Step 1 fcn: 3/4...时间已过 0.167352 秒。
    Step 1 fcn: 4/4...时间已过 0.322786 秒。
    Cleared 1 solvers and 0 stand-alone nets
    Step 2 rsa:  1/4...时间已过 0.064430 秒。
    Step 2 rsa:  2/4...时间已过 0.057944 秒。
    Step 2 rsa:  3/4...时间已过 0.039679 秒。
    Step 2 rsa:  4/4...时间已过 0.055250 秒。
    Cleared 1 solvers and 0 stand-alone nets
    Step 3 rpn:  1/4...时间已过 0.339639 秒。
    Step 3 rpn:  2/4...时间已过 0.431017 秒。
    Step 3 rpn:  3/4...时间已过 0.279860 秒。
    Step 3 rpn:  4/4...时间已过 0.278757 秒。
    

    Speed in comments:

    script_gen_featmap; % GPU runtime: 5ms per pic on Titan Xp @2048px
    script_featmap_transfer; % GPU runtime: 0.3ms per pic on Titan Xp
    script_featmap_2_result; % GPU runtime: 3.2ms per pic on Titan Xp
    

    My speed is 100x slower than the comments. My graphic card is GTX 1080. I think there should not be so much difference. Could anyone help?

    opened by happynear 2
  • 关于论文和代码中的一些问题

    关于论文和代码中的一些问题

    1. 论文中提到使用half-channel resnet18作为lrn net,在代码中并没有看到half-channel,两者在精度上相差多少
    2. 带有关键点的widerface训练集是你们内部没开源的数据集么
    3. 我使用多scale[2048~64]的原图去训练lrn,再基于训练好的lrn第一部分网络训练rsa,这种思路对么,可否告知论文中提到的end-to-end训练方式
    4. 如果使用box作为groudtruth代替keypoint,影响大么

    望大神解答!多谢!

    opened by ElegantGod 0
  • Training code is missing

    Training code is missing

    @sciencefans : Dear sir, in your readme you have written that train/: Training code for modules scale-forecast network and RSA

    but the train folder itself is missing on your repository, either you forgot to upload OR its not public.

    opened by lucwan 0
  • code available date?

    code available date?

    Hi @sciencefans thank you for your work and paper Do you know any approximate date to release your code because I want to look into and try it.

    Regards, Anand

    opened by ananddb90 0
Owner
Yu Liu (Louis)
Researcher@SenseTime; [email protected]
Yu Liu (Louis)
DIT is a DTLS MitM proxy implemented in Python 3. It can intercept, manipulate and suppress datagrams between two DTLS endpoints and supports psk-based and certificate-based authentication schemes (RSA + ECC).

DIT - DTLS Interception Tool DIT is a MitM proxy tool to intercept DTLS traffic. It can intercept, manipulate and/or suppress DTLS datagrams between t

null 52 Nov 30, 2022
A module that used for encrypt code which includes RSA and AES

软件加密模块 requirement: Crypto,pycryptodome,pyqt5 本地加密信息为随机字符串 使用说明 命令行参数 -h 帮助 -checkWorking 检查是否能正常工作,后接1确认指令 -checkEndDate 检查截至日期,后接1确认指令 -activateCode

null 2 Sep 27, 2022
A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

Aboleth A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation [1] with stochastic gradient variational Bayes

Gradient Institute 127 Dec 12, 2022
Hierarchical Uniform Manifold Approximation and Projection

HUMAP Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP for hierarchical non-linear dimensionality reduction. HU

Wilson Estécio Marcílio Júnior 160 Jan 6, 2023
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Ilya Kostrikov 3k Dec 31, 2022
Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat

Jurijs Nazarovs 7 Nov 26, 2022
Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in 3D.

ApproxMVBB Status Build UnitTests Homepage Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in

Gabriel Nützi 390 Dec 31, 2022
Time Dependent DFT in Tamm-Dancoff Approximation

Density Function Theory Program - kspy-tddft(tda) This is an implementation of Time-Dependent Density Functional Theory(TDDFT) using the Tamm-Dancoff

Peter Borthwick 2 Nov 17, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

null 3 Jan 26, 2022
O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning Object-object Interaction Affordance Learning. For a given object-object int

Kaichun Mo 26 Nov 4, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

DV Lab 182 Dec 29, 2022
The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21) By Zhuofan Zong, Qianggang Cao, Biao Leng Introduction F

TempleX 9 Jul 30, 2022
Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

RESA PyTorch implementation of the paper "RESA: Recurrent Feature-Shift Aggregator for Lane Detection". Our paper has been accepted by AAAI2021. Intro

null 137 Jan 2, 2023
Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

RNN-for-Joint-NLU Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

Kim SungDong 194 Dec 28, 2022
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

null 305 Dec 16, 2022
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 443 Dec 6, 2022