Light-Head R-CNN


We release code for Light-Head R-CNN.

This is my best practice for my research.

This repo is organized as follows:

    |    |->user
    |    |    |->your_models

Main Results

  1. We train on COCO trainval which includes 80k training and 35k validation images. Test on minival which is a 5k subset in validation datasets. Noticing test-dev should be little higher than minival.
  2. We provide some crutial ablation experiments details, and it is easy to diff the difference.
  3. We share our training logs in GoogleDrive output folder, which contains dump models, training loss and speed of each steps. (experiments are done on 8 titan xp, and 2batches/per_gpu. Training should be within one day.)
  4. Because the limitation of the time, extra experiments are comming soon.
Model Name mAP@all [email protected] [email protected] mAP@S mAP@M mAP@L
R-FCN, ResNet-v1-101
our reproduce baseline
35.5 54.3 33.8 12.8 34.9 46.1
Light-Head R-CNN
38.2 60.9 41.0 20.9 42.2 52.8
+align pooling
39.3 61.0 42.4 22.2 43.8 53.2
+align pooling + nms0.5
40.0 62.1 42.9 22.5 44.6 54.0

Experiments path related to model:



  1. tensorflow-gpu==1.5.0 (We only test on tensorflow 1.5.0, early tensorflow is not supported because of our gpu nms implementation)
  2. python3. We recommend using Anaconda as it already includes many common packages. (python2 is not tested)
  3. Python packages might missing. pls fix it according to the error message.

Installation, Prepare data, Testing, Training


  1. Clone the Light-Head R-CNN repository, and we'll call the directory that you cloned Light-Head R-CNNN as ${lighthead_ROOT}.
git clone
  1. Compiling
cd ${lighthead_ROOT}/lib;

Make sure all of your compiling is successful. It may arise some errors, it is useful to find some common compile errors in FAQ

  1. Create log dump directory, data directory.
cd ${lighthead_ROOT};
mkdir output
mkdir data

Prepare data

data should be organized as follows:

    |    |->odformat
    |    |->instances_xxx.json
    |    |train2014
    |    |val2014

Download res101 basemodel:

wget -v
tar -xzvf resnet_v1_101_2016_08_28.tar.gz
mv resnet_v1_101.ckpt res101.ckpt

We transfer instances_xxx.json to odformat(object detection format), each line in odformat is an annotation(json) for one image. Our transformed odformat is shared in GoogleDrive .


  1. Using -d to assign gpu_id for testing. (e.g. -d 0,1,2,3 or -d 0-3 )
  2. Using -s to visualize the results.
  3. Using '-se' to specify start_epoch for testing.

We share our experiments output(logs) folder in GoogleDrive. Download it and place it to ${lighthead_ROOT}, then test our release model.


cd experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign
python3 -d 0-7 -se 26


We provide common used in tools, which can be linked to experiments folder.


cd experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign
python3 -tool
cp tools/ .
python3 -d 0-7


This repo is designed be fast and simple for research. There are still some can be improved: anchor_target and proposal_target layer are tf.py_func, which means it will run on cpu.


This is an implementation for Light-Head R-CNN, it is worth noting that:

  • The original implementation is based on our internal Platform used in Megvii. There are slight differences in the final accuracy and running time due to the plenty details in platform switch.
  • The code is tested on a server with 8 Pascal Titian XP gpu, 188.00 GB memory, and 40 core cpu.
  • We rewrite a faster nms in our inner platform, while hear we use tf.nms instead.

Citing Light-Head R-CNN

If you find Light-Head R-CNN is useful in your research, pls consider citing:

  title={Light-Head R-CNN: In Defense of Two-Stage Object Detector},
  author={Li, Zeming and Peng, Chao and Yu, Gang and Zhang, Xiangyu and Deng, Yangdong and Sun, Jian},
  journal={arXiv preprint arXiv:1711.07264},


  • fatal error: cuda/cuda_config.h: No such file or directory

First, find where is cuda_config.h.


find /usr/local/lib/ | grep cuda_config.h

then export your cpath, like:

export CPATH=$CPATH:/usr/local/lib/python3.5/dist-packages/external/local_config_cuda/cuda/
  • fatal error: cuda/include/cuda.h: No such file or directory

    fatal error: cuda/include/cuda.h: No such file or directory

    Hello, I'm running this in a docker container with cuda 9.0 and tensorflow 1.5.0 installed from pip3. When I run it get's stuck while compiling The exact error message is as follows:

    In file included from
    /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:24:31: fatal error: cuda/include/cuda.h: No such file or directory

    Do I have to compile tensorflow from source?

    opened by nguyeho7 9
  • Do anyone run successfully on a single gpu GTX 1080? I tried it and out of memory.

    Do anyone run successfully on a single gpu GTX 1080? I tried it and out of memory.

    I add tfconfig.gpu_options.per_process_gpu_memory_fraction = 0.05 to let it run, but I got error information like follow: ... 2018-05-18 19:14:25.380430: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 58.69MiB. Current allocation summary follows. 2018-05-18 19:14:25.380546: I tensorflow/core/common_runtime/] Bin (256): Total Chunks: 38, Chunks in use: 37. 9.5KiB allocated for chunks. 9.2KiB in use in bin. 7.6KiB client-requested in use in bin. ... 4] 1 Chunks of size 91656192 totalling 87.41MiB 2018-05-18 19:14:25.404137: I tensorflow/core/common_runtime/] Sum Total of in-use chunks: 374.93MiB 2018-05-18 19:14:25.404163: I tensorflow/core/common_runtime/] Stats: Limit: 425407283 InUse: 393138944 MaxInUse: 393138944 NumAllocs: 1096 MaxAllocSize: 91656192

    2018-05-18 19:14:25.404278: W tensorflow/core/common_runtime/] **************************************************************_****____******************xxxxxxx 2018-05-18 19:14:25.404328: W tensorflow/core/framework/] OP_REQUIRES failed at : Resource exhausted: OOM when allocating tensor with shape[1,64,400,601] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "", line 244, in eval_all(args) File "", line 137, in eval_all result_dict = inference(func, inputs, data_dict) File "", line 69, in inference _, scores, pred_boxes, rois = val_func(feed_dict=feed_dict) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 905, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 1140, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 1321, in _do_run run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,64,400,601] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: resnet_v1_101/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/conv1/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, resnet_v1_101/conv1/weights/read)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[Node: resnet_v1_101_5/concat_3/_1133 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2610_resnet_v1_101_5/concat_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

    opened by jiafeixiaoye 6
  • iteration stopped randomly

    iteration stopped randomly

    When I trained for an indefinite number of iterations, the iteration stopped and no reason was found.I use one 1080TI and use my dataset converted od format.It looks like stopping on the function of 'socket.recv'.Has been excluded from stopping in ‘get_data_for_singlegpu’ function, Because I stopped it and print in every possible exit that not occured.

    opened by xtanitfy 6
  • Trying to create frozen inference graph. Multiple nodes with the names 'cls_fc' and 'cls_prob' & 'bbox_fc'. Any help?

    Trying to create frozen inference graph. Multiple nodes with the names 'cls_fc' and 'cls_prob' & 'bbox_fc'. Any help?

    Hi everyone, I have a set of saved checkpoints but in order to do inference, I have to always load the file, modify it a little bit and accordingly make inference. But, it would be much easier if there was a way to convert them to a frozen graph. But in order to freeze the graph, we need nodes corresponding to the bounding box and class scores. It looks like 'cls_fc', 'cls_prob' and 'bbox_fc' are the nodes of interest but I do not see them in the nodes of the graph.

    Printing nodes using: [n for n in tf.get_default_graph().as_graph_def().node]

    If anybody could help me with identifying the respective output nodes, it will be great. Thanks!

    opened by karansomaiah 5
  • compile lib_kernel/lib_fast_nms/fast_nms using GPU V100

    compile lib_kernel/lib_fast_nms/fast_nms using GPU V100

    Hi, I tried to compile this code using two GPUs V100 using sm_70 and I'm getting this warning during compiling and this error when I run the

    /usr/local/cuda-9.0/bin/../targets/x86_64-linux/include/sm_30_intrinsics.hpp(213): here was declared deprecated ("__shfl_down() is not valid on compute_70 and above, and should be replaced with __shfl_down_sync().To continue using __shfl_down(), specify virtual architecture compute_60 when targeting sm_70 and above, for example, using the pair of compiler options: -arch=compute_60 -code=sm_70.")
    NotFoundError: /home/edgar/light_head_rcnn/lib/lib_kernel/lib_fast_nms/ undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumE

    Also, when I use -arch=compute_60 -code=sm_70, I got this warning during compiling and the same error when I run the

    /usr/local/cuda/bin/../targets/x86_64-linux/include/sm_30_intrinsics.hpp(213): here was declared deprecated ("__shfl_down() is deprecated in favor of __shfl_down_sync() and may be removed in a future release (Use -Wno-deprecated-declarations to suppress this warning).")

    The lines to be compiled are:

    nvcc -std=c++11 -c -o \
    	-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=compute_60 -code=sm_70 --expt-relaxed-constexpr -Wno-deprecated-declarations
    opened by emedinac 5
  • Shufflenet as backbone

    Shufflenet as backbone

    Hey guys,

    Anyone tried to replace the backbone by something like a Shufflenet or Mobilenet ? Since the Xception model is not released maybe it could be a good alternative to improve the inference speed ! I'm trying to add the from to but during the training the rpn_cls_loss seems to be switching between 0.5, 0.6, 0.7, 0.8 and 0.9 without decreasing further....

    Thanks for your help !

    opened by YellowKyu 4
  • Iteration stop randomly again

    Iteration stop randomly again

    I tried to update msgpack-numpy and msgpack as you said, but it doesn't work. Can you tell us what system you are using? Ubuntu16.04 automatically loses IP over a period of time, but your code uses pip. I suspect that it is an iterative random stop caused by a system problem. Print log as follows: ch:0, iter:4399, rpn_loss_cls: 0.0677, rpn_loss_box: 0.0325, loss_cls: 0.3604, loss_box: 0.6708, tot_losses: 1.1314, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 43epoch:0, iter:4400, rpn_loss_cls: 0.0281, rpn_loss_box: 0.0015, loss_cls: 0.0247, loss_box: 0.0002, tot_losses: 0.0544, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4401, rpn_loss_cls: 0.0373, rpn_loss_box: 0.0025, loss_cls: 0.0489, loss_box: 0.0004, tot_losses: 0.0892, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4402, rpn_loss_cls: 0.0223, rpn_loss_box: 0.0026, loss_cls: 0.0173, loss_box: 0.0130, tot_losses: 0.0551, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4403, rpn_loss_cls: 0.0571, rpn_loss_box: 0.0198, loss_cls: 0.2124, loss_box: 0.2481, tot_losses: 0.5374, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4404, rpn_loss_cls: 0.0359, rpn_loss_box: 0.0135, loss_cls: 0.1283, loss_box: 0.1383, tot_losses: 0.3160, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4405, rpn_loss_cls: 0.0455, rpn_loss_box: 0.0516, loss_cls: 0.1455, loss_box: 0.0754, tot_losses: 0.3181, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4406, rpn_loss_cls: 0.0611, rpn_loss_box: 0.0380, loss_cls: 0.0184, loss_box: 0.0022, tot_losses: 0.1198, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4407, rpn_loss_cls: 0.0297, rpn_loss_box: 0.0195, loss_cls: 0.0216, loss_box: 0.0106, tot_losses: 0.0814, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 44epoch:0, iter:4408, rpn_loss_cls: 0.0397, rpn_loss_box: 0.0038, loss_cls: 0.0574, loss_box: 0.0496, tot_losses: 0.1505, lr: 0.0006, speed: 0.391s/iter: 32%|▎| 4408/13754 [28:43<57:22, 2.72it/s]

    opened by xtanitfy 4
  • error: constexpr function return is non-constant

    error: constexpr function return is non-constant

    The following error occur when I bash

    ~/light_head_rcnn/lib/utils/py_faster_rcnn_utils ~/light_head_rcnn/lib
    python3 build_ext --inplace
    running build_ext
    skipping 'bbox.c' Cython extension (up-to-date)
    skipping 'nms.c' Cython extension (up-to-date)
    rm -rf build
    ~/light_head_rcnn/lib/lib_kernel/lib_psroi_pooling ~/light_head_rcnn/lib
    /usr/local/lib/python3.5/dist-packages/tensorflow/include/absl/strings/string_view.h(501): error: constexpr function return is non-constant
    /usr/local/lib/python3.5/dist-packages/tensorflow/include/google/protobuf/arena_impl.h(55): warning: integer conversion resulted in a change of sign
    /usr/local/lib/python3.5/dist-packages/tensorflow/include/google/protobuf/arena_impl.h(309): warning: integer conversion resulted in a change of sign
    /usr/local/lib/python3.5/dist-packages/tensorflow/include/google/protobuf/arena_impl.h(310): warning: integer conversion resulted in a change of sign
    opened by yihui-he 3
  • Can someone release a code to test detection on any image?

    Can someone release a code to test detection on any image?

    Hi, everyone I have some problem of modify the test code. I try to write a program to detect on any image.

    In this project, the author only released some code about evaluation with coco. Can someone help me, how to use the trained model detecting on my image?

    Thanks a lot.

    opened by ChienLiu 3
  • g++ error

    g++ error

    when i run the bash, the following error occurs: ~/light_head_rcnn/lib ~/light_head_rcnn/lib/lib_kernel/lib_fast_nms ~/light_head_rcnn/lib 5: nvcc: not found g++: error: No such file or directory

    what is the problem

    opened by powermano 3
  • undefined symbol when import

    undefined symbol when import

    image I have run the in lib directory and all succeeded when I run -d 0 -se 26 it reported this error I googled many times but found no useful solutions for this problem How can I figure this out?

    opened by hbsz123 2
  • Demo or inference from pretrained model

    Demo or inference from pretrained model

    Hi, I am still quite new to this image processing, neural network field. I would just like to know if there is a pretrained model available and if so, if you have the code available to simply just run it. We would like to use it as a starting point of our project and then build from there by training our own dataset. Thanks

    opened by yonglizhong 0
  • No op named PSAlignPool in defined operations

    No op named PSAlignPool in defined operations

    hi: i trans model.ckpt to .pb according to but when i start inference function , it raise such error:

    Traceback (most recent call last): File "", line 36, in model_fn = DetectionModel() File "", line 16, in init tf.import_graph_def(od_graph_def, name='') File "/home/luban/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/", line 316, in new_func return func(*args, **kwargs) File "/home/luban/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/", line 541, in import_graph_def raise ValueError('No op named %s in defined operations.' % node.op) ValueError: No op named PSAlignPool in defined operations.

    opened by YoungDav 2
  • undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumE

    undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumE

    tensorflow.python.framework.errors_impl.NotFoundError: /home/zhex/work/light_head_rcnn/lib/lib_kernel/lib_psroi_pooling/ undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumE

    opened by scutzhe 2
jemmy li
jemmy li
