Py-faster-rcnn - Faster R-CNN (Python implementation)

Overview

py-faster-rcnn has been deprecated. Please see Detectron, which includes an implementation of Mask R-CNN.

Disclaimer

The official Faster R-CNN code (written in MATLAB) is available here. If your goal is to reproduce the results in our NIPS 2015 paper, please use the official code.

This repository contains a Python reimplementation of the MATLAB code. This Python implementation is built on a fork of Fast R-CNN. There are slight differences between the two implementations. In particular, this Python port

  • is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 220ms / image vs. 200ms / image for VGG16)
  • gives similar, but not exactly the same, mAP as the MATLAB version
  • is not compatible with models trained using the MATLAB code due to the minor implementation differences
  • includes approximate joint training that is 1.5x faster than alternating optimization (for VGG16) -- see these slides for more information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (Microsoft Research)

This Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.

Please see the official README.md for more details.

Faster R-CNN was initially described in an arXiv tech report and was subsequently published in NIPS 2015.

License

Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).

Citing Faster R-CNN

If you find Faster R-CNN useful in your research, please consider citing:

@inproceedings{renNIPS15fasterrcnn,
    Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
    Title = {Faster {R-CNN}: Towards Real-Time Object Detection
             with Region Proposal Networks},
    Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
    Year = {2015}
}

Contents

  1. Requirements: software
  2. Requirements: hardware
  3. Basic installation
  4. Demo
  5. Beyond the demo: training and testing
  6. Usage

Requirements: software

NOTE If you are having issues compiling and you are using a recent version of CUDA/cuDNN, please consult this issue for a workaround

  1. Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

You can download my Makefile.config for reference. 2. Python packages you might not have: cython, python-opencv, easydict 3. [Optional] MATLAB is required for official PASCAL VOC evaluation only. The code now includes unofficial Python evaluation code.

Requirements: hardware

  1. For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
  2. For training Fast R-CNN with VGG16, you'll need a K40 (~11G of memory)
  3. For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)

Installation (sufficient for the demo)

  1. Clone the Faster R-CNN repository
# Make sure to clone with --recursive
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
  1. We'll call the directory that you cloned Faster R-CNN into FRCN_ROOT

    Ignore notes 1 and 2 if you followed step 1 above.

    Note 1: If you didn't clone Faster R-CNN with the --recursive flag, then you'll need to manually clone the caffe-fast-rcnn submodule:

    git submodule update --init --recursive

    Note 2: The caffe-fast-rcnn submodule needs to be on the faster-rcnn branch (or equivalent detached state). This will happen automatically if you followed step 1 instructions.

  2. Build the Cython modules

    cd $FRCN_ROOT/lib
    make
  3. Build Caffe and pycaffe

    cd $FRCN_ROOT/caffe-fast-rcnn
    # Now follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
    
    # If you're experienced with Caffe and have all of the requirements installed
    # and your Makefile.config in place, then simply do:
    make -j8 && make pycaffe
  4. Download pre-computed Faster R-CNN detectors

    cd $FRCN_ROOT
    ./data/scripts/fetch_faster_rcnn_models.sh

    This will populate the $FRCN_ROOT/data folder with faster_rcnn_models. See data/README.md for details. These models were trained on VOC 2007 trainval.

Demo

After successfully completing basic installation, you'll be ready to run the demo.

To run the demo

cd $FRCN_ROOT
./tools/demo.py

The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.

Beyond the demo: installation for training and testing models

  1. Download the training, validation, test data and VOCdevkit

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  2. Extract all of these tars into one directory named VOCdevkit

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
    tar xvf VOCdevkit_08-Jun-2007.tar
  3. It should have this basic structure

    $VOCdevkit/                           # development kit
    $VOCdevkit/VOCcode/                   # VOC utility code
    $VOCdevkit/VOC2007                    # image sets, annotations, etc.
    # ... and several other directories ...
  4. Create symlinks for the PASCAL VOC dataset

    cd $FRCN_ROOT/data
    ln -s $VOCdevkit VOCdevkit2007

    Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.

  5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012

  6. [Optional] If you want to use COCO, please see some notes under data/README.md

  7. Follow the next sections to download pre-trained ImageNet models

Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.

cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh

VGG16 comes from the Caffe Model Zoo, but is provided here for your convenience. ZF was trained at MSRA.

Usage

To train and test a Faster R-CNN detector using the alternating optimization algorithm from our NIPS 2015 paper, use experiments/scripts/faster_rcnn_alt_opt.sh. Output is written underneath $FRCN_ROOT/output.

cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

("alt opt" refers to the alternating optimization training algorithm described in the NIPS paper.)

To train and test a Faster R-CNN detector using the approximate joint training method, use experiments/scripts/faster_rcnn_end2end.sh. Output is written underneath $FRCN_ROOT/output.

cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

This method trains the RPN module jointly with the Fast R-CNN network, rather than alternating between training the two. It results in faster (~ 1.5x speedup) training times and similar detection accuracy. See these slides for more details.

Artifacts generated by the scripts in tools are written in this directory.

Trained Fast R-CNN networks are saved under:

output/
   
    /
    
     /

    
   

Test outputs are saved under:

output/
   
    /
    
     /
     
      /

     
    
   
Comments
  • ResNet Implementation for Faster-rcnn

    ResNet Implementation for Faster-rcnn

    Recently, I'm trying to combine ResNet network with Faster-rcnn. As the first step, I tried to train a model with ResNet 34 networks without bottleneck architectures. There is no error during training process, however, the detection result is very bad. I believe there is something wrong in my implementation, here is the prototxt I used for training, can anybody offer some help about how should I modify it?

    name: "ResNet34"
    layer {
      name: 'input-data'
      type: 'Python'
      top: 'data'
      top: 'im_info'
      top: 'gt_boxes'
      python_param {
        module: 'roi_data_layer.layer'
        layer: 'RoIDataLayer'
        param_str: "'num_classes': 2"
      }
    }
    
    #conv1 7x7 64 /2
    layer {
      name: "conv1"
      type: "Convolution"
      bottom: "data"
      top: "conv1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 7
        pad: 1
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv1_bn"
      type: "BatchNorm"
      bottom: "conv1"
      top: "conv1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv1_relu"
      type: "ReLU"
      bottom: "conv1_bn"
      top: "conv1_bn"
    }
    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "conv1_bn"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 3
        stride: 2
      }
    }
    
    #conv2_1 3x3 64
    layer {
      name: "conv2_1_1"
      type: "Convolution"
      bottom: "pool1"
      top: "conv2_1_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv2_1_1_bn"
      type: "BatchNorm"
      bottom: "conv2_1_1"
      top: "conv2_1_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv2_1_1_relu"
      type: "ReLU"
      bottom: "conv2_1_1_bn"
      top: "conv2_1_1_bn"
    }
    layer {
      name: "conv2_1_2"
      type: "Convolution"
      bottom: "conv2_1_1_bn"
      top: "conv2_1_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv2_1_2_bn"
      type: "BatchNorm"
      bottom: "conv2_1_2"
      top: "conv2_1_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv2_1_sum"
      type: "Eltwise"
      bottom: "pool1"
      bottom: "conv2_1_2_bn"
      top: "conv2_1_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv2_1_sum_relu"
      type: "ReLU"
      bottom: "conv2_1_sum"
      top: "conv2_1_sum"
    }
    
    #conv2_2 3x3 64
    layer {
      name: "conv2_2_1"
      type: "Convolution"
      bottom: "conv2_1_sum"
      top: "conv2_2_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv2_2_1_bn"
      type: "BatchNorm"
      bottom: "conv2_2_1"
      top: "conv2_2_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv2_2_1_relu"
      type: "ReLU"
      bottom: "conv2_2_1_bn"
      top: "conv2_2_1_bn"
    }
    layer {
      name: "conv2_2_2"
      type: "Convolution"
      bottom: "conv2_2_1_bn"
      top: "conv2_2_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv2_2_2_bn"
      type: "BatchNorm"
      bottom: "conv2_2_2"
      top: "conv2_2_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv2_2_sum"
      type: "Eltwise"
      bottom: "conv2_1_sum"
      bottom: "conv2_2_2_bn"
      top: "conv2_2_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv2_2_sum_relu"
      type: "ReLU"
      bottom: "conv2_2_sum"
      top: "conv2_2_sum"
    }
    
    #conv2_3 3x3 64
    layer {
      name: "conv2_3_1"
      type: "Convolution"
      bottom: "conv2_2_sum"
      top: "conv2_3_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv2_3_1_bn"
      type: "BatchNorm"
      bottom: "conv2_3_1"
      top: "conv2_3_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv2_3_1_relu"
      type: "ReLU"
      bottom: "conv2_3_1_bn"
      top: "conv2_3_1_bn"
    }
    layer {
      name: "conv2_3_2"
      type: "Convolution"
      bottom: "conv2_3_1_bn"
      top: "conv2_3_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv2_3_2_bn"
      type: "BatchNorm"
      bottom: "conv2_3_2"
      top: "conv2_3_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv2_3_sum"
      type: "Eltwise"
      bottom: "conv2_2_sum"
      bottom: "conv2_3_2_bn"
      top: "conv2_3_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv2_3_sum_relu"
      type: "ReLU"
      bottom: "conv2_3_sum"
      top: "conv2_3_sum"
    }
    layer {
      name: "conv2_proj"
      type: "Convolution"
      bottom: "conv2_3_sum"
      top: "conv2_proj"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 1
        pad: 0
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv2_proj_bn"
      type: "BatchNorm"
      bottom: "conv2_proj"
      top: "conv2_proj_bn"
      batch_norm_param {
      }
    }
    
    #conv3_1 3x3 128
    layer {
      name: "conv3_1_1"
      type: "Convolution"
      bottom: "conv2_3_sum"
      top: "conv3_1_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_1_1_bn"
      type: "BatchNorm"
      bottom: "conv3_1_1"
      top: "conv3_1_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_1_1_relu"
      type: "ReLU"
      bottom: "conv3_1_1_bn"
      top: "conv3_1_1_bn"
    }
    layer {
      name: "conv3_1_2"
      type: "Convolution"
      bottom: "conv3_1_1_bn"
      top: "conv3_1_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_1_2_bn"
      type: "BatchNorm"
      bottom: "conv3_1_2"
      top: "conv3_1_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_1_sum"
      type: "Eltwise"
      bottom: "conv2_proj_bn"
      bottom: "conv3_1_2_bn"
      top: "conv3_1_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv3_1_sum_relu"
      type: "ReLU"
      bottom: "conv3_1_sum"
      top: "conv3_1_sum"
    }
    
    #conv3_2 3x3 128
    layer {
      name: "conv3_2_1"
      type: "Convolution"
      bottom: "conv3_1_sum"
      top: "conv3_2_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_2_1_bn"
      type: "BatchNorm"
      bottom: "conv3_2_1"
      top: "conv3_2_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_2_1_relu"
      type: "ReLU"
      bottom: "conv3_2_1_bn"
      top: "conv3_2_1_bn"
    }
    layer {
      name: "conv3_2_2"
      type: "Convolution"
      bottom: "conv3_2_1_bn"
      top: "conv3_2_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_2_2_bn"
      type: "BatchNorm"
      bottom: "conv3_2_2"
      top: "conv3_2_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_2_sum"
      type: "Eltwise"
      bottom: "conv3_1_sum"
      bottom: "conv3_2_2_bn"
      top: "conv3_2_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv3_2_sum_relu"
      type: "ReLU"
      bottom: "conv3_2_sum"
      top: "conv3_2_sum"
    }
    
    #conv3_3 3x3 128
    layer {
      name: "conv3_3_1"
      type: "Convolution"
      bottom: "conv3_2_sum"
      top: "conv3_3_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_3_1_bn"
      type: "BatchNorm"
      bottom: "conv3_3_1"
      top: "conv3_3_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_3_1_relu"
      type: "ReLU"
      bottom: "conv3_3_1_bn"
      top: "conv3_3_1_bn"
    }
    layer {
      name: "conv3_3_2"
      type: "Convolution"
      bottom: "conv3_3_1_bn"
      top: "conv3_3_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_3_2_bn"
      type: "BatchNorm"
      bottom: "conv3_3_2"
      top: "conv3_3_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_3_sum"
      type: "Eltwise"
      bottom: "conv3_2_sum"
      bottom: "conv3_3_2_bn"
      top: "conv3_3_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv3_3_sum_relu"
      type: "ReLU"
      bottom: "conv3_3_sum"
      top: "conv3_3_sum"
    }
    
    #conv3_4 3x3 128
    layer {
      name: "conv3_4_1"
      type: "Convolution"
      bottom: "conv3_3_sum"
      top: "conv3_4_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_4_1_bn"
      type: "BatchNorm"
      bottom: "conv3_4_1"
      top: "conv3_4_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_4_1_relu"
      type: "ReLU"
      bottom: "conv3_4_1_bn"
      top: "conv3_4_1_bn"
    }
    layer {
      name: "conv3_4_2"
      type: "Convolution"
      bottom: "conv3_4_1_bn"
      top: "conv3_4_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_4_2_bn"
      type: "BatchNorm"
      bottom: "conv3_4_2"
      top: "conv3_4_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv3_4_sum"
      type: "Eltwise"
      bottom: "conv3_3_sum"
      bottom: "conv3_4_2_bn"
      top: "conv3_4_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv3_4_sum_relu"
      type: "ReLU"
      bottom: "conv3_4_sum"
      top: "conv3_4_sum"
    }
    layer {
      name: "conv3_proj"
      type: "Convolution"
      bottom: "conv3_4_sum"
      top: "conv3_proj"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 1
        pad: 0
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv3_proj_bn"
      type: "BatchNorm"
      bottom: "conv3_proj"
      top: "conv3_proj_bn"
      batch_norm_param {
      }
    }
    
    #conv4_1 3x3 256
    layer {
      name: "conv4_1_1"
      type: "Convolution"
      bottom: "conv3_4_sum"
      top: "conv4_1_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_1_1_bn"
      type: "BatchNorm"
      bottom: "conv4_1_1"
      top: "conv4_1_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_1_1_relu"
      type: "ReLU"
      bottom: "conv4_1_1_bn"
      top: "conv4_1_1_bn"
    }
    layer {
      name: "conv4_1_2"
      type: "Convolution"
      bottom: "conv4_1_1_bn"
      top: "conv4_1_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_1_2_bn"
      type: "BatchNorm"
      bottom: "conv4_1_2"
      top: "conv4_1_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_1_sum"
      type: "Eltwise"
      bottom: "conv3_proj_bn"
      bottom: "conv4_1_2_bn"
      top: "conv4_1_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv4_1_sum_relu"
      type: "ReLU"
      bottom: "conv4_1_sum"
      top: "conv4_1_sum"
    }
    
    #conv4_2 3x3 256
    layer {
      name: "conv4_2_1"
      type: "Convolution"
      bottom: "conv4_1_sum"
      top: "conv4_2_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_2_1_bn"
      type: "BatchNorm"
      bottom: "conv4_2_1"
      top: "conv4_2_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_2_1_relu"
      type: "ReLU"
      bottom: "conv4_2_1_bn"
      top: "conv4_2_1_bn"
    }
    layer {
      name: "conv4_2_2"
      type: "Convolution"
      bottom: "conv4_2_1_bn"
      top: "conv4_2_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_2_2_bn"
      type: "BatchNorm"
      bottom: "conv4_2_2"
      top: "conv4_2_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_2_sum"
      type: "Eltwise"
      bottom: "conv4_1_sum"
      bottom: "conv4_2_2_bn"
      top: "conv4_2_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv4_2_sum_relu"
      type: "ReLU"
      bottom: "conv4_2_sum"
      top: "conv4_2_sum"
    }
    
    #conv4_3 3x3 256
    layer {
      name: "conv4_3_1"
      type: "Convolution"
      bottom: "conv4_2_sum"
      top: "conv4_3_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_3_1_bn"
      type: "BatchNorm"
      bottom: "conv4_3_1"
      top: "conv4_3_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_3_1_relu"
      type: "ReLU"
      bottom: "conv4_3_1_bn"
      top: "conv4_3_1_bn"
    }
    layer {
      name: "conv4_3_2"
      type: "Convolution"
      bottom: "conv4_3_1_bn"
      top: "conv4_3_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_3_2_bn"
      type: "BatchNorm"
      bottom: "conv4_3_2"
      top: "conv4_3_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_3_sum"
      type: "Eltwise"
      bottom: "conv4_2_sum"
      bottom: "conv4_3_2_bn"
      top: "conv4_3_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv4_3_sum_relu"
      type: "ReLU"
      bottom: "conv4_3_sum"
      top: "conv4_3_sum"
    }
    
    #conv4_4 3x3 256
    layer {
      name: "conv4_4_1"
      type: "Convolution"
      bottom: "conv4_3_sum"
      top: "conv4_4_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_4_1_bn"
      type: "BatchNorm"
      bottom: "conv4_4_1"
      top: "conv4_4_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_4_1_relu"
      type: "ReLU"
      bottom: "conv4_4_1_bn"
      top: "conv4_4_1_bn"
    }
    layer {
      name: "conv4_4_2"
      type: "Convolution"
      bottom: "conv4_4_1_bn"
      top: "conv4_4_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_4_2_bn"
      type: "BatchNorm"
      bottom: "conv4_4_2"
      top: "conv4_4_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_4_sum"
      type: "Eltwise"
      bottom: "conv4_3_sum"
      bottom: "conv4_4_2_bn"
      top: "conv4_4_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv4_4_sum_relu"
      type: "ReLU"
      bottom: "conv4_4_sum"
      top: "conv4_4_sum"
    }
    
    #conv4_5 3x3 256
    layer {
      name: "conv4_5_1"
      type: "Convolution"
      bottom: "conv4_4_sum"
      top: "conv4_5_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_5_1_bn"
      type: "BatchNorm"
      bottom: "conv4_5_1"
      top: "conv4_5_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_5_1_relu"
      type: "ReLU"
      bottom: "conv4_5_1_bn"
      top: "conv4_5_1_bn"
    }
    layer {
      name: "conv4_5_2"
      type: "Convolution"
      bottom: "conv4_5_1_bn"
      top: "conv4_5_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_5_2_bn"
      type: "BatchNorm"
      bottom: "conv4_5_2"
      top: "conv4_5_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_5_sum"
      type: "Eltwise"
      bottom: "conv4_4_sum"
      bottom: "conv4_5_2_bn"
      top: "conv4_5_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv4_5_sum_relu"
      type: "ReLU"
      bottom: "conv4_5_sum"
      top: "conv4_5_sum"
    }
    
    #conv4_6 3x3 256
    layer {
      name: "conv4_6_1"
      type: "Convolution"
      bottom: "conv4_5_sum"
      top: "conv4_6_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_6_1_bn"
      type: "BatchNorm"
      bottom: "conv4_6_1"
      top: "conv4_6_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_6_1_relu"
      type: "ReLU"
      bottom: "conv4_6_1_bn"
      top: "conv4_6_1_bn"
    }
    layer {
      name: "conv4_6_2"
      type: "Convolution"
      bottom: "conv4_6_1_bn"
      top: "conv4_6_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_6_2_bn"
      type: "BatchNorm"
      bottom: "conv4_6_2"
      top: "conv4_6_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv4_6_sum"
      type: "Eltwise"
      bottom: "conv4_5_sum"
      bottom: "conv4_6_2_bn"
      top: "conv4_6_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv4_6_sum_relu"
      type: "ReLU"
      bottom: "conv4_6_sum"
      top: "conv4_6_sum"
    }
    layer {
      name: "conv4_proj"
      type: "Convolution"
      bottom: "conv4_6_sum"
      top: "conv4_proj"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 1
        pad: 0
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv4_proj_bn"
      type: "BatchNorm"
      bottom: "conv4_proj"
      top: "conv4_proj_bn"
      batch_norm_param {
      }
    }
    
    #conv5_1 3x3 512
    layer {
      name: "conv5_1_1"
      type: "Convolution"
      bottom: "conv4_6_sum"
      top: "conv5_1_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv5_1_1_bn"
      type: "BatchNorm"
      bottom: "conv5_1_1"
      top: "conv5_1_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv5_1_1_relu"
      type: "ReLU"
      bottom: "conv5_1_1_bn"
      top: "conv5_1_1_bn"
    }
    layer {
      name: "conv5_1_2"
      type: "Convolution"
      bottom: "conv5_1_1_bn"
      top: "conv5_1_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv5_1_2_bn"
      type: "BatchNorm"
      bottom: "conv5_1_2"
      top: "conv5_1_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv5_1_sum"
      type: "Eltwise"
      bottom: "conv4_proj_bn"
      bottom: "conv5_1_2_bn"
      top: "conv5_1_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv5_1_sum_relu"
      type: "ReLU"
      bottom: "conv5_1_sum"
      top: "conv5_1_sum"
    }
    
    #conv5_2 3x3 512
    layer {
      name: "conv5_2_1"
      type: "Convolution"
      bottom: "conv5_1_sum"
      top: "conv5_2_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv5_2_1_bn"
      type: "BatchNorm"
      bottom: "conv5_2_1"
      top: "conv5_2_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv5_2_1_relu"
      type: "ReLU"
      bottom: "conv5_2_1_bn"
      top: "conv5_2_1_bn"
    }
    layer {
      name: "conv5_2_2"
      type: "Convolution"
      bottom: "conv5_2_1_bn"
      top: "conv5_2_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv5_2_2_bn"
      type: "BatchNorm"
      bottom: "conv5_2_2"
      top: "conv5_2_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv5_2_sum"
      type: "Eltwise"
      bottom: "conv5_1_sum"
      bottom: "conv5_2_2_bn"
      top: "conv5_2_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv5_2_sum_relu"
      type: "ReLU"
      bottom: "conv5_2_sum"
      top: "conv5_2_sum"
    }
    
    #conv5_3 3x3 512
    layer {
      name: "conv5_3_1"
      type: "Convolution"
      bottom: "conv5_2_sum"
      top: "conv5_3_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv5_3_1_bn"
      type: "BatchNorm"
      bottom: "conv5_3_1"
      top: "conv5_3_1_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv5_3_1_relu"
      type: "ReLU"
      bottom: "conv5_3_1_bn"
      top: "conv5_3_1_bn"
    }
    layer {
      name: "conv5_3_2"
      type: "Convolution"
      bottom: "conv5_3_1_bn"
      top: "conv5_3_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
         lr_mult: 2
         decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "conv5_3_2_bn"
      type: "BatchNorm"
      bottom: "conv5_3_2"
      top: "conv5_3_2_bn"
      batch_norm_param {
      }
    }
    layer {
      name: "conv5_3_sum"
      type: "Eltwise"
      bottom: "conv5_2_sum"
      bottom: "conv5_3_2_bn"
      top: "conv5_3_sum"
      eltwise_param {
        operation: SUM
      }
    }
    layer {
      name: "conv5_3_sum_relu"
      type: "ReLU"
      bottom: "conv5_3_sum"
      top: "conv5_3_sum"
    }
    
    #========= RPN ============
    
    layer {
      name: "rpn_conv/3x3"
      type: "Convolution"
      bottom: "conv5_3_sum"
      top: "rpn/output"
      param { lr_mult: 1.0 }
      param { lr_mult: 2.0 }
      convolution_param {
        num_output: 512
        kernel_size: 3 pad: 1 stride: 1
        weight_filler { type: "gaussian" std: 0.01 }
        bias_filler { type: "constant" value: 0 }
      }
    }
    layer {
      name: "rpn_relu/3x3"
      type: "ReLU"
      bottom: "rpn/output"
      top: "rpn/output"
    }
    
    layer {
      name: "rpn_cls_score"
      type: "Convolution"
      bottom: "rpn/output"
      top: "rpn_cls_score"
      param { lr_mult: 1.0 }
      param { lr_mult: 2.0 }
      convolution_param {
        num_output: 18   # 2(bg/fg) * 9(anchors)
        kernel_size: 1 pad: 0 stride: 1
        weight_filler { type: "gaussian" std: 0.01 }
        bias_filler { type: "constant" value: 0 }
      }
    }
    
    layer {
      name: "rpn_bbox_pred"
      type: "Convolution"
      bottom: "rpn/output"
      top: "rpn_bbox_pred"
      param { lr_mult: 1.0 }
      param { lr_mult: 2.0 }
      convolution_param {
        num_output: 36   # 4 * 9(anchors)
        kernel_size: 1 pad: 0 stride: 1
        weight_filler { type: "gaussian" std: 0.01 }
        bias_filler { type: "constant" value: 0 }
      }
    }
    
    layer {
       bottom: "rpn_cls_score"
       top: "rpn_cls_score_reshape"
       name: "rpn_cls_score_reshape"
       type: "Reshape"
       reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }
    }
    
    layer {
      name: 'rpn-data'
      type: 'Python'
      bottom: 'rpn_cls_score'
      bottom: 'gt_boxes'
      bottom: 'im_info'
      bottom: 'data'
      top: 'rpn_labels'
      top: 'rpn_bbox_targets'
      top: 'rpn_bbox_inside_weights'
      top: 'rpn_bbox_outside_weights'
      python_param {
        module: 'rpn.anchor_target_layer'
        layer: 'AnchorTargetLayer'
        param_str: "'feat_stride': 16"
      }
    }
    
    layer {
      name: "rpn_loss_cls"
      type: "SoftmaxWithLoss"
      bottom: "rpn_cls_score_reshape"
      bottom: "rpn_labels"
      propagate_down: 1
      propagate_down: 0
      top: "rpn_cls_loss"
      loss_weight: 1
      loss_param {
        ignore_label: -1
        normalize: true
      }
    }
    
    layer {
      name: "rpn_loss_bbox"
      type: "SmoothL1Loss"
      bottom: "rpn_bbox_pred"
      bottom: "rpn_bbox_targets"
      bottom: 'rpn_bbox_inside_weights'
      bottom: 'rpn_bbox_outside_weights'
      top: "rpn_loss_bbox"
      loss_weight: 1
      smooth_l1_loss_param { sigma: 3.0 }
    }
    
    #========= RoI Proposal ============
    
    layer {
      name: "rpn_cls_prob"
      type: "Softmax"
      bottom: "rpn_cls_score_reshape"
      top: "rpn_cls_prob"
    }
    
    layer {
      name: 'rpn_cls_prob_reshape'
      type: 'Reshape'
      bottom: 'rpn_cls_prob'
      top: 'rpn_cls_prob_reshape'
      reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
    }
    
    layer {
      name: 'proposal'
      type: 'Python'
      bottom: 'rpn_cls_prob_reshape'
      bottom: 'rpn_bbox_pred'
      bottom: 'im_info'
      top: 'rpn_rois'
    #  top: 'rpn_scores'
      python_param {
        module: 'rpn.proposal_layer'
        layer: 'ProposalLayer'
        param_str: "'feat_stride': 16"
      }
    }
    
    #layer {
    #  name: 'debug-data'
    #  type: 'Python'
    #  bottom: 'data'
    #  bottom: 'rpn_rois'
    #  bottom: 'rpn_scores'
    #  python_param {
    #    module: 'rpn.debug_layer'
    #    layer: 'RPNDebugLayer'
    #  }
    #}
    
    layer {
      name: 'roi-data'
      type: 'Python'
      bottom: 'rpn_rois'
      bottom: 'gt_boxes'
      top: 'rois'
      top: 'labels'
      top: 'bbox_targets'
      top: 'bbox_inside_weights'
      top: 'bbox_outside_weights'
      python_param {
        module: 'rpn.proposal_target_layer'
        layer: 'ProposalTargetLayer'
        param_str: "'num_classes': 2"
      }
    }
    
    #========= RCNN ============
    
    layer {
      name: "roi_pool5"
      type: "ROIPooling"
      bottom: "conv5_3_sum"
      bottom: "rois"
      top: "pool5"
      roi_pooling_param {
        pooled_w: 7
        pooled_h: 7
        spatial_scale: 0.0625 # 1/16
      }
    }
    layer {
      name: "cls_score"
      type: "InnerProduct"
      bottom: "pool5"
      top: "cls_score"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 2
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "bbox_pred"
      type: "InnerProduct"
      bottom: "pool5"
      top: "bbox_pred"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 8
        weight_filler {
          type: "gaussian"
          std: 0.001
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "loss_cls"
      type: "SoftmaxWithLoss"
      bottom: "cls_score"
      bottom: "labels"
      propagate_down: 1
      propagate_down: 0
      top: "loss_cls"
      loss_weight: 1
    }
    layer {
      name: "loss_bbox"
      type: "SmoothL1Loss"
      bottom: "bbox_pred"
      bottom: "bbox_targets"
      bottom: "bbox_inside_weights"
      bottom: "bbox_outside_weights"
      top: "loss_bbox"
      loss_weight: 1
    }
    
    
    opened by twtygqyy 108
  • Training of py-faster-rcnn on ImageNet

    Training of py-faster-rcnn on ImageNet

    I revised the codes in lib/datasets to train on the ImageNet detection data: I create an ilsvrc.py (like the pascal_voc.py) to account for the ImageNet data and modify the corresponding codes in factory.py. Then I run the experiment script ./experiments/scripts/faster_rcnn_alt_opt.sh and everything seems to be correct at first, but then the training seems to get stuck with the following information never moving forward again...

    Loading pretrained model weights from data/imagenet_models/ZF.v2.caffemodel
    Solving...
    I1101 15:18:12.302584 40312 solver.cpp:242] Iteration 0, loss = 1.10525
    I1101 15:18:12.302647 40312 solver.cpp:258]     Train net output #0: rpn_cls_loss = 0.785905 (* 1 =     0.785905 loss)
    I1101 15:18:12.302659 40312 solver.cpp:258]     Train net output #1: rpn_loss_bbox = 0.319344 (* 1 = 0.319344 loss)
    I1101 15:18:12.302670 40312 solver.cpp:571] Iteration 0, lr = 0.001
    

    Could anyone give me some suggestions? Thanks!

    opened by ghost 66
  • Training FasterRCNN without pre-trained network?

    Training FasterRCNN without pre-trained network?

    Hi all, I got the error ""BB = BB[sorted_ind, :] IndexError: too many indices for array" It seem that the trained network is nothing.

    I follow the original steps in https://github.com/rbgirshick/py-faster-rcnn And just modify scripts file ./experiments/scripts/faster_rcnn_end2end.sh to remove the line " --weights data/imagenet_models/${NET}.v2.caffemodel " I can finish the training, and also make the caffemodel file.

    Anyones face this error? Could you please give me the solution? Thank you,

    opened by tiepnh 56
  • Check failed: error == cudaSuccess (8 vs. 0)  invalid device function

    Check failed: error == cudaSuccess (8 vs. 0) invalid device function

    There is no problem for me to run the demo.py of fast-rcnn, however, I had the error as follows when I try to run the demo.py of py-faster-rcnn after successfully make -j8 & make pycaffe Loaded network /home/ubuntu/py-faster-rcnn/data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel F1008 04:30:16.139123 5360 roi_pooling_layer.cu:91] Check failed: error == cudaSuccess (8 vs. 0) invalid device function *** Check failure stack trace: ***

    Anyone has the same problem?

    opened by twtygqyy 38
  • bbox_transform.py:48: RuntimeWarning: overflow encountered in exp ...

    bbox_transform.py:48: RuntimeWarning: overflow encountered in exp ...

    in proposal_layer.py 's forward() function, when i print out bbox_deltas.min() and bbox_deltas.max(), at some point it suddenly become large and cause overflow and core dump, here is the log:

    -0.843478 0.695785
    -1.53431 1.09048
    -2.39332 1.81395
    -2.74009 1.98957
    -0.368922 0.236118
    -0.707322 0.23115
    I0115 01:02:23.016412 31390 solver.cpp:242] Iteration 40, loss = 1.8799
    I0115 01:02:23.016444 31390 solver.cpp:258]     Train net output #0: loss_bbox = 0.040359 (* 1 = 0.040359 loss)
    I0115 01:02:23.016451 31390 solver.cpp:258]     Train net output #1: loss_cls = 0.240918 (* 1 = 0.240918 loss)
    I0115 01:02:23.016456 31390 solver.cpp:258]     Train net output #2: rpn_cls_loss = 0.585994 (* 1 = 0.585994 loss)
    I0115 01:02:23.016461 31390 solver.cpp:258]     Train net output #3: rpn_loss_bbox = 0.883768 (* 1 = 0.883768 loss)
    I0115 01:02:23.016472 31390 solver.cpp:571] Iteration 40, lr = 0.001
    -5.10243 4.21369
    -5.00325 3.99131
    -6.54417 1.98163
    -8.08618 2.3706
    -8.88115 1.3943
    -3.91337 0.64184
    -1.92415 2.32623
    -1.19933 1.09659
    -4.12942 3.17897
    -4.96536 3.46139
    -2.6374 1.79074
    -2.49792 1.71725
    -17.6217 14.487
    -22.151 18.5744
    -21.8844 17.797
    -16.2733 13.0881
    -1.50187 1.95901
    -1.43967 1.50989
    -14.1043 28.3903
    -9.91849 19.1581
    -31.6936 8.46099
    -27.262 7.1318
    -28.2349 125.76
    /home/zerry/Work/Libs/py-faster-rcnn/tools/../lib/fast_rcnn/bbox_transform.py:48: RuntimeWarning: overflow encountered in exp
      pred_w = np.exp(dw) * widths[:, np.newaxis]
    /home/zerry/Work/Libs/py-faster-rcnn/tools/../lib/fast_rcnn/bbox_transform.py:48: RuntimeWarning: overflow encountered in multiply
      pred_w = np.exp(dw) * widths[:, np.newaxis]
    -26.7032 118.203
    -741.881 505.883
    /home/zerry/Work/Libs/py-faster-rcnn/tools/../lib/fast_rcnn/bbox_transform.py:49: RuntimeWarning: overflow encountered in exp
      pred_h = np.exp(dh) * heights[:, np.newaxis]
    /home/zerry/Work/Libs/py-faster-rcnn/tools/../lib/fast_rcnn/bbox_transform.py:49: RuntimeWarning: overflow encountered in multiply
      pred_h = np.exp(dh) * heights[:, np.newaxis]
    -692.391 472.156
    -9.47346e+25 1.02213e+26
    -6.35599e+25 6.85811e+25
    nan nan
    /home/zerry/Work/Libs/py-faster-rcnn/tools/../lib/rpn/proposal_layer.py:176: RuntimeWarning: invalid value encountered in greater_equal
      keep = np.where((ws >= min_size) & (hs >= min_size))[0]
    ./experiments/scripts/faster_rcnn_end2end_handdet.sh: line 39: 31390 Floating point exception(core dumped
    

    Anyone can help figuring out what could be the problem?

    opened by ZhengRui 35
  • TypeError: 'numpy.float64' object cannot be interpreted as an index

    TypeError: 'numpy.float64' object cannot be interpreted as an index

    I am hitting an error when following the tutorial linked below, which uses the INRIA Person dataset as an example.

    https://github.com/deboc/py-faster-rcnn/tree/master/help

    Everything seems to be working, but after two hours I keep hitting the following error at Process 3:

    Solving... Process Process-3: Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "./tools/train_faster_rcnn_alt_opt.py", line 236, in train_fast_rcnn max_iters=max_iters File "/home/scott/code/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 185, in train_net model_paths = sw.train_model(max_iters) File "/home/scott/code/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 112, in train_model self.solver.step(1) File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 155, in forward blobs = self._get_next_minibatch() File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 68, in _get_next_minibatch return get_minibatch(minibatch_db, self._num_classes) File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 64, in get_minibatch num_classes) File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 110, in _sample_rois fg_inds, size=fg_rois_per_this_image, replace=False File "mtrand.pyx", line 1176, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:18822) TypeError: 'numpy.float64' object cannot be interpreted as an index

    Does anyone know what might be causing this problem?

    opened by snsie 26
  • I got the error when running demo.py

    I got the error when running demo.py

    Hi. I already successfully run fast-rcnn code written by the same author of this. Then, I would like to try to use faster-rcnn.. However, when I run demo.py, I got this error. ''' Traceback (most recent call last): File "./demo.py", line 18, in from fast_rcnn.test import im_detect File ".../py-faster-rcnn-master/tools/../lib/fast_rcnn/test.py", line 17, in from fast_rcnn.nms_wrapper import nms File ".../py-faster-rcnn-master/tools/../lib/fast_rcnn/nms_wrapper.py", line 11, in from nms.gpu_nms import gpu_nms ImportError: No module named gpu_nms ''' Anyone has any idea about this error? Thank you.

    opened by jojotata 26
  • I want change anchor size?

    I want change anchor size?

    Hi guys, I already changed the code in lib/rpn/generate_anchors.py and nub_output like this: Uploading snapshot8.png… ratios and num_output like this snapshot6 snapshot7

    However, I got the following error. snapshot9 Could you please help? Best

    opened by moyans 23
  • ImportError: No module named _caffe

    ImportError: No module named _caffe

    After doing the steps, I tried to run the demo.py I am getting error : ImportError: No module named _caffe But I already have installed caffe. When I try to import caffe from python terminal there's no error.

    Where's the Problem?

    screenshot from 2016-12-05 15-45-14

    opened by aquibjaved 20
  • Floating point exception

    Floating point exception

    after thousands iterations, faster-rcnn throw a error "Floating point exception " at ./experiments/scripts/faster_rcnn_end2end.sh . I search the error saying about i/0 or i%0, anyone encountered this?

    opened by morusu 20
  • If I wana detect small object, which args should I modify?

    If I wana detect small object, which args should I modify?

    @rbgirshick Thanks for your good work! I wana detect very small objects , like 10 * 10 .But I don't know how to do this . I have search the hole issue list but didn't see any compelete instructions for this job. Could you please show me a way to do this?something like which args to modified

    Thank you

    opened by albertyou2 19
  • continue training with .caffemodel

    continue training with .caffemodel

    Hi I'm using this codes with Python3.8 (with modifying some codes that can execute on Python3, because code was made with 2.7) Everything's okay, but some errors appear while training. Fixing codes spend not much time, but (maybe) cause of using single gpu, it takes too much time. Moreover I only knew command from README that execute training from the beginning (./experiments/scripts/faster_rcnn_end2end(or alt_opt).sh ~~~) so just did that command regardless of getting iterations 70,000 at previous training, again and again unwillingly.

    Well, I could see the caffemodel file (I think it is saved when iterations got 10,000 times) in directory following README, so what I want to know is :

    Can I continue training with previous stopped iterations using caffemodel file or other files? If I can, what command is that?

    Thanks for reading with my poor english

    opened by whansk50 0
  • image invalid, skipping

    image invalid, skipping

    I am always getting an exception loop saying "image invalid, skipping" and I just can not see why.

    2021-12-16 11:26:40.670417: W tensorflow/core/framework/op_kernel.cc:1639] Unknown: Exception: Traceback (most recent call last):

    File "C:\Anaconda3\envs\Sensor detection\lib\site-packages\tensorflow_core\python\ops\script_ops.py", line 235, in call ret = func(*args)

    File "D:\emn_pg (sensor detection project)\sensor detection\lib\layer_utils\proposal_target_layer.py", line 47, in proposal_target_layer rois_per_image, _num_classes)

    File "D:\emn_pg (sensor detection project)\sensor detection\lib\layer_utils\proposal_target_layer.py", line 135, in _sample_rois raise Exception()

    Exception

    image invalid, skipping

    These are an image and annotation examples: 1

    <?xml version="1.0"?>
    <annotation>
        <folder>annotations</folder>
        <filename>1.JPG</filename>
        <path/>
        <source>
            <database>Unknown</database>
        </source>
        <size>
            <width>3024</width>
            <height>4032</height>
            <depth>3</depth>
        </size>
        <segmented>0</segmented>
        <object>
            <name>sensor</name>
            <pose>Unspecified</pose>
            <truncated>0</truncated>
            <occluded>0</occluded>
            <difficult>0</difficult>
            <bndbox>
                <xmin>1189</xmin>
                <ymin>1433</ymin>
                <xmax>1709</xmax>
                <ymax>2013</ymax>
            </bndbox>
        </object>
    </annotation>
    

    Some images have more than one sensor: 10

    <?xml version="1.0"?>
    <annotation>
        <folder>annotations</folder>
        <filename>10.JPG</filename>
        <path/>
        <source>
            <database>Unknown</database>
        </source>
        <size>
            <width>3024</width>
            <height>4032</height>
            <depth>3</depth>
        </size>
        <segmented>0</segmented>
        <object>
            <name>sensor</name>
            <pose>Unspecified</pose>
            <truncated>0</truncated>
            <occluded>0</occluded>
            <difficult>0</difficult>
            <bndbox>
                <xmin>1643</xmin>
                <ymin>1528</ymin>
                <xmax>1748</xmax>
                <ymax>1620</ymax>
            </bndbox>
        </object>
        <object>
            <name>sensor</name>
            <pose>Unspecified</pose>
            <truncated>0</truncated>
            <occluded>0</occluded>
            <difficult>0</difficult>
            <bndbox>
                <xmin>2387</xmin>
                <ymin>2716</ymin>
                <xmax>2455</xmax>
                <ymax>2784</ymax>
            </bndbox>
        </object>
    </annotation>
    

    I can't see what image is causing this, or if it is a problem somewhere in the code. Can someone help me?

    opened by pollyminatel 0
  • how to solve

    how to solve "raise EnvironmentError('The nvcc binary could not be ' OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME make: *** [all] Error 1"

    I want to install the py-faster-rcnn in order to implement it using pycharm, I already cloned the repository as written in the installation instruction but I got the following error when I type the "make function" on the cmd:

    C:\Users\Yaman\py-faster-rcnn\lib>make python setup.py build_ext --inplace Traceback (most recent call last): File "setup.py", line 58, in CUDA = locate_cuda() File "setup.py", line 46, in locate_cuda raise EnvironmentError('The nvcc binary could not be ' OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME make: *** [all] Error 1

    Although I got the following when I try "nvcc --version": nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Wed_Oct_23_19:32:27_Pacific_Daylight_Time_2019 Cuda compilation tools, release 10.2, V10.2.89

    Which mean it's installed, Am I doing anything wrong? Thank you in advance,,

    opened by yaman-hub 0
  • so i have to compile cython with cuda or nvcc ??

    so i have to compile cython with cuda or nvcc ??

    i just want to do a inference with the virtual machine, i dont need to train , so now i cant go forward to do anything.i cant compile Cython ,i check the setup.py have to install the nvcc ,there is no compile with cpu only option or something like that !!

    opened by xlb8 0
  • How to remove false detection (False Positives) in Faster RCNN

    How to remove false detection (False Positives) in Faster RCNN

    I am using Faster RCNN with Inception V2 on custom dataset. My model is working fine with good detection accuracy. However, I am facing false positive problem when I pass an image to the model I get correct prediction but I am also getting some wrong bounding boxes with high confidence score. Is there any method which can be used as a post-processing to remove these extra detection?

    opened by Syed05 0
Owner
Ross Girshick
Ross Girshick
Faster RCNN with PyTorch

Faster RCNN with PyTorch Note: I re-implemented faster rcnn in this project when I started learning PyTorch. Then I use PyTorch in all of my projects.

Long Chen 1.6k Dec 23, 2022
Faster RCNN pytorch windows

Faster-RCNN-pytorch-windows Faster RCNN implementation with pytorch for windows Open cmd, compile this comands: cd lib python setup.py build develop T

Hwa-Rang Kim 1 Nov 11, 2022
A PyTorch implementation of the architecture of Mask RCNN

EDIT (AS OF 4th NOVEMBER 2019): This implementation has multiple errors and as of the date 4th, November 2019 is insufficient to be utilized as a reso

Sai Himal Allu 975 Dec 30, 2022
A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN

A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN Please follow Faster R-CNN and DAF to complete the environment confi

null 2 Jan 12, 2022
3D cascade RCNN for object detection on point cloud

3D Cascade RCNN This is the implementation of 3D Cascade RCNN: High Quality Object Detection in Point Clouds. We designed a 3D object detection model

Qi Cai 22 Dec 2, 2022
This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Mask-RCNN on Mycobacterium tuberculosis This is an example of object detection on Mycobacterium Tuberculosis using Mask RCNN. Implement of Mask R-CNN

Jun-En Ding 1 Sep 16, 2021
This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Mask-RCNN on Mycobacterium tuberculosis This is an example of object detection on Mycobacterium Tuberculosis using Mask RCNN. Implement of Mask R-CNN

Jun-En Ding 1 Sep 16, 2021
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

null 5 Nov 3, 2022
This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

RobustFreqCNN About This repository contains the implementation of the paper "Towards Frequency-Based Explanation for Robust CNN" arxiv. It primarly d

Sarosij Bose 2 Jan 23, 2022
This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

face-detector-age-gender This is a Keras implementation of a CNN for estimating age, gender and mask from a camera. Before run face detector app, expr

Devdreamsolution 2 Dec 4, 2021
A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc. Speed up due to heavy use of slicing and mathematical simplification..

null 2 Nov 30, 2021
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 7, 2022
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

null 0 May 6, 2022
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

Quasi-Recurrent Neural Network (QRNN) for PyTorch Updated to support multi-GPU environments via DataParallel - see the the multigpu_dataparallel.py ex

Salesforce 1.3k Dec 28, 2022
Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Tacotron 2 (without wavenet) PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions. This implementati

NVIDIA Corporation 4.1k Jan 3, 2023
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 7, 2023
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

CoTr: Efficient 3D Medical Image Segmentation by bridging CNN and Transformer This is the official pytorch implementation of the CoTr: Paper: CoTr: Ef

null 218 Dec 25, 2022
git《Beta R-CNN: Looking into Pedestrian Detection from Another Perspective》(NeurIPS 2020) GitHub:[fig3]

Beta R-CNN: Looking into Pedestrian Detection from Another Perspective This is the pytorch implementation of our paper "[Beta R-CNN: Looking into Pede

null 35 Sep 8, 2021
A spherical CNN for weather forecasting

DeepSphere-Weather - Deep Learning on the sphere for weather/climate applications. The code in this repository provides a scalable and flexible framew

DeepSphere 47 Dec 25, 2022