A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Overview

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark

A Pytorch Implementation of Detectron

Build Status

Example output of e2e_mask_rcnn-R-101-FPN_2x using Detectron pretrained weight.

Corresponding example output from Detectron.

Example output of e2e_keypoint_rcnn-R-50-FPN_s1x using Detectron pretrained weight.

This code follows the implementation architecture of Detectron. Only part of the functionality is supported. Check this section for more information.

With this code, you can...

  1. Train your model from scratch.
  2. Inference using the pretrained weight file (*.pkl) from Detectron.

This repository is originally built on jwyang/faster-rcnn.pytorch. However, after many modifications, the structure changes a lot and it's now more similar to Detectron. I deliberately make everything similar or identical to Detectron's implementation, so as to reproduce the result directly from official pretrained weight files.

This implementation has the following features:

  • It is pure Pytorch code. Of course, there are some CUDA code.

  • It supports multi-image batch training.

  • It supports multiple GPUs training.

  • It supports three pooling methods. Notice that only roi align is revised to match the implementation in Caffe2. So, use it.

  • It is memory efficient. For data batching, there are two techiniques available to reduce memory usage: 1) Aspect grouping: group images with similar aspect ratio in a batch 2) Aspect cropping: crop images that are too long. Aspect grouping is implemented in Detectron, so it's used for default. Aspect cropping is the idea from jwyang/faster-rcnn.pytorch, and it's not used for default.

    Besides of that, I implement a customized nn.DataParallel module which enables different batch blob size on different gpus. Check My nn.DataParallel section for more details about this.

News

  • (2018/05/25) Support ResNeXt backbones.
  • (2018/05/22) Add group normalization baselines.
  • (2018/05/15) PyTorch0.4 is supported now !

Getting Started

Clone the repo:

git clone https://github.com/roytseng-tw/mask-rcnn.pytorch.git

Requirements

Tested under python3.

  • python packages
    • pytorch>=0.3.1
    • torchvision>=0.2.0
    • cython
    • matplotlib
    • numpy
    • scipy
    • opencv
    • pyyaml
    • packaging
    • pycocotools — for COCO dataset, also available from pip.
    • tensorboardX — for logging the losses in Tensorboard
  • An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
  • NOTICE: different versions of Pytorch package have different memory usages.

Compilation

Compile the CUDA code:

cd lib  # please change to this directory
sh make.sh

If your are using Volta GPUs, uncomment this line in lib/mask.sh and remember to postpend a backslash at the line above. CUDA_PATH defaults to /usr/loca/cuda. If you want to use a CUDA library on different path, change this line accordingly.

It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Crop and ROI_Align. (Actually gpu nms is never used ...)

Note that, If you use CUDA_VISIBLE_DEVICES to set gpus, make sure at least one gpu is visible when compile the code.

Data Preparation

Create a data folder under the repo,

cd {repo_root}
mkdir data
  • COCO: Download the coco images and annotations from coco website.

    And make sure to put the files as the following structure:

    coco
    ├── annotations
    |   ├── instances_minival2014.json
    │   ├── instances_train2014.json
    │   ├── instances_train2017.json
    │   ├── instances_val2014.json
    │   ├── instances_val2017.json
    │   ├── instances_valminusminival2014.json
    │   ├── ...
    |
    └── images
        ├── train2014
        ├── train2017
        ├── val2014
        ├──val2017
        ├── ...
    

    Download coco mini annotations from here. Please note that minival is exactly equivalent to the recently defined 2017 val set. Similarly, the union of valminusminival and the 2014 train is exactly equivalent to the 2017 train set.

    Feel free to put the dataset at any place you want, and then soft link the dataset under the data/ folder:

    ln -s path/to/coco data/coco
    

    Recommend to put the images on a SSD for possible better training performance

Pretrained Model

I use ImageNet pretrained weights from Caffe for the backbone networks.

Download them and put them into the {repo_root}/data/pretrained_model.

You can the following command to download them all:

  • extra required packages: argparse_color_formater, colorama, requests
python tools/download_imagenet_weights.py

NOTE: Caffe pretrained weights have slightly better performance than Pytorch pretrained. Suggest to use Caffe pretrained models from the above link to reproduce the results. By the way, Detectron also use pretrained weights from Caffe.

If you want to use pytorch pre-trained models, please remember to transpose images from BGR to RGB, and also use the same data preprocessing (minus mean and normalize) as used in Pytorch pretrained model.

ImageNet Pretrained Model provided by Detectron

Besides of using the pretrained weights for ResNet above, you can also use the weights from Detectron by changing the corresponding line in model config file as follows:

RESNETS:
  IMAGENET_PRETRAINED_WEIGHTS: 'data/pretrained_model/R-50.pkl'

R-50-GN.pkl and R-101-GN.pkl are required for gn_baselines.

X-101-32x8d.pkl, X-101-64x4d.pkl and X-152-32x8d-IN5k.pkl are required for ResNeXt backbones.

Training

DO NOT CHANGE anything in the provided config files(configs/**/xxxx.yml) unless you know what you are doing

Use the environment variable CUDA_VISIBLE_DEVICES to control which GPUs to use.

Adapative config adjustment

Let's define some terms first

       batch_size: NUM_GPUS x TRAIN.IMS_PER_BATCH
       effective_batch_size: batch_size x iter_size
       change of somethining: new value of something / old value of something

Following config options will be adjusted automatically according to actual training setups: 1) number of GPUs NUM_GPUS, 2) batch size per GPU TRAIN.IMS_PER_BATCH, 3) update period iter_size

  • SOLVER.BASE_LR: adjust directly propotional to the change of batch_size.
  • SOLVER.STEPS, SOLVER.MAX_ITER: adjust inversely propotional to the change of effective_batch_size.

Train from scratch

Take mask-rcnn with res50 backbone for example.

python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_mask_rcnn_R-50-C4.yml --use_tfboard --bs {batch_size} --nw {num_workers}

Use --bs to overwrite the default batch size to a proper value that fits into your GPUs. Simliar for --nw, number of data loader threads defaults to 4 in config.py.

Specify —-use_tfboard to log the losses on Tensorboard.

NOTE: use --dataset keypoints_coco2017 when training for keypoint-rcnn.

The use of --iter_size

As in Caffe, update network once (optimizer.step()) every iter_size iterations (forward + backward). This way to have a larger effective batch size for training. Notice that, step count is only increased after network update.

python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_mask_rcnn_R-50-C4.yml --bs 4 --iter_size 4

iter_size defaults to 1.

Finetune from a pretrained checkpoint

python tools/train_net_step.py ... --load_ckpt {path/to/the/checkpoint}

or using Detectron's checkpoint file

python tools/train_net_step.py ... --load_detectron {path/to/the/checkpoint}

Resume training with the same dataset and batch size

python tools/train_net_step.py ... --load_ckpt {path/to/the/checkpoint} --resume

When resume the training, step count and optimizer state will also be restored from the checkpoint. For SGD optimizer, optimizer state contains the momentum for each trainable parameter.

NOTE: --resume is not yet supported for --load_detectron

Set config options in command line

  python tools/train_net_step.py ... --no_save --set {config.name1} {value1} {config.name2} {value2} ...
  • For Example, run for debugging.
    python tools/train_net_step.py ... --no_save --set DEBUG True
    
    Load less annotations to accelarate training progress. Add --no_save to avoid saving any checkpoint or logging.

Show command line help messages

python train_net_step.py --help

Two Training Scripts

In short, use train_net_step.py.

In train_net_step.py:

(Deprecated) In train_net.py some config options have no effects and worth noticing:

  • SOLVER.LR_POLICY, SOLVER.MAX_ITER, SOLVER.STEPS,SOLVER.LRS: For now, the training policy is controlled by these command line arguments:

    • --epochs: How many epochs to train. One epoch means one travel through the whole training sets. Defaults to 6.
    • --lr_decay_epochs : Epochs to decay the learning rate on. Decay happens on the beginning of a epoch. Epoch is 0-indexed. Defaults to [4, 5].

    For more command line arguments, please refer to python train_net.py --help

  • SOLVER.WARM_UP_ITERS, SOLVER.WARM_UP_FACTOR, SOLVER.WARM_UP_METHOD: Training warm up is not supported.

Inference

Evaluate the training results

For example, test mask-rcnn on coco2017 val set

python tools/test_net.py --dataset coco2017 --cfg config/baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml --load_ckpt {path/to/your/checkpoint}

Use --load_detectron to load Detectron's checkpoint. If multiple gpus are available, add --multi-gpu-testing.

Specify a different output directry, use --output_dir {...}. Defaults to {the/parent/dir/of/checkpoint}/test

Visualize the training results on images

python tools/infer_simple.py --dataset coco --cfg cfgs/baselines/e2e_mask_rcnn_R-50-C4.yml --load_ckpt {path/to/your/checkpoint} --image_dir {dir/of/input/images}  --output_dir {dir/to/save/visualizations}

--output_dir defaults to infer_outputs.

Supported Network modules

  • Backbone:

    • ResNet: ResNet50_conv4_body,ResNet50_conv5_body, ResNet101_Conv4_Body,ResNet101_Conv5_Body, ResNet152_Conv5_Body
    • ResNeXt: [fpn_]ResNet101_Conv4_Body,[fpn_]ResNet101_Conv5_Body, [fpn_]ResNet152_Conv5_Body
    • FPN: fpn_ResNet50_conv5_body,fpn_ResNet50_conv5_P2only_body, fpn_ResNet101_conv5_body,fpn_ResNet101_conv5_P2only_body,fpn_ResNet152_conv5_body,fpn_ResNet152_conv5_P2only_body
  • Box head: ResNet_roi_conv5_head,roi_2mlp_head, roi_Xconv1fc_head, roi_Xconv1fc_gn_head

  • Mask head: mask_rcnn_fcn_head_v0upshare,mask_rcnn_fcn_head_v0up, mask_rcnn_fcn_head_v1up, mask_rcnn_fcn_head_v1up4convs, mask_rcnn_fcn_head_v1up4convs_gn

  • Keypoints head: roi_pose_head_v1convX

NOTE: the naming is similar to the one used in Detectron. Just remove any prepending add_.

Supported Datasets

Only COCO is supported for now. However, the whole dataset library implementation is almost identical to Detectron's, so it should be easy to add more datasets supported by Detectron.

Configuration Options

Architecture specific configuration files are put under configs. The general configuration file lib/core/config.py has almost all the options with same default values as in Detectron's, so it's effortless to transform the architecture specific configs from Detectron.

Some options from Detectron are not used because the corresponding functionalities are not implemented yet. For example, data augmentation on testing.

Extra options

  • MODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS = True: Whether to load ImageNet pretrained weights.
    • RESNETS.IMAGENET_PRETRAINED_WEIGHTS = '': Path to pretrained residual network weights. If start with '/', then it is treated as a absolute path. Otherwise, treat as a relative path to ROOT_DIR.
  • TRAIN.ASPECT_CROPPING = False, TRAIN.ASPECT_HI = 2, TRAIN.ASPECT_LO = 0.5: Options for aspect cropping to restrict image aspect ratio range.
  • RPN.OUT_DIM_AS_IN_DIM = True, RPN.OUT_DIM = 512, RPN.CLS_ACTIVATION = 'sigmoid': Official implement of RPN has same input and output feature channels and use sigmoid as the activation function for fg/bg class prediction. In jwyang's implementation, it fix output channel number to 512 and use softmax as activation function.

How to transform configuration files from Detectron

  1. Remove MODEL.NUM_CLASSES. It will be set according to the dataset specified by --dataset.
  2. Remove TRAIN.WEIGHTS, TRAIN.DATASETS and TEST.DATASETS
  3. For module type options (e.g MODEL.CONV_BODY, FAST_RCNN.ROI_BOX_HEAD ...), remove add_ in the string if exists.
  4. If want to load ImageNet pretrained weights for the model, add RESNETS.IMAGENET_PRETRAINED_WEIGHTS pointing to the pretrained weight file. If not, set MODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS to False.
  5. [Optional] Delete OUTPUT_DIR: . at the last line
  6. Do NOT change the option NUM_GPUS in the config file. It's used to infer the original batch size for training, and learning rate will be linearly scaled according to batch size change. Proper learning rate adjustment is important for training with different batch size.
  7. For group normalization baselines, add RESNETS.USE_GN: True.

My nn.DataParallel

  • Keep certain keyword inputs on cpu Official DataParallel will broadcast all the input Variables to GPUs. However, many rpn related computations are done in CPU, and it's unnecessary to put those related inputs on GPUs.
  • Allow Different blob size for different GPU To save gpu memory, images are padded seperately for each gpu.
  • Work with returned value of dictionary type

Benchmark

BENCHMARK.md

Comments
  • Poor training results

    Poor training results

    Hi, I have trained R-101-FPN with coco2017, using 4 GPUs, but only got mmAP=0.33 during test which is well below Detectron result of 0.40.

    What can be the problem?

    I have used python tools/train_net.py --dataset coco2017 --cfg configs/e2e_mask_rcnn_R-101-FPN_2x.yaml --use-tfboard --nw 8 --b 8 for training and python tools/test_net.py --cfg configs/e2e_mask_rcnn_R-101-FPN_2x.yaml --load_ckpt Outputs/e2e_mask_rcnn_R-101-FPN_2x/Apr19-11-34-35_devbox/ckpt/model_7_29315.pth --dataset coco2017

    The loss at the end was about 0.6 which also seems a bit high.

    opened by Rizhiy 20
  • AssertionError: Range subprocess failed (exit code: 1)

    AssertionError: Range subprocess failed (exit code: 1)

    Hi @roytseng-tw When I evaluating training result, I face a problem like below:

    INFO subprocess.py: 129: # ---------------------------------------------------------------------------- # INFO subprocess.py: 131: stdout of subprocess 0 with range [1, 1250] INFO subprocess.py: 133: # ---------------------------------------------------------------------------- # Traceback (most recent call last): File "/mnt/hdd/tung/aim_2018/try_model/mask-rcnn.pytorch/tools/test_net.py", line 4, in import cv2 ImportError: No module named cv2 Traceback (most recent call last): File "tools/test_net.py", line 119, in check_expected_results=True) File "/mnt/hdd/tung/aim_2018/try_model/mask-rcnn.pytorch/lib/core/test_engine.py", line 128, in run_inference all_results = result_getter() File "/mnt/hdd/tung/aim_2018/try_model/mask-rcnn.pytorch/lib/core/test_engine.py", line 108, in result_getter multi_gpu=multi_gpu_testing File "/mnt/hdd/tung/aim_2018/try_model/mask-rcnn.pytorch/lib/core/test_engine.py", line 155, in test_net_on_dataset args, dataset_name, proposal_file, num_images, output_dir File "/mnt/hdd/tung/aim_2018/try_model/mask-rcnn.pytorch/lib/core/test_engine.py", line 187, in multi_gpu_test_net_on_dataset args.load_ckpt, args.load_detectron, opts File "/mnt/hdd/tung/aim_2018/try_model/mask-rcnn.pytorch/lib/utils/subprocess.py", line 109, in process_in_parallel log_subprocess_output(i, p, output_dir, tag, start, end) File "/mnt/hdd/tung/aim_2018/try_model/mask-rcnn.pytorch/lib/utils/subprocess.py", line 147, in log_subprocess_output assert ret == 0, 'Range subprocess failed (exit code: {})'.format(ret) AssertionError: Range subprocess failed (exit code: 1)

    I have installed opencv and successfully imported cv2, but i don't know what is caused to this problem. I have tried solution in https://github.com/facebookresearch/Detectron/issues/349 but it is not helpful. In config file e2e_mask_rcnn_R-50-C4_1x.yaml, I just re-config NUM_GPUS and keep original everything. Can you tell me what is this problem ?

    The command that I ran: python3 tools/test_net.py --dataset coco2017 --cfg configs/e2e_mask_rcnn_R-50-C4_1x.yaml --load_ckpt Outputs/e2e_mask_rcnn_R-50-C4_1x/May17-21-45-19_slspGPU6_step/ckpt/model_step89999.pth --multi-gpu-testing --output_dir Output_val

    System information

    • Operating system: Ubuntu 16.04.4 LTS
    • CUDA version: 8
    • cuDNN version: 5.1
    • GPU models (for all devices if they are not all the same): TITAN X (4 GPUS)
    • python version: 3.5
    • pytorch version: 0.3.1
    opened by tunglm2203 19
  • About resume

    About resume

    Hello, I try to resume the training by using this command:

     python tools/train_net_step.py --dataset coco2017 --cfg configs/e2e_faster_rcnn_R-101-FPN_1x.yaml --use_tfboard --load_ckpt  Outputs/e2e_faster_rcnn_R-101-FPN_1x/May02-12-15-12_faster_step/ckpt/model_step69999.pth --resume
    

    However, it throw out a runtime error

    Traceback (most recent call last):
      File "tools/train_net_step.py", line 367, in main
        optimizer.step()
      File "/home/philokey/.virtualenvs/py3/lib/python3.5/site-packages/torch/optim/sgd.py", line 94, in step
        buf.mul_(momentum).add_(1 - dampening, d_p)
    RuntimeError: invalid argument 3: sizes do not match at /pytorch/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:271
    
    

    How can I solve this problem?

    opened by philokey 17
  • R101-FPN results unreproducible

    R101-FPN results unreproducible

    I tried to train R101-FPN-1x / R101-FPN-2x using the default setting. Unfortunately, both of them cannot reproduce the results in Detectron.caffe2. (~2%-3% lower)

    Any idea what happened? Why R50-FPN-1x is good but these are not?

    opened by jason718 12
  • Benchmark for deeper models

    Benchmark for deeper models

    Thanks for sharing the great code!

    I can also get similar AP for both box and segm with R-50-FPN model, as confirmed in Issue #24.

    I am wondering if there are some benchmark results for deeper models like R-101-FPN. On my side, the results for R-101-FPN is not as good as the one in Detectron. Do you guys reproduce the performance of Detectron (box ap 40, segm ap 35.9) for R-101-FPN @roytseng-tw @Rizhiy?

    opened by li-js 12
  • sh make.sh error cython incompatible?

    sh make.sh error cython incompatible?

    Expected results

    compiling successfully.

    Actual results

    failed.

    Detailed steps to reproduce

    sh make.sh
    
    running build_ext
    building 'utils.cython_bbox' extension
    creating build
    creating build/temp.linux-x86_64-3.7
    creating build/temp.linux-x86_64-3.7/utils
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/numpy/core/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c utils/cython_bbox.c -o build/temp.linux-x86_64-3.7/utils/cython_bbox.o -Wno-cpp
    utils/cython_bbox.c: In function ‘__Pyx__ExceptionSave’:
    utils/cython_bbox.c:6052:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
         *type = tstate->exc_type;
                       ^
    utils/cython_bbox.c:6053:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
         *value = tstate->exc_value;
                        ^
    utils/cython_bbox.c:6054:17: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
         *tb = tstate->exc_traceback;
                     ^
    utils/cython_bbox.c: In function ‘__Pyx__ExceptionReset’:
    utils/cython_bbox.c:6061:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
         tmp_type = tstate->exc_type;
                          ^
    utils/cython_bbox.c:6062:23: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
         tmp_value = tstate->exc_value;
                           ^
    utils/cython_bbox.c:6063:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
         tmp_tb = tstate->exc_traceback;
                        ^
    utils/cython_bbox.c:6064:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
         tstate->exc_type = type;
               ^
    utils/cython_bbox.c:6065:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
         tstate->exc_value = value;
               ^
    utils/cython_bbox.c:6066:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
         tstate->exc_traceback = tb;
               ^
    utils/cython_bbox.c: In function ‘__Pyx__GetException’:
    utils/cython_bbox.c:6121:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
         tmp_type = tstate->exc_type;
                          ^
    utils/cython_bbox.c:6122:23: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
         tmp_value = tstate->exc_value;
                           ^
    utils/cython_bbox.c:6123:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
         tmp_tb = tstate->exc_traceback;
                        ^
    utils/cython_bbox.c:6124:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
         tstate->exc_type = local_type;
               ^
    utils/cython_bbox.c:6125:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
         tstate->exc_value = local_value;
               ^
    utils/cython_bbox.c:6126:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
         tstate->exc_traceback = local_tb;
               ^
    error: command 'gcc' failed with exit status 1
    Compiling nms kernels by nvcc...
    Including CUDA code.
    /home/ubuntu/Detectron.pytorch/lib/model/nms
    ['/home/ubuntu/Detectron.pytorch/lib/model/nms/src/nms_cuda_kernel.cu.o']
    generating /tmp/tmpsqbmpkvi/_nms.c
    setting the current directory to '/tmp/tmpsqbmpkvi'
    running build_ext
    building '_nms' extension
    creating home
    creating home/ubuntu
    creating home/ubuntu/Detectron.pytorch
    creating home/ubuntu/Detectron.pytorch/lib
    creating home/ubuntu/Detectron.pytorch/lib/model
    creating home/ubuntu/Detectron.pytorch/lib/model/nms
    creating home/ubuntu/Detectron.pytorch/lib/model/nms/src
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c _nms.c -o ./_nms.o -std=c99
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c /home/ubuntu/Detectron.pytorch/lib/model/nms/src/nms_cuda.c -o ./home/ubuntu/Detectron.pytorch/lib/model/nms/src/nms_cuda.o -std=c99
    gcc -pthread -shared -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -L/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_nms.o ./home/ubuntu/Detectron.pytorch/lib/model/nms/src/nms_cuda.o /home/ubuntu/Detectron.pytorch/lib/model/nms/src/nms_cuda_kernel.cu.o -o ./_nms.so
    Compiling roi pooling kernels by nvcc...
    Including CUDA code.
    /home/ubuntu/Detectron.pytorch/lib/model/roi_pooling
    generating /tmp/tmptq2i62g6/_roi_pooling.c
    setting the current directory to '/tmp/tmptq2i62g6'
    running build_ext
    building '_roi_pooling' extension
    creating home
    creating home/ubuntu
    creating home/ubuntu/Detectron.pytorch
    creating home/ubuntu/Detectron.pytorch/lib
    creating home/ubuntu/Detectron.pytorch/lib/model
    creating home/ubuntu/Detectron.pytorch/lib/model/roi_pooling
    creating home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c _roi_pooling.c -o ./_roi_pooling.o -std=c99
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c /home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src/roi_pooling.c -o ./home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src/roi_pooling.o -std=c99
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c /home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src/roi_pooling_cuda.c -o ./home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src/roi_pooling_cuda.o -std=c99
    gcc -pthread -shared -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -L/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_roi_pooling.o ./home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src/roi_pooling.o ./home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src/roi_pooling_cuda.o /home/ubuntu/Detectron.pytorch/lib/model/roi_pooling/src/roi_pooling.cu.o -o ./_roi_pooling.so
    Compiling roi crop kernels by nvcc...
    Including CUDA code.
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop
    generating /tmp/tmpcd1yg2_m/_roi_crop.c
    setting the current directory to '/tmp/tmpcd1yg2_m'
    running build_ext
    building '_roi_crop' extension
    creating home
    creating home/ubuntu
    creating home/ubuntu/Detectron.pytorch
    creating home/ubuntu/Detectron.pytorch/lib
    creating home/ubuntu/Detectron.pytorch/lib/model
    creating home/ubuntu/Detectron.pytorch/lib/model/roi_crop
    creating home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c _roi_crop.c -o ./_roi_crop.o -std=c99
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c -o ./home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.o -std=c99
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c: In function ‘BilinearSamplerBHWD_updateGradInput’:
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:190:14: warning: unused variable ‘inBottomRight’ [-Wunused-variable]
             real inBottomRight=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:189:14: warning: unused variable ‘inBottomLeft’ [-Wunused-variable]
             real inBottomLeft=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:188:14: warning: unused variable ‘inTopRight’ [-Wunused-variable]
             real inTopRight=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:187:14: warning: unused variable ‘inTopLeft’ [-Wunused-variable]
             real inTopLeft=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:186:14: warning: unused variable ‘v’ [-Wunused-variable]
             real v=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c: In function ‘BilinearSamplerBCHW_updateGradInput’:
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:440:14: warning: unused variable ‘inBottomRight’ [-Wunused-variable]
             real inBottomRight=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:439:14: warning: unused variable ‘inBottomLeft’ [-Wunused-variable]
             real inBottomLeft=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:438:14: warning: unused variable ‘inTopRight’ [-Wunused-variable]
             real inTopRight=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:437:14: warning: unused variable ‘inTopLeft’ [-Wunused-variable]
             real inTopLeft=0;
                  ^
    /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.c:436:14: warning: unused variable ‘v’ [-Wunused-variable]
             real v=0;
                  ^
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop_cuda.c -o ./home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop_cuda.o -std=c99
    gcc -pthread -shared -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -L/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_roi_crop.o ./home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop.o ./home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop_cuda.o /home/ubuntu/Detectron.pytorch/lib/model/roi_crop/src/roi_crop_cuda_kernel.cu.o -o ./_roi_crop.so
    Compiling roi align kernels by nvcc...
    Including CUDA code.
    /home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align
    generating /tmp/tmpythzjg8o/_roi_align.c
    setting the current directory to '/tmp/tmpythzjg8o'
    running build_ext
    building '_roi_align' extension
    creating home
    creating home/ubuntu
    creating home/ubuntu/Detectron.pytorch
    creating home/ubuntu/Detectron.pytorch/lib
    creating home/ubuntu/Detectron.pytorch/lib/modeling
    creating home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom
    creating home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align
    creating home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align/src
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c _roi_align.c -o ./_roi_align.o -std=c99
    gcc -pthread -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ubuntu/anaconda3/envs/pytorch41/lib/python3.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ubuntu/anaconda3/envs/pytorch41/include/python3.7m -c /home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align/src/roi_align_cuda.c -o ./home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align/src/roi_align_cuda.o -std=c99
    gcc -pthread -shared -B /home/ubuntu/anaconda3/envs/pytorch41/compiler_compat -L/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/pytorch41/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_roi_align.o ./home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align/src/roi_align_cuda.o /home/ubuntu/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align/src/roi_align_kernel.cu.o -o ./_roi_align.so
    

    System information

    • Operating system: ubuntu 16.04
    • CUDA version: 90
    • cuDNN version: 7
    • GPU models (for all devices if they are not all the same):
    • python version: 3.7
    • pytorch version: 0.4.1
    • Anything else that seems relevant: ?
    opened by szrlee 10
  • Training is stuck at some point, I'm not sure if it is a PyTorch problem

    Training is stuck at some point, I'm not sure if it is a PyTorch problem

    Expected results

    There should be no problem for training.

    Actual results

    Training is stuck at [Step 553061 / 720000]. GPU utilization is 0% but memory is not released.
    I waited for 2 days but it didn't resume so I killed the job.
    It seems the problem is caused by dataloader deadlock, I got the following message when I killed the job:

    Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f33d805aa20>>
    Traceback (most recent call last):
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 349, in __del__
    Process Process-19:
        self._shutdown_workers()
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 340, in _shutdown_workers
        self.worker_result_queue.put(None)
      File "/home/bcheng/anaconda3/lib/python3.6/multiprocessing/queues.py", line 346, in put
        with self._wlock:
      File "/home/bcheng/anaconda3/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
        return self._semlock.__enter__()
    KeyboardInterrupt:
    Traceback (most recent call last):
      File "tools/train_net_step.py", line 415, in main
        input_data = next(dataiterator)
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 276, in __next__
        raise StopIteration
    StopIteration
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/bcheng/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
        self.run()
      File "/home/bcheng/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
        self._target(*self._args, **self._kwargs)
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 61, in _worker_loop
        data_queue.put((idx, samples))
      File "/home/bcheng/anaconda3/lib/python3.6/multiprocessing/queues.py", line 346, in put
        with self._wlock:
      File "/home/bcheng/anaconda3/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
        return self._semlock.__enter__()
    KeyboardInterrupt
    
    INFO train_net_step.py: 442: Save ckpt on exception ...
    INFO train_net_step.py: 135: save model: Outputs/e2e_faster_rcnn_R-50-FPN_1x/Aug20-13-57-33_ifp-gup-03_step/ckpt/model_step553070.pth
    INFO train_net_step.py: 444: Save ckpt done.
    Traceback (most recent call last):
      File "tools/train_net_step.py", line 424, in main
        net_outputs = maskRCNN(**input_data)
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/bcheng/Codes/github/Detectron.pytorch/lib/nn/parallel/data_parallel.py", line 108, in forward
        outputs = [self.module(*inputs[0], **kwargs[0])]
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/bcheng/Codes/github/Detectron.pytorch/lib/modeling/model_builder.py", line 144, in forward
        return self._forward(data, im_info, roidb, **rpn_kwargs)
      File "/home/bcheng/Codes/github/Detectron.pytorch/lib/modeling/model_builder.py", line 175, in _forward
        box_feat = self.Box_Head(blob_conv, rpn_ret)
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/bcheng/Codes/github/Detectron.pytorch/lib/modeling/fast_rcnn_heads.py", line 110, in forward
        sampling_ratio=cfg.FAST_RCNN.ROI_XFORM_SAMPLING_RATIO
      File "/home/bcheng/Codes/github/Detectron.pytorch/lib/modeling/model_builder.py", line 291, in roi_feature_transform
        resolution, resolution, sc, sampling_ratio)(bl_in, rois)
      File "/home/bcheng/Codes/github/Detectron.pytorch/lib/modeling/roi_xfrom/roi_align/functions/roi_align.py", line 16, in forward
        def forward(self, features, rois):
      File "/home/bcheng/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 178, in handler
        _error_if_any_worker_fails()
    RuntimeError: DataLoader worker (pid 10494) is killed by signal: Killed.
    

    Detailed steps to reproduce

    E.g.:

    CUDA_VISIBLE_DEVICES=2 python3 tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_faster_rcnn_R-50-FPN_1x.yaml --bs 2 --nw 4
    

    System information

    • Operating system: Ubuntu 16.04.4
    • CUDA version: 9.1
    • cuDNN version: 7
    • GPU models (for all devices if they are not all the same): 1080ti
    • python version: 3.6.5
    • pytorch version: 0.4.0
    • Anything else that seems relevant: I did not modify any code.
    opened by bowenc0221 9
  • hangs in training

    hangs in training

    Thanks for your codes! I was able to successfully train configs/e2e_mask_rcnn_R-50-FPN_1x.yaml with a (batch_size, learning_rate) = (8, 0.01) until a certain number of iterations (max = ~60K). So far, the losses look quite similar to your benchmark. Training speed is also quite comparable to Detectron The issue I'm having is the training hangs randomly at a certain iteration, which is not consistent from run to run, sometimes after 5K, 1K, or 60K iterations. I'm using 4 V-100 GPUs.

    Any thoughts?

    opened by fitsumreda 8
  • cuda runtime error: out of memory after 15K iteration of 4 GPUs training

    cuda runtime error: out of memory after 15K iteration of 4 GPUs training

    Hi, @roytseng-tw, I have successfully ran train_net.py of resnet-C4 (using 4 GPUs), but after 15K training steps, cuda runtime error with out of memory.

    image

    image

    As I thought, this was caused by the increasing GPU-memory of dynamic graph, such as loss+=xxx, can you give some advices for solving this issue.

    opened by JackHenry1992 8
  • Unpickling error while training from scratch e2e mask rcnn for Resnet-50-C4 (1x).

    Unpickling error while training from scratch e2e mask rcnn for Resnet-50-C4 (1x).

    Conda 4.5, Python 3.6, Pytorch 0.3.1

    Traceback (most recent call last):
      File "tools/train_net_step.py", line 391, in <module>
        main()
      File "tools/train_net_step.py", line 222, in main
        maskRCNN = Generalized_RCNN()
    mask-rcnn.pytorch/lib/modeling/model_builder.py", line 98, in __init__
        self._init_modules()
    mask-rcnn.pytorch/lib/modeling/model_builder.py", line 102, in _init_modules
        resnet_utils.load_pretrained_imagenet_weights(self)
    /mask-rcnn.pytorch/lib/utils/resnet_weights_helper.py", line 21, in load_pretrained_imagenet_weights
        pretrianed_state_dict = convert_state_dict(torch.load(weights_file))
    lib/python3.6/site-packages/torch/serialization.py", line 267, in load
        return _load(f, map_location, pickle_module)
    lib/python3.6/site-packages/torch/serialization.py", line 410, in _load
        magic_number = pickle_module.load(f)
    _pickle.UnpicklingError: invalid load key, '<'.
    

    What am I missing? Please help.

    opened by adityarp9 7
  • coco eval perfomance

    coco eval perfomance

    Excellent work! Have you trained from scratch and how's the performance on COCO evaluation? BTW, could you share some pre-trained weights to test on? Thanks a lot!

    opened by hao522 7
  • UnicodeDecodeError: 'ascii' codec can't decode byte 0xf5 in position 1: ordinal not in range(128)

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xf5 in position 1: ordinal not in range(128)

    When I run the inference command,the error is:

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xf5 in position 1: ordinal not in range(128) How to solve this problem? btw,I followed the approach to test python -c "import sys; print(sys.getdefaultencoding())", the output is ascii

    opened by Yinhance 0
  • undefined symbol: __cudaRegisterFatBinaryEnd

    undefined symbol: __cudaRegisterFatBinaryEnd

    I am getting the following error and I don't understand how to get rid of it coz my cuda version and symlinks are correct.

    ImportError: /home/anam/Codes/mask-rcnn.pytorch/lib/model/roi_pooling/_ext/roi_pooling/_roi_pooling.so: undefined symbol: __cudaRegisterFatBinaryEnd

    Command Ran is just a try to run inference

    python tools/infer_simple.py
    
    

    System information

    • Operating system: Linux20.04
    • CUDA version: 10.0
    • cuDNN version: 7.5
    • GPU models (for all devices if they are not all the same): RTX
    • python version: 3.6.6
    • pytorch version: 0.3.1
    • Anything else that seems relevant: ?
    opened by ZahraAnam 1
  • driver418,cuda10.1  can run this project?

    driver418,cuda10.1 can run this project?

    PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

    1. Read the README.md thoroughly ! README.md is not a decoration.
    2. Please search existing open and closed issues in case your issue has already been reported
    3. Please try to debug the issue in case you can solve it on your own before posting

    After following steps above and agreeing to provide the detailed information requested below, you may continue with posting your issue

    (Delete this line and the text above it.)

    Expected results

    What did you expect to see?

    Actual results

    What did you observe instead?

    Detailed steps to reproduce

    E.g.:

    The command that you ran
    

    System information

    • Operating system: ?
    • CUDA version: ?
    • cuDNN version: ?
    • GPU models (for all devices if they are not all the same): ?
    • python version: ?
    • pytorch version: ?
    • Anything else that seems relevant: ?
    opened by henbucuoshanghai 0
  • AssertionError: Range subprocess failed (exit code: 1)

    AssertionError: Range subprocess failed (exit code: 1)

    Hi, it can work when I run test_net.py with one gpu . But it is too slow, so I run with command "--multi-gpu-testing" .By the way, I have 4 GPUS. Is there anyone who can help? Thanks

    opened by azhuantou 0
  • ModuleNotFoundError: No module named 'model.roi_pooling._ext.roi_pooling._roi_pooling

    ModuleNotFoundError: No module named 'model.roi_pooling._ext.roi_pooling._roi_pooling

    PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

    1. Read the README.md thoroughly ! README.md is not a decoration.
    2. Please search existing open and closed issues in case your issue has already been reported
    3. Please try to debug the issue in case you can solve it on your own before posting

    After following steps above and agreeing to provide the detailed information requested below, you may continue with posting your issue

    (Delete this line and the text above it.)

    Expected results

    What did you expect to see?

    Actual results

    What did you observe instead?

    Detailed steps to reproduce

    E.g.:

    The command that you ran
    

    System information

    • Operating system: ?
    • CUDA version: ?
    • cuDNN version: ?
    • GPU models (for all devices if they are not all the same): ?
    • python version: ?
    • pytorch version: ?
    • Anything else that seems relevant: ?
    opened by rmxhhh 0
Owner
Roy
Master student in CS at NTHU, Taiwan. Passionate about deep learning and computer vision research.
Roy
Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images"

Reverse_Engineering_GMs Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Gener

null 100 Dec 18, 2022
🔪 Elimination based Lightweight Neural Net with Pretrained Weights

ELimNet ELimNet: Eliminating Layers in a Neural Network Pretrained with Large Dataset for Downstream Task Removed top layers from pretrained Efficient

snoop2head 4 Jul 12, 2022
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

null 730 Jan 9, 2023
Neural-net-from-scratch - A simple Neural Network from scratch in Python using the Pymathrix library

A Simple Neural Network from scratch A Simple Neural Network from scratch in Pyt

Youssef Chafiqui 2 Jan 7, 2022
Artificial intelligence technology inferring issues and logically supporting facts from raw text

개요 비정형 텍스트를 학습하여 쟁점별 사실과 논리적 근거 추론이 가능한 인공지능 원천기술 Artificial intelligence techno

null 6 Dec 29, 2021
The repo of the preprinting paper "Labels Are Not Perfect: Inferring Spatial Uncertainty in Object Detection"

Inferring Spatial Uncertainty in Object Detection A teaser version of the code for the paper Labels Are Not Perfect: Inferring Spatial Uncertainty in

ZINING WANG 21 Mar 3, 2022
Inferring Lexicographically-Ordered Rewards from Preferences

Inferring Lexicographically-Ordered Rewards from Preferences Code author: Alihan Hüyük ([email protected]) This repository contains the source code nec

Alihan Hüyük 1 Feb 13, 2022
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

train-CLIP ?? A PyTorch Lightning solution to training CLIP from scratch. Goal ⚽ Our aim is to create an easy to use Lightning implementation of OpenA

Cade Gordon 396 Dec 30, 2022
PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

pytorch-fcn PyTorch implementation of Fully Convolutional Networks. Requirements pytorch >= 0.2.0 torchvision >= 0.1.8 fcn >= 6.1.5 Pillow scipy tqdm

Kentaro Wada 1.6k Jan 7, 2023
PyTorch implementation of Wide Residual Networks with 1-bit weights by McDonnell (ICLR 2018)

1-bit Wide ResNet PyTorch implementation of training 1-bit Wide ResNets from this paper: Training wide residual networks for deployment using a single

Sergey Zagoruyko 122 Dec 7, 2022
🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

PyTorch implementation of OpenAI's Finetuned Transformer Language Model This is a PyTorch implementation of the TensorFlow code provided with OpenAI's

Hugging Face 1.4k Jan 5, 2023
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Facebook Research 408 Jan 1, 2023
A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

Rockpool Rockpool is a Python package for developing signal processing applications with spiking neural networks. Rockpool allows you to build network

SynSense 21 Dec 14, 2022
A python code to convert Keras pre-trained weights to Pytorch version

Weights_Keras_2_Pytorch 最近想在Pytorch项目里使用一下谷歌的NIMA,但是发现没有预训练好的pytorch权重,于是整理了一下将Keras预训练权重转为Pytorch的代码,目前是支持Keras的Conv2D, Dense, DepthwiseConv2D, Batch

Liu Hengyu 2 Dec 16, 2021
The original weights of some Caffe models, ported to PyTorch.

pytorch-caffe-models This repo contains the original weights of some Caffe models, ported to PyTorch. Currently there are: GoogLeNet (Going Deeper wit

Katherine Crowson 9 Nov 4, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 9, 2022
Code implementation from my Medium blog post: [Transformers from Scratch in PyTorch]

transformer-from-scratch Code for my Medium blog post: Transformers from Scratch in PyTorch Note: This Transformer code does not include masked attent

Frank Odom 27 Dec 21, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 1, 2023
Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

Giovani Candido 8 Aug 31, 2022