Deformable DETR is an efficient and fast-converging end-to-end object detector.

Overview

Deformable DETR

By Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai.

This repository is an official implementation of the paper Deformable DETR: Deformable Transformers for End-to-End Object Detection.

Introduction

TL; DR. Deformable DETR is an efficient and fast-converging end-to-end object detector. It mitigates the high complexity and slow convergence issues of DETR via a novel sampling-based efficient attention mechanism.

deformable_detr

deformable_detr

Abstract. DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance. However, it suffers from slow convergence and limited feature spatial resolution, due to the limitation of Transformer attention modules in processing image feature maps. To mitigate these issues, we proposed Deformable DETR, whose attention modules only attend to a small set of key sampling points around a reference. Deformable DETR can achieve better performance than DETR (especially on small objects) with 10× less training epochs. Extensive experiments on the COCO benchmark demonstrate the effectiveness of our approach.

License

This project is released under the Apache 2.0 license.

Changelog

See changelog.md for detailed logs of major changes.

Citing Deformable DETR

If you find Deformable DETR useful in your research, please consider citing:

@article{zhu2020deformable,
  title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
  author={Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng},
  journal={arXiv preprint arXiv:2010.04159},
  year={2020}
}

Main Results

Method Epochs AP APS APM APL params
(M)
FLOPs
(G)
Total
Train
Time
(GPU
hours)
Train
Speed
(GPU
hours
/epoch)
Infer
Speed
(FPS)
Batch
Infer
Speed
(FPS)
URL
Faster R-CNN + FPN 109 42.0 26.6 45.4 53.4 42 180 380 3.5 25.6 28.0 -
DETR 500 42.0 20.5 45.8 61.1 41 86 2000 4.0 27.0 38.3 -
DETR-DC5 500 43.3 22.5 47.3 61.1 41 187 7000 14.0 11.4 12.4 -
DETR-DC5 50 35.3 15.2 37.5 53.6 41 187 700 14.0 11.4 12.4 -
DETR-DC5+ 50 36.2 16.3 39.2 53.9 41 187 700 14.0 11.4 12.4 -
Deformable DETR
(single scale)
50 39.4 20.6 43.0 55.5 34 78 160 3.2 27.0 42.4 config
log
model
Deformable DETR
(single scale, DC5)
50 41.5 24.1 45.3 56.0 34 128 215 4.3 22.1 29.4 config
log
model
Deformable DETR 50 44.5 27.1 47.6 59.6 40 173 325 6.5 15.0 19.4 config
log
model
+ iterative bounding box refinement 50 46.2 28.3 49.2 61.5 41 173 325 6.5 15.0 19.4 config
log
model
++ two-stage Deformable DETR 50 46.9 29.6 50.1 61.6 41 173 340 6.8 14.5 18.8 config
log
model

Note:

  1. All models of Deformable DETR are trained with total batch size of 32.
  2. Training and inference speed are measured on NVIDIA Tesla V100 GPU.
  3. "Deformable DETR (single scale)" means only using res5 feature map (of stride 32) as input feature maps for Deformable Transformer Encoder.
  4. "DC5" means removing the stride in C5 stage of ResNet and add a dilation of 2 instead.
  5. "DETR-DC5+" indicates DETR-DC5 with some modifications, including using Focal Loss for bounding box classification and increasing number of object queries to 300.
  6. "Batch Infer Speed" refer to inference with batch size = 4 to maximize GPU utilization.
  7. The original implementation is based on our internal codebase. There are slight differences in the final accuracy and running time due to the plenty details in platform switch.

Installation

Requirements

  • Linux, CUDA>=9.2, GCC>=5.4

  • Python>=3.7

    We recommend you to use Anaconda to create a conda environment:

    conda create -n deformable_detr python=3.7 pip

    Then, activate the environment:

    conda activate deformable_detr
  • PyTorch>=1.5.1, torchvision>=0.6.1 (following instructions here)

    For example, if your CUDA version is 9.2, you could install pytorch and torchvision as following:

    conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=9.2 -c pytorch
  • Other requirements

    pip install -r requirements.txt

Compiling CUDA operators

cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

Usage

Dataset preparation

Please download COCO 2017 dataset and organize them as following:

code_root/
└── data/
    └── coco/
        ├── train2017/
        ├── val2017/
        └── annotations/
        	├── instances_train2017.json
        	└── instances_val2017.json

Training

Training on single node

For example, the command for training Deformable DETR on 8 GPUs is as following:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_deformable_detr.sh

Training on multiple nodes

For example, the command for training Deformable DETR on 2 nodes of each with 8 GPUs is as following:

On node 1:

MASTER_ADDR=<IP address of node 1> NODE_RANK=0 GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 16 ./configs/r50_deformable_detr.sh

On node 2:

MASTER_ADDR=<IP address of node 1> NODE_RANK=1 GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 16 ./configs/r50_deformable_detr.sh

Training on slurm cluster

If you are using slurm cluster, you can simply run the following command to train on 1 node with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deformable_detr 8 configs/r50_deformable_detr.sh

Or 2 nodes of each with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deformable_detr 16 configs/r50_deformable_detr.sh

Some tips to speed-up training

  • If your file system is slow to read images, you may consider enabling '--cache_mode' option to load whole dataset into memory at the beginning of training.
  • You may increase the batch size to maximize the GPU utilization, according to GPU memory of yours, e.g., set '--batch_size 3' or '--batch_size 4'.

Evaluation

You can get the config file and pretrained model of Deformable DETR (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 validation set:

<path to config file> --resume <path to pre-trained model> --eval

You can also run distributed evaluation by using ./tools/run_dist_launch.sh or ./tools/run_dist_slurm.sh.

Comments
  • make.sh

    make.sh

    when I run sh ./make.sh, I got this message:

    Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] :/home1/hli/cuda10.2/bin/nvcc -DWITH_CUDA -I/home1/hli/Deformable-DETR/models/ops/src -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/TH -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/THC -I:/home1/hli/cuda10.2/include -I/home1/hli/anaconda3/envs/detr/include/python3.7m -c -c /home1/hli/Deformable-DETR/models/ops/src/cuda/ms_deform_attn_cuda.cu -o /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cuda/ms_deform_attn_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -std=c++14 FAILED: /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cuda/ms_deform_attn_cuda.o :/home1/hli/cuda10.2/bin/nvcc -DWITH_CUDA -I/home1/hli/Deformable-DETR/models/ops/src -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/TH -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/THC -I:/home1/hli/cuda10.2/include -I/home1/hli/anaconda3/envs/detr/include/python3.7m -c -c /home1/hli/Deformable-DETR/models/ops/src/cuda/ms_deform_attn_cuda.cu -o /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cuda/ms_deform_attn_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -std=c++14 /bin/sh: 1: :/home1/hli/cuda10.2/bin/nvcc: not found [2/3] c++ -MMD -MF /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.o.d -pthread -B /home1/hli/anaconda3/envs/detr/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home1/hli/Deformable-DETR/models/ops/src -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/TH -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/THC -I:/home1/hli/cuda10.2/include -I/home1/hli/anaconda3/envs/detr/include/python3.7m -c -c /home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.cpp -o /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 FAILED: /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.o c++ -MMD -MF /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.o.d -pthread -B /home1/hli/anaconda3/envs/detr/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home1/hli/Deformable-DETR/models/ops/src -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/TH -I/home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/THC -I:/home1/hli/cuda10.2/include -I/home1/hli/anaconda3/envs/detr/include/python3.7m -c -c /home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.cpp -o /home1/hli/Deformable-DETR/models/ops/build/temp.linux-x86_64-3.7/home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /home1/hli/Deformable-DETR/models/ops/src/cpu/ms_deform_attn_cpu.cpp:14:0: /home1/hli/anaconda3/envs/detr/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:30: fatal error: cuda_runtime_api.h: No such file or directory compilation terminated.

    my Cuda version is 10.2.89, the other settings are the same as yours, torch 1.5.1 python3.7. Can you help me? thanks

    opened by lhlm1994 16
  • Mismatch in loading model

    Mismatch in loading model

    Hi, @jackroos I am trying to run the code by using the model weights provided. I used r50_deformable_detr_plus_iterative_bbox_refinement_plus_plus_two_stage-checkpoint.pth and respective config. When loading the model it gives mismatch of transformer shape. Is the checkpoint correct?

    opened by munirfarzeen 10
  • AP=0

    AP=0

    Ap of the pre-training model is normal, and ap of the finetune model is 0. Besides, loss decreases normally, and class_error also decreases, the reduction of learning rate will not solve it, very awful. I suspect something is wrong with the code.

    opened by 1757525671 9
  • How to train on a single GPU ?

    How to train on a single GPU ?

    Can someone explain how to train on a single GPU ?

    I'm using the base to train of DETR, but it is giving and error after executing the line:

    !python main.py --dataset_file custom --data_path /content/coco --output_dir /content/Output --batch_size 2

    The error is as follows

    `Not using distributed mode git: sha: 11169a60c33333af00a4849f1808023eba96a931, status: has uncommited changes, branch: main

    Namespace(aux_loss=True, backbone='resnet50', batch_size=2, bbox_loss_coef=5, cache_mode=False, clip_max_norm=0.1, cls_loss_coef=2, coco_panoptic_path=None, data_path='/content/coco', dataset_file='custom', dec_layers=6, dec_n_points=4, device='cuda', dice_loss_coef=1, dilation=False, dim_feedforward=1024, distributed=False, dropout=0.1, enc_layers=6, enc_n_points=4, epochs=50, eval=False, focal_alpha=0.25, frozen_weights=None, giou_loss_coef=2, hidden_dim=256, lr=0.0002, lr_backbone=2e-05, lr_backbone_names=['backbone.0'], lr_drop=40, lr_drop_epochs=None, lr_linear_proj_mult=0.1, lr_linear_proj_names=['reference_points', 'sampling_offsets'], mask_loss_coef=1, masks=False, nheads=8, num_feature_levels=4, num_queries=300, num_workers=2, output_dir='/content/Output', position_embedding='sine', position_embedding_scale=6.283185307179586, remove_difficult=False, resume='', seed=42, set_cost_bbox=5, set_cost_class=2, set_cost_giou=2, sgd=False, start_epoch=0, two_stage=False, weight_decay=0.0001, with_box_refine=False) number of params: 39824906 loading annotations into memory... Done (t=0.10s) creating index... index created! loading annotations into memory... Done (t=0.01s) creating index... index created! transformer.level_embed transformer.encoder.layers.0.self_attn.sampling_offsets.weight transformer.encoder.layers.0.self_attn.sampling_offsets.bias transformer.encoder.layers.0.self_attn.attention_weights.weight transformer.encoder.layers.0.self_attn.attention_weights.bias transformer.encoder.layers.0.self_attn.value_proj.weight transformer.encoder.layers.0.self_attn.value_proj.bias transformer.encoder.layers.0.self_attn.output_proj.weight transformer.encoder.layers.0.self_attn.output_proj.bias transformer.encoder.layers.0.norm1.weight transformer.encoder.layers.0.norm1.bias transformer.encoder.layers.0.linear1.weight transformer.encoder.layers.0.linear1.bias transformer.encoder.layers.0.linear2.weight transformer.encoder.layers.0.linear2.bias transformer.encoder.layers.0.norm2.weight transformer.encoder.layers.0.norm2.bias transformer.encoder.layers.1.self_attn.sampling_offsets.weight transformer.encoder.layers.1.self_attn.sampling_offsets.bias transformer.encoder.layers.1.self_attn.attention_weights.weight transformer.encoder.layers.1.self_attn.attention_weights.bias transformer.encoder.layers.1.self_attn.value_proj.weight transformer.encoder.layers.1.self_attn.value_proj.bias transformer.encoder.layers.1.self_attn.output_proj.weight transformer.encoder.layers.1.self_attn.output_proj.bias transformer.encoder.layers.1.norm1.weight transformer.encoder.layers.1.norm1.bias transformer.encoder.layers.1.linear1.weight transformer.encoder.layers.1.linear1.bias transformer.encoder.layers.1.linear2.weight transformer.encoder.layers.1.linear2.bias transformer.encoder.layers.1.norm2.weight transformer.encoder.layers.1.norm2.bias transformer.encoder.layers.2.self_attn.sampling_offsets.weight transformer.encoder.layers.2.self_attn.sampling_offsets.bias transformer.encoder.layers.2.self_attn.attention_weights.weight transformer.encoder.layers.2.self_attn.attention_weights.bias transformer.encoder.layers.2.self_attn.value_proj.weight transformer.encoder.layers.2.self_attn.value_proj.bias transformer.encoder.layers.2.self_attn.output_proj.weight transformer.encoder.layers.2.self_attn.output_proj.bias transformer.encoder.layers.2.norm1.weight transformer.encoder.layers.2.norm1.bias transformer.encoder.layers.2.linear1.weight transformer.encoder.layers.2.linear1.bias transformer.encoder.layers.2.linear2.weight transformer.encoder.layers.2.linear2.bias transformer.encoder.layers.2.norm2.weight transformer.encoder.layers.2.norm2.bias transformer.encoder.layers.3.self_attn.sampling_offsets.weight transformer.encoder.layers.3.self_attn.sampling_offsets.bias transformer.encoder.layers.3.self_attn.attention_weights.weight transformer.encoder.layers.3.self_attn.attention_weights.bias transformer.encoder.layers.3.self_attn.value_proj.weight transformer.encoder.layers.3.self_attn.value_proj.bias transformer.encoder.layers.3.self_attn.output_proj.weight transformer.encoder.layers.3.self_attn.output_proj.bias transformer.encoder.layers.3.norm1.weight transformer.encoder.layers.3.norm1.bias transformer.encoder.layers.3.linear1.weight transformer.encoder.layers.3.linear1.bias transformer.encoder.layers.3.linear2.weight transformer.encoder.layers.3.linear2.bias transformer.encoder.layers.3.norm2.weight transformer.encoder.layers.3.norm2.bias transformer.encoder.layers.4.self_attn.sampling_offsets.weight transformer.encoder.layers.4.self_attn.sampling_offsets.bias transformer.encoder.layers.4.self_attn.attention_weights.weight transformer.encoder.layers.4.self_attn.attention_weights.bias transformer.encoder.layers.4.self_attn.value_proj.weight transformer.encoder.layers.4.self_attn.value_proj.bias transformer.encoder.layers.4.self_attn.output_proj.weight transformer.encoder.layers.4.self_attn.output_proj.bias transformer.encoder.layers.4.norm1.weight transformer.encoder.layers.4.norm1.bias transformer.encoder.layers.4.linear1.weight transformer.encoder.layers.4.linear1.bias transformer.encoder.layers.4.linear2.weight transformer.encoder.layers.4.linear2.bias transformer.encoder.layers.4.norm2.weight transformer.encoder.layers.4.norm2.bias transformer.encoder.layers.5.self_attn.sampling_offsets.weight transformer.encoder.layers.5.self_attn.sampling_offsets.bias transformer.encoder.layers.5.self_attn.attention_weights.weight transformer.encoder.layers.5.self_attn.attention_weights.bias transformer.encoder.layers.5.self_attn.value_proj.weight transformer.encoder.layers.5.self_attn.value_proj.bias transformer.encoder.layers.5.self_attn.output_proj.weight transformer.encoder.layers.5.self_attn.output_proj.bias transformer.encoder.layers.5.norm1.weight transformer.encoder.layers.5.norm1.bias transformer.encoder.layers.5.linear1.weight transformer.encoder.layers.5.linear1.bias transformer.encoder.layers.5.linear2.weight transformer.encoder.layers.5.linear2.bias transformer.encoder.layers.5.norm2.weight transformer.encoder.layers.5.norm2.bias transformer.decoder.layers.0.cross_attn.sampling_offsets.weight transformer.decoder.layers.0.cross_attn.sampling_offsets.bias transformer.decoder.layers.0.cross_attn.attention_weights.weight transformer.decoder.layers.0.cross_attn.attention_weights.bias transformer.decoder.layers.0.cross_attn.value_proj.weight transformer.decoder.layers.0.cross_attn.value_proj.bias transformer.decoder.layers.0.cross_attn.output_proj.weight transformer.decoder.layers.0.cross_attn.output_proj.bias transformer.decoder.layers.0.norm1.weight transformer.decoder.layers.0.norm1.bias transformer.decoder.layers.0.self_attn.in_proj_weight transformer.decoder.layers.0.self_attn.in_proj_bias transformer.decoder.layers.0.self_attn.out_proj.weight transformer.decoder.layers.0.self_attn.out_proj.bias transformer.decoder.layers.0.norm2.weight transformer.decoder.layers.0.norm2.bias transformer.decoder.layers.0.linear1.weight transformer.decoder.layers.0.linear1.bias transformer.decoder.layers.0.linear2.weight transformer.decoder.layers.0.linear2.bias transformer.decoder.layers.0.norm3.weight transformer.decoder.layers.0.norm3.bias transformer.decoder.layers.1.cross_attn.sampling_offsets.weight transformer.decoder.layers.1.cross_attn.sampling_offsets.bias transformer.decoder.layers.1.cross_attn.attention_weights.weight transformer.decoder.layers.1.cross_attn.attention_weights.bias transformer.decoder.layers.1.cross_attn.value_proj.weight transformer.decoder.layers.1.cross_attn.value_proj.bias transformer.decoder.layers.1.cross_attn.output_proj.weight transformer.decoder.layers.1.cross_attn.output_proj.bias transformer.decoder.layers.1.norm1.weight transformer.decoder.layers.1.norm1.bias transformer.decoder.layers.1.self_attn.in_proj_weight transformer.decoder.layers.1.self_attn.in_proj_bias transformer.decoder.layers.1.self_attn.out_proj.weight transformer.decoder.layers.1.self_attn.out_proj.bias transformer.decoder.layers.1.norm2.weight transformer.decoder.layers.1.norm2.bias transformer.decoder.layers.1.linear1.weight transformer.decoder.layers.1.linear1.bias transformer.decoder.layers.1.linear2.weight transformer.decoder.layers.1.linear2.bias transformer.decoder.layers.1.norm3.weight transformer.decoder.layers.1.norm3.bias transformer.decoder.layers.2.cross_attn.sampling_offsets.weight transformer.decoder.layers.2.cross_attn.sampling_offsets.bias transformer.decoder.layers.2.cross_attn.attention_weights.weight transformer.decoder.layers.2.cross_attn.attention_weights.bias transformer.decoder.layers.2.cross_attn.value_proj.weight transformer.decoder.layers.2.cross_attn.value_proj.bias transformer.decoder.layers.2.cross_attn.output_proj.weight transformer.decoder.layers.2.cross_attn.output_proj.bias transformer.decoder.layers.2.norm1.weight transformer.decoder.layers.2.norm1.bias transformer.decoder.layers.2.self_attn.in_proj_weight transformer.decoder.layers.2.self_attn.in_proj_bias transformer.decoder.layers.2.self_attn.out_proj.weight transformer.decoder.layers.2.self_attn.out_proj.bias transformer.decoder.layers.2.norm2.weight transformer.decoder.layers.2.norm2.bias transformer.decoder.layers.2.linear1.weight transformer.decoder.layers.2.linear1.bias transformer.decoder.layers.2.linear2.weight transformer.decoder.layers.2.linear2.bias transformer.decoder.layers.2.norm3.weight transformer.decoder.layers.2.norm3.bias transformer.decoder.layers.3.cross_attn.sampling_offsets.weight transformer.decoder.layers.3.cross_attn.sampling_offsets.bias transformer.decoder.layers.3.cross_attn.attention_weights.weight transformer.decoder.layers.3.cross_attn.attention_weights.bias transformer.decoder.layers.3.cross_attn.value_proj.weight transformer.decoder.layers.3.cross_attn.value_proj.bias transformer.decoder.layers.3.cross_attn.output_proj.weight transformer.decoder.layers.3.cross_attn.output_proj.bias transformer.decoder.layers.3.norm1.weight transformer.decoder.layers.3.norm1.bias transformer.decoder.layers.3.self_attn.in_proj_weight transformer.decoder.layers.3.self_attn.in_proj_bias transformer.decoder.layers.3.self_attn.out_proj.weight transformer.decoder.layers.3.self_attn.out_proj.bias transformer.decoder.layers.3.norm2.weight transformer.decoder.layers.3.norm2.bias transformer.decoder.layers.3.linear1.weight transformer.decoder.layers.3.linear1.bias transformer.decoder.layers.3.linear2.weight transformer.decoder.layers.3.linear2.bias transformer.decoder.layers.3.norm3.weight transformer.decoder.layers.3.norm3.bias transformer.decoder.layers.4.cross_attn.sampling_offsets.weight transformer.decoder.layers.4.cross_attn.sampling_offsets.bias transformer.decoder.layers.4.cross_attn.attention_weights.weight transformer.decoder.layers.4.cross_attn.attention_weights.bias transformer.decoder.layers.4.cross_attn.value_proj.weight transformer.decoder.layers.4.cross_attn.value_proj.bias transformer.decoder.layers.4.cross_attn.output_proj.weight transformer.decoder.layers.4.cross_attn.output_proj.bias transformer.decoder.layers.4.norm1.weight transformer.decoder.layers.4.norm1.bias transformer.decoder.layers.4.self_attn.in_proj_weight transformer.decoder.layers.4.self_attn.in_proj_bias transformer.decoder.layers.4.self_attn.out_proj.weight transformer.decoder.layers.4.self_attn.out_proj.bias transformer.decoder.layers.4.norm2.weight transformer.decoder.layers.4.norm2.bias transformer.decoder.layers.4.linear1.weight transformer.decoder.layers.4.linear1.bias transformer.decoder.layers.4.linear2.weight transformer.decoder.layers.4.linear2.bias transformer.decoder.layers.4.norm3.weight transformer.decoder.layers.4.norm3.bias transformer.decoder.layers.5.cross_attn.sampling_offsets.weight transformer.decoder.layers.5.cross_attn.sampling_offsets.bias transformer.decoder.layers.5.cross_attn.attention_weights.weight transformer.decoder.layers.5.cross_attn.attention_weights.bias transformer.decoder.layers.5.cross_attn.value_proj.weight transformer.decoder.layers.5.cross_attn.value_proj.bias transformer.decoder.layers.5.cross_attn.output_proj.weight transformer.decoder.layers.5.cross_attn.output_proj.bias transformer.decoder.layers.5.norm1.weight transformer.decoder.layers.5.norm1.bias transformer.decoder.layers.5.self_attn.in_proj_weight transformer.decoder.layers.5.self_attn.in_proj_bias transformer.decoder.layers.5.self_attn.out_proj.weight transformer.decoder.layers.5.self_attn.out_proj.bias transformer.decoder.layers.5.norm2.weight transformer.decoder.layers.5.norm2.bias transformer.decoder.layers.5.linear1.weight transformer.decoder.layers.5.linear1.bias transformer.decoder.layers.5.linear2.weight transformer.decoder.layers.5.linear2.bias transformer.decoder.layers.5.norm3.weight transformer.decoder.layers.5.norm3.bias transformer.reference_points.weight transformer.reference_points.bias class_embed.0.weight class_embed.0.bias bbox_embed.0.layers.0.weight bbox_embed.0.layers.0.bias bbox_embed.0.layers.1.weight bbox_embed.0.layers.1.bias bbox_embed.0.layers.2.weight bbox_embed.0.layers.2.bias query_embed.weight input_proj.0.0.weight input_proj.0.0.bias input_proj.0.1.weight input_proj.0.1.bias input_proj.1.0.weight input_proj.1.0.bias input_proj.1.1.weight input_proj.1.1.bias input_proj.2.0.weight input_proj.2.0.bias input_proj.2.1.weight input_proj.2.1.bias input_proj.3.0.weight input_proj.3.0.bias input_proj.3.1.weight input_proj.3.1.bias backbone.0.body.conv1.weight backbone.0.body.layer1.0.conv1.weight backbone.0.body.layer1.0.conv2.weight backbone.0.body.layer1.0.conv3.weight backbone.0.body.layer1.0.downsample.0.weight backbone.0.body.layer1.1.conv1.weight backbone.0.body.layer1.1.conv2.weight backbone.0.body.layer1.1.conv3.weight backbone.0.body.layer1.2.conv1.weight backbone.0.body.layer1.2.conv2.weight backbone.0.body.layer1.2.conv3.weight backbone.0.body.layer2.0.conv1.weight backbone.0.body.layer2.0.conv2.weight backbone.0.body.layer2.0.conv3.weight backbone.0.body.layer2.0.downsample.0.weight backbone.0.body.layer2.1.conv1.weight backbone.0.body.layer2.1.conv2.weight backbone.0.body.layer2.1.conv3.weight backbone.0.body.layer2.2.conv1.weight backbone.0.body.layer2.2.conv2.weight backbone.0.body.layer2.2.conv3.weight backbone.0.body.layer2.3.conv1.weight backbone.0.body.layer2.3.conv2.weight backbone.0.body.layer2.3.conv3.weight backbone.0.body.layer3.0.conv1.weight backbone.0.body.layer3.0.conv2.weight backbone.0.body.layer3.0.conv3.weight backbone.0.body.layer3.0.downsample.0.weight backbone.0.body.layer3.1.conv1.weight backbone.0.body.layer3.1.conv2.weight backbone.0.body.layer3.1.conv3.weight backbone.0.body.layer3.2.conv1.weight backbone.0.body.layer3.2.conv2.weight backbone.0.body.layer3.2.conv3.weight backbone.0.body.layer3.3.conv1.weight backbone.0.body.layer3.3.conv2.weight backbone.0.body.layer3.3.conv3.weight backbone.0.body.layer3.4.conv1.weight backbone.0.body.layer3.4.conv2.weight backbone.0.body.layer3.4.conv3.weight backbone.0.body.layer3.5.conv1.weight backbone.0.body.layer3.5.conv2.weight backbone.0.body.layer3.5.conv3.weight backbone.0.body.layer4.0.conv1.weight backbone.0.body.layer4.0.conv2.weight backbone.0.body.layer4.0.conv3.weight backbone.0.body.layer4.0.downsample.0.weight backbone.0.body.layer4.1.conv1.weight backbone.0.body.layer4.1.conv2.weight backbone.0.body.layer4.1.conv3.weight backbone.0.body.layer4.2.conv1.weight backbone.0.body.layer4.2.conv2.weight backbone.0.body.layer4.2.conv3.weight Start training Traceback (most recent call last): File "main.py", line 333, in main(args) File "main.py", line 282, in main model, criterion, data_loader_train, optimizer, device, epoch, args.clip_max_norm) File "/content/Deformable-DETR1/engine.py", line 43, in train_one_epoch loss_dict = criterion(outputs, targets) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/content/Deformable-DETR1/models/deformable_detr.py", line 342, in forward indices = self.matcher(outputs_without_aux, targets) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/content/Deformable-DETR1/models/matcher.py", line 88, in forward cost_giou = -generalized_box_iou(box_cxcywh_to_xyxy(out_bbox), File "/content/Deformable-DETR1/util/box_ops.py", line 19, in box_cxcywh_to_xyxy b = [(x_c - 0.5 * w), (y_c - 0.5 * h), RuntimeError: CUDA error: device-side assert triggered /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [35,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [39,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [43,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [47,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [55,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [59,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [35,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [39,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [43,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [47,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [55,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [59,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [7,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [15,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [19,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [7,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [15,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [19,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [99,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [107,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [111,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [119,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [123,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [99,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [107,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [111,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [119,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [123,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [67,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [71,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [79,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [83,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [87,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [91,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [67,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [71,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [79,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [83,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [87,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [91,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [3,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [35,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [39,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [43,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [47,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [55,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [59,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [7,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [15,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [19,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [99,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [107,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [111,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [119,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [123,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [67,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [71,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [79,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [83,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [87,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [91,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [4,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [7,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [15,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [19,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [35,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [39,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [43,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [47,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [55,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [59,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [99,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [107,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [111,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [119,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [123,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [67,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [71,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [79,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [83,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [87,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [91,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [2,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [35,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [39,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [43,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [47,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [55,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [59,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [7,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [15,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [19,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [99,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [107,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [111,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [119,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [123,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [67,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [71,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [79,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [83,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [87,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [91,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [1,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. terminate called after throwing an instance of 'c10::Error' what(): CUDA error: device-side assert triggered Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:733 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7fe2de2312f2 in /usr/local/lib/python3.7/dist-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::string const&) + 0x5b (0x7fe2de22e67b in /usr/local/lib/python3.7/dist-packages/torch/lib/libc10.so) frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void) + 0x809 (0x7fe2de4891f9 in /usr/local/lib/python3.7/dist-packages/torch/lib/libc10_cuda.so) frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7fe2de2193a4 in /usr/local/lib/python3.7/dist-packages/torch/lib/libc10.so) frame #4: + 0x6e9f8a (0x7fe32b009f8a in /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_python.so) frame #5: + 0x6ea031 (0x7fe32b00a031 in /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_python.so) frame #19: __libc_start_main + 0xe7 (0x7fe33c831bf7 in /lib/x86_64-linux-gnu/libc.so.6)`

    opened by ver0z 8
  • have problem when install MultiScaleDeformableAttention

    have problem when install MultiScaleDeformableAttention

    I encounter problem when running test.py after ./mask.sh. The bug is: ImportError: /envs/anaconda3/envs/venv/lib/python3.8/site-packages/MultiScaleDeformableAttention-1.0-py3.8-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-38-x86_64-linux-gnu.so: undefined symbol: cudaSetupArgument

    What causes this problem?

    opened by whatsups 7
  • ImportError: MultiScaleDeformableAttention undefined symbol

    ImportError: MultiScaleDeformableAttention undefined symbol

    Hi I'm having troubles importing the MultiScaleDeformableAttention module. I followed the instructions, I have pytorch 1.5.1 and cuda 9.2. Thanks

    ImportError: .conda/envs/deformable_detr/lib/python3.7/site-packages/MultiScaleDeformableAttention-1.0-py3.7-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIN3c107complexINS2_4HalfEEEEEPKNS_6detail12TypeMetaDataEv

    opened by MatteoStefanini 6
  • CUDA memory issue

    CUDA memory issue

    Hi, thanks for your great work! But I found many problems related to CUDA memory usage.

    • The memory consumptions for different GPUs are not balanced

      image The memory consumption difference between GPUs could even higher than 3GB(I only have 11GB memory per card).

    • There seems to be memory leakage As the training process goes on, the CUDA memory consumption becomes higher. However, the memory allocated by PyTorch is stable. Although the used CUDA memory is larger than 8GB, the memory allocated by PyTorch is only 2757MB as here: https://github.com/fundamentalvision/Deformable-DETR/blob/11169a60c33333af00a4849f1808023eba96a931/util/misc.py#L270

    opened by cfzd 6
  • AP is close to 0 for my trained model

    AP is close to 0 for my trained model

    @jackroos

    Hi, about a month ago, I downloaded the provided codes and used the command "GPUS_PER_NODE=2 ./tools/run_dist_launch.sh 2 ./configs/r50_deformable_detr.sh" to run the code. The parameters setting is default except that the batch size is set to 1 due to memory issue. (I have modified batch size=1 as the default value in main.py) However, when I evaluated the model saved at epoch 48 using the command "./configs/r50_deformable_detr.sh --resume exps/r50_deformable_detr/checkpoint.pth --eval", the AP is close to 0. image

    Then, I downloaded the provided model "r50_deformable_detr-checkpoint.pth", the evaluation result is right. 微信图片_20210311150112

    I wonder the reason of the abnormal AP for my trained model, and how to solve it? Thanks very much!!!


    a student from USTC six department [email protected] (Jing Zhang)

    opened by zhangjing9701 5
  • what does samples.mask do?

    what does samples.mask do?

    ` def forward(self, samples: NestedTensor): """ The forward expects a NestedTensor, which consists of: - samples.tensor: batched images, of shape [batch_size x 3 x H x W] - samples.mask: a binary mask of shape [batch_size x H x W], containing 1 on padded pixels

            It returns a dict with the following elements:
               - "pred_logits": the classification logits (including no-object) for all queries.
                                Shape= [batch_size x num_queries x (num_classes + 1)]
               - "pred_boxes": The normalized boxes coordinates for all queries, represented as
                               (center_x, center_y, height, width). These values are normalized in [0, 1],
                               relative to the size of each individual image (disregarding possible padding).
                               See PostProcess for information on how to retrieve the unnormalized bounding box.
               - "aux_outputs": Optional, only returned when auxilary losses are activated. It is a list of
                                dictionnaries containing the two above keys for each decoder layer.
        """`
    

    As is described in models/deformable_detr.py, what does samples.mask do here?

    opened by YJHMITWEB 5
  • Attempt to Reproduce the Results

    Attempt to Reproduce the Results

    Recently I attempt to train the Deformable DETR model on a 8-GPU machine (8x TITAN RTX) following the command:

    GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_deformable_detr.sh
    

    The results I got are shown as below:

    IoU metric: bbox
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.435
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.625
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.472
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.256
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.467
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.575
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.346
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.578
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.620
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.394
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.665
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.808
    Training time 2 days, 6:18:48
    

    which has lower performance (around 1% lower) compared with the reported result:

    Method | Epochs | AP | APS | APM | APL -- | -- | -- | -- | -- | -- Deformable DETR | 50 | 44.5 | 27.1 | 47.6 | 59.6

    I wonder if the results I obtained are reasonable? As shown in https://github.com/fundamentalvision/Deformable-DETR/issues/1, it seems the pre-trained checkpoints are trained using batch size 32 (i.e. 2 nodes, 8 GPUs per node, 2 images per GPU). Is the performance gap due to the batch size? (Or the environment settings?) Thanks for your great help!

    opened by xwjabc 5
  • Visualize model predictions (get scores and boxes)

    Visualize model predictions (get scores and boxes)

    Hello,

    thank you for your great work!

    I want to test the performance of my network on some test images. For this I visualize the predicted boxes and scores on the images. I got everything working as I can use my code from the original DETR, but I was wondering how to get the correct scores and labels.

    For DETR I did:

    # keep only predictions with 0.7+ confidence
    probas = outputs['pred_logits'].softmax(-1)[0, :, :-1]
    keep = probas.max(-1).values > 0.7
    
    # convert boxes from [0; 1] to image scales
    bboxes_scaled = rescale_bboxes(outputs['pred_boxes'][0, keep], im.size)
    
    scores, boxes = probas[keep], bboxes_scaled
    

    Since you use sigmoid function for DeformableDETR I replaced these lines with: (Heavily inspired by PostProcess class from deformable_detr.py 😄)

     prob = out_logits.sigmoid() 
     topk_values, topk_indexes = torch.topk(prob.view(out_logits.shape[0], -1), 100, dim=1) 
     scores = topk_values 
     topk_boxes = topk_indexes // out_logits.shape[2] 
     labels = topk_indexes % out_logits.shape[2] 
     boxes = box_ops.box_cxcywh_to_xyxy(out_bbox) 
     boxes = torch.gather(boxes, 1, topk_boxes.unsqueeze(-1).repeat(1,1,4)) 
      
     # and from relative [0, 1] to absolute [0, height] coordinates 
     img_h, img_w = im.size
     img_w = torch.tensor(img_w, device=boxes.device)
     img_h = torch.tensor(img_h, device=boxes.device)
     scale_fct = torch.unsqueeze(torch.stack([img_w, img_h, img_w, img_h], 0))
     boxes = boxes * scale_fct[:, None, :]
    

    With this I get a lot of false positives. The scores are pretty low compared to softmax scores, so which threshold would you recommend to get rid of the false positives?

    opened by krxxxxxxxanc 5
  • why is the last dimension of query_embed 2*hidden_num instead of hidden_num?

    why is the last dimension of query_embed 2*hidden_num instead of hidden_num?

    The paper mentions

    For each object query, the 2-d normalized coordinate of the reference point p_q is predicted from its object query embedding via a learnable linear projection followed by a sigmoid function.

    Based on this description, I guess the last dimension of query_embed is hidden_num. But line 58 shows it is 2*hidden_num.

    Could you share the interpretation? Many thanks.

    opened by Sampson-Lee 0
  • DeformableTransformerDecoder, always sets self.bbox_embed = None?

    DeformableTransformerDecoder, always sets self.bbox_embed = None?

    Hi! It appears that DeformableTransformerDecoder.init always sets self.bbox_embed = None, so if self.bbox_embed is not None in forward(...) is never triggered.

    Why is that? Am I missing anything?

    Thanks!

    https://github.com/fundamentalvision/Deformable-DETR/blob/main/models/deformable_transformer.py#L321-L324:

    # in DeformableTransformerDecoder.__init__:
    
    # hack implementation for iterative bounding box refinement and two-stage Deformable DETR
    self.bbox_embed = None
    self.class_embed = None
    
    opened by vadimkantorov 0
  • Question about the model's complexity

    Question about the model's complexity

    As i understand from the paper, Deformable DETR doesn't suffer from quadratic complexity like DETR. The complexity is [2NqC^2+min(HWC^2, NqKC^2)] so as long as NqK < HW, the model's complexity should be the same even if we change the size of feature map H*W to higher res?

    I checked the log and model provided for Deformable DETR (single scale) and Deformable DETR, the n_parameters and sizes are pretty close: 33844193 (398MB) and 39847265 (468MB). Im assuming the difference is because the parameters of conv layers for multiscale are included.

    So I tried traning the model on a face detection dataset (WIDER_FACE), the standard model gave the exact same n_parameters and size as the provided log, but when i change the backbone's feature map to higher res: layer2 -> layer1, it runs out of memory during training (im on colab, 12gb RAM).

    So is my understanding correct, or does feature map res affect complexity?

    Also just curious, but why did you guys choose the lowest res feature map (layer4) as default for single scale in the code?

    opened by knn217 0
  • Why does the AP  converge to zero after training

    Why does the AP converge to zero after training

    IoU metric: bbox Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.001 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.009 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.012 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.013 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.001 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.022

    My training environment as following: CUDA version = 10.2 GeForce GTX 1080 Ti ×2 My training parameters as following: batch_size = 1 epochs = 50 training on single node and 2 GPUs training command:GPUS_PER_NODE=2 ./tools/run_dist_launch.sh 2 ./configs/r50_deformable_detr.sh

    opened by XmuHjx 3
  • Questions about computing focal loss

    Questions about computing focal loss

    https://github.com/fundamentalvision/Deformable-DETR/blob/11169a60c33333af00a4849f1808023eba96a931/models/segmentation.py#L221

    The focal loss is computed on all object queries (i.e. suppose num_queries=300, all 300 queries have a focal loss). However, the focal loss is only divided by num_boxes, which is the number of all the ground truth boxes in this batch and this number is significantly smaller than the number of all object queries.

    Do you have any specific reasons for computing the focal loss in this way?

    Thanks

    opened by the-yanqi 0
Owner
null
Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"

DE-DETRs By Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao This repository is an official implementation of DE-DETR and DELA-DETR in

Wen Wang 61 Dec 12, 2022
[ICLR 2022] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

DAB-DETR This is the official pytorch implementation of our ICLR 2022 paper DAB-DETR. Authors: Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi

null 336 Dec 25, 2022
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Deformable Attention Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DET

Phil Wang 128 Dec 24, 2022
A simple, fast, and efficient object detector without FPN

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides an implementation for

null 789 Jan 9, 2023
Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

Lane Follower This code is for the lane follower, including perception and control, as shown below. Environment Hardware Industrial Camera Intel-NUC(1

Siqi Fan 3 Jul 7, 2022
🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

?? Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

Rendi Chevi 156 Jan 9, 2023
Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

null 281 Dec 30, 2022
[CVPR2021 Oral] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers This is the official PyTorch implementation and models for UP-DETR paper: @a

dddzg 430 Dec 23, 2022
LiDAR R-CNN: An Efficient and Universal 3D Object Detector

LiDAR R-CNN: An Efficient and Universal 3D Object Detector Introduction This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object

TuSimple 295 Jan 5, 2023
ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

NAVER AI 262 Dec 27, 2022
HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

null 1 Jan 5, 2022
Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

video_lie_detector_using_xgboost a video lie detector using OpenFace and xgboost

null 2 Jan 11, 2022
A whale detector design for the Kaggle whale-detector challenge!

CNN (InceptionV1) + STFT based Whale Detection Algorithm So, this repository is my PyTorch solution for the Kaggle whale-detection challenge. The obje

Tarin Ziyaee 92 Sep 28, 2021
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 7, 2023
Moment-DETR code and QVHighlights dataset

Moment-DETR QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries Jie Lei, Tamara L. Berg, Mohit Bansal For dataset de

Jie Lei 雷杰 133 Dec 22, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

null 36 Sep 13, 2022
[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

OW-DETR: Open-world Detection Transformer (CVPR 2022) [Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Sh

Akshita Gupta 127 Dec 27, 2022
FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

FPGA & FreeNet Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification by Zhuo Zheng, Yanfei Zhong, Ailong M

Zhuo Zheng 92 Jan 3, 2023