Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

Wang, Yue

Last update: Jan 7, 2023

Related tags

Deep Learning detr3d

Overview

Object DGCNN & DETR3D

This repo contains the implementations of Object DGCNN (https://arxiv.org/abs/2110.06923) and DETR3D (https://arxiv.org/abs/2110.06922). Our implementations are built on top of MMdetection3D.

Prerequisite

mmcv (https://github.com/open-mmlab/mmcv)
mmdet (https://github.com/open-mmlab/mmdetection)
mmseg (https://github.com/open-mmlab/mmsegmentation)
mmdet3d (https://github.com/open-mmlab/mmdetection3d)

Data

Follow the mmdet3d to process the data.

Train

Downloads the pretrained backbone weights to pretrained/
For example, to train Object-DGCNN with pillar on 8 GPUs, please use

tools/dist_train.sh projects/configs/obj_dgcnn/pillar.py 8

Evaluation using pretrained models

Download the weights accordingly.

Backbone	mAP	NDS	Download
DETR3D, ResNet101 w/ DCN	34.7	42.2	model \| log
above, + CBGS	34.9	43.4	model \| log
DETR3D, VoVNet on trainval, evaluation on test set	41.2	47.9	model \| log

Backbone	mAP	NDS	Download
Object DGCNN, pillar	53.2	62.8	model \| log
Object DGCNN, voxel	58.6	66.0	model \| log

To test, use
tools/dist_test.sh projects/configs/obj_dgcnn/pillar_cosine.py /path/to/ckpt 8 --eval=bbox

If you find this repo useful for your research, please consider citing the papers

@inproceedings{
   obj-dgcnn,
   title={Object DGCNN: 3D Object Detection using Dynamic Graphs},
   author={Wang, Yue and Solomon, Justin M.},
   booktitle={2021 Conference on Neural Information Processing Systems ({NeurIPS})},
   year={2021}
}

@inproceedings{
   detr3d,
   title={DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries},
   author={Wang, Yue and Guizilini, Vitor and Zhang, Tianyuan and Wang, Yilun and Zhao, Hang and and Solomon, Justin M.},
   booktitle={The Conference on Robot Learning ({CoRL})},
   year={2021}
}

Comments

mAOE metric abnormal

When reproduce detr3d, I find that whether use official provided model or train myself, the mAOE and mAOS metric is always too large. Could someone give me any suggestion?

opened by kaixinbear 7
A question about GPU-util and training time(detr3D)

Hi, I'm a trying to train detr3D on Nuscenes, but I meet a puzzled training procedure. It's strange that the training time will continue some weeks, and I find that the GPU-util is strange,too. I use 4 RTX A6000 to train the model, sometimes one or two GPUs' util will be 0(for a long time). After check, I can train fcos3d as usual. Could anyone please help me with the question? A lot of thanks!

opened by Etah0409 4
How to change the batch_size of DETR3D when training?

I have reproduced your excellent work DETR3D but met some trouble, I didn't find the config parameter to change the batch_size of the model. Should it be added to the config.py (like detr3d_res101_gridmask.py)? I will appreciate it to your kindness to answer my question.

opened by Bosszhe 4
feature_sampling problem

hi, thank you for your work, i have a problem in feature_sampling function, the following code reference_points_cam = (reference_points_cam - 0.5) * 2 why need multiply 2? Thank you for reply!

opened by Yvaine 3
Perfomance on val and test set.

Thanks for releasing the wonderful work.

In the readme:

Backbone | mAP | NDS | Download -- | -- | -- | -- DETR3D, ResNet101 w/ DCN | 34.7 | 42.2 | model | log above, + CBGS | 34.9 | 43.4 | model | log DETR3D, VoVNet on trainval, evaluation on test set | 41.2 | 47.9 | model | log

If I understand correctly, the number of above+CBGS is for the validation set, while the number of DETR3D, VoVNet on trainval is for test set. So I am wondering have you evaluated the results of above + CBGS on the test set? I wonder how much gap it will have for the results on val set and test set

opened by XuyangBai 2

Unable to reproduce results on ObjDGCNN

Hi, thanks for your great work! I am trying to reproduce the performance of Object DGCNN. However, the final results of voxel-based one is 5 points below the reported one. Here is the log:

2021-12-06 10:15:40,069 - mmdet - INFO - Epoch(val) [20][753] ...pts_bbox_NuScenes/NDS: 0.6263, pts_bbox_NuScenes/mAP: 0.5352

I notice that there is a shape mismatch between the pretrained backbone(provided in your google-drive) and the initialized one in object DGCNN:

size mismatch for pts_backbone.blocks.0.0.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 3, 3]).
size mismatch for pts_backbone.blocks.0.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.3.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for pts_backbone.blocks.0.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.4.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.4.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.6.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for pts_backbone.blocks.0.7.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.7.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.7.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.7.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.9.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for pts_backbone.blocks.0.10.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.10.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.10.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.0.10.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for pts_backbone.blocks.1.0.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]).
size mismatch for pts_backbone.blocks.1.1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for pts_backbone.blocks.1.4.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.4.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.4.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.4.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.6.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for pts_backbone.blocks.1.7.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.7.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.7.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.7.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.9.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for pts_backbone.blocks.1.10.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.10.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.10.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.10.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.12.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for pts_backbone.blocks.1.13.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.13.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.13.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.13.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.15.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for pts_backbone.blocks.1.16.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.16.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.16.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for pts_backbone.blocks.1.16.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).

Is this normal?

opened by zehuichen123 2

Question about Training Time

Hi, Thx for your great work! I'm trying to rerun the training process of DETR3D on a 8-V100 machine using the default setting of detr3d_res101_gridmask_cbgs.py and the estimation of the training time is 8 days (I notice that the training epoch in paper is 12 while the code is 24, so for you the time should be 36h in this config?), which is 5.4x longer than the time you reported in the paper. Is this normal? I think 3090 should not be 5.4x faster than V100...

opened by zehuichen123 2
the detail about mmdet3d version?

hi thx for the great job , can someone tell me the detail rely version about this repo? i try to follow the mmdetection3d but all failed.

pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 mmcv=1.4.0 mmsegmentation=0.14.1 mmdet=2.14.0 not works

opened by zituka 1
Maybe a typo in detr3d_vovnet_gridmask_det_final_trainval_cbgs's config

Hi, When I did the experiment on NuScenesDataset, I found that an error was reported in the configuration file detr3d_vovnet_gridmask_det_final_trainval_cbgs.py and there was no such file in the NuScenesDataset. ann_file=data_root + 'nuscenes_infos_trainval.pkl'(line 192), maybe it should be ann_file=data_root + 'nuscenes_infos_train.pkl'?

opened by wchstrife 1
Question about the time complexity of Object DGCNN

From your result in the paper, fps of your model is smaller than other models with nms. And from my understanding, bigger fps means better performance, but you said your model is more efficient. Could you explain it why your model is more efficient with smaller fps?

opened by BoomSky0416 1
Any attempts on kitti?

Hello! First of all, thanks so much for your great efforts, it's really amazing to have DETR3D here! I see that you didn't report any results on KITTI-object in your paper since nuScenes provides more sequential data that benefits the use of transformer-ish architecture. So I would like to ask whether you ran some test on KITTI dataset, or if someone else has reproduced the method on KITTI, which would be really appreciated. Kind regards.

opened by Muyiyunzi 1
Car detection performance of checkpoint on nuscenes mini is poor

I use this checkpoint file to test on the nuscenes mini verification set The evaluation indicators are as follows:

It seems that the car detection is much worse than the pedestrian detection. Is this a normal phenomenon and what may be the cause.

In the visualized figure, blue indicates that the confidence level in the prediction is greater than 0.2, and green indicates the ground truth. Please ignore the red Chinese mark in the figure~

opened by hrz2000 0
Question on nan

Hi there, This is a great work and many thanks for releasing the code. When I look into the code, I find some places where nan is handled, like https://github.com/WangYueFt/detr3d/blob/34a47673011fe13593a3e594a376668acca8bddb/projects/mmdet3d_plugin/models/utils/detr3d_transformer.py#L366-L367 and https://github.com/WangYueFt/detr3d/blob/34a47673011fe13593a3e594a376668acca8bddb/projects/mmdet3d_plugin/models/dense_heads/detr3d_head.py#L336-L337

I wonder in which cases nan would occur? It might help me better understand the model. Any advice would be very much appreciated!

opened by xyupeng 0
About the vov-99 pretrained checkpoint

Hi, thanks for the great work!

I wonder how did you get the dd3d_det_final.pth? Is it pretrained on DDAM-15M for depth prediction, and then subsequently trained as the backbone in DD3D for Monocular 3D Detection? Or just pretrained on DDAM-15M?

opened by zeyuwang615 0
Did you tried tempotal fusion?

Hello, I researched about the temporal fusion in bev object detection. Because temporal fusion works well in BEVFormer, did you tried temporal fusion in DETR3D? The main difference between the two is that there is bev feature in BEVFormer, but not in DETR3D. I wonder if temporal fusion will work in the object queries in DETR3D, thanks!

opened by applezy8866 0

Owner

Wang, Yue

GitHub

MMDetection3D is an open source object detection toolbox based on PyTorch

MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.

3.2k Jan 5, 2023

DGCNN - Dynamic Graph CNN for Learning on Point Clouds

DGCNN is the author's re-implementation of Dynamic Graph CNN, which achieves state-of-the-art performance on point-cloud-related high-level tasks including category classification, semantic segmentation and part segmentation.

1.3k Dec 26, 2022

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

129 Dec 22, 2022

Deep GPs built on top of TensorFlow/Keras and GPflow

GPflux Documentation | Tutorials | API reference | Slack What does GPflux do? GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hier

107 Nov 2, 2022

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

2.8k Jan 8, 2023

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

st-nerf We provide PyTorch implementations for our paper: Editable Free-viewpoint Video Using a Layered Neural Representation SIGGRAPH 2021 Jiakai Zha

258 Jan 2, 2023

A lossless neural compression framework built on top of JAX.

Kompressor Branch CI Coverage main (active) main development A neural compression framework built on top of JAX. Install setup.py assumes a compatible

2 Mar 14, 2022

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

1.2k Jan 4, 2023

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

305 Dec 16, 2022

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

34 Dec 21, 2022

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infant’s sleep schedule using our Google

7 Jul 15, 2022

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

3D-CVF This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object

97 Dec 20, 2022

Python/Rust implementations and notes from Proofs Arguments and Zero Knowledge

What is this? This is where I'll be collecting resources related to the Study Group on Dr. Justin Thaler's Proofs Arguments And Zero Knowledge Book. T

66 Jan 4, 2023

Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

54 Dec 17, 2022

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

21.8k Jan 9, 2023

Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

Related tags

Overview

Object DGCNN & DETR3D

Prerequisite

Data

Train

Evaluation using pretrained models

Comments

Owner

Wang, Yue

MMDetection3D is an open source object detection toolbox based on PyTorch

DGCNN - Dynamic Graph CNN for Learning on Point Clouds

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

Deep GPs built on top of TensorFlow/Keras and GPflow

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

A lossless neural compression framework built on top of JAX.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

Python/Rust implementations and notes from Proofs Arguments and Zero Knowledge

Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.