Object Depth via Motion and Detection Dataset

Brent Griffin

Last update: Dec 21, 2022

Related tags

Deep Learning ODMD

Overview

ODMD Dataset

ODMD is the first dataset for learning Object Depth via Motion and Detection. ODMD training data are configurable and extensible, with each training example consisting of a series of object detection bounding boxes, camera movement distances, and ground truth object depth. As a benchmark evaluation, we provide four ODMD validation and test sets with 21,600 examples in multiple domains, and we also convert 15,650 examples from the ODMS benchmark for detection. In our paper, we use a single ODMD-trained network with object detection or segmentation to achieve state-of-the-art results on existing driving and robotics benchmarks and estimate object depth from a camera phone, demonstrating how ODMD is a viable tool for monocular depth estimation in a variety of mobile applications.

Contact: Brent Griffin (griffb at umich dot edu)

Depth results using a camera phone.

Using ODMD

Run ./demo/demo_datagen.py to generate random ODMD data to train or test your model.
Example data generation and camera configurations are provided in the ./config/ folder. demo_datagen.py has the option to save data into a static dataset for repeated use.
[native Python]

Run ./demo/demo_dataset_eval.py to evaluate your model on the ODMD validation and test sets.
demo_dataset_eval.py has an example evaluation for the Box_LS baseline and instructions for using our detection-based version of ODMS. Results are saved in the ./results/ folder.
[native Python]

Benchmark

Method	Normal	Perturb Camera	Perturb Detect	Robot	All
DBox	1.73	2.45	2.54	11.17	4.47
DBox_Abs	1.11	2.05	1.75	13.29	4.55
Box_LS	0.00	4.47	21.60	21.23	11.83

Is your technique missing although it's published and the code is public? Let us know and we'll add it.

Using DBox Method

Run ./demo/demo_dataset_DBox_train.py to train your own DBox model using ODMD.
Run ./demo/demo_dataset_DBox_eval.py after training to evaluate your DBox model.
Example training and DBox model configurations are provided in the ./config/ folder. Models are saved in the ./results/model/ folder.
[native Python, has Torch dependency]

Publication

Please cite our paper if you find it useful for your research.

@inproceedings{GrCoCVPR21,
  author = {Griffin, Brent A. and Corso, Jason J.},
  booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  title = {Depth from Camera Motion and Object Detection},
  year = {2021}
}

CVPR 2021 supplementary video: https://youtu.be/GruhbdJ2l7k

Use

This code is available for non-commercial research purposes only.

Comments

Data format

Hi Brent,

This is a very interesting publication and it is nice that you are making the code publicly available so early.

I have a question about the input data format in case one would like to run their own images with their own bounding box detections and camera poses. Looking at all the datasets provided in the repository I saw that the bounding box detections are all saved in the normalized format (xc_norm, yc_norm, w_norm, h_norm, Z) and I am not sure how to arrive at the normalized bounding box positions without a depth estimate or the bounding box width and height in 3D. Could you please provide more details on how to create input data for your method, also for the normalized camera poses.

Example input data could be as follows: Image size (640 480) camera pose 0 [[1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0]] bounding box 0 [320, 240, 100, 80]

camera pose 1 [[1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.2, 1.0]] bounding box 1 [320, 240, 120, 115]

camera pose 2 [[1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.5, 1.0]] bounding box 2 [320, 240, 150, 135]

Thanks.

opened by Frank-Michel 4
Inference part

Suppose if I am getting bounding box of object from yolov5 then how can we feed those bounding boxes to this model, or how can we integrate this to Yolov5.

opened by GaganTC 3
Handling arbitrary object motion

Hello,

Thanks for the great work, it is amazing! While reading the paper, I am wondering if the model can still work under complex object/camera motions? To the best of my reading, currently, the camera motion is limited to translation along x, y, and z-axis. Will it still work if there is some rotation of the camera or the object? For example, a pedestrian squat, or the camera is tilt by some angle?

Best, Xianghui

opened by xiexh20 2
Training Samples

Hello sir, This is Kesavan, Sir Can you Provide the Training or Testing Image Samples with the Motion data, If you Provide that would be a great help to understand this ODMD Model more Deeply.

Thanking you, Kesavan

opened by Kesavan-Raman 1
Training DBox on driving

Greeting! I interested in using DBox on the driving dataset, But the default model config seems like indoor(z_lim:[0.55,1]). I'd appreciate it if you release the training config about driving or camera phone on paper examples.

opened by Liuuuu54 1

Owner

Brent Griffin

GitHub

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

5 Jul 14, 2022

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

19 Oct 22, 2022

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

TFLite-msg_chn_wacv20-depth-completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model

2 Oct 4, 2021

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

39 Nov 21, 2022

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

32 Jun 14, 2022

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

LapDepth-release This repository is a Pytorch implementation of the paper "Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals" M

205 Dec 30, 2022

Beyond Image to Depth: Improving Depth Prediction using Echoes (CVPR 2021)

Beyond Image to Depth: Improving Depth Prediction using Echoes (CVPR 2021) Kranti Kumar Parida, Siddharth Srivastava, Gaurav Sharma. We address the pr

33 Jun 27, 2022

Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

light-weight-depth-estimation Boosting Light-Weight Depth Estimation Via Knowledge Distillation, https://arxiv.org/abs/2105.06143 Junjie Hu, Chenyou F

13 Dec 10, 2022

Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

3 Feb 8, 2022

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

138 Dec 28, 2022

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

M4Depth This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in M4Depth: A moti

76 Jan 3, 2023

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

Categorical Depth Distribution Network for Monocular 3D Object Detection

CaDDN CaDDN is a monocular-based 3D object detection method. This repository is based off of [OpenPCDet]. Categorical Depth Distribution Network for M

289 Jan 5, 2023

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Skin Lesion detection using YOLO This project deal

1 Nov 22, 2021

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

5 Dec 10, 2022

Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

3 Jan 26, 2022

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

MOT-Tracking-by-Detection-Pipeline Tracking-by-Detection形式のMOT(Multi Object Trac

41 Nov 23, 2022

Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

49 Oct 9, 2022

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

305 Dec 16, 2022

Object Depth via Motion and Detection Dataset

Related tags

Overview

ODMD Dataset

Using ODMD

Benchmark

Using DBox Method

Publication

Use

Comments

Data format

Inference part

Handling arbitrary object motion

Training Samples

Training DBox on driving

Owner

Brent Griffin

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

Beyond Image to Depth: Improving Depth Prediction using Echoes (CVPR 2021)

Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

Data-depth-inference - Data depth inference with python

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Categorical Depth Distribution Network for Monocular 3D Object Detection

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Yolo object detection - Yolo object detection with python

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.