Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

Michaël Fonder

Last update: Jan 3, 2023

Related tags

Overview

M4Depth

This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in

M4Depth: A motion-based approach for monocular depth estimation on video sequences

Michaël Fonder, Damien Ernst and Marc Van Droogenbroeck

arXiv pdf

Some samples produced by our method: the first line shows the RGB picture capured by the camera, the second the ground-truth depth map and the last one the results produced by our method.

If you find our work useful in your research please consider citing our paper:

@article{Fonder2021M4Depth,
  title     = {M4Depth: A motion-based approach for monocular depth estimation on video sequences},
  author    = {Michael Fonder and Damien Ernst and Marc Van Droogenbroeck},
  booktitle = {arXiv},
  month     = {May},
  year      = {2021}
}

If you use the Mid-Air dataset in your research, please consider citing the related paper:

@INPROCEEDINGS{Fonder2019MidAir,
  author    = {Michael Fonder and Marc Van Droogenbroeck},
  title     = {Mid-Air: A multi-modal dataset for extremely low altitude drone flights},
  booktitle = {Conference on Computer Vision and Pattern Recognition Workshop (CVPRW)},
  year      = {2019},
  month     = {June}
}

Dependencies

Assuming a fresh Anaconda distribution, you can install the dependencies with:

conda install tensorflow-gpu=1.15 h5py pyquaternion numpy

Formatting data

Our code works with tensorflow protobuffer files data for training and testing therefore need to be encoded properly before being passed to the network.

Mid-Air dataset

To reproduce the results of our paper, you can use the Mid-Air dataset for training and testing our network. For this, you will first need to download the required data on your computer. The procedure to get them is the following:

Go on the download page of the Mid-Air dataset

Select the "Left RGB" and "Stereo Disparity" image types

Move to the end of the page and enter your email to get the download links (the volume of selected data should be equal to 316.5Go)

Follow the procedure given at the begining of the download page to download and extract the dataset

Once the dataset is downloaded you can generate the required protobuffer files by running the following script:

python3 midair-protobuf_generation.py --db_path path/to/midair-root --output_dir desired/protobuf-location --write

This script generates trajectory sequences with a length of 8 frames and automatically creates the train and test splits for Mid-Air in separated subdirectories.

Custom data

You can also train or test our newtork on your own data. You can generate your own protobuffer files by repurpusing our midair-protobuf_generation.py script. When creating your own protobuffer files, you should pay attention to two major parameters; All sequences should have the same length and each element of a sequence should come with the following data:

"image/color_i" : the binary data of the jpeg picture encoding the color data of the frame
"Image/depth_i" : the binary data of the 16-bit png file encoding the stereo disparity map
"data/omega_i" : a list of three float32 numbers corresponding to the angular rotation between two consecutive frames
"data/trans_i" : a list of three float32 numbers corresponding to the translation between two consecutive frames

The subscript i has to be replaced by the index of the data within the trajectory. Translations and rotations are expressed in the standard camera frame of refence axis system.

Training

You can launch a training or a finetuning (if the log_dir already exists) by exectuting the following command line:

python3 m4depth_pipeline.py --train_datadir=path/to/protobuf/dir --log_dir=path/to/logdir --dataset=midair --arch_depth=6 --db_seq_len=8 --seq_len=6 --num_batches=200000 -b=3 -g=1 --summary_interval_secs=900 --save_interval_secs=1800

If needed, other options are available for the training phase and are described in pipeline_options.py and in m4depth_options.py files. Please note that the code can run on multiple GPUs to speedup the training.

Testing/Evaluation

You can launch the evaluation of your test samples by exectuting the following command line:

python3 m4depth_pipeline.py --test_datadir=path/to/protobuf/dir --log_dir=path/to/logdir --dataset=midair --arch_depth=6 --db_seq_len=8 --seq_len=8 --b=3 -g=1

If needed, other options are available for the evaluation phase and are described in pipeline_options.py and in m4depth_options.py files.

Pretrained model

We provide pretrained weights for our model in the "trained_weights" directory. Testing or evaluating a dataset from these weight can be done by executing the following command line:

python3 m4depth_pipeline.py --test_datadir=path/to/protobuf/dir --log_dir=trained_weights/M4Depth-d6 --dataset=midair --arch_depth=6 --db_seq_len=8 --seq_len=8 --b=3 -g=1

Comments

Bad results on MidAir data

Hi!

I downloaded part of MidAir data (Kite_training/sunny/00-19), made protobuffs from them. At the testing I get very strange results: predicted depth maps are completely gray or with horizontal \ vertical lines pattern (examples)

Here is error statistics, which is also not satisfying:

Average batch processing time : 35.745987
----------------------------------------------------------------
Test results for Abs Rel : b'3.18921' % +/- b'0.11278' (jitter = b'0.50106')
Test results for Sq Rel : b'193.020' % +/- b'8.26548' (jitter = b'37.4728')
Test results for RMSE : b'43.8260' % +/- b'0.89049' (jitter = b'2.44292')
Test results for RMSEl : b'1.41643' % +/- b'0.02271' (jitter = b'0.08594')
Test results for a1 : b'0.14217' % +/- b'0.00765' (jitter = b'0.04241')
Test results for a2 : b'0.26711' % +/- b'0.01180' (jitter = b'0.05237')
Test results for a3 : b'0.37170' % +/- b'0.01334' (jitter = b'0.06998')

Any ideas why is it going like this?

opened by dimaxano 6

Code freezes during validation step while training

I run the next command

python3 m4depth_pipeline.py --train_datadir=/home/dmitry/datasets/MidAir/pb/train/ --val_datadir='/home/dmitry/datasets/MidAir/pb/test/'  --log_dir=/home/dmitry/Documents/repos/M4Depth/logdir/ --dataset=midair --arch_depth=6 --db_seq_len=8 --seq_len=6 --num_batches=200000 -b=1 -g=1 --summary_interval_secs=120 --save_interval_secs=900 --validation_interval_secs=180 --eval_only_last_pic

With small debugging I found that code stuck at that line.

Some info about setup:

tf 1.15
2080Ti
MidAir dataset (RGB + Stereo Disparities)

@michael-fonder Do you have any ideas where should I look for the source of the freeze?

opened by dimaxano 4

About the license of this repository

Thanks for the great work!

I understand that the datasets used are licensed under CC4.0 and cannot be used commercially without special negotiation. What license do you assume for the source code or models committed to this repository of yours?

I'm sure you are busy, so I won't hurry to answer at all. I'm sure other researchers and engineers are wondering the same thing.

Thank you.

opened by PINTO0309 2
Missing information about datasets_locations.json

Hey Michaël,

Just a suggestion here, because it is somehow intuitive, but maybe you should add a sentence about the datasets_locations.json file in the README.

Best, Pascal

opened by PaLeroy 1
stereo disparity vs depth map

As I understand you used stereo disparities for training, but in the paper, I see depth map should be used.

I see both are very similar modalities, does it really makes difference what to use as groundtruth: depth map or disparity?

opened by dimaxano 1
missing module error

Hey Michael, thank you so much for this amazing piece of work. I was trying out your code and I repeatedly end up with an error saying - " no module named protobuf_db". I was wondering how I can solve this error. I could not find a direct module relating to the error.

Any help would be appreciated.

opened by dskuma 0

Owner

Michaël Fonder

PhD candidate in computer vision and deep learning. Interested in drone flight automation by using an on-board mounted monocular camera.

GitHub

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 6, 2022

Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

TensorFlow2-GAN Collection of tf2.0 implementations of Generative Adversarial Network varieties presented in research papers. Model architectures will

41 Apr 28, 2022

Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

FFD Source Code Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face M

88 Nov 22, 2022

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

3 Dec 2, 2022

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Population-Based Bandits (PB2) Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimiza

22 Nov 16, 2022

The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

SaxonJS-Tutorial-2021, version 1.0.4 Last updated on 4 November, 2021. Table of contents Background Prerequisites Starting a web server Running a Java

11 Oct 23, 2022

Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

Playground4AWS Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021. Architecture Minecraft and Lamps This project i

5 Nov 30, 2022

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

59 Dec 17, 2022

Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

Self-Classifier: Self-Supervised Classification Network Official PyTorch implementation and pretrained models of the paper Self-Supervised Classificat

24 Dec 21, 2022

Official PyTorch implementation of the paper: Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting.

Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting Official PyTorch implementation of the paper: Improving Graph Neural Net

58 Dec 31, 2022

The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

Watermark-Robustness-Toolbox - Official PyTorch Implementation This repository contains the official PyTorch implementation of the following paper to

49 Dec 19, 2022

Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

SAGCN - Official PyTorch Implementation | Paper | Project Page This is the official implementation of the paper "Steganographer detection via a simila

1 Nov 26, 2021

Official implementation of the paper 'Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution'

DASR Paper Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution Jie Liang, Hui Zeng, and Lei Zhang. In arxiv preprint. Abs

81 Dec 28, 2022

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolu

95 Oct 24, 2022

Official implement of Paper：A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

135 Dec 19, 2022

Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

M3D-VTON: A Monocular-to-3D Virtual Try-On Network Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network" Paper | Suppl

109 Dec 29, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

Official public repository of paper "Intention Adaptive Graph Neural Network for Category-Aware Session-Based Recommendation"

Intention Adaptive Graph Neural Network (IAGNN) This is the official repository of paper Intention Adaptive Graph Neural Network for Category-Aware Se

9 Nov 22, 2022

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

503 Jan 4, 2023

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

Related tags

Overview

M4Depth

Dependencies

Formatting data

Mid-Air dataset

Custom data

Training

Testing/Evaluation

Pretrained model

Comments

Bad results on MidAir data

Code freezes during validation step while training

About the license of this repository

Missing information about datasets_locations.json

stereo disparity vs depth map

missing module error

Owner

Michaël Fonder

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

Official PyTorch implementation of the paper: Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting.

The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

Official implementation of the paper 'Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution'

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

Official implement of Paper：A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

Official public repository of paper "Intention Adaptive Graph Neural Network for Category-Aware Session-Based Recommendation"

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation