PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

Alibaba Cloud

Last update: Dec 12, 2022

Related tags

Deep Learning dro-sfm

Overview

DRO: Deep Recurrent Optimizer for Structure-from-Motion

This is the official PyTorch implementation code for DRO-sfm. For technical details, please refer to:

DRO: Deep Recurrent Optimizer for Structure-from-Motion
Xiaodong Gu*, Weihao Yuan*, Zuozhuo Dai, Chengzhou Tang, Siyu Zhu, Ping Tan
[Paper]

Bibtex

If you find this code useful in your research, please cite:

@article{gu2021dro,
  title={DRO: Deep Recurrent Optimizer for Structure-from-Motion},
  author={Gu, Xiaodong and Yuan, Weihao and Dai, Zuozhuo and Tang, Chengzhou and Zhu, Siyu and Tan, Ping},
  journal={arXiv preprint arXiv:2103.13201},
  year={2021}
}

Install
Datasets
Training
Evaluation
Models

Install

We recommend using nvidia-docker2 to have a reproducible environment.

git clone https://github.com/aliyun/dro-sfm.git
cd dro-sfm
sudo make docker-build
sudo make docker-start-interactive

You can also download the built docker directly from dro-sfm-image.tar

docker load < dro-sfm-image.tar

If you do not use docker, you could create an environment following the steps in the Dockerfile.

# Environment variables
export PYTORCH_VERSION=1.4.0
export TORCHVISION_VERSION=0.5.0
export NCCL_VERSION=2.4.8-1+cuda10.1
export HOROVOD_VERSION=65de4c961d1e5ad2828f2f6c4329072834f27661
# Install NCCL
sudo apt-get install libnccl2=${NCCL_VERSION} libnccl-dev=${NCCL_VERSION}

# Install Open MPI
mkdir /tmp/openmpi && \
    cd /tmp/openmpi && \
    wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.0.tar.gz && \
    tar zxf openmpi-4.0.0.tar.gz && \
    cd openmpi-4.0.0 && \
    ./configure --enable-orterun-prefix-by-default && \
    make -j $(nproc) all && \
    make install && \
    ldconfig && \
    rm -rf /tmp/openmpi

# Install PyTorch
pip install torch==${PYTORCH_VERSION} torchvision==${TORCHVISION_VERSION} && ldconfig

# Install horovod (for distributed training)
sudo ldconfig /usr/local/cuda/targets/x86_64-linux/lib/stubs && HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL HOROVOD_WITH_PYTORCH=1 pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION} && sudo ldconfig

To verify that the environment is setup correctly, you can run a simple overfitting test:

# download a tiny subset of KITTI
cd dro-sfm
curl -s https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/dro-sfm/datasets/KITTI_tiny.tar | tar xv -C /data/datasets/kitti/
# in docker
./run.sh "python scripts/train.py configs/overfit_kitti_mf_gt.yaml" log.txt

Datasets

Datasets are assumed to be downloaded in /data/datasets/ (can be a symbolic link).

KITTI

The KITTI (raw) dataset used in our experiments can be downloaded from the KITTI website. For convenience, you can download data from packnet or here

Tiny KITTI

For simple tests, you can download a "tiny" version of KITTI:

Scannet

The Scannet (raw) dataset used in our experiments can be downloaded from the Scannet website. For convenience, you can download data from here

DeMoN

Download DeMoN.

bash download_traindata.sh
python ./dataset/preparation/preparedata_train.py
bash download_testdata.sh
python ./dataset/preparation/preparedata_test.py

Training

Any training, including fine-tuning, can be done by passing either a .yaml config file or a .ckpt model checkpoint to scripts/train.py:

# kitti, checkpoints will saved in ./results/mdoel/
./run.sh 'python scripts/train.py  configs/train_kitti_mf_gt.yaml' logs/kitti_sup.txt
./run.sh 'python scripts/train.py  configs/train_kitti_mf_selfsup.yaml' logs/kitti_selfsup.txt 

# scannet
./run.sh 'python scripts/train.py  configs/train_scannet_mf_gt_view3.yaml' logs/scannet_sup.txt
./run.sh 'python scripts/train.py  configs/train_scannet_mf_selfsup_view3.yaml' logs/scannet_selfsup.txt
./run.sh 'python scripts/train.py  configs/train_scannet_mf_gt_view5.yaml' logs/scannet_sup_view5.txt

# demon
./run.sh 'python scripts/train.py  configs/train_demon_mf_gt.yaml' logs/demon_sup.txt

Evaluation

python scripts/eval.py --checkpoint <checkpoint.ckpt> [--config <config.yaml>]
# example:kitti, results will be saved in results/depth/
python scripts/eval.py --checkpoint ckpt/outdoor_kitti.ckpt --config configs/train_kitti_mf_gt.yaml

You can also directly run inference on a single image or video:

# video or folder
# indoor-scannet 
python scripts/infer_video.py --checkpoint ckpt/indoor_sacnnet.ckpt --input /path/to/video or folder --output /path/to/save_folder --sample_rate 1 --data_type scannet --ply_mode 
 # indoor-general
python scripts/infer_video.py --checkpoint ckpt/indoor_sacnnet.ckpt --input /path/to/video or folder --output /path/to/save_folder --sample_rate 1 --data_type general --ply_mode

# outdoor
python scripts/infer_video.py --checkpoint ckpt/outdoor_kitti.ckpt --input /path/to/video or folder --output /path/to/save_folder --sample_rate 1 --data_type kitti --ply_mode 

# image
python scripts/infer.py --checkpoint <checkpoint.ckpt> --input <image or folder> --output <image or folder>

Models

Model	Abs.Rel.	Sqr.Rel	RMSE	RMSElog	a1	a2	a3	SILog	L1_inv	rot_ang	t_ang	t_cm
Kitti_sup	0.045	0.193	2.570	0.080	0.971	0.994	0.998	0.079	0.003	-	-	-
Kitti_selfsup	0.053	0.346	3.037	0.102	0.962	0.990	0.996	0.101	0.004	-	-	-
scannet_sup	0.053	0.017	0.165	0.080	0.967	0.994	0.998	0.078	0.033	0.472	9.297	1.160
scannet_sup(view5)	0.047	0.014	0.151	0.072	0.976	0.996	0.999	0.071	0.030	0.456	8.502	1.163
scannet_selfsup	0.143	0.345	0.656	0.274	0.896	0.954	0.969	0.272	0.106	0.609	10.779	1.393

Acknowledgements

Thanks to Toyota Research Institute for opening source of excellent work packnet-sfm. Thanks to Zachary Teed for opening source of his excellent work RAFT.

You might also like...

AdamW optimizer for bfloat16 models in pytorch.

Image source AdamW optimizer for bfloat16 models in pytorch. Bfloat16 is currently an optimal tradeoff between range and relative error for deep netwo

8 Nov 20, 2022

A mini library for Policy Gradients with Parameter-based Exploration, with reference implementation of the ClipUp optimizer from NNAISENSE.

PGPElib A mini library for Policy Gradients with Parameter-based Exploration [1] and friends. This library serves as a clean re-implementation of the

56 Jan 1, 2023

Comments

About the camera pose

Thanks for sharing such a great work!

I have run the demo code in the provided docker image successfully. However, I got only the depth files in the output folder. Could you please tell me how to get the estimated poses after I run the inference process?

Thanks!

opened by XiaoqiangZhou 1
Bad odometry prediciton on kitti

Hi! Thanks for your great work and well-organized codebase!

I am trying to compare the odometry performance with yout method. I grab the public available supervised pretrained model from this repo. And I run the prediction on kitti odometry 00 sequence(2011_10_03/2011_10_03_drive_0027_sync). But the prediciton results is very strange. I wish to double-check is this coming from bug in my code. Basically, I add the prediciton named "poses" in "dro-sfm/dro_sfm/models/model_wrapper.py" from function "evaluate_depth()". Is this performance normal? I attach the odometry plot below

opened by TWJianNuo 2

PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

Related tags

Overview

DRO: Deep Recurrent Optimizer for Structure-from-Motion

Bibtex

Contents

Install

Datasets

KITTI

Tiny KITTI

Scannet

DeMoN

Training

Evaluation

Models

Acknowledgements

You might also like...

AdamW optimizer for bfloat16 models in pytorch.

A mini library for Policy Gradients with Parameter-based Exploration, with reference implementation of the ClipUp optimizer from NNAISENSE.

This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

Pytorch implementation of the Variational Recurrent Neural Network (VRNN).

PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

The first public PyTorch implementation of Attentive Recurrent Comparators

PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

Comments

About the camera pose

Bad odometry prediciton on kitti

Owner

Alibaba Cloud

Deep Two-View Structure-from-Motion Revisited

ESGD-M - A stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch

Ranger deep learning optimizer rewrite to use newest components

DeepOBS: A Deep Learning Optimizer Benchmark Suite

Video Autoencoder: self-supervised disentanglement of 3D structure and motion

COLMAP - Structure-from-Motion and Multi-View Stereo

Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures

SatelliteSfM - A library for solving the satellite structure from motion problem

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)