Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Overview

Toward Practical Monocular Indoor Depth Estimation

Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su

[arXiv] [project site]

DistDepth

Our DistDepth is a highly robust monocular depth estimation approach for generic indoor scenes.

  • Trained with stereo sequences without their groundtruth depth
  • Structured and metric-accurate
  • Run in an interactive rate with Laptop GPU
  • Sim-to-real: trained on simulation and becomes transferrable to real scenes

Single Image Inference Demo

We test on Ubuntu 20.04 LTS with an laptop NVIDIA 2080 GPU (only GPU mode is supported).

Install packages

  1. Use conda

    conda create --name distdepth python=3.8 conda activate distdepth

  2. Install pre-requisite common packages. Go to https://pytorch.org/get-started/locally/ and install pytorch that is compatible to your computer. We test on pytorch v1.9.0 and cudatoolkit-11.1. (The codes should work under other v1.0+ versions)

    conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.3 -c pytorch -c conda-forge

  3. Install other dependencies: opencv-python and matplotlib.

    pip install opencv-python, matplotlib

Download pretrained models

  1. Download pretrained models [here] (ResNet152, 246MB).

  2. Move the downloaded item under this folder, and then unzip it. You should be able to see a new folder 'ckpts' that contains the pretrained models.

  3. Run

    python demo.py

  4. Results will be stored under results/

Data

Download SimSIN [here]. For UniSIN and VA, please download at the [project site].

Depth-aware AR effects

Virtual object insertion:

Dragging objects along a trajectory:

Citation

@inproceedings{wu2022toward,
title={Toward Practical Monocular Indoor Depth Estimation},
author={Wu, Cho-Ying and Wang, Jialiang and Hall, Michael and Neumann, Ulrich and Su, Shuochen},
booktitle={CVPR},
year={2022}
}

License

DistDepth is CC-BY-NC licensed, as found in the LICENSE file.

Comments
  • google drive data download fail

    google drive data download fail

    File download fails because a single file of Google drive data is too large, example UniSIN: , 457G(https://drive.google.com/file/d/15Pz2pr9u1nS809wXYGroEDFEprBVCTF0/view?usp=sharing)

    opened by aiforworlds 4
  • camera intrinsics of Simsin dataset

    camera intrinsics of Simsin dataset

    It is appreciated to publish new datasets for indoor environment, and thanks for your working. I have some questions of the Simsin dataset.

    1. Where can I find the camera intrinsics of Simsin dataset? The dataset zip does include the readme or camera intrinsics for the dataset.
    2. Whether a dataset.py file that reads the simsin dataset can be published ?

    Thanks for your working and looking forward to your reply

    opened by rainfall1998 4
  • Problems about evaluation results on the NYU-V2 dataset

    Problems about evaluation results on the NYU-V2 dataset

    Hi, thanks for the wonderful work! I have made a evaluation on NYU-V2 test dataset and the code is based on demo.py and the pretrain model provided . I add the h5py read function and evaluation result calculation funcion.

    However, I got a result lower than the paper mentioned. So I wonder whether the evaluation code is wrong or the model is wrong? The result and evaluation code is following.

    Thanks for your working and looking forward to your reply

    evaluation result: val_error/abs_rel | val_error/sq_rel | val_error/rmse | val_error/rmse_log | val_error/lg10 | val_acc/a1 | val_acc/a2 | val_acc/a3 0.17270 0.14038 0.59941 0.21287 0.07320 0.73734 0.93800 0.98530

    evaluation code(the py file isn't supported, turn it to txt): nyutest_demo.py.txt

    opened by rainfall1998 3
  • About UniSIN data collect method

    About UniSIN data collect method

    When you use zed2i camera to collect data, Just hold it in your hands or use others stabilizing device? I found out when I tried, the video will shake when you hold it directly, Is there any way you can stabilize the zed2i camera while you're collecting data, thank you very much

    opened by aiforworlds 2
  • Question about preparing NYUv2 dataset for evaluation

    Question about preparing NYUv2 dataset for evaluation

    Hi, Thank you so much for releasing your code. I am trying to run the evaluation script for NYUv2 dataset. Based on the README the format of the data should be like this:

    ├── NYUv2
           ├── img_val
              ├── 00001.png
              ......
           ├── depth_val
              ├── 00001.npy
              ......
              ......
           ├── NYUv2.txt
    

    I have downloaded NYUv2 labeled dataset from here which is a .m file. Is this the dataset you used for evaluation? if yes do you have any scripts on how to extract data from this .m file compatible with the format mentioned in the README?

    opened by afazel 2
  • Some questions about the demo (infering on single images in general)

    Some questions about the demo (infering on single images in general)

    What is the 1.312 constant in output_to_depth function? And what are the 0.1 and 10 inputs to the function? is it based on the training data and should change during inference? In the demo visualization you use 0.1 and 5 instead of 0.1 and 10 Is the output of output_to_depth in meters? and output of the net (the sigmoid) is disparity? I didn't see any reference in the repo or the paper... Thanks!

    opened by gozi1123 1
  • Questions about the loss computation

    Questions about the loss computation

    Hi, thanks for releasing the code!

    I have several questions after reading the execute_func.py:

    • At L78, the DPT is initialized with invert=True, which means the output is already depth. While at L253, a division is performed again to convert the output of DPT to depth. Is there a comment error here, since from what I understand, the code at L254 seems to be correct.
    • At L378, the pseudo_depth error is computed between outputs["fromMono_dep"] and outputs[('depth', 0, 0)], which have different scales. Should outputs["fromMono_dep"] be replaced by target_depth at L353, since its scale has been aligned.

    Looking forward to your reply.

    Best regards!

    opened by w2kun 3
  • How to generate SimSIN data?

    How to generate SimSIN data?

    Hi,

    Thanks for your nice work. I want to generate data from HM3D. Could you share the script for how to generate stereo data with depth with me.

    Thank you!

    opened by HLinChen 2
  • Question about scaling factor in evaluation script

    Question about scaling factor in evaluation script

    I was wondering what is the motivation for using this scaling factor during evaluation? depth_pred *= torch.median(depth_gt) / [torch.median(depth_pred)](url) https://github.com/facebookresearch/DistDepth/blob/dde30a4cecd5457f3c4fa7ab7bf7d5c8ea95934a/execute_func.py

    If we remove this scaling, the results are not aligned with what is reported in the paper. For NYUv2 dataset, the RMSE increases from 0.58 to 0.99

    Can you please explain why did you use that scaling during evaluation? Is it reasonable to scale-up from ground truth information?

    Also in compute_depth_errors(), how did you decide on 1.25 as a threshold?

    opened by afazel 1
  • FileNotFoundError: [Errno 2] No such file or directory: 'ckpts-distdepth-M-101-SimSIN-DPTLegacy/pose_encoder.pth'

    FileNotFoundError: [Errno 2] No such file or directory: 'ckpts-distdepth-M-101-SimSIN-DPTLegacy/pose_encoder.pth'

    i test bash eval.sh

    but have a issues :FileNotFoundError: [Errno 2] No such file or directory: 'ckpts-distdepth-M-101-SimSIN-DPTLegacy/pose_encoder.pth'

    where can download pose_encoder.pth.

    i unzip ckpts-distdepth-M-101-SimSIN-DPTLegac only have depth.pth and encoder.pth

    opened by deepuav 1
Owner
Meta Research
Meta Research
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrés Romero 37 Nov 27, 2022
Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)

Self-Supervised Models are Continual Learners This is the official repository for the paper: Self-Supervised Models are Continual Learners Enrico Fini

Enrico Fini 73 Dec 18, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 294 Jan 1, 2023
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
Colar: Effective and Efficient Online Action Detection by Consulting Exemplars, CVPR 2022.

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars This repository is the official implementation of Colar. In this work,

LeYang 246 Dec 13, 2022
[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement Announcement ?? We have not tested the code yet. We will fini

Xiuwei Xu 7 Oct 30, 2022
[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

Haochen Wang 268 Dec 24, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

Qing Guo 13 Nov 29, 2022
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 21 Dec 22, 2022
A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

A variational Bayesian method for similarity learning in non-rigid image registration We provide the source code and the trained models used in the re

daniel grzech 14 Nov 21, 2022
Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

?? Sound-guided Semantic Image Manipulation (CVPR2022) Official Pytorch Implementation Sound-guided Semantic Image Manipulation IEEE/CVF Conference on

CVLAB 58 Dec 28, 2022
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Nonuniform-to-Uniform Quantization This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quanti

Zechun Liu 60 Dec 28, 2022
Code for "Modeling Indirect Illumination for Inverse Rendering", CVPR 2022

Modeling Indirect Illumination for Inverse Rendering Project Page | Paper | Data Preparation Set up the python environment conda create -n invrender p

ZJU3DV 116 Jan 3, 2023
CVPR 2022 "Online Convolutional Re-parameterization"

OREPA: Online Convolutional Re-parameterization This repo is the PyTorch implementation of our paper to appear in CVPR2022 on "Online Convolutional Re

Mu Hu 121 Dec 21, 2022
Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

PGNet Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022, CVPR 2022 (arXiv 2204.05041) Abstract Recent salient objec

CVTEAM 109 Dec 5, 2022
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 113 Dec 21, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

null 36 Nov 23, 2022