Code release of paper "Deep Multi-View Stereo gone wild"

François Darmon

Last update: Dec 24, 2022

Related tags

Deep Learning wild_deep_mvs

Overview

Deep MVS gone wild

Pytorch implementation of "Deep MVS gone wild" (Paper | website)

This repository provides the code to reproduce the experiments of the paper. It implements extensive comparison of Deep MVS architecture, training data and supervision.

If you find this repository useful for your research, please consider citing

@article{
  author    = {Darmon, Fran{\c{c}}ois  and
               Bascle, B{\'{e}}n{\'{e}}dicte  and
               Devaux, Jean{-}Cl{\'{e}}ment  and
               Monasse, Pascal  and
               Aubry, Mathieu},
  title     = {Deep Multi-View Stereo gone wild},
  year      = {2021},
  url       = {https://arxiv.org/abs/2104.15119},
}

Installation

Python packages: see requirements.txt
Fusibile:

git clone https://github.com/YoYo000/fusibile 
cd fusibile
cmake .
make .
ln -s EXE ./fusibile

COLMAP: see the github repository for installation details then link colmap executable with ln -s COLMAP_DIR/build/src/exe/colmap colmap

Training

You may find all the pretrained models here (120 Mo) or alternatively you can train models using the following instructions.

Data

Download the following data and extract to folder datasets

DTU training (19 Go)
BlendedMVS (27.5 Go)
Megadepth: MegadepthV1 (199 Go) Geometry (8 Go)

The directory structure should be as follow:

datasets
├─ blended
├─ dtu_train
├─ MegaDepth_v1
├─ undistorted_md_geometry

The data is already preprocessed for DTU and BlendedMVS. For MegaDepth, run python preprocess.py for generating the training data.

Script

The training script is train.py, launch python train.py --help for all the options. For example

python train.py --architecture vis_mvsnet --dataset md --supervised --logdir best_sup --world_size 4 --batch_size 4 for training the best performing setup for images in the wild.
python train.py --architecture mvsnet-s --dataset md --unsupervised --upsample --occ_masking --epochs 5 --lrepochs 4:10 --logdir best_unsup --world_size 3 for the best unsupervised model.

The models are saved in folder trained_models

Evaluations

We provide code for both depthmap evaluation and 3D reconstruction evaluation

Data

Download the following links and extract them to datasets

BlendedMVS (27.5 GB) same link as BlendedMVS training data
YFCC depth maps (1.1Go)
DTU MVS benchmark: Create directory datasets/dtu_eval and extract the following files
- Images (500Mo), rename it as images folder
- Ground truth (6.3Go)
- evaluation files (6.3Go), the evaluation only need ObsMask folder
In the end the folder structure should be
```
datasets
├─ dtu_eval
    ├─ ObsMask
    ├─ images
    ├─ Points
        ├─ stl
```
YFCC 3D reconstruction (1.5Go)

Depthmap evaluation

python depthmap_eval.py --model MODEL --dataset DATA

MODEL is the name of a folder found in trained_models
DATA is the evaluation dataset, either yfcc or blended

3D reconstruction

See python reconstruction_pipeline.py --help for a complete list of parameters for 3D reconstruction. For running the whole evaluation for a trained model with the parameters used in the paper, run

scripts/eval3d_dtu.sh --model MODEL (--compute_metrics) for DTU evaluation
scripts/eval3d_yfcc.sh --model MODEL (--compute_metrics) for YFCC 3D evaluation

The reconstruction will be located in datasets/dtu_eval/Points or datasets/yfcc_data/Points

Acknowledgments

This repository is inspired by MVSNet_pytorch and MVSNet repositories. We also adapt the official implementations of Vis_MVSNet and CVP_MVSNet.

Copyright

Deep MVS Gone Wild All rights reseved to Thales LAS and ENPC.

This code is freely available for academic use only and Provided “as is” without any warranty.

Modification are allowed for academic research provided that the following conditions are met :
  * Redistributions of source code or any format must retain the above copyright notice and this list of conditions.
  * Neither the name of Thales LAS and ENPC nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

You might also like...

The code release of paper Low-Light Image Enhancement with Normalizing Flow

[AAAI 2022] Low-Light Image Enhancement with Normalizing Flow Paper | Project Page Low-Light Image Enhancement with Normalizing Flow Yufei Wang, Renji

176 Jan 6, 2023

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

536 Dec 20, 2022

Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Transferable Semantic Augmentation for Domain Adaptation Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021) Paper

66 Dec 16, 2022

Code release for "COTR: Correspondence Transformer for Matching Across Images"

COTR: Correspondence Transformer for Matching Across Images This repository contains the inference code for COTR. We plan to release the training code

360 Jan 6, 2023

We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

ConTNet Introduction ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large rec

93 Nov 8, 2022

This is the dataset and code release of the OpenRooms Dataset.

95 Jan 8, 2023

Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

524 Jan 8, 2023

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

BlockGAN Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images BlockGAN: Learning 3D Object-aware Scene Rep

41 May 18, 2022

Code Release for Learning to Adapt to Evolving Domains

EAML Code release for "Learning to Adapt to Evolving Domains" (NeurIPS 2020) Prerequisites PyTorch = 0.4.0 (with suitable CUDA and CuDNN version) tor

23 Dec 7, 2022

Comments

gt depth map fusion parameters

Hi, thanks for a good work. Could you provide your exact parameters for generating gt through fusing depth maps from IMC? I was fusing these scenes with parameters specified in paper "reprojection error below half a pixel and depth error below 1%", but the fusing result is not as good as the gt you provided. Thanks!

opened by Burningdust21 4
Urban datasets
Hi, thank you for sharing both the paper and the code. I'm working on something similar, so I was very happy to read your results.

I would like to ask if you ever considered urban datasets in your evaluation, especially multi-camera datasets such as nuScenes or DDAD. I'm asking this for two main reasons:

Internet data are surely "in the wild", but they mostly focus on a single giant object (i.e. a building) in the image, for which a lot of diverse views can be captured. On the other hand, urban data have no clear subject, a lot of dynamic objects and several textureless areas to deal with, which is definitely an even harder test for MVS networks.

MVS networks cannot be trained in a supervised way on urban data, therefore your insights on unsupervised methods might be interesting to be validated also on these kind of data.

What are your thoughts on this?
opened by morsingher 1
Re-computing f-scores reported in paper

Hi there,

Thanks for the great work!

I'm trying to re-compute the metrics reported in Table 4 of your paper (prec., rec., f-score on YFCC evaluation) but there doesn't appear to be code in your repo for doing this. I have my own function for computing the metrics, but am a bit confused about setting a threshold.

First, do you have the thresholds you used for each YFCC scene stored anywhere? I tried using the values stored in the text files in yfcc_data/gt_resolution, but using your provided trained models with this threshold doesn't give me the same results you report in Table 4.

Second, is Eq. 7 in the paper correct? I understand the goal with Eq. 7, but shouldn't the distance argument be ||K^-1 (D(p)p) - K^-1 (D(p')p')||. This way, you find the median distance in scene space between back-projected points 2 pixels away from each other. Is this perhaps what you used or did you use Eq. 7 as is?

Thanks! Alex

opened by alexrich021 0

Owner

François Darmon

PhD student in 3D computer vision at Imagine team ENPC and Thales LAS FRANCE

GitHub

The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

56 Dec 28, 2022

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

235 Dec 26, 2022

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022

Code release of paper "Deep Multi-View Stereo gone wild"

Related tags

Overview

Deep MVS gone wild

Installation

Training

Data

Script

Evaluations

Data

Depthmap evaluation

3D reconstruction

Acknowledgments

Copyright

You might also like...

The code release of paper Low-Light Image Enhancement with Normalizing Flow

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Code release for "COTR: Correspondence Transformer for Matching Across Images"

We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

This is the dataset and code release of the OpenRooms Dataset.

Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Code Release for Learning to Adapt to Evolving Domains

Comments

gt depth map fusion parameters

Urban datasets

Re-computing f-scores reported in paper

Owner

François Darmon

The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

This is the official code release for the paper Shape and Material Capture at Home

Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

Code release for ICCV 2021 paper "Anticipative Video Transformer"

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".