A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo



Paper | Project Page

This repository contains the code release of our ICCV 2021 paper:

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

Wang Zhao*, Shaohui Liu*, Yi Wei, Hengkai Guo, Yong-Jin Liu


We recommend to use conda to setup a specified environment. Run

conda env create -f environment.yml

Test on a sequence

First download the pretrained model from here and put it under ./pretrain/ folder.

Prepare the sequence data with color images, camera poses (4x4 cam2world transformation) and intrinsics. The sequence data structure should be like:

  | color
      | 00000.jpg
  | pose
      | 00000.txt
  | K.txt

Run the following command to get the outputs:

python infer_folder.py --seq_dir /path/to/the/sequence/data --output_dir /path/to/save/outputs --config ./configs/test_folder.yaml

Tune the "reference gap" parameter to make sure there are sufficient overlaps and camera translations within an image pair. For ScanNet-like sequence, we recommend to use reference_gap of 20.

Test on ScanNet

Prepare ScanNet test split data

Download the ScanNet test split data from the official site and pre-process the data using:

python ./data/preprocess.py --data_dir /path/to/scannet/test/split/ --output_dir /path/to/save/pre-processed/scannet/test/data

This includes 1. resize the color images to 480x640 resolution 2. sample the data with interval of 20

Run evaluation

python eval_scannet.py --data_dir /path/to/processed/scannet/test/split/ --config ./configs/test_scannet.yaml


Prepare ScanNet training data

We use the pre-processed ScanNet data from NAS, you could download the data using this link. The data structure is like:

  | scannet_nas
    | train
      | scene0000_00
          | color
            | 0000.jpg
          | pose
            | 0000.txt
          | depth
            | 0000.npy
          | intrinsic
          | normal
            | 0000_normal.npy
    | val
  | scans_test_sample (preprocessed ScanNet test split)

Run training

Modify the "dataset_path" variable with yours in the config yaml.

The network is trained with a two-stage strategy. The whole training process takes ~6 days with 4 Nvidia V100 GPUs.

python train.py ./configs/scannet_stage1.yaml
python train.py ./configs/scannet_stage2.yaml


If you find our work useful in your research, please consider citing:

    author    = {Zhao, Wang and Liu, Shaohui and Wei, Yi and Guo, Hengkai and Liu, Yong-Jin},
    title     = {A Confidence-Based Iterative Solver of Depths and Surface Normals for Deep Multi-View Stereo},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {6168-6177}


This project heavily relies codes from NAS and we thank the authors for releasing their code.

We also thank Xiaoxiao Long for kindly helping with ScanNet evaluations.

  • wehre shoud I get the testing scene?

    wehre shoud I get the testing scene?

    hi, zhaowang, Thans for your great job! And I have a easy question. where should I get the test scene. I down the scannet data by this link. Howerer, this datasets did not contain the test split. eg scene0707_00.

    opened by tyjiang1997 0
  • The size of unzip ScanNet training data?

    The size of unzip ScanNet training data?

    Hi, thanks for your great work!

    I download the preprocessed ScanNet training data from the google drive link (about 76G). However, when I unzip the data, I get a message that there is not enough space (The remaining space of my disk is about 370G).

    Could you please tell me the size of unzip scannet training data? Thanks a lot!

    opened by LINA-lln 0
  • test result not right

    test result not right

    i have conda env create -f environment.yml success, but i get result not same with you , which python version you are used, thanks below is my env: 1、 conda list

    Name Version Build Channel

    _libgcc_mutex 0.1 main
    argparse 1.4.0 blas 1.0 mkl
    blessings 1.7 bzip2 1.0.8 h7b6447c_0
    ca-certificates 2021.10.26 h06a4308_2
    cached-property 1.5.2 cairo 1.16.0 hf32fb01_1
    certifi 2021.10.8 py37h06a4308_0
    cffi 1.14.6 py37h400218f_0
    cudatoolkit 10.0.130 0
    cudnn 7.6.5 cuda10.0_0
    cycler 0.10.0 ffmpeg 4.0 hcdf2ecd_0
    fontconfig 2.13.1 h6c09931_0
    freeglut 3.0.0 hf484d3e_5
    freetype 2.10.4 h5ab3b9f_0
    glib 2.69.1 h5202010_0
    graphite2 1.3.14 h23475e2_0
    h5py 3.6.0 harfbuzz 1.8.8 hffaf4a1_0
    hdf5 1.10.2 hba1933b_1
    icu 58.2 he6710b0_3
    imageio 2.13.5 intel-openmp 2021.3.0 h06a4308_3350
    jasper 2.0.14 hd8c5072_2
    jpeg 9d h7f8727e_0
    kiwisolver 1.3.2 ld_impl_linux-64 2.35.1 h7274673_9
    libffi 3.3 he6710b0_2
    libgcc-ng 9.1.0 hdf63c60_0
    libgfortran-ng 7.3.0 hdf63c60_0 libglu 9.0.0 hf484d3e_1
    libopencv 3.4.2 hb342d67_1
    libopus 1.3.1 h7b6447c_0
    libpng 1.6.37 hbc83047_0
    libstdcxx-ng 9.1.0 hdf63c60_0
    libtiff 4.2.0 h85742a9_0
    libuuid 1.0.3 h7f8727e_2
    libvpx 1.7.0 h439df22_0
    libwebp-base 1.2.0 h27cfd23_0
    libxcb 1.14 h7b6447c_0
    libxml2 2.9.10 hb55368b_3
    lz4-c 1.9.3 h295c915_1
    matplotlib 3.4.3 mkl 2020.2 256
    mkl-service 2.3.0 py37he8ac12f_0
    mkl_fft 1.3.0 py37h54f3939_0
    mkl_random 1.1.1 py37h0573a6f_0
    ncurses 6.2 he6710b0_1
    networkx 2.6.3 ninja 1.10.2 hff7bd54_1
    numpy 1.19.2 py37h54aff64_0
    numpy-base 1.19.2 py37hfa32c7d_0
    opencv 3.4.2 py37h6fd60c2_1
    openssl 1.1.1l h7f8727e_0
    packaging 21.3 path 16.2.0 path.py 12.5.0 pcre 8.45 h295c915_0
    Pillow 8.4.0 pip 21.0.1 py37h06a4308_0
    pixman 0.40.0 h7f8727e_1
    progressbar 2.5 progressbar2 3.55.0 protobuf 3.19.1 py-opencv 3.4.2 py37hb342d67_1
    pycparser 2.20 py_2
    pyparsing 3.0.3 python 3.7.11 h12debd9_0
    python-dateutil 2.8.2 python-utils 2.7.0 pytorch 1.1.0 cuda100py37he554f03_0
    PyWavelets 1.2.0 PyYAML 6.0 readline 8.1 h27cfd23_0
    scikit-image 0.18.3 scipy 1.2.1 setuptools 58.0.4 py37h06a4308_0
    six 1.16.0 pyhd3eb1b0_0
    sqlite 3.36.0 hc218d9a_0
    tensorboardX 2.4.1 tifffile 2021.11.2 tk 8.6.11 h1ccaba5_0
    wheel 0.37.0 pyhd3eb1b0_1
    xz 5.2.5 h7b6447c_0
    zlib 1.2.11 h7b6447c_3
    zstd 1.4.9 haebb681_0

    2、pip list Package Version

    blessings 1.7 cached-property 1.5.2 certifi 2021.10.8 cffi 1.14.6 cycler 0.10.0 h5py 3.6.0 imageio 2.13.5 kiwisolver 1.3.2 matplotlib 3.4.3 mkl-fft 1.3.0 mkl-random 1.1.1 mkl-service 2.3.0 networkx 2.6.3 numpy 1.19.2 packaging 21.3 path 16.2.0 path.py 12.5.0 Pillow 8.4.0 pip 21.0.1 progressbar 2.5 progressbar2 3.55.0 protobuf 3.19.1 pycparser 2.20 pyparsing 3.0.3 python-dateutil 2.8.2 python-utils 2.7.0 PyWavelets 1.2.0 PyYAML 6.0 scikit-image 0.18.3 scipy 1.2.1 setuptools 58.0.4 six 1.16.0 tensorboardX 2.4.1 tifffile 2021.11.2 torch 1.1.0 wheel 0.37.0

    3、python version Python 3.7.11

    opened by aiforworlds 0
  • How to get the camera poses and intrinsics

    How to get the camera poses and intrinsics

    I am a student interested in your code,I just dont know how to get the camera poses and intrinsics.Do you get them by colmap?Sorry to bother you but I really want to figure it out.

    opened by ThePrecoder 1
  • ScanNet evaluation doesn't match paper

    ScanNet evaluation doesn't match paper

    Hi, Thank you for sharing your code! I ran test on scannet following the instructions but I get much worse results. Do you know what might be happening?

    Total test num: 10434
    ['abs_rel', 'abs_diff', 'sq_rel', 'rms', 'log_rms', 'a1', 'a2', 'a3']
    [0.26725563624661075, 0.4768538641935877, 0.18805316813831266, 0.5750360812767931, 0.3188030923313161, 0.4764993420147505, 0.8155114483545208, 0.9528344624353279]

    Here's the first output for scene0707_00 image

    opened by mrharicot 6
