Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021).

Overview

AA-RMVSNet

Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021) in PyTorch.

paper link: arXiv | CVF

Change Log

  • Jun 17, 2021: Initialize repo
  • Jun 27, 2021: Update code
  • Aug 10, 2021: Update paper link
  • Oct 14, 2021: Update bibtex

Data Preparation

How to run

  1. Install required dependencies:
    conda create -n drmvsnet python=3.6
    conda activate drmvsnet
    conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch
    conda install -c conda-forge py-opencv plyfile tensorboardx
  2. Set root of datasets as env variables in env.sh.
  3. Train AA-RMVSNet on DTU dataset (note that training requires a large amount of GPU memory):
    ./scripts/train_dtu.sh
  4. Predict depth maps and fuse them to get point clouds of DTU:
    ./scripts/eval_dtu.sh
    ./scripts/fusion_dtu.sh
  5. Predict depth maps and fuse them to get point clouds of Tanks and Temples:
    ./scripts/eval_tnt.sh
    ./scripts/fusion_tnt.sh

Note: if permission issues are encountered, try chmod +x <script_filename> to allow execution.

Citation

@inproceedings{wei2021aa,
  title={AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network},
  author={Wei, Zizhuang and Zhu, Qingtian and Min, Chen and Chen, Yisong and Wang, Guoping},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={6187--6196},
  year={2021}
}

Acknowledgements

This repository is heavily based on Xiaoyang Guo's PyTorch implementation.

Comments
  • The settings of fine-tune on BlendedMVS

    The settings of fine-tune on BlendedMVS

    Hi! I use the script "train_blend.sh" and the default setting to fine-tune the released pre-trained model on the BlendedMVS and hope to reproduce the result on Tanks and Temples. I have one NVIDIA 3090 (24G), but I met the out-of-memory problem. Do I need to change some settings like the "--max_h" and "--max_w"?

    opened by VILmchen 5
  • test in Tank and Temple benchmark

    test in Tank and Temple benchmark

    I success using eval.sh and generate "intermediate" point cloud by fusion.sh. Can you remind me how to submit these data to tank and temple website .**Because I don't have *.log file, so I guess to use ***.log file in TAT databenchmark from mvsnet. I just try to upload data to TAT ,can you help me ? image

    opened by lilipopololo 4
  • Batch size and timing for single GPU setup

    Batch size and timing for single GPU setup

    First, let me add my thanks for your excellent work.

    With a single GPU setup, I infer that I should set 'batch=1' in 'env.sh'. Is that correct?

    Using the default parameters with batch size of 1, computing depth for DTU models takes ~170 seconds per image on a single GPU (nVidia 1060).

    Does that seem correct?

    opened by KevinCain 4
  • Nan while using mvsnet_cls_loss in other framework

    Nan while using mvsnet_cls_loss in other framework

    Hi, thanks for your great work! I find the mvsnet_cls_loss useful for improving the representational ability of the classification schedule in MVS network. However, when I transfer mvsnet_cls_loss to other CasMVSNet-like frameworks, I get nan value of loss during training sometimes. Have you meet this problem before? Hoping for your reply~

    BTW, does the mvsnet_cls_loss corresponds to finer ground-truth in the paper?

    opened by doubleZ0108 2
  • question about sample details?

    question about sample details?

    Hello, thank you for your great work and generous contribution. There‘re two details in code: (1) in the script(train_dtu.sh), # make sure num_depth* interval = 203.45. (2) in t&t datasets, the depths are reversed. What's the role of these two operations?

    opened by DIVE128 2
  • Error when running testing

    Error when running testing

    I tried to run eval.sh and get this error, with both DTU and TNT:

    RuntimeError: Error(s) in loading state_dict for AARMVSNet: Missing key(s) in state_dict: "feature.init_conv.0.0.weight", "feature.init_conv.0.0.bias",..... Unexpected key(s) in state_dict: "module.feature.init_conv.0.0.weight"...

    When I trained, I changed the view_num from 7 to 2 and batch size to 1, because of memory and time (I'm running in colab).

    opened by BiaBibii 2
  • Questions about testing BlendesMVS

    Questions about testing BlendesMVS

    Hello, I tested blended_MVS before, using the model you gave me. The GPU is 3090, but the memory of the GPU is insufficient when executing the test script. How do you set the test parameters and get the results

    opened by pdw211 2
  • Question about the survey paper

    Question about the survey paper

    Hello, Dr. @QT-Zhu Recently, I'm reading your survey paper, "Deep Learning for Multi-view Stereo via Plane Sweep: A Survey". I found the Fig. 2 of the paper is visually appealing and understandable. Now I want to make photos of plane sweeping using my own 3D scene and 2D images, how could I achieve it? Could you give me some suggestions? Looking forward to your reply. PlaneSweepingStereo

    opened by XYZ-qiyh 2
  • 相机参数估计的问题

    相机参数估计的问题

    很抱歉打扰到您了,最近使用您的代码实现MVSNet的深度估计时 ,遇到了很难解决的问题,想请教一下。 在获取相机内外参数时使用了colmap(https://github.com/colmap/colmap/tree/3.5)来估计的,然后使用caolmap2mvsnet.py 将colmap估计的结果转换成MVSNet的输入格式。 目前问题是colmap估计的内外参和真值差距很大。下图是输入DTU数据时colmap估计的结果 image 下图是DTU提供的直 image 数据差异很大。 并且当使用估计的内外参作为网络输入时,输出的深度估计完全不正确。请问有什么解决方法吗? 谢谢!

    opened by WeiCaike 2
  • Question about the training cost

    Question about the training cost

    Thanks for your great works!

    I am retraining AA-RMVSNet in DTU dataset with default settings on one V100 32GB GPU. But it cost about 17GB for batchsize=1, and batchsize=2 will cause OOM problem.

    It is really strange because in the paper, batchsize=4 costs only 20.16GB. Besides, the depth_num is set as 192 in the paper, while it is just 150 in the default setting.

    Another question is that the training is very slow. It cost about 4.6s for one step of batch=1. image

    Can you provide any advice on it?

    opened by ewrfcas 2
  • Scripts to train/test/eval on custom data.

    Scripts to train/test/eval on custom data.

    Hi, thanks for a great work.
    Could you provide a script to train/test/eval on custom data? Readme only suggests to prepare data in MVSNet format, but didn't say how to run on it.

    opened by Burningdust21 2
Owner
Qingtian Zhu
No one knows carrying pots better than I do.
Qingtian Zhu
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
the code of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021)

RMA-Net This repo is the implementation of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021). Paper

Wanquan Feng 205 Nov 9, 2022
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 4, 2023
A study project using the AA-RMVSNet to reconstruct buildings from multiple images

3d-building-reconstruction This is part of a study project using the AA-RMVSNet to reconstruct buildings from multiple images. Introduction It is exci

null 17 Oct 17, 2022
Multi-Scale Geometric Consistency Guided Multi-View Stereo

ACMM [News] The code for ACMH is released!!! [News] The code for ACMP is released!!! About ACMM is a multi-scale geometric consistency guided multi-vi

Qingshan Xu 118 Jan 4, 2023
Stacked Recurrent Hourglass Network for Stereo Matching

SRH-Net: Stacked Recurrent Hourglass Introduction This repository is supplementary material of our RA-L submission, which helps reviewers to understan

null 28 Jan 3, 2023
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

PyTorch-High-Res-Stereo-Depth-Estimation Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch. Stereo dep

Ibai Gorordo 26 Nov 24, 2022
Planar Prior Assisted PatchMatch Multi-View Stereo

ACMP [News] The code for ACMH is released!!! [News] The code for ACMM is released!!! About This repository contains the code for the paper Planar Prio

Qingshan Xu 127 Dec 31, 2022
Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Self-Supervised-MVS This repository is the official PyTorch implementation of our AAAI 2021 paper: "Self-supervised Multi-view Stereo via Effective Co

hongbin_xu 127 Jan 4, 2023
COLMAP - Structure-from-Motion and Multi-View Stereo

COLMAP About COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface.

null 4.7k Jan 7, 2023
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
Implementation of Bidirectional Recurrent Independent Mechanisms (Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules)

BRIMs Bidirectional Recurrent Independent Mechanisms Implementation of the paper Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neura

Sarthak Mittal 26 May 26, 2022
(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Dressing in Order (DiOr) ?? [Paper] ?? [Webpage] ?? [Running this code] The official implementation of "Dressing in Order: Recurrent Person Image Gene

Aiyu Cui 277 Dec 28, 2022
Code for LIGA-Stereo Detector, ICCV'21

LIGA-Stereo Introduction This is the official implementation of the paper LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based

Xiaoyang Guo 75 Dec 9, 2022
the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

BGNet This repository contains the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet] Environment Python 3.6.* C

3DCV developer 87 Nov 29, 2022
Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

null 111 Dec 29, 2022
【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021) Overview We release the code of the DSANet (Dynamic S

Wenhao Wu 46 Dec 27, 2022
PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

hierarchical-multi-label-text-classification-pytorch Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach This

Mingu Kang 17 Dec 13, 2022