This repository contains the code for the paper Neural RGB-D Surface Reconstruction

Overview

Neural RGB-D Surface Reconstruction

Paper | Project Page | Video

Neural RGB-D Surface Reconstruction
Dejan Azinović, Ricardo Martin-Brualla, Dan B Goldman, Matthias Nießner, Justus Thies
Arxiv Pre-print

This repository contains the code for the paper Neural RGB-D Surface Reconstruction, a novel approach for 3D reconstruction that combines implicit surface representations with neural radiance fields.

Installation

You can create a conda environment called neural_rgbd using:

conda env create -f environment.yaml
conda activate neural_rgbd

Make sure to clone the external Marching Cubes dependency and install it in the same environment:

cd external/NumpyMarchingCubes
python setup.py install

You can run an optimization using:

python optimize.py --config configs/
   
    .txt

   

Data

The data needs to be in the following format:


   
                # args.datadir in the config file
├── depth               # raw (real data) or ground truth (synthetic data) depth images (optional)
    ├── depth0.png     
    ├── depth1.png
    ├── depth2.png
    ...
├── depth_filtered      # filtered depth images
    ├── depth0.png     
    ├── depth1.png
    ├── depth2.png
    ...
├── depth_with_noise    # depth images with synthetic noise and artifacts (optional)
    ├── depth0.png     
    ├── depth1.png
    ├── depth2.png
    ...
├── images              # RGB images
    ├── img0.png     
    ├── img1.png
    ├── img2.png
    ...
├── focal.txt           # focal length
├── poses.txt           # ground truth poses (optional)
├── trainval_poses.txt  # camera poses used for optimization

   

The dataloader is hard-coded to load depth maps from the depth_filtered folder. These depth maps have been generated from the raw ones (or depth_with_noise in the case of synthetic data) using the same bilateral filter that was used by BundleFusion. The method also works with the raw depth maps, but the results are slightly degraded.

The file focal.txt contains a single floating point value representing the focal length of the camera in pixels.

The files poses.txt and trainval_poses.txt contain the camera matrices in the format 4N x 4, where is the number of cameras in the trajectory. Like the NeRF paper, we use the OpenGL convention for the camera's coordinate system. If you run this code on ScanNet data, make sure to transform the poses to the OpenGL system, since ScanNet used a different convention.

You can also write your own dataloader. You can use the existing load_scannet.py as template and update load_dataset.py.

Citation

If you use this code in your research, please consider citing:

@misc{azinović2021neural,
      title={Neural RGB-D Surface Reconstruction}, 
      author={Dejan Azinović and Ricardo Martin-Brualla and Dan B Goldman and Matthias Nießner and Justus Thies},
      year={2021},
      eprint={2104.04532},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Further information

The code is largely based on the original NeRF code by Mildenhall et al. https://github.com/bmild/nerf

The Marching Cubes implementation was adapted from the SPSG code by Dai et al. https://github.com/angeladai/spsg

Comments
  • How about the result on real world datasets?

    How about the result on real world datasets?

    @dazinovic How about the result on real-world datasets. I collect the datasets with my own RGBD camera and estimate the pose with colmap. But the results is a mass. Any advices?

    opened by endlesswho 16
  • confusion on the translation = [-4.44, 0, 2.31]

    confusion on the translation = [-4.44, 0, 2.31]

    I'm a little confused about the translation in config. For instance, in configs/scene0050_00.txt there is a config named translation. Each config contains a different translation, what is the effect of the translation?

    opened by chensjtu 16
  • Concern about the meaning of sdf

    Concern about the meaning of sdf

    Hi, @dazinovic Thx for sharing your great work! I'm confused about the meaning of sdf used in your code. I hope you can help me figure it out. https://github.com/dazinovic/neural-rgbd-surface-reconstruction/blob/01782741271b5ace1083bfc4aadd787c97b31836/losses.py#L31-L38 As in this function, it seems that sdf is a ratio between the actual distance and the truncation value. While, when it turns to this render function, https://github.com/dazinovic/neural-rgbd-surface-reconstruction/blob/01782741271b5ace1083bfc4aadd787c97b31836/optimize.py#L172-L183 it seems that sdf represents a distance in normalized space. So, if we use the second meaning which is sdf is a distance in normalized space, should the sdf loss function change to this?

    def get_sdf_loss(z_vals, target_d, predicted_sdf, truncation, loss_type):
    
        front_mask, sdf_mask, fs_weight, sdf_weight = get_masks(z_vals, target_d, truncation)
    
        fs_loss = compute_loss(predicted_sdf * front_mask, truncation * tf.ones_like(predicted_sdf) * front_mask, loss_type) * fs_weight
        sdf_loss = compute_loss((z_vals + predicted_sdf) * sdf_mask, target_d * sdf_mask, loss_type) * sdf_weight
    
        return fs_loss, sdf_loss
    
    opened by hirotong 9
  • evaluation on scene0005

    evaluation on scene0005

    Thanks for your wonderful work! I have got the reconed mesh as: image

    So I wonder: Are the wired blocks around the scene proposed? If so, how can I alleviate their effects in the evaluation period? I have tried code from retrieval-fuse. While I get the metric with: Iou: 0.5294117647058824,chamferL1:0.21610536071424724, normals_correctness:0.7708863087762083, f-score1:0.8835909549382578, f-score-1.5:0.9605475835329947

    looking forward to your kind reply!

    opened by chensjtu 7
  • Can we get a colored mesh from the trained model?

    Can we get a colored mesh from the trained model?

    Original NeRF can simulate a virtual camera for extract the vertices color from the NeRF, I did the same operation as https://github.com/bmild/nerf/issues/44 , but get a bad results, any ideas to extract color mesh from the trained model?

    opened by endlesswho 6
  • How to Transform ScanNet Poses?

    How to Transform ScanNet Poses?

    Hi @dazinovic , Thanks for your great work! I'm trying to train a ScanNet Scene using your code. Could you tell me how you transform ScanNet poses? I investigate the synthesis dataset you provide, I guess trainval_poses are: [ 1 0 0 0 0 0 -1 0 0 1 0 0 0 0 0 1] @ inv(T0) @ Tx where T0 is the first pose of poses(poses.txt) and Tx is the rest. But they are not equal after I verify the "complete kitchen" scene.

    Thanks.

    opened by wtiandong 6
  • How to make own datasets?

    How to make own datasets?

    Hello, author.

    Thank you for completing and providing this great work.

    I ran the code and got perfect results on the data set you provided, so I want to try to make my own data set and see how the code works. Could you describe in detail how to create own datasets, such as how to use BundleFusion you mentioned in readme.md.

    opened by divingwolf 5
  • cannot download data

    cannot download data

    Seems like the link : http://kaldir.vc.in.tum.de/neural_rgbd/neural_rgbd_data.zip is broken.

    Could you please upload the data to some place else (maybe on gdrive?)

    opened by kyj24182 4
  • Camera pose axes to OpenGL coordinate space alignment

    Camera pose axes to OpenGL coordinate space alignment

    Hi, Dejan! Thanks for opensourcing this work :muscle:! I cannot get how to transform poses from custom format to the format in this work. May be I did not get properly OpenGL coordinate system you mentioned, but it looks like the usual yUp coordinate system. So I expect that y axis from camera pose is "up" direction. However the poses for breakfast_room (and other provided scenes) are oriented differently: I have rotated MeshLab axes to match OpenGL coordinate system and draw camera poses (each 10th), and y axis from camera pose (in green) is aligned with z axis from coordinate system and vice versa: breakfast_room_poses_in_ogl I have checked #2, #5, #9, #16, #21 and still in stuck. Can you please clarify why camera axes are not aligned with the corresponding OpenGL coordinate axes?

    opened by Daniil-Osokin 4
  • Confusion about Coordinate Transformation

    Confusion about Coordinate Transformation

    Hi, @dazinovic , thx for the great work. I've investaged [#2 ] & [#4 ] , and finally found the corret operation for aligning point clouds from each frame in the given dataset, for instance the breakfast_room data. I've got complete sence points using both trainval_pose.txt & pose.txt, although there exist a rigid transformation between them, key idea is the transform the ponit cloud recovered from each frame using: T0 @ trainval_pose or pose @ T1 @ point_cloud, with T0 = [ 1., 0., 0., 0.], [ 0., 0., 1., 0.], [ 0., -1., 0., 0.], [ 0., 0., 0., 1.]

    T1 = array([[ 1., 0., 0., 0.], [ 0., -1., 0., 0.], [ 0., 0., -1., 0.], [ 0., 0., 0., 1.]])

    But I'm still confused about the meaning of given T0 & T1。I guess that T1 is used for converting from OpenCV coo to OpenGL coo, T0 for converting to Blender, in which the datasets were created. Do I understand it right? Since that you use the trainval_pose in the code, are the transformations, namely T0 & T1 here, implicitly included in your code? And did the trainval_pose came directly from BundleFusion? If I want to use a Kinect to create a custom data, what should I do with these transformations?

    opened by ZirongChan 4
  • about the metrics on scannet

    about the metrics on scannet

    hey! I just found that in the latest paper, there don't exist any quantitative results of Neural RGB-D surface recon on SCANNET dataset. Can you share the results so that I won't need to re-eval. Many thanks for your wonderful work!

    opened by chensjtu 3
  • Question about coordinate system of official scannet pose.

    Question about coordinate system of official scannet pose.

    Hi @dazinovic , Thanks for your great work! Based on the pose conversion you provided #2 , T0 @ scannet_pose @ T1, I want to make sure that the coordinate system of official scannet pose is that: positive x-axis is right, positive y-axis is out and positive z-axis is down. Am I right or not?

    Thanks.

    opened by HLinChen 0
  • error with cull_mesh script

    error with cull_mesh script

    hi, many thanks for your great works! I tried to run the transform_mesh.py and cull_mesh.py scripts on the breakfast_room data on remote server. But I get a error with the cull_mesh script. Do you have any advice how I can resolve this error?

    /home/dl/kaiduo.zhang/LumiNet/cullmesh/neural_rgbd_data/ breakfast_room Traceback (most recent call last): File "/home/dl/kaiduo.zhang/LumiNet/cullmesh/frustum_culling/cull_mesh.py", line 172, in <module> cull_mesh(mesh_path, save_path, pose_file, training_poses, intrinsics_path, scene_bounds=scene_bounds) File "/home/dl/kaiduo.zhang/LumiNet/cullmesh/frustum_culling/cull_mesh.py", line 81, in cull_mesh proc1 = subprocess.Popen(['scontrol', 'show', 'job', os.environ['SLURM_JOBID'], '-d'], stdout=subprocess.PIPE) File "/home/dl/kaiduo.zhang/anaconda3/envs/cull/lib/python3.9/os.py", line 679, in __getitem__ raise KeyError(key) from None KeyError: 'SLURM_JOBID'

    opened by UestcJay 1
  • About estimated camera poses.

    About estimated camera poses.

    Hi, Thanks for sharing you great work! I found that neural-rgbd performs well in estimating the camera poses, so I wonder if you can share the estimated camera poses on 10 synthetic scenes. I would like to use the estimated camera poses of neural-rgbd as an initialization. Thanks for your time!

    opened by junshengzhou 0
  • Evaluation steps

    Evaluation steps

    Hi, firstly thanks so much for this amazing work.

    Now I finished all the preparation steps and started running "optimize.py".

    1. After traininng I will get a reconstruction mesh which is exactly as same as the "meshes.zip"?
    2. If not, which step should i do to get the same result from the zip file?

    Thanks for your reply.

    opened by Yiiii19 1
  • Environment.yml

    Environment.yml

    Hi, I am trying to get this repo working. It seems like the environment.yml file is outdated. I get a bunch of dependencies conflicts. ANy ideas what the right .yml file should be?

    opened by tkbala 4
  • Train/eval script for compared methods in table 1

    Train/eval script for compared methods in table 1

    Hi:

    thank you for sharing this amazing work!

    Is it possible to share the training/evaluation script of the compared methods (like routed fusion) on the scannet dataset?

    thank you!

    opened by fengziyue 0
Owner
Dejan
Dejan
Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance Project Page | Paper | Data This repository contains an implementatio

Lior Yariv 521 Dec 30, 2022
Implementation for the "Surface Reconstruction from 3D Line Segments" paper.

Surface Reconstruction from 3D Line Segments Surface reconstruction from 3d line segments. Langlois, P. A., Boulch, A., & Marlet, R. In 2019 Internati

null 85 Jan 4, 2023
The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PlantStereo This is the official implementation code for the paper "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction".

Wang Qingyu 14 Nov 28, 2022
DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

DSAC* for Visual Camera Re-Localization (RGB or RGB-D) Introduction Installation Data Structure Supported Datasets 7Scenes 12Scenes Cambridge Landmark

Visual Learning Lab 143 Dec 22, 2022
3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

Владислав Молодцов 0 Feb 6, 2022
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

null 5 Jan 4, 2023
Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Poisson Surface Reconstruction for LiDAR Odometry and Mapping Surfels TSDF Our Approach Table: Qualitative comparison between the different mapping te

Photogrammetry & Robotics Bonn 305 Dec 21, 2022
[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

Linyi Jin 89 Jan 5, 2023
ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction. NeurIPS 2021.

Gengshan Yang 59 Nov 25, 2022
Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

Raphael Sulzer 23 Jan 4, 2023
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

null 136 Dec 12, 2022
This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

Crisalix 72 Dec 10, 2022
CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

CoReNet CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image. It produces coherent reconstructions, where all objec

Google Research 80 Dec 25, 2022
Towards uncontrained hand-object reconstruction from RGB videos

Towards uncontrained hand-object reconstruction from RGB videos Yana Hasson, Gül Varol, Ivan Laptev and Cordelia Schmid Project page Paper Table of Co

Yana 69 Dec 27, 2022
OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

OcclusionFusion (CVPR'2022) Project Page | Paper | Video Overview This repository contains the code for the CVPR 2022 paper OcclusionFusion, where we

Wenbin Lin 193 Dec 15, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

刘彦超 34 Nov 30, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022