Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Related tags

Deep Learning idr
Overview

Multiview Neural Surface Reconstruction
by Disentangling Geometry and Appearance

Project Page | Paper | Data

This repository contains an implementation for the NeurIPS 2020 paper Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance.

The paper introduce Implicit Differentiable Renderer (IDR): a neural network architecture that simultaneously learns the 3D geometry, appearance and cameras from a set of 2D images. IDR able to produce high fidelity 3D surface reconstruction, by disentangling geometry and appearance, learned solely from masked 2D images and rough camera estimates.

Installation Requirmenets

The code is compatible with python 3.7 and pytorch 1.2. In addition, the following packages are required:
numpy, pyhocon, plotly, scikit-image, trimesh, imageio, opencv, torchvision.

You can create an anaconda environment called idr with the required dependencies by running:

conda env create -f environment.yml
conda activate idr

Usage

Multiview 3D reconstruction

Data

We apply our multiview surface reconstruction model to real 2D images from the DTU MVS repository. The 15 scans data, including the manually annotated masks and the noisy initializations for the trainable cameras setup, can be download using:

bash data/download_data.sh 

For more information on the data convention and how to run IDR on a new data please have a look at data convention.

We used our method to generate 3D reconstructions in two different setups:

Training with fixed ground truth cameras

For training IDR run:

cd ./code
python training/exp_runner.py --conf ./confs/dtu_fixed_cameras.conf --scan_id SCAN_ID

where SCAN_ID is the id of the DTU scene to reconstruct.

Then, to produce the meshed surface, run:

cd ./code
python evaluation/eval.py  --conf ./confs/dtu_fixed_cameras.conf --scan_id SCAN_ID --checkpoint CHECKPOINT [--eval_rendering]

where CHECKPOINT is the epoch you wish to evaluate or 'latest' if you wish to take the most recent epoch. Turning on --eval_rendering will further produce and evaluate PSNR of train image reconstructions.

Training with trainable cameras with noisy initializations

For training IDR with cameras optimization run:

cd ./code
python training/exp_runner.py --train_cameras --conf ./confs/dtu_trained_cameras.conf --scan_id SCAN_ID

Then, to evaluate cameras accuracy and to produce the meshed surface, run:

cd ./code
python evaluation/eval.py  --eval_cameras --conf ./confs/dtu_trained_cameras.conf --scan_id SCAN_ID --checkpoint CHECKPOINT [--eval_rendering]

Evaluation on pretrained models

We have uploaded IDR trained models, and you can run the evaluation using:

cd ./code
python evaluation/eval.py --exps_folder trained_models --conf ./confs/dtu_fixed_cameras.conf --scan_id SCAN_ID  --checkpoint 2000 [--eval_rendering]

Or, for trained cameras:

python evaluation/eval.py --exps_folder trained_models --conf ./confs/dtu_trained_cameras.conf --scan_id SCAN_ID --checkpoint 2000 --eval_cameras [--eval_rendering]

Disentanglement of geometry and appearance

For transferring the appearance learned from one scene to unseen geometry, run:

cd ./code
python evaluation/eval_disentanglement.py --geometry_id GEOMETRY_ID --appearance_id APPEARANCE _ID

This script will produce novel views of the geometry of the GEOMETRY_ID scan trained model, and the rendering of the APPEARANCE_ID scan trained model.

Citation

If you find our work useful in your research, please consider citing:

@article{yariv2020multiview,
title={Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance},
author={Yariv, Lior and Kasten, Yoni and Moran, Dror and Galun, Meirav and Atzmon, Matan and Ronen, Basri and Lipman, Yaron},
journal={Advances in Neural Information Processing Systems},
volume={33},
year={2020}
}

Related papers

Here are related works on implicit neural representation from our group:

Comments
  •  cameras.npz file

    cameras.npz file

    In order to run IDR on new data [DIR PATH], you need to supply image and mask directories, as well as cameras.npz file containing the appropriate camera projection matrices.

    Dear authors,

    Do we need to convert camera projection matrices in DTU dataset from txt to npy files?

    Thanks

    opened by derrick-xwp 6
  • Cuda OOM in plots.py during mesh extraction

    Cuda OOM in plots.py during mesh extraction

    Hi,

    Thank you for sharing the code. I trained the model. When I ran eval.py to extract the mesh using the latest checkpoint I am getting Cuda OOM in the File "../code/utils/plots.py", line 197, in get_surface_high_res_mesh which is the line grid_points = torch.cat(g, dim=0) I tried to change the following code snippet in plots.py, but it did not help. My GPU has 11GB memory. I also tried using 2 gpus, but I think the eval code uses only a single gpu. I also tried torch.cuda.empty_cache() to clear the cuda cache but I still get the OOM error. Could you please provide some guidance on how to fix the OOM problem? Thanks.

    g = []
    for i, pnts in enumerate(torch.split(grid_points, 100000, dim=0)):
        g.append(torch.bmm(vecs.unsqueeze(0).repeat(pnts.shape[0], 1, 1).transpose(1, 2),
                           pnts.unsqueeze(-1)).squeeze() + s_mean)
    grid_points = torch.cat(g, dim=0)
    
    opened by athena913 4
  • Cameras Coordinate System

    Cameras Coordinate System

    Hi there,

    First of all congrats on the project and on the code. It's really neat 🔥 .

    I'm trying to run the code with my own scene and, even though the losses are printed with kind of meaningful values, they do not decrease and the learnt scene doesn't resemble at all at what it should. I was wondering if the issue could be related to the coordinate system used.

    My coordinate system is this way:

    • Cameras look towards z.
    • Usually, if the object is at the origin, cameras position are somewhere in -z.

    Could it be you are assuming an OpenGL-like coordinate system and that it could be the reason it doesn't work?

    • Here cameras look towards -z.
    • Cameras are usually located around +z if the object is at the origin.

    Any thoughts will be very welcome.

    Thanks in advance!

    opened by eduardramon 4
  • How to get scale matrix

    How to get scale matrix

    Hello, I am a big fan of this project I wonder how you get the scale matrix with DTU dataset. If you don't mind, could you share the code which you use to get scale matrix?

    I think I solved it Thank you

    opened by yeong5366 3
  • GEOMETRY_ID and APPEARANCE_ID failed to get evaluated

    GEOMETRY_ID and APPEARANCE_ID failed to get evaluated

    I did try to run the last instruction , where for GEOMETRY_ID , I have given the trained surface model with surface_world_coordinates.ply and same for appearance_id too, even I have changed those generated evals to code and then tried to run the code by specifying the location_path saying evals/dtu_trained_cameras/surface_world_coordinates.ply

    But nothing worked for me, can you be bit more clear about the instructions for the execution. Thank you

    opened by RanjithKatta 3
  • Why not apply positional encoding to positions in the rendering network?

    Why not apply positional encoding to positions in the rendering network?

    Hi,

    I have a quick question regarding the design choice made in this work. In the rendering network, positional encoding is applied to view directions but not positions. This is quite different from NeRF, where they positional-encode the 3d positions. I just wonder if there's any special consideration behind this choice? Thank you.

    opened by Kai-46 3
  • Quantitative Evaluation on DTU

    Quantitative Evaluation on DTU

    Hi @lioryariv, this is great work!

    I noticed that the published code doesn't contain scripts for quantitatively evaluating the reconstructed surface against the DTU ground truth. Your packaged DTU dataset doesn't contain ground-truth shapes either.

    Could you please share your scripts for quantitative evaluation of DTU scenes? If you can also share the ground truth shapes (packaged with your DTU data) to use with your scripts, that would be amazing!

    Thanks and Regards, Shubham

    opened by shubham-goel 2
  • Additional data in the cameras.npz files

    Additional data in the cameras.npz files

    I noticed that the cameras.npz file provided for the DTU dataset has some additional matrices called "camera_mat_{i}". Are these the 3x3 calibration matrices K from P = K[R | t]?

    I tried to extract K, R and t from P (which are "world_mat_{i}") using opencv: K, R, t, _, _, _ = cv2.decomposeProjectionMatrix(P2)

    The K that I get from here does not match with the corresponding "camera_mat_{i}". Also the t values that I get are confusing to me. t has the shape (4, 1) and usually the 4th entry is always 1, but that is not happening here.

    UPDATE: The 't' values were actually okay. I forgot to divide it by the 4th entry

    opened by ahnaf1393 2
  • why not Normalize the normal which input to the rendering network?

    why not Normalize the normal which input to the rendering network?

    In the code: https://github.com/lioryariv/idr/blob/44959e7aac267775e63552d8aac6c2e9f2918cca/code/model/implicit_differentiable_renderer.py#L249 I found that the normal predicted by the implicit network is directly input to the rendering network without normalization.

    I want to know if it is well-designed by you, and why?

    opened by iYuqinL 2
  • Softplus activation

    Softplus activation

    Hey there,

    This is not an issue, just a question. Using softplus activation is significantly slower than using for instance ReLU. Is there any technical (theoretical or practical) reason why you chose softplus in front of other more efficient activations?

    Thank you!

    opened by eduardramon 2
  • Images preprocessing

    Images preprocessing

    In the data_convention.md file, the preprocess script says that it generates the needed cameras.npz file. The code loads an initial cameras.npz file, but how should we build this cameras.npz file to run the preprocess code on new images ?

    opened by ThibautLbl 2
  • Question about the spherical tracing

    Question about the spherical tracing

    Hi, thanks for your great work. I am reading your codes and one thing confuses me. For the spherical tracing part, accoring to the paper, it should find the first intersection point (which is also nearer) first and if it is not convergent, the algorithm will begin at the another intersection point, which is the farthest point. However, when I read the codes, the start_point seems to be farther from the camera center than the end_points because the start_points choose the index 0 of "sphere_intersections_points" and the end_points choose index 1. image Because according to "get_sphere_intersection" function, the index 0 stores (-under_sqrt - ray_cam_dot) and index 1 stores (under_sqrt - ray_cam_dot), which means the tensors on index 0 have larger absolute value. So it seems that start_points will be farther than the end_points, which is comfusing. Did I get anything wrong?

    opened by Haian-Jin 1
  • How to render video?

    How to render video?

    Hi folks,

    Thanks for this awesome work. I'm wondering how to render the "spiral" videos demonstrated in the project page?

    Any reply would be appreciated!

    Best, Peihao

    opened by peihaowang 1
  • Abnormal transferring appearance results

    Abnormal transferring appearance results

    Thanks for the great work. For my side, I am trying to train one appearance and transfer it to other unseen geometry objects, but the results are not so correct, may I know is results there anyone else also encountered this issue? Thanks in advance.

    opened by SherlockSunset 7
  • Running on Colmap Data

    Running on Colmap Data

    Thanks for the great work. I was trying to run it on some images captured through my camera. I used the colmap to get the world matrices. But it is giving the following error while trying to run preprocess_camera.py

    preprocess_cameras.py:87: DeprecationWarning:np.floatis a deprecated alias for the builtinfloat. To silence this warning, usefloatby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.float64here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations mask_points_all.append(np.stack((xs,ys,np.ones_like(xs))).astype(np.float)) preprocess_cameras.py:45: DeprecationWarning:np.floatis a deprecated alias for the builtinfloat. To silence this warning, usefloatby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.float64here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations cur_l_1=Fj0 @ np.array([curx,cury,1.0]).astype(np.float) Number of points:0 preprocess_cameras.py:169: RuntimeWarning: Mean of empty slice. centroid = np.array(all_Xs).mean(axis=0) /home/kshitiz/miniconda3/envs/idr/lib/python3.7/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount) /home/kshitiz/miniconda3/envs/idr/lib/python3.7/site-packages/numpy/core/_methods.py:263: RuntimeWarning: Degrees of freedom <= 0 for slice keepdims=keepdims, where=where) /home/kshitiz/miniconda3/envs/idr/lib/python3.7/site-packages/numpy/core/_methods.py:223: RuntimeWarning: invalid value encountered in true_divide subok=False) /home/kshitiz/miniconda3/envs/idr/lib/python3.7/site-packages/numpy/core/_methods.py:254: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount) Traceback (most recent call last): File "preprocess_cameras.py", line 244, in <module> get_normalization(opt.source_dir, opt.use_linear_init) File "preprocess_cameras.py", line 207, in get_normalization normalization,all_Xs=get_normalization_function(Ps, mask_points_all, number_of_normalization_points, number_of_cameras,masks_all) File "preprocess_cameras.py", line 175, in get_normalization_function centroid,scale,all_Xs = refine_visual_hull(masks_all, Ps, scale, centroid) File "preprocess_cameras.py", line 101, in refine_visual_hull points = points + center[:, np.newaxis] IndexError: invalid index to scalar variable.

    opened by g-gaurav 7
  • rgb_loss stuck at zero

    rgb_loss stuck at zero

    Hi! Thank you for the great work, I'm trying to get system to work on my own images, but am facing some difficulties. I obtain the position information using an aruco board on which I place my object. Now, the algorithm moves past the ray-tracing step properly, but then the algorithm outputs an rgb-loss of "0.0". Did anyone encounter something similar? I am very grateful for any type of response :)

    opened by iernstig 2
Owner
Lior Yariv
Lior Yariv
Code to reproduce the results for Compositional Attention: Disentangling Search and Retrieval.

Compositional-Attention This repository contains the official implementation for the paper Compositional Attention: Disentangling Search and Retrieval

Sarthak Mittal 17 Oct 23, 2021
Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Poisson Surface Reconstruction for LiDAR Odometry and Mapping Surfels TSDF Our Approach Table: Qualitative comparison between the different mapping te

Photogrammetry & Robotics Bonn 305 Dec 21, 2022
Implementation for the "Surface Reconstruction from 3D Line Segments" paper.

Surface Reconstruction from 3D Line Segments Surface reconstruction from 3d line segments. Langlois, P. A., Boulch, A., & Marlet, R. In 2019 Internati

null 85 Jan 4, 2023
[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

Linyi Jin 89 Jan 5, 2023
The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PlantStereo This is the official implementation code for the paper "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction".

Wang Qingyu 14 Nov 28, 2022
ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction. NeurIPS 2021.

Gengshan Yang 59 Nov 25, 2022
Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

Raphael Sulzer 23 Jan 4, 2023
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

null 136 Dec 12, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Shih-Yang Su 172 Dec 22, 2022
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

null 304 Jan 3, 2023
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 4, 2020
Generative Models as a Data Source for Multiview Representation Learning

GenRep Project Page | Paper Generative Models as a Data Source for Multiview Representation Learning Ali Jahanian, Xavier Puig, Yonglong Tian, Phillip

Ali 81 Dec 3, 2022
Multiview 3D object detection on MultiviewC dataset through moft3d.

Multiview Orthographic Feature Transformation for 3D Object Detection Multiview 3D object detection on MultiviewC dataset through moft3d. Introduction

Jiahao Ma 20 Dec 21, 2022
Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all views to learn a common Hamming space, thus making it difficult to handle the data with increasing views or a large number of views.

null 4 Nov 19, 2022
Deep Semisupervised Multiview Learning With Increasing Views (IEEE TCYB 2021, PyTorch Code)

Deep Semisupervised Multiview Learning With Increasing Views (ISVN, IEEE TCYB) Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin, Huaibai Yan, Dez

null 3 Nov 19, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
SLAMP: Stochastic Latent Appearance and Motion Prediction

SLAMP: Stochastic Latent Appearance and Motion Prediction Official implementation of the paper SLAMP: Stochastic Latent Appearance and Motion Predicti

Kaan Akan 34 Dec 8, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
[CVPR'21] DeepSurfels: Learning Online Appearance Fusion

DeepSurfels: Learning Online Appearance Fusion Paper | Video | Project Page This is the official implementation of the CVPR 2021 submission DeepSurfel

Online Reconstruction 52 Nov 14, 2022