Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Overview

Volume rendering + 3D implicit surface = Neural 3D Reconstruction

Multi-view 3D reconstruction using neural rendering.

This repository holds ⚠️ unofficial ⚠️ pytorch implementations of:

  • Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction

  • NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction

  • [VolSDF] Volume rendering of neural implicit surfaces

and more...

Showcase

Trained with VolSDF@200k, with NeRF++ as background.

  • Above: 🚀 volume rendering of the scene (novel view synthesis)

  • Below: mesh extracted from the learned implicit shape

volsdf_nerf++_blended_norm_5c0d13_rgb&mesh_576x768_450_archimedean_spiral_400
full-res video: (35 MiB, 15s@576x768@30fps) [click here]

Trained with NeuS @300k, with NeRF++ as background.

  • Above: 🚀 volume rendering of the scene (novel view synthesis)
  • Middle: normals extracted from the learned implicit shape ($\nabla_{\mathbf{x}} s$)
  • Below: mesh extracted from the learned implicit shape
neus_55_nomask_new_rgb&normal&mesh_300x400_60_small_circle_256 neus_37_nomask_new_rgb&normal&mesh_300x400_60_small_circle_256 neus_65_nomask_new_rgb&normal&mesh_360x400_60_small_circle_256
neus_97_nomask_new_rgb&normal&mesh_300x400_60_small_circle_256 neus_105_nomask_new_rgb&normal&mesh_390x400_60_small_circle_256 neus_24_nomask_new_rgb&normal&mesh_360x400_60_small_circle_256

What?

The overall topic of the implemented papers is multi-view surface and appearance reconstruction from pure posed images.

  • studying and bridging between [DeepSDF/OccupancyNet]-like implicit 3D surfaces and volume rendering (NeRF).
  • framework:

framework

What's known / Ground Truth / Supervision What's learned
ONLY Multi-view posed RGB images. (no masks, no depths, no GT mesh or pointclouds, nothing.) 3D surface / shape
3D appearance

previous: surface rendering; now: volume rendering

From one perspective, the implemented papers introduce volume rendering to 3D implicit surfaces to differentiably render views and reconstructing scenes using photometric reconstruction loss.

Rendering methods in previous surface reconstruction approach Rendering method in this repo (when training)
Surface rendering Volume rendering

The benefit of using volume rendering is that it diffuses gradients widely in space, and can efficiently learns a roughly correct shape at the very early beginning of training without mask supervision, avoiding bad local minimas when learning shapes, which is often encountered when using surface rendering even with mask supervision.

config:
[click me]
@0 iters @3k iters
@16 mins
@10k iters
@1 hours
@200k iters
@ 18.5 hours
Mesh extracted from the learned shape mesh_0k mesh_3k mesh_10k mesh_200k
View rendered from the learned appearance 00000000 00003000 00010000 00200000

previous: NeRF's volume density; now: implicit surface

From another perspective, they change the original NeRF's shape representation (volume density $\sigma$) to a 3D implicit surface model, whose iso-surface is defined to represent spatial surfaces.

Shape representation in NeRF Shape representation in this repo
Volume density Occupancy net (UNISURF)
DeepSDF (VolSDF/NeuS)

The biggest disadvantage of NeRF's shape representation is that it considers objects as volume clouds, which actually does not guarantees an exact surface, since there is no constraint on the learned density.

Representing shapes with implicit surfaces can force the volume density to be associated with a exact surface.

What's more, the association (or, the mapping function that maps implicit surface value to volume density) can be controlled either manually or by learn-able parameters, allowing the shape representation to be more surface-like or more volume-like, meeting different needs of different training stages.

sdf2sigma
Demonstration of controllable mappings from sdf value to volume density value. @VolSDF

Hence, the training scheme of approaches in this repo can be roughly divided as follows (not discrete stages, continuously progressing instead):

  • at the earlier stage of learning shapes, the shape representation is more volume-like, taking into account more neighboring points along the ray when rendering colors. The network fast learns a roughly correct shape and appearance.
  • while in the later stage, the shape representation is more surface-like, almost only taking into account the exact intersected point with the surface along the ray . The network slowly learns the fine thin structures of shapes and fine details of appearance.

You can see that as the controlling parameter let narrower and narrower neighboring points being considered during volume rendering, the rendered results are getting almost equivalent to surface rendering. This is proved in UNISURF, and also proved with results showed in the section [docs/usage.md#use surface rendering instead of volume rendering].

limit

What's different of the implemented papers:

  • how to map a implicit surface value to a volume density representation, or how to (accurately) calculate volume rendering's opacity with such exact surface representation.
  • how to efficiently sample points on camera rays taking advantage of the exact surface
  • You can find out more on my [personal notes] (In Chinese only).

Future

Currently, the biggest problem of methods contained in this repo is that the view-dependent reflection effect is baked into the object's surface, similar with IDR, NeRF and so on. In other words, if you place the learned object into a new scene with different ambient light settings, the rendering process will have no consideration of the new scene's light condition, and keeps the ambient light's reflection of the old trained scene with it.

However, as the combination of implicit surface with NeRF has come true, ambient light and material decomposition can be much easier for NeRF-based frameworks, since now shapes are represented by the underlying neural surface instead of volume densities.

Results and trained models

The trained models are stored in [GoogleDrive] / [Baidu, code: reco].

For more visualization of more trained results, see [docs/trained_models_results.md].

USAGE

See [docs/usage.md] for detailed usage documentation.

NOTES

TODO

  • NeuS

    • Compare with NeuS official repo.
    • Fix performance bug (camera inside surface after training) on some of the DTU instances.
  • VolSDF

    • improve VolSDF's sampling performance
    • release more results
  • UNISURF

    • Fix performance bug (huge artifact after training) on some of the DTU instances.
  • general

    • train camera
    • cluster training configs
    • DDP support
    • refine GPU usage and try to allow for at least 2080 Ti.
    • surface rendering option.
    • eval script for RGB
    • eval script for mesh CD

CITATION

  • UNISURF
@article{oechsle2021unisurf,
  title={Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction},
  author={Oechsle, Michael and Peng, Songyou and Geiger, Andreas},
  journal={arXiv preprint arXiv:2104.10078},
  year={2021}
}
  • NeuS
@article{wang2021neus,
  title={NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction},
  author={Wang, Peng and Liu, Lingjie and Liu, Yuan and Theobalt, Christian and Komura, Taku and Wang, Wenping},
  journal={arXiv preprint arXiv:2106.10689},
  year={2021}
}
  • VolSDF
@article{yariv2021volume,
  title={Volume Rendering of Neural Implicit Surfaces},
  author={Yariv, Lior and Gu, Jiatao and Kasten, Yoni and Lipman, Yaron},
  journal={arXiv preprint arXiv:2106.12052},
  year={2021}
}
  • NeRF++
@article{kaizhang2020,
    author    = {Kai Zhang and Gernot Riegler and Noah Snavely and Vladlen Koltun},
    title     = {NeRF++: Analyzing and Improving Neural Radiance Fields},
    journal   = {arXiv:2010.07492},
    year      = {2020},
}
  • SIREN
@inproceedings{sitzmann2019siren,
    author = {Sitzmann, Vincent and Martel, Julien N.P. and Bergman, Alexander W. and Lindell, David B. and Wetzstein, Gordon},
    title = {Implicit Neural Representations with Periodic Activation Functions},
    booktitle = {Proc. NeurIPS},
    year={2020}
}

Acknowledgement

This repository modifies codes or draws inspiration from:

Contact

Feel free to submit issues or contact Jianfei Guo(郭建非) guojianfei [at] pjlab.org.cn.
PRs are also very welcome 😃

We are hiring!

🎉 🎉 🎉

On behalf of Intelligent Transportation and Auto Driving Group in Shanghai AI Lab, we are hiring researcher/engineer/full-time intern for Computer Graphics and 3D Rendering Algorithm (base in Shanghai)

上海人工智能实验室智慧交通与自动驾驶团队招聘「图形学算法研究员」和「3D场景生成研究员」。实习、校招、社招均有海量HC。

图形学算法研究员

岗位职责

  1. 结合计算机图形学和深度学习进行研究,并探索相关技术在自动驾驶数据上的应用。
  2. 研究基于神经渲染器、可微渲染等技术的高质量三维重建技术。
  3. 研究面向数字孪生的城市场景重建技术。
  4. 在图形学和计算机视觉领域跟踪学术和产业前沿方案,在单点算法上实现突破业界、原型验证,持续构建三维重建领域的技术竞争力。
  5. 设计实现基于高质量三维重建数据嵌入的自动驾驶感知仿真平台,提升自动驾驶仿真数据在感知模型上的可用性。

任职要求

  1. 计算机科学与技术、电子工程、自动化、人工智能、应用数学、车辆工程等相关专业,本科及以上学历。
  2. 有扎实的图形学基础,理解基本渲染管线,了解PBR光照、材质模型。
  3. 熟悉机器学习与深度学习的基本理论,熟悉编程/脚本语言:C++/C/Python,至少使用过PyTorch、TensorFlow、Caffe等框架中的一种。
  4. 加分项:熟悉并掌握一个或多个下列方向:神经网络隐式三维表示(如NeRF),场景三维重建,3D人体重建与生成
  5. 加分项:理解渲染管线中的GPU编程原理,或具有GPU CUDA开发和性能优化经验
  6. 加分项:有Unity, UE4等游戏引擎或Carla、AirSim等仿真项目的开发使用经验,或熟悉车辆、行人运动仿真
  7. 加分项:有自动驾驶、计算机视觉和图像处理领域相关项目和研究背景。
  8. 加分项:有相关领域顶级学术会议/期刊论文发表经历。

3D场景生成研究员

岗位职责

  1. 研究基于深度学习的生成模型在自动驾驶数据上的应用。
  2. 研究利用GAN/VAE等生成学习技术针对自动驾驶感知场景数据生成算法。
  3. 负责基于深度学习与生成模型算法进行自动驾驶感知数据仿真。
  4. 负责跟踪学术界和工业界关于生成模型在计算机视觉、自动驾驶、智慧城市、数字孪生等领域的最新研究进展。
  5. 设计实现基于生成模型的自动驾驶感知仿真平台,并利用生成数据提升下游感知任务的性能。

任职要求

  1. 计算机科学与技术、电子工程、自动化、人工智能、应用数学、车辆工程等相关专业,本科及以上学历。
  2. 有扎实的深度学习基础,熟悉机器学习与深度学习的基本理论,并要求至少熟练掌握PyTorch、TensorFlow、Caffe等框架中的一种。
  3. 代码能力强,有良好的C++/Python编码习惯,能够快速设计并执行实验、验证想法、具备能支持全栈开发工作的学习能力。
  4. 加分项:有计算机图形学、立体视觉、三维重建、神经渲染、可微渲染等方向研究经历。
  5. 加分项:熟悉并掌握一个或多个下列方向:3D生成模型,3D人体重建与生成,神经网络隐式三维表示(如NeRF)
  6. 加分项:有自动驾驶领域相关项目和研究背景。
  7. 加分项:在相关领域有顶级学术会议/期刊论文发表经历。

对以上两个岗位感兴趣的同学请发送简历到 shibotian [at] pjlab.org.cn, guojianfei [at] pjlab.org.cn。标题务必包含「应聘」两个字,谢谢。


About Shanghai AI Lab/关于上海人工智能实验室

上海人工智能实验室是中国人工智能领域的新型科研机构,由汤晓鸥、姚期智、陈杰等多位世界人工智能领域知名学者领衔发起成立,于2020年7月在世界人工智能大会正式揭牌。

实验室研究团队由一流科学家和团队按新机制组建。并开展战略性、原创性、前瞻性的科学研究与技术攻关,突破人工智能的重要基础理论和关键核心技术,打造“突破型、引领型、平台型”一体化的大型综合性研究基地,支撑中国人工智能产业实现跨越式发展,目标建成国际一流的人工智能实验室,成为享誉全球的人工智能原创理论和技术的策源地。

实验室先后与上海交通大学、复旦大学、浙江大学、中国科学技术大学、香港中文大学、同济大学、华东师范大学等知名高校签订战略合作协议,建立科研人员双聘和职称互认机制,汇聚国内国际优势资源,探索建立创新型的评价考核制度和具有国际竞争力的薪酬体系及条件保障。

上海人工智能实验室官方网站:https://www.shlab.org.cn/

智能交通与自动驾驶研发岗位招聘:https://www.shlab.org.cn/news/5443060

Comments
  • Positional encoding in `RadianceNet`

    Positional encoding in `RadianceNet`

    Hi, and thanks a lot for the implementation!

    https://github.com/ventusff/neurecon/blob/972e810ec252cfd16f630b1de6d2802d1b8de59a/configs/volsdf_nerfpp_blended.yaml#L41-L42

    I was wondering why we are not using positional encoding and instead are feeding raw 3D coordinates and view directions here? Especially because IDR is not doing so and the defaults are 6 and 4... 🤔

    I tried changing these from -1 to 6 and/or 4, and training collapses or at least goes much slower... To me, this seems extremely weird!

    opened by shrubb 7
  • Why init beta needs sqrt?

    Why init beta needs sqrt?

    Hi,

    Thx for your excellent repo. In volsdf rendering part, I find you add extra sqrt when initializing beta which does not show in volsdf paper's equation. Could you pls tell me why you use sqrt here?

    opened by fishfishson 2
  • Why pts_mid in Neus

    Why pts_mid in Neus

    Thanks for your awesome work! I am wondering why you used pts and pts_mid as the inputs of surface and radiance field, rather than align both inputs? https://github.com/ventusff/neurecon/blob/972e810ec252cfd16f630b1de6d2802d1b8de59a/models/frameworks/neus.py#L288

    The official implementation seems only to take pts_mid as the input. https://github.com/Totoro97/NeuS/blob/2708e43ed71bcd18dc26b2a1a9a92ac15884111c/models/renderer.py#L213

    opened by YuLiHN 1
  • About the stop condition of 'sphere_tracing_surface_points'

    About the stop condition of 'sphere_tracing_surface_points'

    Hi, I guess the stop condition of sdf surface tracing should be 'mask[surface_val < 0] = False' ?

    Now it is 'mask[d_preds < 0] = False'

    https://github.com/ventusff/neurecon/blob/eb179624a6fb820abb3eca1582b70f71b542875b/models/ray_casting.py#L182

    opened by Tianhang-Cheng 1
  • How to run on inside-out data?

    How to run on inside-out data?

    Hi! Thank you for the excellent project! I am going to run VolSDF on inside-out data such as indoor scenes. Could you give me some advice on how to setup the parameters?

    opened by ghy0324 1
  • Slight loss of detail compared with official implementation

    Slight loss of detail compared with official implementation

    First off, thanks for this useful implementation. So with specific regard to your NeuS implementation, there is a slight drop in the level of detail in the final mesh compared with the official implementation, when using the same hyperparameters and marching cubes settings.

    Is this something that should be solvable by tweaking the marching cubes algorithm?

    opened by yyeboah 0
  • why set alpha=1/beat and sigma=alpha * psi?

    why set alpha=1/beat and sigma=alpha * psi?

    It's really a great job! iq(≧▽≦q)。 But i have a question about the parameter alpha. I saw you set alpha = 1/beta and sigma=alpha * psi in your code, that means sigma/density will >1 when beta < 1. Why not set sigma = psi, that will keep sigma <= 1?

    opened by ZhouWeikun 0
  • Coordinate System

    Coordinate System

    Hi thanks for your great work! I am try to apply your codebase to some offical dataset, but I got some problem in get_rays function. Is camera extriniscs the world coordinate system to the camera coordinate system? or is camera coordinate system to world coordinate system?

    opened by xiliu8006 0
  • Wrong mesh for DTU37

    Wrong mesh for DTU37

    opened by ShaoTengLiu 0
  • The implementation principle of rend_util.sample_pdf() and sample_cdf()?

    The implementation principle of rend_util.sample_pdf() and sample_cdf()?

    Thank you for your work and summary. I have learned a lot, but due to the limitations of knowledge, I still have some doubts. I sincerely hope you can help me solve my doubts.

    Q1:Although the functions of sample_pdf() and sample_cdf() can be known according to the functions of the code, I still can't understand the specific internal details of sample_pdf() and sample_cdf(). If you are free, I hope you can answer or recommend relevant knowledge materials.

    Thank you very much! Look forward to your reply.

    opened by YinGuoX 0
Owner
Jianfei Guo
Thrive, don't just exist.
Jianfei Guo
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
Code release for NeuS

NeuS We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inpu

Peng Wang 813 Jan 4, 2023
PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

IBRNet: Learning Multi-View Image-Based Rendering PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021. IBRN

Google Interns 371 Jan 3, 2023
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

null 304 Jan 3, 2023
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

face-vid2vid Usage Dataset Preparation cd datasets wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl chmod a+rx youtube-dl python load_

worstcoder 68 Dec 30, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

SVHNClassifier-PyTorch A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks If

Potter Hsu 182 Jan 3, 2023
Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

MINDs Lab 170 Jan 4, 2023
Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

MINDs Lab 54 Aug 30, 2021
PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images This r

null 63 Dec 16, 2022
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Table of Content Introduction Getting Started Datasets Installation Experiments Training & Testing Pretrained models Texture fine-tuning Demo Toward R

VinAI Research 42 Dec 5, 2022
Code for "Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency" paper

UNICORN ?? Webpage | Paper | BibTex PyTorch implementation of "Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency" pap

null 118 Jan 6, 2023
OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

OcclusionFusion (CVPR'2022) Project Page | Paper | Video Overview This repository contains the code for the CVPR 2022 paper OcclusionFusion, where we

Wenbin Lin 193 Dec 15, 2022
Multi-Scale Geometric Consistency Guided Multi-View Stereo

ACMM [News] The code for ACMH is released!!! [News] The code for ACMP is released!!! About ACMM is a multi-scale geometric consistency guided multi-vi

Qingshan Xu 118 Jan 4, 2023
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 4, 2023