Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

Overview

nvdiffrec

Teaser image

Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D Models, Materials, and Lighting From Images.

For differentiable marching tetrahedons, we have adapted code from NVIDIA's Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research.

Licenses

Copyright © 2022, NVIDIA Corporation. All rights reserved.

This work is made available under the Nvidia Source Code License.

For business inquiries, please contact [email protected]

Installation

Requires Python 3.6+, VS2019+, Cuda 11.3+ and PyTorch 1.10+

Tested in Anaconda3 with Python 3.9 and PyTorch 1.10

One time setup (Windows)

Install the Cuda toolkit (required to build the PyTorch extensions). We support Cuda 11.3 and above. Pick the appropriate version of PyTorch compatible with the installed Cuda toolkit. Below is an example with Cuda 11.3

conda create -n dmodel python=3.9
activate dmodel
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
pip install ninja imageio PyOpenGL glfw xatlas gdown
pip install git+https://github.com/NVlabs/nvdiffrast/
pip install --global-option="--no-networks" git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
imageio_download_bin freeimage

Every new command prompt

activate dmodel

Examples

Our approach is designed for high-end NVIDIA GPUs with large amounts of memory. To run on mid-range GPU's, reduce the batch size parameter in the .json files.

Simple genus 1 reconstruction example:

python train.py --config configs/bob.json

Visualize training progress (only supported on Windows):

python train.py --config configs/bob.json --display-interval 20

Multi GPU example (Linux only. Experimental: all results in the paper were generated using a single GPU), using PyTorch DDP

torchrun --nproc_per_node=4 train.py --config configs/bob.json

Below, we show the starting point and the final result. References to the right.

Initial guess Our result

The results will be stored in the out folder. The Spot and Bob models were created and released into the public domain by Keenan Crane.

Included examples

  • spot.json - Extracting a 3D model of the spot model. Geometry, materials, and lighting from image observations.
  • spot_fixlight.json - Same as above but assuming known environment lighting.
  • spot_metal.json - Example of joint learning of materials and high frequency environment lighting to showcase split-sum.
  • bob.json - Simple example of a genus 1 model.

Datasets

We additionally include configs (nerf_*.json, nerd_*.json) to reproduce the main results of the paper. We rely on third party datasets, which are courtesy of their respective authors. Please note that individual licenses apply to each dataset. To automatically download and pre-process all datasets, run the download_datasets.py script:

activate dmodel
cd data
python download_datasets.py

Below follows more information and instructions on how to manually install the datasets (in case the automated script fails).

NeRF synthetic dataset Our view interpolation results use the synthetic dataset from the original NeRF paper. To manually install it, download the NeRF synthetic dataset archive and unzip it into the nvdiffrec/data folder. This is required for running any of the nerf_*.json configs.

NeRD dataset We use datasets from the NeRD paper, which features real-world photogrammetry and inaccurate (manually annotated) segmentation masks. Clone the NeRD datasets using git and rescale them to 512 x 512 pixels resolution using the script scale_images.py. This is required for running any of the nerd_*.json configs.

activate dmodel
cd nvdiffrec/data/nerd
git clone https://github.com/vork/ethiopianHead.git
git clone https://github.com/vork/moldGoldCape.git
python scale_images.py

Server usage (through Docker)

  • Build docker image.
cd docker
./make_image.sh nvdiffrec:v1
  • Start an interactive docker container: docker run --gpus device=0 -it --rm -v /raid:/raid -it nvdiffrec:v1 bash

  • Detached docker: docker run --gpus device=1 -d -v /raid:/raid -w=[path to the code] nvdiffrec:v1 python train.py --config configs/bob.json

Comments
  • Custom dataset config options

    Custom dataset config options

    image

    I've been able to run this on my own data (slowly, on a 3070) using U^2Net matting and colmap2nerf code from instant-ngp (if others are following this, remove the image extensions, and it should run on windows) The issue I'm having is val, and test data or the lack of it. Are there config options to remove the requirement for those datasets. Copying and renaming the json lists to val and train works, but is rather cumbersome, and I was wondering if I was missing a better option with the NeRD dataset preparation, which use manual masks not applied to the image itself (which is possible with U^2 net, and would help with colmap mapping since the background can still be used for feature tracking.)

    I haven't looked into what data is required to be presented in the config, nor resolution/training options, but I'm just wondering what's the generally intended arguments and method for presenting your own data, when no val is available.

    opened by Sazoji 31
  • Coordinate frames and lighting

    Coordinate frames and lighting

    I am trying to get nvdiffrec to do light estimation in a controlled setting. I suspect that there are some coordinate frame and lighting issues. Here is the procedure I am following: First, I generate a synthetic dataset using a custom environment map from the NeRF lego script. Then, I try to train the model from the generated dataset without giving an environment map, or a reference mesh. I end up with a mesh that is flipped 90 degrees on the x-axis and the lighting is badly estimated.

    Here is the result mesh and the lego before training: lego_after

    lego_before
    opened by Selozhd 10
  • Confusing mesh result on nerd_ehead

    Confusing mesh result on nerd_ehead

    Thank you for the great work! I followed the one time setup on ubuntu 18.04. Python=3.9, CUDA=11.3, PyTorch=1.10. Then I had a trial on nerd_ehead by running: python train.py --config configs/nerd_ehead.json In dmtet_mesh folder, the mesh looks good : image But in mesh folder, the mesh.obj looks confusing: image I did not change any code in this repo so I did not know why this happened. Could you please help me solve this problem? Thank you for your time!

    opened by YuhsiHu 10
  • MLP to paramterize SDF

    MLP to paramterize SDF

    Hello, thank you for your great work. In #9 and in Paper 8.5, You expressed you used DVR based MLP to optimize SDF. And I have several question about it.

    1. Can you tell me how many iter you have run to pre-train sphere?
    2. Did you set the MLP output shape as (size, sdf) which mean (size,1) or (size, sdf+deform) which mean (size, 4) ?
    3. Did you use Modulelist in DVR decoder to process latent condition code c too?
    opened by JiyouSeo 9
  • Render the final result?

    Render the final result?

    Hello. First of all, thank you very much for your excellent work. I'm sorry I'm not very familiar with graphics engines, but now I want to render the final output, such as hotdog's Mesh. There are some questions below hope to get your answer: 1.According to the description in the article, kd.png\ks.png\kn.png describes the material of the model,But the final output has.mtL file, is.mtL equivalent to three PNG files?Now when I use mesh.load_mesh(mesh.obj), the api will directly use mesh.mtl instead of loading three Png files 2. If.mtL means the same as three Png files, I can render by loading the model and loading Png files?

    opened by soolone 8
  • noise in rendered results (noise in diffuse/specular?)

    noise in rendered results (noise in diffuse/specular?)

    Thank you for the amazing work!!

    I did a quick trial on the custom in-the-wild data and found that there are some noise in fitting results. Did I mess up something? Have you observed this phonenmon before? Note that I fixed the vertex for mesh and just optimized the lighting / material.

    | rendered | reference | |---|---| | test_000002_opt | test_000002_ref | | test_000007_opt | test_000007_ref |

    The texture maps looks like: | kd | ks | normal | |---|---|---| | texture_kd | texture_ks | texture_n |

    Looking forward to your great help! Thanks a lot

    opened by wangjksjtu 8
  • How to get a model with higer accuracy and more details?

    How to get a model with higer accuracy and more details?

    Hello,

    I have run "python train.py --config configs/nerf_lego.json" with default setting. The picture below shows my mesh output.

    1. The smooth surface is not flat (as shown in the red box). How should I improve it?
    2. Lots of details are loss due to the geometric simplification. How can I get a model with more details?

    图片

    Thank you in advance!

    opened by wuge1880 7
  • about making masks of custom dataset

    about making masks of custom dataset

    Hi, sorry to ask about this again, I know #3 and some other issues are about this problem, but it seems that there only are ways of getting poses for the images are discussed, and same for the 'https://github.com/bmild/nerf#generating-poses-for-your-own-scenes' you mentioned in another issue before, those methods truly help a lot, thanks for you and those who contributed! Now I am confused with the masks getting process, i find there are two different types of masks in your examples, in nerd dataset there's a 'masks' directory containing mask images corresponding with the ones in 'images', and for the nerf synthetic dataset, the 4-th channel image contains the masks information in the image itself as you said.

    So I guess my biggest question is that how to get the mask images like the first way or how to get 32bit images with alpha mask. Besides, i want to know if this is true, that seperate mask images need to be used with .npy file, and alpha-mask type needs to be used with .json file. Please correct me if i get something wrong about all these. Thanks for your work and your time.

    opened by sadexcavator 6
  • CUDA_ERROR_OUT_OF_MEMORY

    CUDA_ERROR_OUT_OF_MEMORY

    Hi,

    Is there any limitation on the training image resolution?

    I used just 12 images of resolution 4640x3472, and got this error though I successfully generated mesh for custom images with lower resolution. I am using Nvidia RTX A6000 with 50GB, so memory shouldn't be an issue.

    Are there some parameters in the config that I can tweak to occupy less memory with compromising mesh quality?

    opened by riponazad 5
  • What are the inputs for the training and testing phases

    What are the inputs for the training and testing phases

    What parts of the input in the training phase are the multi-view images, camera poses and masks corresponding to multi-views? Will a .obj file be generated after training as a model for testing? What are the parts of the input in the testing phase? Thank you very much for your reply, urgent!

    opened by malinjie-hub 5
  • How to show mesh with textures?

    How to show mesh with textures?

    Thank you for the amazing work! Now, I have trained the Nerf datasets of chair and I get a great result. I want to show the mesh with texture in MeshLab, but the chair mesh has nontexture in MeshLab. So what method should I use? 1

    opened by Doraemon167 5
  • Can‘t understand

    Can‘t understand "mesh_scale"

    hello,Thank you for sharing your great work!In the previous issues, you said that this parameter “mesh_scale” represents the size of the tetrahedron。when I test on other dataset(human public dataset), i find some trouble。when i set “mesh_scale” as 3.0,the result as follow image when i set “mesh_scale” as 10 ,the result as follow image. The results are from pass1. I don't understand the influence of this parameter on reconstruction. Why is it completely white when the parameter is small? And in second image, could you have any good suggestions on how to prevent some areas from being cut?I want to find out the influence of Hyperparameter on training(in .json file) Looking forward to your reply!

    opened by cjlunmh 0
  • Using xatlas instead of the uvs from marching_tet

    Using xatlas instead of the uvs from marching_tet

    Thanks for this wonderful work! I have a question about the design choice here, which is using xatlas to create uv coordinates instead of taking the uvs coordinates from marching_tet.

    opened by xk-huang 0
  • The color of Kd texture is very different from the photos.

    The color of Kd texture is very different from the photos.

    Hello, when I run your code with a fixed mesh, I found the the color of the final Kd texture is very different from the original photos, as shown in follows:

    original photo IMG_20221109_155652 Kd texture texture_kd

    You can see that the color of Kd texture is much lighter than that of the original photo, is it a normal result? If not, how should I ajust it ?

    Besides,when I rendered the final results in blender using the script you provided, the material always looked like metal( the real material is plastics). But the rending results in "img_mesh_pass" looks perfect. How should I get a more realistic result using blender (or other software)?

    Thank you in advance !

    rendering results in blender image

    rendering results in "img_mesh_pass" img_mesh_pass_000013

    opened by wuge1880 6
  • Observe many seams

    Observe many seams

    After mesh pass, I found there are many seams between paired UV patches although the UV coordinates are all located within the patch border. Can you provide some advice? snapshot00

    opened by pean1128 6
  • Incompatible vertex normals used in rendering

    Incompatible vertex normals used in rendering

    Hi, I found an incompatible issue in the calculation of vertex normal. In the main pipeline, each explicit mesh is extracted and rendered with the following function, which ensures the vertex normal of mesh will be recomputed through weighting face normal.

    self.getMesh(opt_material)
    

    However, the above recomputation process is missing in data preparation and lead to incompatible shading result. https://github.com/NVlabs/nvdiffrec/blob/main/dataset/dataset_mesh.py#L48 https://github.com/NVlabs/nvdiffrec/blob/main/dataset/dataset_mesh.py#L92

    self.ref_mesh = mesh.compute_tangents(ref_mesh)
    .......
    img = render.render_mesh(self.glctx, self.ref_mesh, mvp, campos, self.envlight, iter_res, spp=iter_spp,                                num_layers=self.FLAGS.layers, msaa=True, background=None)['shaded']
    
    opened by pean1128 2
Owner
NVIDIA Research Projects
NVIDIA Research Projects
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 257 Nov 26, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 293 Nov 21, 2022
Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

News 05/10/2022 To make the comparison on ScanNet easier, we provide all quantitative and qualitative results of baselines here, including COLMAP, COL

ZJU3DV 350 Nov 18, 2022
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning        Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

FeiLong 107 Nov 21, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

null 35 Nov 1, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 132 Nov 22, 2022
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 101 Nov 30, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

null 241 Nov 16, 2022
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 816 Nov 19, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 94 Nov 18, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

null 94 Nov 22, 2022
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Yifan Zhang 242 Nov 22, 2022
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

null 249 Nov 25, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 224 Nov 22, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 200 Nov 17, 2022
Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

DV Lab 271 Nov 25, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 68 Oct 20, 2022
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 19 Nov 2, 2022
Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

OpenDet Expanding Low-Density Latent Regions for Open-Set Object Detection (CVPR2022) Jiaming Han, Yuqiang Ren, Jian Ding, Xingjia Pan, Ke Yan, Gui-So

csuhan 60 Oct 30, 2022