Code release for NeRF (Neural Radiance Fields)

Last update: Jan 1, 2023

Related tags

Overview

NeRF: Neural Radiance Fields

Project Page | Video | Paper | Data

Tensorflow implementation of optimizing a neural representation for a single scene and rendering new views.

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall*¹, Pratul P. Srinivasan*¹, Matthew Tancik*¹, Jonathan T. Barron², Ravi Ramamoorthi³, Ren Ng¹
¹UC Berkeley, ²Google Research, ³UC San Diego
*denotes equal contribution
in ECCV 2020 (Oral Presentation, Best Paper Honorable Mention)

TL;DR quickstart

To setup a conda environment, download example training data, begin the training process, and launch Tensorboard:

conda env create -f environment.yml
conda activate nerf
bash download_example_data.sh
python run_nerf.py --config config_fern.txt
tensorboard --logdir=logs/summaries --port=6006

If everything works without errors, you can now go to localhost:6006 in your browser and watch the "Fern" scene train.

Setup

Python 3 dependencies:

Tensorflow 1.15
matplotlib
numpy
imageio
configargparse

The LLFF data loader requires ImageMagick.

We provide a conda environment setup file including all of the above dependencies. Create the conda environment nerf by running:

conda env create -f environment.yml

You will also need the LLFF code (and COLMAP) set up to compute poses if you want to run on your own real data.

What is a NeRF?

A neural radiance field is a simple fully connected network (weights are ~5MB) trained to reproduce input views of a single scene using a rendering loss. The network directly maps from spatial location and viewing direction (5D input) to color and opacity (4D output), acting as the "volume" so we can use volume rendering to differentiably render new views.

Optimizing a NeRF takes between a few hours and a day or two (depending on resolution) and only requires a single GPU. Rendering an image from an optimized NeRF takes somewhere between less than a second and ~30 seconds, again depending on resolution.

Running code

Here we show how to run our code on two example scenes. You can download the rest of the synthetic and real data used in the paper here.

Optimizing a NeRF

Run

bash download_example_data.sh

to get the our synthetic Lego dataset and the LLFF Fern dataset.

To optimize a low-res Fern NeRF:

python run_nerf.py --config config_fern.txt

After 200k iterations (about 15 hours), you should get a video like this at logs/fern_test/fern_test_spiral_200000_rgb.mp4:

To optimize a low-res Lego NeRF:

python run_nerf.py --config config_lego.txt

After 200k iterations, you should get a video like this:

Rendering a NeRF

Run

bash download_example_weights.sh

to get a pretrained high-res NeRF for the Fern dataset. Now you can use render_demo.ipynb to render new views.

Replicating the paper results

The example config files run at lower resolutions than the quantitative/qualitative results in the paper and video. To replicate the results from the paper, start with the config files in paper_configs/. Our synthetic Blender data and LLFF scenes are hosted here and the DeepVoxels data is hosted by Vincent Sitzmann here.

Extracting geometry from a NeRF

Check out extract_mesh.ipynb for an example of running marching cubes to extract a triangle mesh from a trained NeRF network. You'll need the install the PyMCubes package for marching cubes plus the trimesh and pyrender packages if you want to render the mesh inside the notebook:

pip install trimesh pyrender PyMCubes

Generating poses for your own scenes

Don't have poses?

We recommend using the imgs2poses.py script from the LLFF code. Then you can pass the base scene directory into our code using --datadir <myscene> along with -dataset_type llff. You can take a look at the config_fern.txt config file for example settings to use for a forward facing scene. For a spherically captured 360 scene, we recomment adding the --no_ndc --spherify --lindisp flags.

Already have poses!

In run_nerf.py and all other code, we use the same pose coordinate system as in OpenGL: the local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis upwards, and the Z axis backwards as seen from the image.

Poses are stored as 3x4 numpy arrays that represent camera-to-world transformation matrices. The other data you will need is simple pinhole camera intrinsics (hwf = [height, width, focal length]) and near/far scene bounds. Take a look at our data loading code to see more.

Citation

@inproceedings{mildenhall2020nerf,
  title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
  author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
  year={2020},
  booktitle={ECCV},
}

Comments

What can we do with our own trained model?

I've trained my own nerf model using 80~100 real images successfully, it takes 8 hours for 50000 iters (probably my GPU not so good), so i think it will take four times hours for 20k iters, it's too long.

And i have no idea about how to use the trained model, because it's not like most general deep learning models. In my opinion, when we train a model for a specific object, then this model can only be tested for images which similar to the train set. so what can we do with our own trained model? all i know is that we can use it to extract mesh model.

My goal is to build a 3D model with some 2D images, i don't konw whether this repo can achieve it or not. The biggest problem is how to use depth map to add color to the extract mesh model. Can you show some guideline for me? I'll be very appreciate.

opened by SpongeGirl 13
LLFF data preprocessing

From what I can decipher, the pose_bounds.npy contains 3x5 pose matrices and 2 depth bounds for each image. Each pose has [R T] as the left 3x4 matrix and [H W F] as the right 3x1 matrix.

However I get confused by the functions poses_avg and recenter_poses. What do these functions do and why?

I checked the original code but there aren't these averaging or recentering.

opened by kwea123 12
How to translate depth in NDC to real depth?

When in NDC space, the predicted depth has range 0~1. How to translate that into real depth? We know that predicted depth=0 means that it's at the near plane, which is at real distance 1.0, how about predicted depth=1? In the formula it corresponds to infinity, but in reality, do we translate that into the real farthest depth provided by COLMAP? And how about the values that are between 0 and 1?

I ask this question because I want to reconstruct the LLFF data in 3D. Using the predicted depth in 0~1 gives something visually plausible but I think it is mathematically wrong, we need to convert it to real depth.

opened by kwea123 11
tips for training real 360 inward-facing scene

I tried to train on my own 360 inward-facing scenes, however, there is a huge portion of noise in the output: It doesn't disappear no matter how many steps I train. I follow the suggestion in the readme and add --no_ndc --spherify --lindisp, the other settings are same as fern. I suspect that it is due to the fact that input images has many arbitrary backgrounds, which deteriorates the training? For example some of the training images: These arbitrary backgrounds are inevitable unless I have a infinite ground or an infinitely large table... What's your opinion on this problem? Is it really due to the background or any other reason?

opened by kwea123 9
How to read the depth map?

I read the depth map in synthetic test file by d[u][v] = (255-depth[u][v])/255, but the final results seems not consistent in multi-view. So how can I get the accurate depth map?

opened by SYSUGrain 8
Question about ray directions calculation in code.

dirs = np.stack([(i-W*.5)/focal, -(j-H*.5)/focal, -np.ones_like(i)], -1)

'i','j' is the pixel coordinates, why this is the ray direction? Are there some assumptions?

Thanks a lot!

opened by zdw-qingdao 8

about render_path_spiral and viewmatrix

hello, i'm a novice in this field. I was confused about two functions below, they seem to perform transformation between coordinates, could you give more details?

def render_path_spiral(c2w, up, rads, focal, zdelta, zrate, rots, N):
    render_poses = []
    rads = np.array(list(rads) + [1.])
    hwf = c2w[:,4:5]
    for theta in np.linspace(0., 2. * np.pi * rots, N+1)[:-1]:
        c = np.dot(c2w[:3,:4], np.array([np.cos(theta), -np.sin(theta), -np.sin(theta*zrate), 1.]) * rads) 
        z = normalize(c - np.dot(c2w[:3,:4], np.array([0,0,-focal, 1.])))
        render_poses.append(np.concatenate([viewmatrix(z, up, c), hwf], 1))
    return render_poses

def viewmatrix(z, up, pos):
    vec2 = normalize(z)
    vec1_avg = up
    vec0 = normalize(np.cross(vec1_avg, vec2)) #np.cross叉积
    vec1 = normalize(np.cross(vec2, vec0))
    m = np.stack([vec0, vec1, vec2, pos], 1)
    return m

hope you can help me. Thanks!

opened by ai1361720220000 7

original blender files

I followed the instructions to obtain the lego scene from blendswap and I parsed the json files to get the poses and FOV. The poses look correct, but two pieces of info seem to be missing. One is that the scene doesn't appear to be the correct scale. The cameras are within the object without scaling by a factor of 25 or so. The second is that the lego bulldozer scene has a controllable bucket and the renderings in NeRF clearly use a setting other than the default in the blendswap scene.

Do the authors still have the original blender files they used to render these scenes? The licenses seem fairly permissive. It would be great to use these as a starting point for new experiments, but be able to modify them.

opened by kmatzen 7
Opencv extrinsic instead of colmap

Hi, Thanks for your great work of nerf ! Actually, Using colmap to estimate camera extrinsic spends much time (the installation and runing reconstrusction code...). I tried to put a chessboard in my own llff data and then calibrate it by using Opencv. However, the camera parameters could not be correctly used in nerf. I think may be Opencv and colmap have different coordinate systems which makes the camera extrinsic in Opencv can not be used in the code. If you know the answer about how to transform the extrinsic, please tell me. THANK U very much !!

opened by Duncan1115 5
Depth GT

Hi, I just want to confirm that the depth images in the testing dataset of nerf_synthetic are the ground truth or not?

If they are ground truth, could you please also release the depth images for the training and validation datasets?

Many Thanks!

opened by BingCS 5
Regarding the ray color calculation from the MLP output.
Hi,

I was trying to rewrite the NeRF from scratch to understand the technique better and during the implementation I encountered a concern which I need a help. I am new to computer graphics area so there may be a misunderstanding from my end. Please let me know if that's the case.

If I understood the paper correctly, In NeRF we are supposed to calculate the pixel colors of an image by integrating over the ray sample point colors (c) and volume density values (σ) given by the MLP using the provided formula in the paper.

Since we cant integrate, we use below formula to calculate the estimated ray color.

In the provided implementation of this repo, we first calculate the difference between sampled points (δ). Then we calculate the a value called alpha = 1.0 - tf.exp(-act_fn(raw) * dists) which I think is equivalent to 1 − exp(−σiδi) part of above formula.

Then comes my problem.

weights = alpha * tf.math.cumprod(1.-alpha + 1e-10, axis=-1, exclusive=True) rgb_map = tf.reduce_sum(weights[..., None] * rgb, axis=-2) # [N_rays, 3]

As per my understanding weights variable is equal to the Ti(1 − exp(−σiδi)) part of the formula (Not Exactly). But shouldn't it be implemented roughly like below according to the paper?

alpha = 1 - np.exp(-volume_density * delta) T_i = np.exp(-np.cumsum(volume_density * delta)) rgb_map = np.sum(T_i*alpha*colors, axis=1)

why this difference exist? Is my understanding is wrong or am I missing something?
opened by DinushkaDDS 4
Is it possible to generate a video from my own images on google colab demo?

Hi nerf team. I have a question.

I tried the demo below https://colab.research.google.com/github/bmild/nerf/blob/master/tiny_nerf.ipynb

And,successfully created a demo video. Next, I want to generate a video from my own images, is that possible? please tell me if there is a way.

opened by mule-engineer13 0

Fix tiny_nerf for colab compatibility

Colab removed TensorFlow 1 support, so I added install commands to make it work. I also added a command to install a needed video module.

If the diff is too messy, you can simply copy and paste this code:

# %tensorflow_version 1.x
#
# TensorFlow 1 is deprecated in Colab
# https://stackoverflow.com/questions/73215696/did-colab-suspend-tensorflow-1-x

!echo y | pip uninstall tensorflow
!pip install tensorflow-gpu==1.15
!apt install --allow-change-held-packages libcudnn7=7.4.1.5-1+cuda10.0

# required for video below
!pip install imageio-ffmpeg

opened by philipkd 0

360° scene on real data

Hi! Thanks for your work greatfully：）

I’m now trying to use Nerf on my 360° captured real scene. Using flag --no_ndc --spherify --lindisp. Similar config file as in fern.txt you provided, but got terrible result...

To get a reasonable result, should I use your config for synthetic data like lego.txt?

opened by Seagullflysxt 0
About change L1 loss to SSIM + L1 loss

Hi, I tried to change the L1 supervision to SSIM + L1 supervsion (0.85 * L_ssim + 0.15 L_l1), but failed to get good results. My motivation is that the original L1 supervision only focuses on a single pixel while discarding the local relationship of pixels, thus maybe better performance can be achieved by adding a structural loss like SSIM into supervision. I calculate the SSIM loss by sampling a batch of patches instead of pixels during training (since the memory consumption increases fast when the patch length is large, I only tried path length = 3, e.g. 9 pixels). I am wondering why the performance gets worse by adding the SSIM loss. Any discussion is welcome.

Thanks!

opened by Beniko95J 0
Why disparity

From your supplementary material, I see that "Note that, as desired, t' = 0 when t = 0. Additionally, we see that t' → 1 as t → ∞." And I think it considers depth instead of disparity.
Is not t' sampling from 0 to 1 linear? Why do we need do one more operation-inverse the depth?

What does "using z dimension representing inverse depth" mean?

opened by TwiceMao 0

Owner

GitHub http://tancik.com/nerf

(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

NeRF--: Neural Radiance Fields Without Known Camera Parameters Project Page | Arxiv | Colab Notebook | Data Zirui Wang¹, Shangzhe Wu², Weidi Xie², Min

411 Dec 26, 2022

Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

239 Dec 22, 2022

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

This repository contains the code release for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. This implementation is written in JAX, and is a fork of Google's JaxNeRF implementation. Contact Jon Barron if you encounter any issues.

625 Dec 30, 2022

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

NeRF-pytorch NeRF (Neural Radiance Fields) is a method that achieves state-of-the-art results for synthesizing novel views of complex scenes. Here are

3.2k Jan 8, 2023

D-NeRF: Neural Radiance Fields for Dynamic Scenes

D-NeRF: Neural Radiance Fields for Dynamic Scenes [Project] [Paper] D-NeRF is a method for synthesizing novel views, at an arbitrary point in time, of

291 Jan 2, 2023

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

172 Dec 22, 2022

A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

11 Jul 24, 2022

Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

243 Jan 7, 2023

Point-NeRF: Point-based Neural Radiance Fields

Point-NeRF: Point-based Neural Radiance Fields Project Sites | Paper | Primary c

662 Jan 1, 2023

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

GRAF This repository contains official code for the paper GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. You can find detailed usage i

349 Dec 29, 2022

Instant-nerf-pytorch - NeRF trained SUPER FAST in pytorch

instant-nerf-pytorch This is WORK IN PROGRESS, please feel free to contribute vi

94 Nov 22, 2022

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Deformable Neural Radiance Fields This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies. Project Page Paper Video This codebase conta

1k Jan 9, 2023

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs Check out the paper on arXiv: https://arxiv.org/abs/2103.13744 This repo cont

373 Dec 20, 2022

This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

281 Dec 22, 2022