Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

Last update: Jan 6, 2023

Related tags

Deep Learning InfoNeRF

Overview

InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering

Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

Project | Paper

Mijeong Kim, Seonguk Seo, Bohyung Han

Seoul National University

arXiv 2112.15399, 2021

We present an information-theoretic regularization technique for few-shot novel view synthesis based on neural implicit representation. The proposed approach minimizes potential reconstruction inconsistency that happens due to insufficient viewpoints by imposing the entropy constraint of the density in each ray. In addition, to alleviate the potential degenerate issue when all training images are acquired from almost redundant viewpoints, we further incorporate the spatially smoothness constraint into the estimated images by restricting information gains from a pair of rays with slightly different viewpoints. The main idea of our algorithm is to make reconstructed scenes compact along individual rays and consistent across rays in the neighborhood. The proposed regularizers can be plugged into most of existing neural volume rendering techniques based on NeRF in a straightforward way. Despite its simplicity, we achieve consistently improved performance compared to existing neural view synthesis methods by large margins on multiple standard benchmarks.

Citation

If you find our work useful in your research, please cite:

@article{kim2021infonerf},
            title = {InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering},
            author = {Mijeong Kim and Seonguk Seo and Bohyung Han},
            journal = {arXiv.org}
            year = {2021},
        }

Acknowlegements

This code borrows heavily from nerf-pytorch.

Comments

about the ray for calculating entropy loss

In paper, "where R s denotes a set of rays from training images, Ru denotes a set of rays from randomly sampled unseen images, and ⊙ indicates element-wise multiplication.", you have mentioned that the entropy loss is calculated within the union of seen and unseen rays. However, in the lego code, the entropy loss is only down on unseen rays. acc = acc[self.N_samples:] sigma = sigma[self.N_samples:] why only use later 1024 rays for this loss?

opened by chensjtu 2

Unable to reproduce results on Realistic Synthetic 360 scenes

Hi,

I am trying to reproduce the results on Realistic Synthetic 360 - lego scene. I tried the configs released in this GitHub repo as well as the configs mentioned in issue-2. But I was not able to get anywhere close to the reported performance with either of the configs. Any suggestions on how to figure out what is happening? I've attached both the configs below for reference. Kindly let me know if you need any other information.

Configs from this GitHub repo:

datadir = ../Data/Databases/NeRF_Synthetic/Data1/lego
expname = lego
basedir = ../Runs/Training/Train0034
dataset_type = blender
train_scene = [26, 86, 2, 55]
factor = 1

N_rand = 1024
N_samples = 64
N_importance = 128
use_viewdirs = True
no_batching = True
lrate_decay = 500
white_bkgd = True

precrop_iters = 500
precrop_frac = 0.5

entropy = True
N_entropy = 1024
entropy_ray_zvals_lambda = 0.001
fewshot = 4

wandb = False
i_wandb = 10

N_iters = 50000
i_weights = 25000

Configs from issue-2:

datadir = ../Data/Databases/NeRF_Synthetic/Data1/lego
expname = lego
basedir = ../Runs/Training/Train0034
dataset_type = blender
train_scene = [26, 86, 2, 55]
factor = 1

N_rand = 1024
N_samples = 64
N_importance = 128
use_viewdirs = True
no_batching = True
lrate_decay = 500
white_bkgd = True

precrop_iters = 500
precrop_frac = 0.5

entropy = True
N_entropy = 1024
entropy_ray_zvals_lambda = 0.001
fewshot = 4

smooth_sampling_method = near_pixel
smooth_pixel_range = 1
smoothing_activation = softmax
smoothing_lambda = 0.00001
smoothing_step = 2500

wandb = False
i_wandb = 10

N_iters = 50000
i_weights = 25000

opened by NagabhushanSN95 0

Performance is very bad on LLFF dataset with 2 input views

Hi,

I tried your model on LLFF dataset with 2 input views. But the reconstruction of novel views is very bad. I've attached the images below (train viewpoints are reconstructed reasonably well). Is this expected with such fewer views or do you think something is going wrong?

I'm trying the fern scene. I used images 6 and 8 as train frames and images 5,7,9 as test frames.

Image 5 (test frame) 0005

Image 6 (train frame) 0006

Image 7 (test frame) 0007

Image 8 (train frame) 0008

Image 9 (test frame) 0009

Configs: args.txt

N_entropy = 1024
N_importance = 128
N_iters = 50000
N_rand = 1024
N_samples = 64
alpha_model_path = None
basedir = ../Runs/Training/Train0011
chunk = 32768
ckpt_render_iter = None
computing_entropy_all = False
config = None
datadir = ../Data/Databases/NeRF_LLFF/Data
dataset_type = NeRF_LLFF
debug = False
entropy = True
entropy_acc_threshold = 0.1
entropy_end_iter = None
entropy_ignore_smoothing = False
entropy_log_scaling = False
entropy_ray_lambda = 1
entropy_ray_zvals_lambda = 0.001
entropy_type = log2
eval_only = False
expname = fern
factor = 4
fewshot = 2
fewshot_seed = 0
ft_path = None
half_res = False
i_embed = 0
i_img = 500
i_print = 100
i_testset = 50000
i_video = 50000
i_wandb = 100
i_weights = 25000
lindisp = False
llffhold = 8
lrate = 0.0005
lrate_decay = 500
maskdir = None
multires = 10
multires_views = 4
near_c2w_rot = 5
near_c2w_trans = 0.1
near_c2w_type = rot_from_origin
netchunk = 65536
netdepth = 8
netdepth_fine = 8
netwidth = 256
netwidth_fine = 256
no_batching = True
no_coarse = False
no_ndc = False
no_reload = False
num_interpolation_poses = 3
perturb = 1.0
precrop_frac = 0.5
precrop_iters = 0
raw_noise_std = 0.0
render_factor = 0
render_mypath = False
render_only = False
render_pass = False
render_test = False
render_test_full = False
render_test_ray = False
render_train = False
shape = greek
smooth_pixel_range = None
smooth_sampling_method = near_pose
smoothing = False
smoothing_activation = norm
smoothing_end_iter = None
smoothing_lambda = 1
smoothing_rate = 1
smoothing_step_size = 5000
spherify = False
test_scene = None
testskip = 8
train_scene = None
train_set_num = 1
use_viewdirs = True
wandb = False
wandb_group = None
white_bkgd = False

opened by NagabhushanSN95 3

Use of test camera poses during training

Hi,

In the code here, you are using test camera poses during training. Instead of this, is it possible to somehow generate new poses from the train poses and use them?

opened by NagabhushanSN95 0
how to implement DTU

Hi, how to implement the dtu dataset? The official DTU dataset is not consistent with the load_dtu.py file. For example, how to get dtu masks and how to compute the camera.npz? Thank you.

opened by Wanggcong 1
Regarding the reproduction of the results
Thanks for your great work.

Regarding reproduction of the lego dataset , training does not converge if I use both entropy and KL loss. I used the default config file, which is at "config/infonerf/synthetic/lego.txt". Although I expected to have similar results as mentioned in the paper, but I failed to train the model. In order to turn on KL loss, I just added "Smoothing=True" and used default parameters for the rest.

parser.add_argument("--smooth_sampling_method", type=str, default='near_pose', help='how to sample the near rays, near_pose: modifying camera pose, near_pixel: sample near pixel', choices=['near_pose', 'near_pixel']) # 1) sampling by rotating camera pose parser.add_argument("--near_c2w_type", type=str, default='rot_from_origin', help='random augmentation method') parser.add_argument("--near_c2w_rot", type=float, default=5, help='random augmentation rotate: degree') parser.add_argument("--near_c2w_trans", type=float, default=0.1, help='random augmentation translation')

Could you please guide me if there are some important hyper-parameters missing in the default config file? Just for your information, the default config file(w/o KL loss), I succeeded in training and results seem to be reasonable.

In addition, you mentioned the role of KL loss in the appendix. I would like to reproduce the result with narrow-baseline dataset as well. Would you please tell me which indexes of images you used for narrow-baseline 4-views?

Thank you for your feedback in advance.

Best regards,
opened by ayclove 2

Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

Related tags

Overview

InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering

Project | Paper

Citation

Acknowlegements

Comments

about the ray for calculating entropy loss

Unable to reproduce results on Realistic Synthetic 360 scenes

Performance is very bad on LLFF dataset with 2 input views

Use of test camera poses during training

how to implement DTU

Regarding the reproduction of the results

Owner

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

A PyTorch re-implementation of Neural Radiance Fields

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Neural Radiance Fields Using PyTorch

SatelliteNeRF - PyTorch-based Neural Radiance Fields adapted to satellite domain

Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

Volsdf - Volume Rendering of Neural Implicit Surfaces

Gesture-Volume-Control - This Python program can adjust the system's volume by using hand gestures

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

This is a JAX implementation of Neural Radiance Fields for learning purposes.

A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.