[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

Quan Meng

Last update: Dec 26, 2022

Related tags

Overview

GNeRF

This repository contains official code for the ICCV 2021 paper: GNeRF: GAN-based Neural Radiance Field without Posed Camera. This implementation is written in Pytorch.

Abstract

We introduce GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field (NeRF) reconstruction for the complex scenarios with unknown and even randomly initialized camera poses. Recent NeRF-based advances have gained popularity for remarkable realistic novel view synthesis. However, most of them heavily rely on accurate camera poses estimation, while few recent methods can only optimize the unknown camera poses in roughly forward-facing scenes with relatively short camera trajectories and require rough camera poses initialization. Differently, our GNeRF only utilizes randomly initialized poses for complex outside-in scenarios. We propose a novel two-phases end-to-end framework. The first phase takes the use of GANs into the new realm for optimizing coarse camera poses and radiance fields jointly, while the second phase refines them with additional photometric loss. We overcome local minima using a hybrid and iterative optimization scheme. Extensive experiments on a variety of synthetic and natural scenes demonstrate the effectiveness of GNeRF. More impressively, our approach outperforms the baselines favorably in those scenes with repeated patterns or even low textures that are regarded as extremely challenging before.

Installation

We recommand using Anaconda to setup the environment. Run the following commands:

# Create a conda environment named 'gnerf'
conda create --name gnerf python=3.7
# Activate the environment
conda activate gnerf
# Install requirements
pip install -r requirements.txt

Data

Blender

Download from the NeRF official Google Drive . Please download and unzip nerf_synthetic.zip.

DTU

Download the preprocessed DTU training data from original MVSNet repo and unzip. We also provide a few DTU examples for fast testing.

Your own data

We share some advices on preparing your own dataset and setting related parameters:

Pose sampling space should be close to the data: Our method requires a reasonable prior pose distribution.
The training may fail to converge on symmetrical scenes: The inversion network can not map an image to different poses.

Running

python train.py ./config/CONFIG.yaml --data_dir PATH/TO/DATASET

where you replace CONFIG.yaml with your config file (blender.yaml for blender dataset and dtu.yaml for DTU dataset). You can optionally monitor on the training process using tensorboard by adding --open_tensorboard argument. The default setting takes around 13GB GPU memory. After 40k iterations, you should get a video like these:

Evaluation

python eval.py --ckpt PATH/TO/CKPT.pt --gt PATH/TO/GT.json

where you replace PATH/TO/CKPT.pt with your trained model checkpoint, and PATH/TO/GT.json with the json file in NeRF-Synthetic dataset. Then, just run the ATE toolbox on the evaluation directory.

List of Possible Improvements

For future work, we recommend the following aspects to further improve the performance and stability:

Replace the single NeRF network with mip-NeRF network: The use of separate MLPs in the original NeRF paper is a key detail to represent thin objects in the scene, if you retrain the original NeRF with only one MLP you will find a decrease in performance. While in our work, a single MLP network is necessary to keep the coarse image and fine image aligned. The cone casting and IPE features of mip-NeRF allow it to explicitly encode scale into the input features and thereby enable an MLP to learn a multiscale representation of the scene.
Combine BARF to further overcome local minima: The BARF method shows that susceptibility to noise from positional encoding affects the basin of attraction for registration and present a coarse-to-fine registration strategy.
Combine NeRF++ to represent the background in real scenes with complex background.

Citation

If you find our code or paper useful, please consider citing

@InProceedings{meng2021gnerf,
    author = {Meng, Quan and Chen, Anpei and Luo, Haimin and Wu, Minye and Su, Hao and Xu, Lan and He, Xuming and Yu, Jingyi},
    title = {{G}{N}e{R}{F}: {G}{A}{N}-based {N}eural {R}adiance {F}ield without {P}osed {C}amera},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year = {2021}
}

Some code snippets are borrowed from GRAF and nerf_pl. Thanks for these great projects.

Comments

GIF result on hotdog & lego dataset: all is white

Hello authors, I have read the GNeRF paper recently and try to re-product the results on blender hotdog and lego dataset, with released code in this repository and default settings. But I find that the output gif result is all white after at least 30k training iterations.

I have also read the previous issues proposed previously here. The author say that it happens that the GAN part training fails and leads to all-white results. I wonder whether it is normal for the GNeRF to fail on GAN training?

opened by ZhangXiaoXuan2019 1
Experiments on DTU and blender dataset: blurry outputs, mode collapse
Thanks for sharing the implementation! This is a pretty interesting work!

I'm trying to reproduce the results on DTU and have several questions:

What is the size of images used for the experiments? In config/dtu.yaml, the image size is [500, 400] by default. However, in the datasets.py (https://github.com/quan-meng/gnerf/blob/a008c63dba3a0f7165e912987942c47972759879/dataset/datasets.py#L120), the size is enforced to be proportional to the original size which is 1600x1200 for DTU, thus [500, 400] does not work. Should the size be [400, 300] instead? I set the image size to be [400, 300] in my following experiments.

I got pretty blurry synthesis results on DTU, for example, on scan63, after 30K iters, I got the following results, on scan4, after 30K iters,

What are the pose estimation scores (rotation and translation errors) on DTU dataset?
opened by ZezhouCheng 3
train result

There was an error while trying to return the code to my own blender data similar to nerf_synthetic data. It was confirmed that all the results derived from phase ABAB (from 12k) did not converge and splashed. (continued until phase b) Do you know why this is happening? I trained by python train.py ./config/blender.yaml --data_dir PATH/TO/DATASET and the data has same forms just like the nerf_synthetic data. Thank you for the code.

opened by yenncye 0
Output GIF is All NONE

Dear Authors, Thanks for the amazing work.

I have some problem for the network result. In the end, the generator and discriminator look like convergence. But the result gif is all NONE. (rgb gif is all white, depth gif is all Black) I am so confuse about that.

What I have changed before training this network is only changed the batch size from 12 to 6. Does this change make this err? :)

ps: I use the blender drums dataset for training. one GPU 2080Ti Training for almost 4 days

Thanks for your help.

opened by SimonCK666 6
problem on image size

Dear Authors, Thanks for the amazing work.

I have some problem when the size of image h not equal to w, when cal psnr on phase B the render image is [w,h], but the real img is [h,w], this problem also show on tensorboard 'rgb' detail: similarity.py mse value = (image_pred - image_gt) ** 2

Thanks for your help.

opened by js-duan 2
Pose Distribution Prior

Dear Authors,

Thanks for the interesting work and releasing the code. I was wondering about the advice that you've put in the readme on training with our own data. Assuming that I only have an image dataset, how can I 1) find this suitable prior distribution, 2) train your model on it?

Thanks for your help in advance.

opened by hmdolatabadi 2

Owner

Quan Meng

GitHub

BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF ?? : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

539 Dec 28, 2022

[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

369 Dec 24, 2022

D-NeRF: Neural Radiance Fields for Dynamic Scenes

D-NeRF: Neural Radiance Fields for Dynamic Scenes [Project] [Paper] D-NeRF is a method for synthesizing novel views, at an arbitrary point in time, of

291 Jan 2, 2023

A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

11 Jul 24, 2022

(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

NeRF--: Neural Radiance Fields Without Known Camera Parameters Project Page | Arxiv | Colab Notebook | Data Zirui Wang¹, Shangzhe Wu², Weidi Xie², Min

411 Dec 26, 2022

Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

239 Dec 22, 2022

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields Paper | Supplementary | Video | Poster If you find our code or paper useful, please

26 Nov 29, 2022

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

111 Dec 29, 2022

Camera-caps - Examine the camera capabilities for V4l2 cameras

camera-caps This is a graphical user interface over the v4l2-ctl command line to

25 Dec 26, 2022

[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction Project Page | Paper | Supplementary | Video This reposit

331 Dec 28, 2022

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

12 Dec 7, 2022

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

113 Dec 21, 2022

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

585 Jan 4, 2023

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

1.1k Dec 30, 2022

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

1 Nov 12, 2021

Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

Less is More: Pay Less Attention in Vision Transformers Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers. By

73 Jan 1, 2023

GANmouflage: 3D Object Nondetection with Texture Fields

GANmouflage: 3D Object Nondetection with Texture Fields Rui Guo1 Jasmine Collins

29 Aug 10, 2022

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

172 Dec 22, 2022

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

Related tags

Overview

GNeRF

Abstract

Installation

Data

Blender

DTU

Your own data

Running

Evaluation

List of Possible Improvements

Citation

Comments

GIF result on hotdog & lego dataset: all is white

Experiments on DTU and blender dataset: blurry outputs, mode collapse

train result

Output GIF is All NONE

problem on image size

Pose Distribution Prior

Owner

Quan Meng

BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

D-NeRF: Neural Radiance Fields for Dynamic Scenes

A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Camera-caps - Examine the camera capabilities for V4l2 cameras

[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

GANmouflage: 3D Object Nondetection with Texture Fields

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose