Unsupervised Discovery of Object Radiance Fields

Hong-Xing Yu

Last update: Nov 30, 2022

Related tags

Deep Learning uORF

Overview

Unsupervised Discovery of Object Radiance Fields

by Hong-Xing Yu, Leonidas J. Guibas and Jiajun Wu from Stanford University.

arXiv link: https://arxiv.org/abs/2107.07905

Project website: https://kovenyu.com/uorf

Environment

We recommend using Conda:

conda env create -f environment.yml
conda activate uorf-3090

or install the packages listed therein. Please make sure you have NVIDIA drivers supporting CUDA 11.0, or modify the version specifictions in environment.yml.

Data and model

Please download datasets and models here.

Evaluation

We assume you have a GPU. If you have already downloaded and unzipped the datasets and models into the root directory, simply run

bash scripts/eval_nvs_seg_chair.sh

from the root directory. Replace the script filename with eval_nvs_seg_clevr.sh, eval_nvs_seg_diverse.sh, and eval_scene_manip.sh for different evaluations. Results will be saved into ./results/. During evaluation, the results on-the-fly will also be sent to visdom in a nicer form, which can be accessed from localhost:8077.

Training

We assume you have a GPU with no less than 24GB memory (evaluation does not require this as rendering can be done ray-wise but some losses are defined on the image space), e.g., 3090. Then run

bash scripts/train_clevr_567.sh

or other training scripts. If you unzip datasets on some other place, add the location as the first parameter:

bash scripts/train_clevr_567.sh PATH_TO_DATASET

Training takes ~6 days on a 3090 for CLEVR-567 and Room-Chair, and ~9 days for Room-Diverse. It can take even longer for less powerful GPUs (e.g., ~10 days on a titan RTX for CLEVR-567 and Room-Chair). During training, visualization will be sent to localhost:8077.

Bibtex

@article{yu2021unsupervised
  author    = {Yu, Hong-Xing and Guibas, Leonidas J. and Wu, Jiajun},
  title     = {Unsupervised Discovery of Object Radiance Fields},
  journal   = {arXiv preprint arXiv:2107.07905},
  year      = {2021},
}

Acknowledgement

Our code framework is adapted from Jun-Yan Zhu's CycleGAN. Some code related to adversarial loss is adapted from a pytorch implementation of StyleGAN2. Some snippets are adapted from pytorch slot attention and NeRF. If you find any problem please don't hesitate to email me at [email protected] or open an issue.

Comments

Question about segmentation and manipulation
Hi, thanks for the efforts and the work is interesting. I have two questions here:

Is it the right way to predict segmentations using argmax(integrate(weight * density), slot_dim)? Seems that it could be wrong when occlusion occurs. I am showing a rendered example (the segm on top and the image on bottom) as below. I'm referring to the two cylinders on the right.

How can I manipulate clever scenes?

FileNotFoundError: [Errno 2] No such file or directory: './clevr_567_test/00000_sc0000_az00_moved.png'

I'm having this question because a background slot view contains object shadows as shown below. So how will these shadows be during manipulations?
opened by KelestZ 2
Download Issues of Data and Pretrained Models.

Thanks for your great work. The download link you provide in README needs a Stanford SharePoint account to log in, which makes us cannot access outside. I try to use a personal account or my institution account to access it, but it didn't work.

Could you please solve this or upload it to google drive? Thanks.

opened by Hai-chao-Zhang 2
Setting of the Scene Design and Editing Experiments.

Hi, thanks for your great work. I'm trying to reproduce your scene editing baseline. I'm wondering how to get the object moving results in the paper? It's performed by editing slot features before sending them to the decoder or directly manipulating the NeRF volume before the composition.

opened by Hai-chao-Zhang 1
Information about dataloading

Hello, I am trying to apply uORF to my own dataset, which currently consists only of multiview RGB images. I noticed in the dataloading code there are many additional files to load aside from the images. It is not clear from the paper which inputs are necessary and what are not.

Could you elaborate on all the different files being loaded in the dataloading code? Like if they are necessary for training the model, if they are for evaluation or debugging, and if they are very useful or not for debugging. This will help me greatly with figuring out if I should generate the corresponding metadata with my custom dataset, or just not use it.

https://github.com/KovenYu/uORF/blob/main/data/multiscenes_dataset.py

opened by hueds 1
What is nss_scale ?

A kind request to please clarify regarding nss_scale ? Like I want to generate my own data and train the model on it. How should I determine the nss_scale ?

opened by vanshilshah97 0
Perceptual loss is zero

While using the ./train_clevr_567.sh to train the model on the clever 567 scene, the perceptual loss is zero. Is that the usual behaviour or is it due to something elsle ? Moreover when i modify the shell script to be the trained on uorf_gan model and train_with_gan.py then this is the following output

Reconstruction loss is non zero, otherwise all other losses are zero for the first epoch. Is this fine ?

opened by vanshilshah97 0
Why are the normalised pixel co-ordinates subtracted from 2 ?

https://github.com/KovenYu/uORF/blob/d5047198419a64932d47a7df29ea8662979e8b4b/models/model.py#L48

here like from paper i get that it is for getting information in 4 directions ? but what are the 4 directions being talked about ? is this like front and back of the camera? but here as we are rendering foreground objects in viewers view point we only need information in the front of the camera?

opened by vanshilshah97 0
Question about the groundtruth segmentation mask

Hi, Koven, I wonder if there are the ground truth segmentation masks provided in the dataset. By the way, I want to know how can generate the segmentation labels when generating my own dataset.

opened by Xuanmeng-Zhang 0
Replace '.cuda()' with '.to(self.device)' to enable training/ evaluation on cpu

Setting the --gpu_ids -1 fails at the moment due to some .cuda() statements. Replacing these with .to(self.device) allows training/ evaluation on a CPU. This can be useful for evaluation on machines without a GPU, or for downloading model weights on cluster login nodes without GPU.

opened by nepfaff 0

Fix 'depth_range' and 'z_cam' device mismatch error in 'construct_frus_coor'

When running bash scripts/eval_nvs_seg_clevr.sh, I got the following device mismatch error:

Traceback (most recent call last):
  File "test.py", line 32, in <module>
    model.test()           # run inference: forward + compute_visuals
  File "/home/.../uORF/models/base_model.py", line 104, in test
    self.forward()
  File "/home/.../uORF/models/uorf_eval_model.py", line 135, in forward
    frus_nss_coor, z_vals, ray_dir = self.projection.construct_sampling_coor(cam2world, partitioned=True)
  File "/home/.../uORF/models/projection.py", line 63, in construct_sampling_coor
    pixel_coor = self.construct_frus_coor()
  File "/home/.../uORF/models/projection.py", line 43, in construct_frus_coor
    z_cam = depth_range[z_frus].to(self.device)
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

This PR fixes this by ensuring that depth_range is moved to the correct device.

opened by nepfaff 0

Issue with Multiple GPUs

Dear authors,

I saw your config supports adding multiple gpus with "--gpu_ids", however the implementation seems doesn't support it. Have you tested training with multiple gpus as well? Thanks

opened by gaobaoding 0
Any tips for real scene data?

Thanks for your nice work. Is it possible to fit this method into real scene captured by my phone? E.g., I have captured some photos, and then run COLMAP to get camera pose and other parameters. Finally, I put these in uORF. Can I get acceptable object radiance fields? Did you have a try? Or do you have any tips?

opened by QinlongHuang 1

Owner

Hong-Xing Yu

GitHub

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Deformable Neural Radiance Fields This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies. Project Page Paper Video This codebase conta

1k Jan 9, 2023

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

NeRF--: Neural Radiance Fields Without Known Camera Parameters Project Page | Arxiv | Colab Notebook | Data Zirui Wang¹, Shangzhe Wu², Weidi Xie², Min

411 Dec 26, 2022

Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

239 Dec 22, 2022

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

This repository contains the code release for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. This implementation is written in JAX, and is a fork of Google's JaxNeRF implementation. Contact Jon Barron if you encounter any issues.

625 Dec 30, 2022

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs Check out the paper on arXiv: https://arxiv.org/abs/2103.13744 This repo cont

373 Dec 20, 2022

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

111 Dec 29, 2022

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022

DeRF: Decomposed Radiance Fields

DeRF: Decomposed Radiance Fields Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi Links Paper Project Page Abstract

24 Dec 2, 2022

Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

524 Jan 8, 2023

PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

MINE: Continuous-Depth MPI with Neural Radiance Fields Project Page | Video PyTorch implementation for our ICCV 2021 paper. MINE: Towards Continuous D

325 Dec 29, 2022

This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

281 Dec 22, 2022

Unsupervised Discovery of Object Radiance Fields

Related tags

Overview

Unsupervised Discovery of Object Radiance Fields

Environment

Data and model

Evaluation

Training

Bibtex

Acknowledgement

Comments

Owner

Hong-Xing Yu

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

DeRF: Decomposed Radiance Fields

Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

[ICCV21] Self-Calibrating Neural Radiance Fields

[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

pixelNeRF: Neural Radiance Fields from One or Few Images

D-NeRF: Neural Radiance Fields for Dynamic Scenes