Unsupervised Discovery of Object Radiance Fields

Related tags

Deep Learning uORF
Overview

Unsupervised Discovery of Object Radiance Fields

by Hong-Xing Yu, Leonidas J. Guibas and Jiajun Wu from Stanford University.

teaser

arXiv link: https://arxiv.org/abs/2107.07905

Project website: https://kovenyu.com/uorf

Environment

We recommend using Conda:

conda env create -f environment.yml
conda activate uorf-3090

or install the packages listed therein. Please make sure you have NVIDIA drivers supporting CUDA 11.0, or modify the version specifictions in environment.yml.

Data and model

Please download datasets and models here.

Evaluation

We assume you have a GPU. If you have already downloaded and unzipped the datasets and models into the root directory, simply run

bash scripts/eval_nvs_seg_chair.sh

from the root directory. Replace the script filename with eval_nvs_seg_clevr.sh, eval_nvs_seg_diverse.sh, and eval_scene_manip.sh for different evaluations. Results will be saved into ./results/. During evaluation, the results on-the-fly will also be sent to visdom in a nicer form, which can be accessed from localhost:8077.

Training

We assume you have a GPU with no less than 24GB memory (evaluation does not require this as rendering can be done ray-wise but some losses are defined on the image space), e.g., 3090. Then run

bash scripts/train_clevr_567.sh

or other training scripts. If you unzip datasets on some other place, add the location as the first parameter:

bash scripts/train_clevr_567.sh PATH_TO_DATASET

Training takes ~6 days on a 3090 for CLEVR-567 and Room-Chair, and ~9 days for Room-Diverse. It can take even longer for less powerful GPUs (e.g., ~10 days on a titan RTX for CLEVR-567 and Room-Chair). During training, visualization will be sent to localhost:8077.

Bibtex

@article{yu2021unsupervised
  author    = {Yu, Hong-Xing and Guibas, Leonidas J. and Wu, Jiajun},
  title     = {Unsupervised Discovery of Object Radiance Fields},
  journal   = {arXiv preprint arXiv:2107.07905},
  year      = {2021},
}

Acknowledgement

Our code framework is adapted from Jun-Yan Zhu's CycleGAN. Some code related to adversarial loss is adapted from a pytorch implementation of StyleGAN2. Some snippets are adapted from pytorch slot attention and NeRF. If you find any problem please don't hesitate to email me at [email protected] or open an issue.

Comments
  • Question about segmentation and manipulation

    Question about segmentation and manipulation

    Hi, thanks for the efforts and the work is interesting. I have two questions here:

    1. Is it the right way to predict segmentations using argmax(integrate(weight * density), slot_dim)? Seems that it could be wrong when occlusion occurs. I am showing a rendered example (the segm on top and the image on bottom) as below. I'm referring to the two cylinders on the right.

    00000_sc0000_az00_render_mask3 00000_sc0000_az00_x_rec3

    1. How can I manipulate clever scenes?

    FileNotFoundError: [Errno 2] No such file or directory: './clevr_567_test/00000_sc0000_az00_moved.png'

    I'm having this question because a background slot view contains object shadows as shown below. So how will these shadows be during manipulations?

    image

    opened by KelestZ 2
  • Download Issues of Data and Pretrained Models.

    Download Issues of Data and Pretrained Models.

    Thanks for your great work. The download link you provide in README needs a Stanford SharePoint account to log in, which makes us cannot access outside. I try to use a personal account or my institution account to access it, but it didn't work.

    Could you please solve this or upload it to google drive? Thanks.

    opened by Hai-chao-Zhang 2
  • Setting of the Scene Design and Editing Experiments.

    Setting of the Scene Design and Editing Experiments.

    Hi, thanks for your great work. I'm trying to reproduce your scene editing baseline. I'm wondering how to get the object moving results in the paper? It's performed by editing slot features before sending them to the decoder or directly manipulating the NeRF volume before the composition.

    opened by Hai-chao-Zhang 1
  • Information about dataloading

    Information about dataloading

    Hello, I am trying to apply uORF to my own dataset, which currently consists only of multiview RGB images. I noticed in the dataloading code there are many additional files to load aside from the images. It is not clear from the paper which inputs are necessary and what are not.

    Could you elaborate on all the different files being loaded in the dataloading code? Like if they are necessary for training the model, if they are for evaluation or debugging, and if they are very useful or not for debugging. This will help me greatly with figuring out if I should generate the corresponding metadata with my custom dataset, or just not use it.

    https://github.com/KovenYu/uORF/blob/main/data/multiscenes_dataset.py

    opened by hueds 1
  • What is nss_scale ?

    What is nss_scale ?

    A kind request to please clarify regarding nss_scale ? Like I want to generate my own data and train the model on it. How should I determine the nss_scale ?

    opened by vanshilshah97 0
  • Perceptual loss is zero

    Perceptual loss is zero

    While using the ./train_clevr_567.sh to train the model on the clever 567 scene, the perceptual loss is zero. Is that the usual behaviour or is it due to something elsle ? Moreover when i modify the shell script to be the trained on uorf_gan model and train_with_gan.py then this is the following output image

    Reconstruction loss is non zero, otherwise all other losses are zero for the first epoch. Is this fine ?

    opened by vanshilshah97 0
  • Why are the normalised pixel co-ordinates subtracted from 2 ?

    Why are the normalised pixel co-ordinates subtracted from 2 ?

    https://github.com/KovenYu/uORF/blob/d5047198419a64932d47a7df29ea8662979e8b4b/models/model.py#L48

    here like from paper i get that it is for getting information in 4 directions ? but what are the 4 directions being talked about ? is this like front and back of the camera? but here as we are rendering foreground objects in viewers view point we only need information in the front of the camera?

    opened by vanshilshah97 0
  • Question about the groundtruth segmentation mask

    Question about the groundtruth segmentation mask

    Hi, Koven, I wonder if there are the ground truth segmentation masks provided in the dataset. By the way, I want to know how can generate the segmentation labels when generating my own dataset.

    opened by Xuanmeng-Zhang 0
  • Replace '.cuda()' with '.to(self.device)' to enable training/ evaluation on cpu

    Replace '.cuda()' with '.to(self.device)' to enable training/ evaluation on cpu

    Setting the --gpu_ids -1 fails at the moment due to some .cuda() statements. Replacing these with .to(self.device) allows training/ evaluation on a CPU. This can be useful for evaluation on machines without a GPU, or for downloading model weights on cluster login nodes without GPU.

    opened by nepfaff 0
  • Fix 'depth_range' and 'z_cam' device mismatch error in 'construct_frus_coor'

    Fix 'depth_range' and 'z_cam' device mismatch error in 'construct_frus_coor'

    When running bash scripts/eval_nvs_seg_clevr.sh, I got the following device mismatch error:

    Traceback (most recent call last):
      File "test.py", line 32, in <module>
        model.test()           # run inference: forward + compute_visuals
      File "/home/.../uORF/models/base_model.py", line 104, in test
        self.forward()
      File "/home/.../uORF/models/uorf_eval_model.py", line 135, in forward
        frus_nss_coor, z_vals, ray_dir = self.projection.construct_sampling_coor(cam2world, partitioned=True)
      File "/home/.../uORF/models/projection.py", line 63, in construct_sampling_coor
        pixel_coor = self.construct_frus_coor()
      File "/home/.../uORF/models/projection.py", line 43, in construct_frus_coor
        z_cam = depth_range[z_frus].to(self.device)
    RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
    

    This PR fixes this by ensuring that depth_range is moved to the correct device.

    opened by nepfaff 0
  • Issue with Multiple GPUs

    Issue with Multiple GPUs

    Dear authors,

    I saw your config supports adding multiple gpus with "--gpu_ids", however the implementation seems doesn't support it. Have you tested training with multiple gpus as well? Thanks

    opened by gaobaoding 0
  • Any tips for real scene data?

    Any tips for real scene data?

    Thanks for your nice work. Is it possible to fit this method into real scene captured by my phone? E.g., I have captured some photos, and then run COLMAP to get camera pose and other parameters. Finally, I put these in uORF. Can I get acceptable object radiance fields? Did you have a try? Or do you have any tips?

    opened by QinlongHuang 1
Owner
Hong-Xing Yu
Hong-Xing Yu
This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Deformable Neural Radiance Fields This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies. Project Page Paper Video This codebase conta

Google 1k Jan 9, 2023
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

NeRF--: Neural Radiance Fields Without Known Camera Parameters Project Page | Arxiv | Colab Notebook | Data Zirui Wang¹, Shangzhe Wu², Weidi Xie², Min

Active Vision Laboratory 411 Dec 26, 2022
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

Jianfei Guo 239 Dec 22, 2022
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

This repository contains the code release for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. This implementation is written in JAX, and is a fork of Google's JaxNeRF implementation. Contact Jon Barron if you encounter any issues.

Google 625 Dec 30, 2022
Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs Check out the paper on arXiv: https://arxiv.org/abs/2103.13744 This repo cont

Christian Reiser 373 Dec 20, 2022
Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

null 111 Dec 29, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

null 551 Dec 29, 2022
DeRF: Decomposed Radiance Fields

DeRF: Decomposed Radiance Fields Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi Links Paper Project Page Abstract

UBC Computer Vision Group 24 Dec 2, 2022
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

null 524 Jan 8, 2023
PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

MINE: Continuous-Depth MPI with Neural Radiance Fields Project Page | Video PyTorch implementation for our ICCV 2021 paper. MINE: Towards Continuous D

Zijian Feng 325 Dec 29, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF ?? : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

Chen-Hsuan Lin 539 Dec 28, 2022
[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

null 381 Dec 30, 2022
[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

Yi Wei 369 Dec 24, 2022
This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields This is the code for "HyperNeRF: A Higher-Dimensional

Google 702 Jan 2, 2023
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

NeRF-pytorch NeRF (Neural Radiance Fields) is a method that achieves state-of-the-art results for synthesizing novel views of complex scenes. Here are

Yen-Chen Lin 3.2k Jan 8, 2023
pixelNeRF: Neural Radiance Fields from One or Few Images

pixelNeRF: Neural Radiance Fields from One or Few Images Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa UC Berkeley arXiv: http://arxiv.org/abs/2

Alex Yu 1k Jan 4, 2023
D-NeRF: Neural Radiance Fields for Dynamic Scenes

D-NeRF: Neural Radiance Fields for Dynamic Scenes [Project] [Paper] D-NeRF is a method for synthesizing novel views, at an arbitrary point in time, of

Albert Pumarola 291 Jan 2, 2023