Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Overview

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Website | ICCV paper | arXiv | Twitter

Diagram overviewing DietNeRF's training procedure

This repository contains the official implementation of DietNeRF, a system that reconstructs 3D scenes from a few posed photos.

Setup

We use the following folder structure:

dietnerf/
  logs/ (images, videos, checkpoints)
  data/
    nerf_synthetic/
  configs/ (run configuration files)
CLIP/ (Fork of OpenAI's clip repository with a wrapper)

Create conda environment:

conda create -n dietnerf python=3.9
conda activate dietnerf

Set up requirements and our fork of CLIP:

pip install -r requirements.txt
cd CLIP
pip install -e .

Login to Weights & Biases:

wandb login

Experiments on the Realistic Synthetic dataset

Realistic Synthetic experiments are implemented in the ./dietnerf subdirectory.

You need to download datasets from NeRF's Google Drive folder. The dataset was used in the original NeRF paper by Mildenhall et al. For example,

mkdir dietnerf/logs/ dietnerf/data/
cd dietnerf/data
pip install gdown
gdown --id 18JxhpWD-4ZmuFKLzKlAw-w5PpzZxXOcG -O nerf_synthetic.zip
unzip nerf_synthetic.zip
rm -r __MACOSX

Then, shrink images to 400x400:

python dietnerf/scripts/bulk_shrink_images.py "dietnerf/data/nerf_synthetic/*/*/*.png" dietnerf/data/nerf_synthetic_400_rgb/ True

These images are used for FID/KID computation. The dietnerf/run_nerf.py training and evaluation code automatically shrinks images with the --half_res argument.

Each experiment has a config file stored in dietnerf/configs/. Scripts in dietnerf/scripts/ can be run to train and evaluate models. Run these scripts from ./dietnerf. The scripts assume you are running one script at a time on a server with 8 NVIDIA GPUs.

cd dietnerf
export WANDB_ENTITY=
   
    

# NeRF baselines
sh scripts/run_synthetic_nerf_100v.sh
sh scripts/run_synthetic_nerf_8v.sh
sh scripts/run_synthetic_simplified_nerf_8v.sh

# DietNeRF with 8 observed views
sh scripts/run_synthetic_dietnerf_8v.sh
sh scripts/run_synthetic_dietnerf_ft_8v.sh

# NeRF and DietNeRF with partial observability
sh scripts/run_synthetic_unseen_side_14v.sh

   

Experiments on the DTU dataset

Coming soon. Our paper also fine-tunes pixelNeRF on DTU scenes for 1-shot view synthesis.

Citation and acknowledgements

If DietNeRF is relevant to your project, please cite our associated paper:

@InProceedings{Jain_2021_ICCV,
    author    = {Jain, Ajay and Tancik, Matthew and Abbeel, Pieter},
    title     = {Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {5885-5894}
}

This code is based on Yen-Chen Lin's PyTorch implementation of NeRF and the official pixelNeRF code.

Comments
  • The Hotdog result is terriable?

    The Hotdog result is terriable?

    Hi, your job is brilliant and i'm very interested, i follow your description, run the code on the nerf_synthetic hotdog data, but the result I got was terriable, the psnr in your paper is 25.250, but i just got 8.37. and when i train the model, I found the val images were always splited. c2c2aca6-6234-4108-af91-43213fbd6a0f Is there any trick that you dont mentioned when train the model? Thanks!

    opened by YangHaibo01 1
  • Consistency Loss - Question on the implementation

    Consistency Loss - Question on the implementation

    Assuming

              args.N_importance = 0
              args.consistency_model_type.startswith('clip_vit') = True
    

    Is it right to say that the consistency loss will compare the first pixel row of the first image of the training batch with the first pixel row of a random image in the target batch ?

                with torch.no_grad():
                      targets_resize_model = F.interpolate(targets, (args.consistency_size, args.consistency_size), ....)
                      target_embeddings = embed(targets_resize_model)  # [N_images , Width , Height]
                target_emb = target_embeddings[:, 0]  # from all images take the first rows
                target_i = np.random.randint(target_emb.shape[0]) 
                target_emb = target_emb[target_i] # sample a random image
    
    
                rgbs_resize_c = F.interpolate(rgbs, size=(args.consistency_size, args.consistency_size), mode=args.pixel_interp_mode)
                rendered_embeddings = embed(rgbs_resize_c) # [N_images , Width , Height]
                rendered_embedding = rendered_embeddings[0]  # get the first image
                rendered_emb = rendered_embedding[0] # get the first row of pixels
    
                consistency_loss = -torch.cosine_similarity(target_emb, rendered_emb, dim=-1)
    
    
    opened by pbonazzi 0
  • Strange results on the Lego dataset

    Strange results on the Lego dataset

    Hi, thanks for sharing such great work! I followed the instructions to install dependencies and place datasets. Since I only wanted to see the results for the Lego in nerf-synthesis dataset, I modified the scripts and only make the lego line remain. It runs all good but the results after the 8 views train and ft are strange, just like this. 000 017 023

    Could you tell me where is the problem? Thank you very much!

    opened by honghd16 0
  • The meaning of

    The meaning of "Diet"NeRF

    Hi! Thanks for your excellent work! I am wondering what's the meaning of "putting NeRF on a diet", does it mean less input views or something else?

    opened by ghy0324 0
  • Release checkpoints

    Release checkpoints

    Hi, your work is excellent and I'm really interested in it :) Would you mind sharing the pretrained model so that I can compare to it directly?

    Thanks.

    opened by sj-li 1
Owner
Ajay Jain
AI PhD at Berkeley
Ajay Jain
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware

Xuanmeng Zhang 78 Dec 10, 2022
Instant-nerf-pytorch - NeRF trained SUPER FAST in pytorch

instant-nerf-pytorch This is WORK IN PROGRESS, please feel free to contribute vi

null 94 Nov 22, 2022
Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

THUNLP 319 Dec 30, 2022
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

T-Few This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learni

null 220 Dec 31, 2022
Pytorch implementation of few-shot semantic image synthesis

Few-shot Semantic Image Synthesis Using StyleGAN Prior Our method can synthesize photorealistic images from dense or sparse semantic annotations using

null 40 Sep 26, 2022
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Moustafa Meshry 16 Oct 5, 2022
SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis

SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis Pretrained Models In this work, we created synthetic tissue

Emirhan Kurtuluş 1 Feb 7, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

face-vid2vid Usage Dataset Preparation cd datasets wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl chmod a+rx youtube-dl python load_

worstcoder 68 Dec 30, 2022
ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers Official implementation of ViewFormer. ViewFormer is a NeRF-free neural rend

Jonáš Kulhánek 169 Dec 30, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

null 551 Dec 29, 2022
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Fre-GAN Vocoder Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Training: python train.py --config config.json Citation: @misc{kim2021frega

Rishikesh (ऋषिकेश) 93 Dec 17, 2022
The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

Jake YANG 62 Nov 21, 2022
From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

SESNet for remote sensing image change detection It is the implementation of the paper: "SESNet: A Semantically Enhanced Siamese Network for Remote Se

null 1 May 24, 2022
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021) PyTorch implementation of SnapMix | paper Method Overview Cite

DavidHuang 126 Dec 30, 2022
SeMask: Semantically Masked Transformers for Semantic Segmentation.

SeMask: Semantically Masked Transformers Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi This repo co

Picsart AI Research (PAIR) 186 Dec 30, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022