"SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image", Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang

Overview

SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image

[Paper] [Website]

Pipeline

Code

Environment

pip install -r requirements.txt

Dataset Preparation

Please download the datasets from these links:

Please download the depth from here: https://drive.google.com/drive/folders/13Lc79Ox0k9Ih2o0Y9e_g_ky41Nx40eJw?usp=sharing

Training

If you meet OOM issue, try:

  1. enable precision=16
  2. reduce the patch size --patch_size (or --patch_size_x, --patch_size_y) and enlarge the stride size --sH, --sW
NeRF synthetic
  • Step 1

    python train.py  --dataset_name blender_ray_patch_1image_rot3d  --root_dir  ../../dataset/nerf_synthetic/lego   --N_importance 64 --img_wh 400 400 --num_epochs 3000 --batch_size 1  --optimizer adam --lr 2e-4  --lr_scheduler steplr --decay_step 1000 2000 --decay_gamma 0.5  --exp_name lego_s6 --with_ref --patch_size 64 --sW 6 --sH 6 --proj_weight 1 --depth_smooth_weight 0  --dis_weight 0 --num_gpus 4 --load_depth --depth_type nerf --model sinnerf --depth_weight 8 --vit_weight 10 --scan 4
    
  • Step 2

    python train.py  --dataset_name blender_ray_patch_1image_rot3d  --root_dir  ../../dataset/nerf_synthetic/lego   --N_importance 64 --img_wh 400 400 --num_epochs 3000 --batch_size 1  --optimizer adam --lr 1e-4  --lr_scheduler steplr --decay_step 1000 2000 --decay_gamma 0.5  --exp_name lego_s6_4ft --with_ref --patch_size 64 --sW 4 --sH 4 --proj_weight 1 --depth_smooth_weight 0.1  --dis_weight 0.1 --num_gpus 4 --load_depth --depth_type nerf --model sinnerf --depth_weight 8 --vit_weight 0 --pt_model xxx.ckpt --nerf_only  --scan 4
    
LLFF
  • Step 1

    python train.py  --dataset_name llff_ray_patch_1image_proj  --root_dir  ../../dataset/nerf_llff_data/room   --N_importance 64 --img_wh 504 378 --num_epochs 3000 --batch_size 1  --optimizer adam --lr 2e-4  --lr_scheduler steplr --decay_step 1000 2000 --decay_gamma 0.5  --exp_name llff_room_s4 --with_ref --patch_size_x 63 --patch_size_y 84 --sW 4 --sH 4 --proj_weight 1 --depth_smooth_weight 0  --dis_weight 0 --num_gpus 4 --load_depth --depth_type nerf --model sinnerf --depth_weight 8 --vit_weight 10
    
  • Step 2

    python train.py  --dataset_name llff_ray_patch_1image_proj  --root_dir  ../../dataset/nerf_llff_data/room   --N_importance 64 --img_wh 504 378 --num_epochs 3000 --batch_size 1  --optimizer adam --lr 1e-4  --lr_scheduler steplr --decay_step 1000 2000 --decay_gamma 0.5  --exp_name llff_room_s4_2ft --with_ref --patch_size_x 63 --patch_size_y 84 --sW 2 --sH 2 --proj_weight 1 --depth_smooth_weight 0.1  --dis_weight 0.1 --num_gpus 4 --load_depth --depth_type nerf --model sinnerf --depth_weight 8 --vit_weight 0 --pt_model xxx.ckpt --nerf_only
    
DTU
  • Step 1

    python train.py  --dataset_name dtu_proj  --root_dir  ../../dataset/mvs_training/dtu   --N_importance 64 --img_wh 640 512 --num_epochs 3000 --batch_size 1  --optimizer adam --lr 2e-4  --lr_scheduler steplr --decay_step 1000 2000 --decay_gamma 0.5  --exp_name dtu_scan4_s8 --with_ref --patch_size_y 70 --patch_size_x 56 --sW 8 --sH 8 --proj_weight 1 --depth_smooth_weight 0  --dis_weight 0 --num_gpus 4 --load_depth --depth_type nerf --model sinnerf --depth_weight 8 --vit_weight 10 --scan 4
    
  • Step 2

    python train.py  --dataset_name dtu_proj  --root_dir  ../../dataset/mvs_training/dtu   --N_importance 64 --img_wh 640 512 --num_epochs 3000 --batch_size 1  --optimizer adam --lr 1e-4  --lr_scheduler steplr --decay_step 1000 2000 --decay_gamma 0.5  --exp_name dtu_scan4_s8_4ft --with_ref --patch_size_y 70 --patch_size_x 56 --sW 4 --sH 4 --proj_weight 1 --depth_smooth_weight 0.1  --dis_weight 0.1 --num_gpus 4 --load_depth --depth_type nerf --model sinnerf --depth_weight 8 --vit_weight 0 --pt_model xxx.ckpt --nerf_only  --scan 4
    

More finetuning with smaller strides benefits reconstruction quality.

Testing

python eval.py  --dataset_name llff  --root_dir /dataset/nerf_llff_data/room --N_importance 64 --img_wh 504 378 --model nerf --ckpt_path ckpts/room.ckpt --timestamp test

Acknowledgement

Codebase based on https://github.com/kwea123/nerf_pl . Thanks for sharing!

Citation

If you find this repo is helpful, please cite:


@InProceedings{Xu_2022_SinNeRF,
author = {Xu, Dejia and Jiang, Yifan and Wang, Peihao and Fan, Zhiwen and Shi, Humphrey and Wang, Zhangyang},
title = {SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image},
journal={arXiv preprint arXiv:2204.00928},
year={2022}
}

Comments
  • There is a problem in image warping on LLFF dataset.

    There is a problem in image warping on LLFF dataset.

    I appreciate your sharing of the great work. I have a question. I have been trying to reproduce your work. On the LLFF dataset, however, I can't obtain a reliable warping image by using the depth maps you provided on this website. The warping error seems critical when comparing the warped and target images. Although I aligned with your warping code, it doesn't work. I also test the image warping on the Blender dataset and it works well. Is there an additional process such as depth scaling? If so, please let me know.

    opened by skyir0n 10
  • problems in load vit

    problems in load vit

    Hi, thank for your exciting work,but When I tried to train in the room scene, I had the following problems when loading the VIT model, could you give me some suggestions?

    ~/NeRFs/SinNeRF$ python train.py --dataset_name llff_ray_patch_1image_proj --root_dir data/nerf_llff_data/room --N_importance 64 --img_wh 504 378 --num_epochs 3000 --batch_size 1 --optimizer adam --lr 2e-4 --lr_scheduler steplr --decay_step 1000 2000 --decay_gamma 0.5 --exp_name llff_room_s4 --with_ref --patch_size_x 63 --patch_size_y 84 --sW 4 --sH 4 --proj_weight 1 --depth_smooth_weight 0 --dis_weight 0 --num_gpus 4 --load_depth --depth_type nerf --model sinnerf --depth_weight 8 --vit_weight 10 Namespace(N_importance=64, N_samples=64, angle=30, batch_size=1, chunk=32768, ckpt_path=None, dataset_name='llff_ray_patch_1image_proj', decay_gamma=0.5, decay_step=[1000, 2000], depth_anneal=False, depth_smooth_weight=0.0, depth_type='nerf', depth_weight=8.0, dis_weight=0.0, dloss='hinge', exp_name='llff_room_s4', img_wh=[504, 378], load_depth=True, loss_type='mse', lr=0.0002, lr_scheduler='steplr', model='sinnerf', momentum=0.9, nH=32, nW=32, nerf_only=False, noise_std=1.0, num_epochs=3000, num_gpus=4, optimizer='adam', patch_loss='mse', patch_size=-1, patch_size_x=63, patch_size_y=84, perturb=1.0, poly_exp=0.9, prefixes_to_ignore=['loss'], proj_weight=1.0, pt_model=None, repeat=1, root_dir='data/nerf_llff_data/room', sH=4, sW=4, scan=4, spheric_poses=False, use_disp=False, vit_weight=10.0, warmup_epochs=0, warmup_multiplier=1.0, weight_decay=0, with_ref=True) Using cache found in /home/zhangzhongwei18/.cache/torch/hub/facebookresearch_dino_main Traceback (most recent call last): File "train.py", line 19, in system = SinNeRF(hparams) File "/home/zhangzhongwei18/NeRFs/SinNeRF/models/sinnerf.py", line 148, in init self.ext = VitExtractor( File "/home/zhangzhongwei18/NeRFs/SinNeRF/models/extractor.py", line 22, in init self.model = torch.hub.load( File "/home/zhangzhongwei18/.custom/cuda-10.2-cudnn8-devel-ubuntu18.04-pytorch1.8.0_full_tensorboard/envs/sinnerf/lib/python3.8/site-packages/torch/hub.py", line 404, in load model = _load_local(repo_or_dir, model, *args, **kwargs) File "/home/zhangzhongwei18/.custom/cuda-10.2-cudnn8-devel-ubuntu18.04-pytorch1.8.0_full_tensorboard/envs/sinnerf/lib/python3.8/site-packages/torch/hub.py", line 430, in _load_local hub_module = _import_module(MODULE_HUBCONF, hubconf_path) File "/home/zhangzhongwei18/.custom/cuda-10.2-cudnn8-devel-ubuntu18.04-pytorch1.8.0_full_tensorboard/envs/sinnerf/lib/python3.8/site-packages/torch/hub.py", line 76, in import_module spec.loader.exec_module(module) File "", line 783, in exec_module File "", line 219, in call_with_frames_removed File "/home/zhangzhongwei18/.cache/torch/hub/facebookresearch_dino_main/hubconf.py", line 17, in import vision_transformer as vits File "/home/zhangzhongwei18/.cache/torch/hub/facebookresearch_dino_main/vision_transformer.py", line 24, in from utils import trunc_normal ImportError: cannot import name 'trunc_normal' from 'utils' (/home/zhangzhongwei18/NeRFs/SinNeRF/utils/init.py)

    opened by dlutzzw 9
  • Some minor issues and loss=nan problem

    Some minor issues and loss=nan problem

    Hi, there!Sorry to reply you so late, I have been reading your code and running experiments recently.

    But I ran into some small problems as follows,

    1. When running the test code, the following code will make an error

    python eval.py  \
           --dataset_name blender_ray_patch_1image_rot3d  \
           --root_dir ./synthetic_SinNeRF/lego/  \
           --N_importance 64 --img_wh 400 400 --model nerf \
           --ckpt_path ./ckpts/lego_s6_4ft/last.ckpt \
           --timestamp test
    

    When the default value of the --split is test, an error will be reported here. note that frame https://github.com/VITA-Group/SinNeRF/blob/6f101f924fe9ba7793df5a9bbc52b2c82423e251/datasets/blender_ray_patch_1image_rot3d.py#L540


    2. The dtu file you uploaded is missing a part or the code is wrong? image https://github.com/VITA-Group/SinNeRF/blob/6f101f924fe9ba7793df5a9bbc52b2c82423e251/datasets/dtu_proj.py#L433-L434


    3. loss nan problem I am running the latest code on an RTX 3090, 24G, and environments are created using environment.yaml, but is still OOM, so I adjusted the patch_size, precision and --sH, --sW according to the README.

    I set precision=16, and --sH, --sW remain the same(=6).

    I found that the loss=nan problem will appear when the patch_size is too small, such as patch_size=8 or patch_size=16, even patch_size=32.

    It works(without loss=nan) when patch_size=50, but that's not a good number is it?

    I would be grateful if you could provide advice on how to deal with this. Thank you !

    opened by happysxpp 8
  • Confusing results after step-2 training.

    Confusing results after step-2 training.

    Hi! Thanks for your great work and implementation!

    I'm currently trying your code on nerf_synthetic (lego) and dtu (scan4), while the results after the two-stage training are confusing.

    Specifically, the evaluation psnr for lego is 20.6 by the end of step 1, while it drops to 14.9 after step 2 training. The same thing happens to dtu_scan_4, where the evaluation psnr is around 15.0 after step 1 and drops to 11.8 after step 2. The visualization results for lego are as below.

    I'd appreciate it if you could provide some idea on this phenomenon and how to fix this problem. Thank you!

    lego after step 1 002 lego after step 2 010

    opened by wutong16 7
  • nan

    nan

    I'm sorry to bother you. I'm having some problems running your code on my own machine and I hope you can help me out (I'm using 4 A5000 cards for training. Other than that, all other parameters are default). 1, in the NeRF synthetic dataset on step1 training, in the training to more than 1500 steps, all the losses become 'nan', is this normal? 2, after this situation of 1, does it mean that the model has been trained? Should I stop the training? 3, There are several different pth files in the 'ckpts/lego_s6' folder, which one should I choose as the training weights for the second step? 4、You mentioned supplementary material in your paper, but I didn't find the relevant link, can you provide it? I am looking forward to your reply, thank you very much.

    opened by xiaoyudanaa 6
  • Forward warping

    Forward warping

    Dear author Thank you for your great work. And I am trying to incorporate your work into my pipeline. I found a problem that you use forward warping to get the new RGB information and the new depth. But according to my understanding, forward warping will create holes and loss accuracy. Please point it out if I am wrong. If you stick to the plan I mentioned, how do you solve the problem related with forward warping? Thank you so much!

    opened by ChaoyiZh 4
  • Testing poses for the result videos

    Testing poses for the result videos

    Thanks again for your great work. Did you provide testing poses for the result videos somewhere in your code? If so, please let me know where I can find it. Thanks for your answer in advance.

    opened by ayclove 3
  • How do you get the depth maps of RGB images like LLFF

    How do you get the depth maps of RGB images like LLFF

    Hi, thanks for the great work!

    I want to know how you got the depth maps in your datasets. As far as I know, some original datasets like LLFF do not provide depth information.

    Thanks!

    opened by tmpss93172 2
  • Where is the decay strategy of stride in the code ?

    Where is the decay strategy of stride in the code ?

    Hi, Thanks for releasing the code of the awesome work. I am curious about the decay strategy of stride in Progressive Strided Ray Sampling. However, I can not find the correct place in the code. It seems that the stride is controlled by sW,sH in Dataset class but sW,sH are not modified during training. The same problem happens for the vit_weight and dis_weight, which are weights of global structure prior loss and local texture guidance loss.

    The fragment in the paper about thedecay strategy of stride: image

    The fragment in the paper about the annealling of loss weights: image

    opened by 1612190130 2
  • Regarding results in the table 2

    Regarding results in the table 2

    Thank you so much for sharing your great work. Regarding the results in table 2, I have the below questions.

    Q1. Which scenes did you use for the evaluation and what is the reference camera ID? According to the files and codes, I am assuming that you evaluated on 19 scenes, such as below, [1, 3, 4, 5, 6, 8, 9, 14, 15, 30, 34, 40, 55, 60, 63, 82, 84, 103, 105] Please correct me if I am wrong.

    Q2. Did you use any object mask for the evaluation as RegNeRF did? If so, would you please let me know how you created the mask?

    Thank you very much for your kind reply in advance.

    opened by ayclove 1
  • Question about angle variable

    Question about angle variable

    I have a question about the blender dataset. the dataset definition contains the value 30, what does this angle indicate?

    class Blender_ray_patch_1image_rot3d_camera_Dataset(Dataset):
        def __init__(self, root_dir, split='train', img_wh=(400, 400), patch_size=-1, factor=1, test_crop=False, with_ref=False, repeat=1, load_depth=False, depth_type='nerf', sH=1, sW=1, angle=30, **kwargs):
    
    opened by HannahHaensen 1
  • Pretrained models

    Pretrained models

    Thanks for sharing great work. You have provided one pretrained model, 'room.ckpt'. Do you have any plan to share other models as well? If possible please share other ckpts trained on synthetic 360 and DTU. Thank you so much in advance.

    opened by ayclove 0
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study Codes for [Preprint] Bag of Tricks for Training Deeper Graph

VITA 99 Nov 29, 2022
[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Self-Damaging Contrastive Learning Introduction The recent breakthrough achieved by contrastive learning accelerates the pace for deploying unsupervis

VITA 50 Sep 8, 2022
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [PDF] Wuyang Chen, Xinyu Gong, Zhangyang Wang In ICLR 2

VITA 156 Nov 28, 2022
[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Undistillable: Making A Nasty Teacher That CANNOT teach students "Undistillable: Making A Nasty Teacher That CANNOT teach students" Haoyu Ma, Tianlong

VITA 69 Nov 11, 2022
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 39 Dec 7, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 57 Dec 1, 2022
[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Graph Contrastive Learning Automated PyTorch implementation for Graph Contrastive Learning Automated [talk] [poster] [appendix] Yuning You, Tianlong C

Shen Lab at Texas A&M University 80 Nov 23, 2022
[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Chasing Sparsity in Vision Transformers: An End-to-End Exploration Codes for [Preprint] Chasing Sparsity in Vision Transformers: An End-to-End Explora

VITA 63 Dec 2, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
PyTorch implementation of Super SloMo by Jiang et al.

Super-SloMo PyTorch implementation of "Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation" by Jiang H., Sun

Avinash Paliwal 2.9k Dec 3, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 300 Nov 30, 2022
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 2, 2022
[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

SSH: A Self-Supervised Framework for Image Harmonization (ICCV 2021) code for SSH Representative Examples Main Pipeline RealHM DataSet Google Drive Pr

VITA 86 Dec 2, 2022
[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study Codes for [Preprint] Bag of Tricks for Training Deeper Graph

VITA 99 Nov 29, 2022
[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Self-Damaging Contrastive Learning Introduction The recent breakthrough achieved by contrastive learning accelerates the pace for deploying unsupervis

VITA 50 Sep 8, 2022
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [PDF] Wuyang Chen, Xinyu Gong, Zhangyang Wang In ICLR 2

VITA 156 Nov 28, 2022
[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Undistillable: Making A Nasty Teacher That CANNOT teach students "Undistillable: Making A Nasty Teacher That CANNOT teach students" Haoyu Ma, Tianlong

VITA 69 Nov 11, 2022
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 39 Dec 7, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 57 Dec 1, 2022
[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Graph Contrastive Learning Automated PyTorch implementation for Graph Contrastive Learning Automated [talk] [poster] [appendix] Yuning You, Tianlong C

Shen Lab at Texas A&M University 80 Nov 23, 2022