Dynamic View Synthesis from Dynamic Monocular Video

Overview

Dynamic View Synthesis from Dynamic Monocular Video

arXiv

Project Website | Video | Paper

Dynamic View Synthesis from Dynamic Monocular Video
Chen Gao, Ayush Saraf, Johannes Kopf, Jia-Bin Huang
in ICCV 2021

Setup

The code is test with

  • Linux (tested on CentOS Linux release 7.4.1708)
  • Anaconda 3
  • Python 3.7.11
  • CUDA 10.1
  • 1 V100 GPU

To get started, please create the conda environment dnerf by running

conda create --name dnerf
conda activate dnerf
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboard scipy opencv -c pytorch
pip install imageio configargparse timm lpips

and install COLMAP manually. Then download MiDaS and RAFT weights

ROOT_PATH=/path/to/the/DynamicNeRF/folder
cd $ROOT_PATH
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/weights.zip
unzip weights.zip
rm weights.zip

Dynamic Scene Dataset

The Dynamic Scene Dataset is used to quantitatively evaluate our method. Please download the pre-processed data by running:

cd $ROOT_PATH
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/data.zip
unzip data.zip
rm data.zip

Training

You can train a model from scratch by running:

cd $ROOT_PATH/
python run_nerf.py --config configs/config_Balloon2.txt

Every 100k iterations, you should get videos like the following examples

The novel view-time synthesis results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/novelviewtime. novelviewtime

The reconstruction results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset. testset

The fix-view-change-time results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset_view000. testset_view000

The fix-time-change-view results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset_time000. testset_time000

Rendering from pre-trained models

We also provide pre-trained models. You can download them by running:

cd $ROOT_PATH/
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/logs.zip
unzip logs.zip
rm logs.zip

Then you can render the results directly by running:

python run_nerf.py --config configs/config_Balloon2.txt --render_only --ft_path $ROOT_PATH/logs/Balloon2_H270_DyNeRF_pretrain/300000.tar

Evaluating our method and others

Our goal is to make the evaluation as simple as possible for you. We have collected the fix-view-change-time results of the following methods:

NeRF
NeRF + t
Yoon et al.
Non-Rigid NeRF
NSFF
DynamicNeRF (ours)

Please download the results by running:

cd $ROOT_PATH/
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/results.zip
unzip results.zip
rm results.zip

Then you can calculate the PSNR/SSIM/LPIPS by running:

cd $ROOT_PATH/utils
python evaluation.py
PSNR / LPIPS Jumping Skating Truck Umbrella Balloon1 Balloon2 Playground Average
NeRF 20.99 / 0.305 23.67 / 0.311 22.73 / 0.229 21.29 / 0.440 19.82 / 0.205 24.37 / 0.098 21.07 / 0.165 21.99 / 0.250
NeRF + t 18.04 / 0.455 20.32 / 0.512 18.33 / 0.382 17.69 / 0.728 18.54 / 0.275 20.69 / 0.216 14.68 / 0.421 18.33 / 0.427
NR NeRF 20.09 / 0.287 23.95 / 0.227 19.33 / 0.446 19.63 / 0.421 17.39 / 0.348 22.41 / 0.213 15.06 / 0.317 19.69 / 0.323
NSFF 24.65 / 0.151 29.29 / 0.129 25.96 / 0.167 22.97 / 0.295 21.96 / 0.215 24.27 / 0.222 21.22 / 0.212 24.33 / 0.199
Ours 24.68 / 0.090 32.66 / 0.035 28.56 / 0.082 23.26 / 0.137 22.36 / 0.104 27.06 / 0.049 24.15 / 0.080 26.10 / 0.082

Please note:

  1. The numbers reported in the paper are calculated using TF code. The numbers here are calculated using this improved Pytorch version.
  2. In Yoon's results, the first frame and the last frame are missing. To compare with Yoon's results, we have to omit the first frame and the last frame. To do so, please uncomment line 72 and comment line 73 in evaluation.py.
  3. We obtain the results of NSFF and NR NeRF using the official implementation with default parameters.

Train a model on your sequence

  1. Set some paths
ROOT_PATH=/path/to/the/DynamicNeRF/folder
DATASET_NAME=name_of_the_video_without_extension
DATASET_PATH=$ROOT_PATH/data/$DATASET_NAME
  1. Prepare training images and background masks from a video.
cd $ROOT_PATH/utils
python generate_data.py --videopath /path/to/the/video
  1. Use COLMAP to obtain camera poses.
colmap feature_extractor \
--database_path $DATASET_PATH/database.db \
--image_path $DATASET_PATH/images_colmap \
--ImageReader.mask_path $DATASET_PATH/background_mask \
--ImageReader.single_camera 1

colmap exhaustive_matcher \
--database_path $DATASET_PATH/database.db

mkdir $DATASET_PATH/sparse
colmap mapper \
    --database_path $DATASET_PATH/database.db \
    --image_path $DATASET_PATH/images_colmap \
    --output_path $DATASET_PATH/sparse \
    --Mapper.num_threads 16 \
    --Mapper.init_min_tri_angle 4 \
    --Mapper.multiple_models 0 \
    --Mapper.extract_colors 0
  1. Save camera poses into the format that NeRF reads.
cd $ROOT_PATH/utils
python generate_pose.py --dataset_path $DATASET_PATH
  1. Estimate monocular depth.
cd $ROOT_PATH/utils
python generate_depth.py --dataset_path $DATASET_PATH --model $ROOT_PATH/weights/midas_v21-f6b98070.pt
  1. Predict optical flows.
cd $ROOT_PATH/utils
python generate_flow.py --dataset_path $DATASET_PATH --model $ROOT_PATH/weights/raft-things.pth
  1. Obtain motion mask (code adapted from NSFF).
cd $ROOT_PATH/utils
python generate_motion_mask.py --dataset_path $DATASET_PATH
  1. Train a model. Please change expname and datadir in configs/config.txt.
cd $ROOT_PATH/
python run_nerf.py --config configs/config.txt

Explanation of each parameter:

  • expname: experiment name
  • basedir: where to store ckpts and logs
  • datadir: input data directory
  • factor: downsample factor for the input images
  • N_rand: number of random rays per gradient step
  • N_samples: number of samples per ray
  • netwidth: channels per layer
  • use_viewdirs: whether enable view-dependency for StaticNeRF
  • use_viewdirsDyn: whether enable view-dependency for DynamicNeRF
  • raw_noise_std: std dev of noise added to regularize sigma_a output
  • no_ndc: do not use normalized device coordinates
  • lindisp: sampling linearly in disparity rather than depth
  • i_video: frequency of novel view-time synthesis video saving
  • i_testset: frequency of testset video saving
  • N_iters: number of training iterations
  • i_img: frequency of tensorboard image logging
  • DyNeRF_blending: whether use DynamicNeRF to predict blending weight
  • pretrain: whether pre-train StaticNeRF

License

This work is licensed under MIT License. See LICENSE for details.

If you find this code useful for your research, please consider citing the following paper:

@inproceedings{Gao-ICCV-DynNeRF,
    author    = {Gao, Chen and Saraf, Ayush and Kopf, Johannes and Huang, Jia-Bin},
    title     = {Dynamic View Synthesis from Dynamic Monocular Video},
    booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
    year      = {2021}
}

Acknowledgments

Our training code is build upon NeRF, NeRF-pytorch, and NSFF. Our flow prediction code is modified from RAFT. Our depth prediction code is modified from MiDaS.

Comments
  • Windows Support

    Windows Support

    Hi all,

    Has anyone tried running the code on Windows? Does the framework use any Linux specific libraries? Is there any information regarding the average time necessary to run the forward-pass with the pretrained models?

    Best Regards -E

    opened by ernlavr 4
  • An issue about generating mask

    An issue about generating mask

    Hi, thanks for your great work.

    I am wondering why only consider the foreground as a person.

    As shown in the umbrella case, the umbrella is not a person, but also successfully segmented. image image

    opened by ShaoTengLiu 2
  • TypeError: expected Tensor as element 0 in argument 0, but got tuple

    TypeError: expected Tensor as element 0 in argument 0, but got tuple

    Step: 49993, Loss: 0.013911506161093712, Time: 0.13519692420959473, chain_5frames: False, expname: Balloon2_H270_DyNeRF_pretrain Step: 49994, Loss: 0.012133732438087463, Time: 0.1316220760345459, chain_5frames: False, expname: Balloon2_H270_DyNeRF_pretrain Step: 49995, Loss: 0.01327237207442522, Time: 0.1362135410308838, chain_5frames: False, expname: Balloon2_H270_DyNeRF_pretrain Step: 49996, Loss: 0.01176927424967289, Time: 0.12955403327941895, chain_5frames: False, expname: Balloon2_H270_DyNeRF_pretrain Step: 49997, Loss: 0.012316515669226646, Time: 0.13320422172546387, chain_5frames: False, expname: Balloon2_H270_DyNeRF_pretrain Step: 49998, Loss: 0.012100623920559883, Time: 0.1324453353881836, chain_5frames: False, expname: Balloon2_H270_DyNeRF_pretrain Step: 49999, Loss: 0.011780014261603355, Time: 0.13549208641052246, chain_5frames: False, expname: Balloon2_H270_DyNeRF_pretrain Traceback (most recent call last): File "run_nerf.py", line 763, in train() File "run_nerf.py", line 452, in train **render_kwargs_train) File "/DynamicNeRF/render_utils.py", line 98, in render rays, chunk, **kwargs) File "/DynamicNeRF/render_utils.py", line 26, in batchify_rays all_ret = {k: torch.cat(all_ret[k], 0) for k in all_ret} File "/DynamicNeRF/render_utils.py", line 26, in all_ret = {k: torch.cat(all_ret[k], 0) for k in all_ret} TypeError: expected Tensor as element 0 in argument 0, but got tuple

    opened by zhywanna 2
  • no video output

    no video output

    Dear gaochen, thanks for shaing your great job! I followed your lead and trained around 240k step currently, but i got no video as you mentioned in readme Every 100k iterations, you should get videos like the following examples and the filename is not the same as your's image

    opened by zhywanna 2
  • Missing dependency

    Missing dependency

    Hi @gaochen315 ,

    Thanks a lot for releasing the code for DynamicNeRF! This is one of the smoothest experiences I had so far when it comes to running a Github repo - great work!

    Just one small thing I noticed is that scikit-image seems to be missing in the dependencies. When I run the motion mask generation script, an import error is raised.

    opened by weders 1
  • Link broken for datasets

    Link broken for datasets

    Hey, i'm attempting to recreate the results from the paper, and i'm trying to download the dataset in order to test it out, and it seems the link https://www-users.cse.umn.edu/~jsyoon/dynamic_synth/ is broken

    is there any other source for the data?

    opened by Fortunanto 0
  • KeyError: 'network_fn_d_state_dict'

    KeyError: 'network_fn_d_state_dict'

    2022-09-07 15-49-13 的屏幕截图 Fixing random seed 1 factor 2 (270, 480, 3, 12) (270, 480, 12) (270, 480, 12) (270, 480, 2, 12) (270, 480, 12) Loaded ./data/Balloon2/ 45.061882503348485 71.07477444180932 Loaded llff (12, 270, 480, 3) (60, 3, 5) [270. 480. 418.96216] ./data/Balloon2/ DEFINING BOUNDS NEAR FAR 0.0 1.0 Found ckpts ['./logs/Balloon2_H270_DyNeRF_pretrain/300000.tar', './logs/Balloon2_H270_DyNeRF_pretrain/Pretrained_S.tar'] Reloading from ./logs/Balloon2_H270_DyNeRF_pretrain/Pretrained_S.tar Traceback (most recent call last): File "run_nerf.py", line 768, in train() File "run_nerf.py", line 215, in train render_kwargs_train, render_kwargs_test, start, grad_vars, optimizer = create_nerf(args) File "/home/chenghuan/DynamicNeRF/folder/run_nerf_helpers.py", line 355, in create_nerf model_d.load_state_dict(ckpt['network_fn_d_state_dict']) KeyError: 'network_fn_d_state_dict'

    Hi, thanks for your great work.

    I wonder why there is no 'network_fn_d_state_dict' in ckpt.

    Your model training time is slow . Can you take advantage of instant ngp-related work to increase speed?

    我很疑惑为什么在我运行时ckpt中没有'network_fn_d_state_dict',您的模型训练时间很慢,我们能否利用 instant ngp 相关工作来提升您的模型的速度.期待您的回复.

    opened by AIBUWAN 2
  • run on my own data

    run on my own data

    Hello, I tried your great work on my own video record by phone. But I failed on the following colmap's command colmap mapper \ --database_path $DATASET_PATH/database.db \ --image_path $DATASET_PATH/images_colmap \ --output_path $DATASET_PATH/sparse \ --Mapper.num_threads 16 \ --Mapper.init_min_tri_angle 4 \ --Mapper.multiple_models 0 \ --Mapper.extract_colors 0 like this 355d17c61712147ea7347bda2010a02

    firstly, I have some difficult to understand what does these Mapper.init_min_tri_angle Mapper.multiple_models Mapper.extract_colors mean? And what's your purpose to alter those parameters rather than using default ? Secondly, I removed the last 4 parameters, things go well and successfully got my camera.bin. I wonder how those affect the result. And I'm worry about by remove the last 4 parameters Mapper.num_threads Mapper.init_min_tri_angle Mapper.multiple_models Mapper.extract_colors, whether the extrinsic camera poses I get are still accurate or not ?

    Hoping for your reply, thanks!

    opened by zhywanna 2
Owner
Chen Gao
Ph.D. student at Virginia Tech Vision and Learning Lab (@vt-vl-lab). Former intern at Google and Facebook Research.
Chen Gao
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

null 118 Dec 26, 2022
PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images This r

null 63 Dec 16, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 4, 2023
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

Main repo for ECCV 2020 paper MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images. visual.cs.brown.edu/matryodshka

Brown University Visual Computing Group 75 Dec 13, 2022
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
Out-of-boundary View Synthesis towards Full-frame Video Stabilization

Out-of-boundary View Synthesis towards Full-frame Video Stabilization Introduction | Update | Results Demo | Introduction This repository contains the

null 25 Oct 10, 2022
[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

NeRFlow [ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing Datasets The pouring dataset used for experiments can be download he

null 44 Dec 20, 2022
PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"

FIERY This is the PyTorch implementation for inference and training of the future prediction bird's-eye view network as described in: FIERY: Future In

Wayve 406 Dec 24, 2022
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

Visual Understanding Lab @ Samsung AI Center Moscow 190 Dec 30, 2022
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

null 304 Jan 3, 2023
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

ZJU3DV 1.4k Dec 30, 2022
Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Contact and Human Dynamics from Monocular Video This is the official implementation for the ECCV 2020 spotlight paper by Davis Rempe, Leonidas J. Guib

Davis Rempe 207 Jan 5, 2023
Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

M4Depth This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in M4Depth: A moti

Michaël Fonder 76 Jan 3, 2023
Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

LASR Installation Build with conda conda env create -f lasr.yml conda activate lasr # install softras cd third_party/softras; python setup.py install;

Google 157 Dec 26, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

null 536 Dec 20, 2022