This repository contains the code needed to train Mega-NeRF models and generate the sparse voxel octrees

Overview

Mega-NeRF

This repository contains the code needed to train Mega-NeRF models and generate the sparse voxel octrees used by the Mega-NeRF-Dynamic viewer.

The codebase for the Mega-NeRF-Dynamic viewer can be found here.

Note: This is a preliminary release and there may still be outstanding bugs.

Citation

@misc{turki2021meganerf,
      title={Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs}, 
      author={Haithem Turki and Deva Ramanan and Mahadev Satyanarayanan},
      year={2021},
      eprint={2112.10703},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Demo

Setup

conda env create -f environment.yml
conda activate mega-nerf

The codebase has been mainly tested against CUDA >= 11.1 and V100/2080 Ti/3090 Ti GPUs. 1080 Ti GPUs should work as well although training will be much slower.

Data

Mill 19

  • The Building scene can be downloaded here.
  • The Rubble scene can be downloaded here.

UrbanScene 3D

  1. Download the raw photo collections from the UrbanScene3D dataset
  2. Download the refined camera poses for one of the scenes below:
  1. Run python scripts/copy_images.py --image_path $RAW_PHOTO_PATH --dataset_path $CAMERA_POSE_PATH

Quad 6k Dataset

  1. Download the raw photo collections from here.
  2. Download the refined camera poses
  3. Run python scripts/copy_images.py --image_path $RAW_PHOTO_PATH --dataset_path $CAMERA_POSE_PATH

Custom Data

The expected directory structure is:

  • /coordinates.pt: Torch file that should contain the following keys:
    • 'origin_drb': Origin of scene in real-world units
    • 'pose_scale_factor': Scale factor mapping from real-world unit (ie: meters) to [-1, 1] range
  • '/{val|train}/rgbs/': JPEG or PNG images
  • '/{val|train}/metadata/': Image-specific image metadata saved as a torch file. Each image should have a corresponding metadata file with the following file format: {rgb_stem}.pt. Each metadata file should contain the following keys:
    • 'W': Image width
    • 'H': Image height
    • 'intrinsics': Image intrinsics in the following form: [fx, fy, cx, cy]
    • 'c2w': Camera pose. 3x3 camera matrix with the convention used in the original NeRF repo, ie: x: down, y: right, z: backwards, followed by the following transformation: torch.cat([camera_in_drb[:, 1:2], -camera_in_drb[:, :1], camera_in_drb[:, 2:4]], -1)

Training

  1. Generate the training partitions for each submodule: python scripts/create_cluster_masks.py --config configs/mega-nerf/${DATASET_NAME}.yml --dataset_path $DATASET_PATH --output $MASK_PATH --grid_dim $GRID_X $GRID_Y
    • Note: this can be run across multiple GPUs by instead running python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS --max_restarts 0 scripts/create_cluster_masks.py
  2. Train each submodule: python mega_nerf/train.py --config_file configs/mega-nerf/${DATASET_NAME}.yml --exp_name $EXP_PATH --dataset_path $DATASET_PATH --chunk_paths $SCRATCH_PATH --cluster_mask_path ${MASK_PATH}/${SUBMODULE_INDEX}
    • Note: training with against full scale data will write hundreds of GBs / several TBs of shuffled data to disk. You can downsample the training data using train_scale_factor option.
    • Note: we provide a utility script based on parscript to start multiple training jobs in parallel. It can run through the following command: CONFIG_FILE=configs/mega-nerf/${DATASET_NAME}.yaml EXP_PREFIX=$EXP_PATH DATASET_PATH=$DATASET_PATH CHUNK_PREFIX=$SCRATCH_PATH MASK_PATH=$MASK_PATH python -m parscript.dispatcher parscripts/run_8.txt -g $NUM_GPUS
  3. Merge the trained submodules into a unified Mega-NeRF model: python scripts/merge_submodules.py --config_file configs/mega-nerf/${DATASET_NAME}.yaml --ckpt_prefix ${EXP_PREFIX}- --centroid_path ${MASK_PATH}/params.pt --output $MERGED_OUTPUT

Evaluation

Single-GPU evaluation: python mega_nerf/eval.py --config_file configs/nerf/${DATASET_NAME}.yaml --exp_name $EXP_NAME --dataset_path $DATASET_PATH --container_path $MERGED_OUTPUT

Multi-GPU evaluation: python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS mega_nerf/eval.py --config_file configs/nerf/${DATASET_NAME}.yaml --exp_name $EXP_NAME --dataset_path $DATASET_PATH --container_path $MERGED_OUTPUT

Octree Extraction (for use by Mega-NeRF-Dynamic viewer)

python scripts/create_octree.py --config configs/mega-nerf/${DATASET_NAME}.yaml --dataset_path $DATASET_PATH --container_path $MERGED_OUTPUT --output $OCTREE_PATH

Acknowledgements

Large parts of this codebase are based on existing work in the nerf_pl, NeRF++, and Plenoctree repositories. We use svox to serialize our sparse voxel octrees and the generated structures should be largely compatible with that codebase.

Comments
  • Custom dataset

    Custom dataset

    Congrats on your great work and thank you for releasing the code. I am curious to train my custom dataset but get stuck in the dataset preparation process. I also notice that you mention in the paper: "We refine the initial GPS-derived camera poses in the Mill 19 and UrbanScene3D datasets and the estimates provided in the Quad 6k dataset using PixSFM."

    I have the following questions:

    1. How are the initial camera poses derived from drone GPS and IMU sensors? How are the initial GPS-derived camera poses refined using PixSFM? Could you kindly provide the script that can generate camera poses?
    2. Can the camera poses estimated by COLMAP work with Mega-NeRF? Or should also be refined by PixSFM?

    Thanks in advance and really looking forward to your reply! Xingyi

    opened by xingyi-li 12
  • arguments questions

    arguments questions

    I read the paper about mega-nerf, which is amazing. Could I ask some questions like the arguments of the code? For example, what is the mean of '--chunk_paths $SCRATCH_PATH '? I do not see the notes of this parameter.

    opened by city19992 9
  • Question about c2w

    Question about c2w

    Hi, just to confirm the format of c2w matrix. Is it c2w[:3,:3] corresponds to rot matrix in order [x,y,z] and c2w[:3,-1] the camera position in order [z,x,y]? I noticed in _truncate_with_plane_intersection you use rays_o[:, :, 0] < altitude.

    opened by kam1107 7
  • The validation images are also used in training

    The validation images are also used in training

    Hi, thank you for your great open source and quick response to my previous questions. I am trying to train mega-nerf to reproduce the results. I have a question related to the validation images. As in the code below: https://github.com/cmusatyalab/mega-nerf/blob/306e06cc316dd4f5c84d0610308bcbc208228fc3/mega_nerf/runner.py#L620 It seems that the validation images are also used in training. To my understanding, it is because we need to learn the appearance embeddings of the validation images. I would to like confirm this setting and want to ask whether I understand this setting correctly. Thank you very much.

    Best regards, Zhenxing Mi

    opened by MiZhenxing 5
  • how to see results after training?

    how to see results after training?

    thank you for your excellent work! I have a question. I use dataset building and after training i cannot see the results of whole scene. Is there a .mp4 or some images exist to show the training result? Besides: I notice in render_images.py ,which requires input including pose.txt/intrinsics.txt/embeddings.txt. how can i get these ?

    opened by Furenchampion 5
  • pixsfm cameras of Campus dataset

    pixsfm cameras of Campus dataset

    Hi, seems the camera parameter link of Campus has broken (https://storage.cmusatyalab.org/mega-nerf-data/campus-pixsfm.tgz). Could you please update the link so we could fully test on the datasets? Thank you very much.

    opened by MiZhenxing 4
  • CUDA out of memory when creating mask

    CUDA out of memory when creating mask

    Hi I used the quad6 dataset and set the grid_dim to 10, 10

    However, I get the following cuda_outof_memory error: RuntimeError: CUDA out of memory. Tried to allocate 18.31 GiB (GPU 0; 23.70 GiB total capacity; 18.57 GiB already allocated; 3.13 GiB free; 18.58 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

    I'm using 3090 with 8 cards. I dont understand why it caused the out of memory issue.

    Weide

    opened by weidezhang 3
  • Training speed almost irrelevant to the size of dataset

    Training speed almost irrelevant to the size of dataset

    I have trained the building dataset with 1000+ images and a sample of 50 images of the dataset. I trained on 4 3090 GPUs and it took almost 40 hours. However, the train time of the small dataset was almost as same as big one. I was wondering how to speed up this process. Thanks a lot.

    opened by h8c2 3
  • Error while training mill 19 building, centroid 0 in a distributed fashion

    Error while training mill 19 building, centroid 0 in a distributed fashion

    I get this error after a while trying to train (on multiple nodes with multiple gpus each) a model on the building dataset, for chunk/submodule 0 (using the provided cluster masks) Any tips on debugging this? Seems like somewhere there is an issue in a sample.

     28%|██▊       | 140533/500000 [15:32:21<45:35:06,  2.19it/s]
     28%|██▊       | 140534/500000 [15:32:22<50:19:14,  1.98it/s]
     28%|██▊       | 140535/500000 [15:32:23<53:45:04,  1.86it/s]
     28%|██▊       | 140536/500000 [15:32:23<44:03:00,  2.27it/s]
     28%|██▊       | 140537/500000 [15:32:23<49:43:21,  2.01it/s]
     28%|██▊       | 140538/500000 [15:32:24<55:23:35,  1.80it/s]
     28%|██▊       | 140539/500000 [15:32:24<46:40:38,  2.14it/s]
     28%|██▊       | 140540/500000 [15:32:25<46:16:18,  2.16it/s]
     28%|██▊       | 140541/500000 [15:32:25<38:48:16,  2.57it/s]
     28%|██▊       | 140542/500000 [15:32:25<34:51:35,  2.86it/s]
     28%|██▊       | 140543/500000 [15:32:26<38:54:21,  2.57it/s]
     28%|██▊       | 140544/500000 [15:32:26<46:17:54,  2.16it/s]
     28%|██▊       | 140545/500000 [15:32:27<40:27:56,  2.47it/s]
     28%|██▊       | 140546/500000 [15:32:27<46:56:04,  2.13it/s]
     28%|██▊       | 140547/500000 [15:32:28<52:37:22,  1.90it/s]ERROR:torch.distributed.elastic.multiprocessing.errors.error_handler:{
    [stderr]  "message": {
    [stderr]    "message": "ValueError: cannot reshape array of size 42028165 into shape (42107792,)",
    [stderr]    "extraInfo": {
    [stderr]      "py_callstack": "Traceback (most recent call last):\n  File \"/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py\", line 345, in wrapper\n    return f(*args, **kwargs)\n  File \"/tmp/code/mega_nerf/train.py\", line 24, in main\n    Runner(hparams).train()\n  File \"/tmp/code/mega_nerf/runner.py\", line 226, in train\n    dataset.load_chunk()\n  File \"/tmp/code/mega_nerf/datasets/filesystem_dataset.py\", line 80, in load_chunk\n    chosen, self._loaded_rgbs, self._loaded_rays, self._loaded_image_indices = self._chunk_future.result()\n  File \"/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/concurrent/futures/_base.py\", line 438, in result\n    return self.__get_result()\n  File \"/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/concurrent/futures/_base.py\", line 390, in __get_result\n    raise self._exception\n  File \"/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/concurrent/futures/thread.py\", line 52, in run\n    result = self.fn(*self.args, **self.kwargs)\n  File \"/tmp/code/mega_nerf/datasets/filesystem_dataset.py\", line 111, in _load_chunk_inner\n    loaded_pixel_indices = torch.IntTensor(np.load(self._ray_arrays[next_index]))\n  File \"/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/site-packages/numpy/lib/npyio.py\", line 440, in load\n    return format.read_array(fid, allow_pickle=allow_pickle,\n  File \"/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/site-packages/numpy/lib/format.py\", line 787, in read_array\n    array.shape = shape\nValueError: cannot reshape array of size 42028165 into shape (42107792,)\n",
    [stderr]      "timestamp": "1651655215"
    [stderr]    }
    [stderr]  }
    [stderr]}
    [stderr]Traceback (most recent call last):
    [stderr]  File "/tmp/code/mega_nerf/train.py", line 28, in <module>
    [stderr]    main(_get_train_opts())
    [stderr]  File "/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
    [stderr]    return f(*args, **kwargs)
    [stderr]  File "/tmp/code/mega_nerf/train.py", line 24, in main
    [stderr]    Runner(hparams).train()
    [stderr]  File "/tmp/code/mega_nerf/runner.py", line 226, in train
    [stderr]    dataset.load_chunk()
    [stderr]  File "/tmp/code/mega_nerf/datasets/filesystem_dataset.py", line 80, in load_chunk
    [stderr]    chosen, self._loaded_rgbs, self._loaded_rays, self._loaded_image_indices = self._chunk_future.result()
    [stderr]  File "/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/concurrent/futures/_base.py", line 438, in result
    [stderr]    return self.__get_result()
    [stderr]  File "/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    [stderr]    raise self._exception
    [stderr]  File "/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    [stderr]    result = self.fn(*self.args, **self.kwargs)
    [stderr]  File "/tmp/code/mega_nerf/datasets/filesystem_dataset.py", line 111, in _load_chunk_inner
    [stderr]    loaded_pixel_indices = torch.IntTensor(np.load(self._ray_arrays[next_index]))
    [stderr]  File "/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/site-packages/numpy/lib/npyio.py", line 440, in load
    [stderr]    return format.read_array(fid, allow_pickle=allow_pickle,
    [stderr]  File "/azureml-envs/azureml_42fe9568fca52a4c68567983a86d6e0f/lib/python3.9/site-packages/numpy/lib/format.py", line 787, in read_array
    [stderr]    array.shape = shape
    [stderr]ValueError: cannot reshape array of size 42028165 into shape (42107792,)
    
    opened by madratman 3
  • Question about running a demo

    Question about running a demo

    What a great work! May I ask a few questions? If I just want to run a demo, all I need to do is downloading one of the datasets and the pretrained model corresponding to the dataset, and running the create_octree.py to extract the octree and then see the result by the Mega-NeRF-Dynamic viewer, is that correct? Moreover, what's the meaning of --container_path argument and what should I fill in to it? Many thanks!

    opened by tonytonglt 3
  • forward-facing dataset

    forward-facing dataset

    Hi, thanks for the excellent work.

    I want to test the code on a relatively large-scale forward-facing dataset. How to change the code, particularly how to produce the centroids? My understanding is that the parameter ray_altitude_range no longer needs. But I cannot get reasonable cluster mask.

    opened by Harper714 2
  • not clear img results

    not clear img results

    hi, thks for this great work for large scene. i uesd it for my own dataset (500 pics), after training 500k, i get the img from your 'eval_result_rgbs' from the scritpt 'eval.py'. but the result rgb img is not clear enough, do you have any suggestion? any reponse is helpful. thanks! 1

    opened by Calsia 1
  • questions about render_image.py

    questions about render_image.py

    @hturki thanks for your great work! I'm following the tutorial to see its outcome. However, after the evaluation, I can only see the matrixs.txt, which I want is to see some rendered images or videos. The question is I don't really understand how to creat 3 files of the input param in render_image.py, could you pls offer some examples?

    opened by cezarbbb 3
  • How pixel assign to cell?

    How pixel assign to cell?

    Dear @jaharkes @hturki @therealsatya @teiszler

    Thanks the great work! But I have some understing. How to assign a pixel to certain 3D cell? Using the depth of the pixel? There are many sample locations on a ray of a pixel. How to process the sample locations?

    Best, Yingjie CAI

    opened by yjcaimeow 0
  • How to get the pixel color if a training ray it traverses multiple submodules?

    How to get the pixel color if a training ray it traverses multiple submodules?

    This work is excellent. But I have a question, in the paper, you said each submodule can be trained separately, but what if a training ray corresponds to multiple submodules, do you merge the sample points in different modules?

    opened by cwchenwang 0
  • How to visualize and save OBJ?

    How to visualize and save OBJ?

    Hi @hturki ! I was trying to compile mega-nerf-viewer but was stuck in the final stage. Do you have any idea about this problem?

    OS: Ubuntu 20.04 gcc version: 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1) cmake: 3.24.1 libtorch libx11-dev_1.6.9-2ubuntu1.3_amd64 cuda: 11.6

    I have successfully compiled mega-nerf-viewer. Then I ran the following commands with one NVIDIA Tesla A100: ./mega-nerf-viewer ../../OCTREE_PATH.npz --model_path ../../building-pixsfm-8.pt image 屏幕截图 2022-09-21 192622 Thanks!

    opened by chl2 0
Owner
cmusatyalab
cmusatyalab
Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

MS-SVConv : 3D Point Cloud Registration with Multi-Scale Architecture and Self-supervised Fine-tuning Compute features for 3D point cloud registration

null 42 Jul 25, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

null 551 Dec 29, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 294 Jan 1, 2023
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

Joseph 53 Dec 13, 2022
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 5, 2022
Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj

Andy Zeng 845 Jan 3, 2023
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates

null 45 Dec 26, 2022
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

null 5 Jan 4, 2023
Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

HU Zeyu 82 Dec 27, 2022
Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

null 173 Dec 25, 2022
for taichi voxel-challange event

Taichi Voxel Challenge Figure: result of python3 example6.py. Please replace the image above (demo.jpg) with yours, so that other people can immediate

Liming Xu 20 Nov 26, 2022
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

Billy HE 141 Dec 30, 2022
All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.

Data Structures and Algorithms Python INDEX 1. Resources - Books Data Structures - Reema Thareja competitiveCoding Big-O Cheat Sheet DAA Syllabus Inte

Shushrut Kumar 129 Dec 15, 2022
This repo contains the code required to train the multivariate time-series Transformer.

Multi-Variate Time-Series Transformer This repo contains the code required to train the multivariate time-series Transformer. Download the data The No

Gregory Duthé 4 Nov 24, 2022
Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet

Elad Hoffer 145 Sep 16, 2022