[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

Overview

Panoptic NeRF

Project Page | Paper | Dataset


Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation
Xiao Fu*, Shangzhan zhang*, Tianrun Chen, Yichong Lu, Lanyun Zhu, Xiaowei Zhou, Andreas Geiger, Yiyi Liao
arXiv 2022

image

Installation

  1. Create a virtual environment via conda.
    conda create -n panopticnerf python=3.7
    conda activate panopticnerf
    
  2. Install torch and torchvision.
    conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
    
  3. Install requirements.
    pip install -r requirements.txt
    

Data Preparation

  1. We evaluate our model on KITTI-360. Here we show the structure of a test dataset as follow. You can download it from here and then put it into $ROOT (RGBs should query the KITTI-360 website).

    ├── KITTI-360
      ├── 2013_05_28_drive_0000_sync
        ├── image_00
        ├── image_01
      ├── bbx_intersection
        ├── *_00.npz
        ├── *_01.npz
      ├── calibration
        ├── calib_cam_to_pose.txt
        ├── perspective.txt
      ├── data_3d_bboxes
      ├── data_poses
        ├── cam0_to_world.txt
        ├── poses.txt
      ├── pspnet
      ├── sgm
      ├── visible_id
    
    file Intro
    image_00/01 stereo RGB images
    pspnet 2D pseudo ground truth
    sgm weak stereo depth supervision
    visible_id per-frame bounding primitive IDs
    data_poses system poses in a global Euclidean coordinate
    calibration extrinsics and intrinsics of the perspective cameras
    bbx_intersection ray-mesh intersections, containing depths between hitting points and camera origin, semantic label IDs and bounding primitive IDs
  2. Generate ray-mesh intersections (bbx_intersection/*.npz). The red dots and blue dots indicate where the rays hit into and out of the meshes, respectively. For the given test scene, START=3353, NUM=64.

    # image_00
    python mesh_intersection.py intersection_start_frame ${START} intersection_frames ${NUM} use_stereo False
    # image_01
    python mesh_intersection.py intersection_start_frame ${START} intersection_frames ${NUM} use_stereo True
    

  1. Evaluate the origin of a scene (center_pose) and the distance from the origin to the furthest bounding primitive (dist_min). Then accordingly modify the .yaml file.
    python recenter_pose.py recenter_start_frame ${START} recenter_frames ${NUM}
    

Training and Visualization

  1. We provide the training code. Replace resume False with resume True to load the pretained model.

    python train_net.py --cfg_file configs/panopticnerf_test.yaml pretrain nerf gpus '1,' use_stereo True use_pspnet True use_depth True pseudo_filter True weight_th 0.05 resume False
    
  2. Render semantic map, panoptic map and depth map in a single forward pass, which takes around 10s per-frame on a single 3090 GPU. Please make sure to maximize the GPU memory utilization by increasing the size of the chunk to reduce inference time. Replace use_stereo False with use_stereo True to render the right views.

    python run.py --type visualize --cfg_file configs/panopticnerf_test.yaml use_stereo False
    
  3. Visualize novel view appearance & label synthesis. Before rendering, select a frame and generate corresponding ray-mesh intersections with respect to its novel spiral poses by enabling spiral poses==True in lib.datasets.kitti360.panopticnerf.py.

    monocular

Evaluation

├── KITTI-360
  ├── gt_2d_semantics
  ├── gt_2d_panoptics
  ├── lidar_depth
  1. Download the corresponding pretrained model and put it to $ROOT/data/trained_model/panopticnerf/panopticnerf_test/latest.pth.

  2. We provide some semantic & panoptic GTs and LiDAR point clouds for evaluation. The details of evaluation metrics can be found in the paper.

  3. Eval mean intersection-over-union (mIoU)

python run.py --type eval_miou --cfg_file configs/panopticnerf_test.yaml use_stereo False
  1. Eval panoptic quality (PQ)
sh eval_pq_test.sh
  1. Eval depth with 0-100m LiDAR point clouds, where the far depth can be adjusted to evaluate the closer scene.
python run.py --type eval_depth --cfg_file configs/panopticnerf_test.yaml use_stereo False max_depth 100.
  1. Eval Multi-view Consistency (MC)
python eval_consistency.py --cfg_file configs/panopticnerf_test.yaml use_stereo False consistency_thres 0.1

News

  • 12/04/2022 Code released.
  • 29/03/2022 Repo created. Code will come soon.

Citation

@article{fu2022panoptic,
  title={Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation},
  author={Fu, Xiao and Zhang, Shangzhan and Chen, Tianrun and Lu, Yichong and Zhu, Lanyun and Zhou, Xiaowei and Geiger, Andreas and Liao, Yiyi},
  journal={arXiv preprint arXiv:2203.15224},
  year={2022}
}

Copyright © 2022, Zhejiang University. All rights reserved. We favor any positive inquiry, please contact [email protected].

You might also like...
[CVPR 2021] Forecasting the panoptic segmentation of future video frames
[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Panoptic Segmentation Forecasting Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021 [Link to paper] We propose

An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

Semisupervised Multitask Learning This repository is an unofficial and slightly modified implementation of UM-Adapt[1] using PyTorch. This code primar

PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images
PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images This r

Roach: End-to-End Urban Driving by Imitating a Reinforcement Learning Coach
Roach: End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

CARLA-Roach This is the official code release of the paper End-to-End Urban Driving by Imitating a Reinforcement Learning Coach by Zhejun Zhang, Alexa

Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)
Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)

Deep Reinforcement Learning for Smart Cities Documentation RLlib: https://docs.ray.io/en/master/rllib.html Mesa: https://mesa.readthedocs.io/en/stable

MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021
Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

Embedding Transfer with Label Relaxation for Improved Metric Learning Official PyTorch implementation of CVPR 2021 paper Embedding Transfer with Label

Code for CVPR 2021 oral paper
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Comments
  • About sky depth without using NDC space

    About sky depth without using NDC space

    Thanks for your great work! I found that you used real world space coordinate instead of common NDC space. So I want to know how you supervise sky depth cause it is inf depth in real world. I found that you filted it out when training here

    opened by DRosemei 3
  • Intersections over road surface

    Intersections over road surface

    I find thiswill largely enforce NeRF to converge to proper depth. While intersections are from 3D bboxes, so how do you deal with road surface? I cannot see from here. Thanks in advance : )

    opened by DRosemei 2
  • How can I change batch_size?

    How can I change batch_size?

    When I change the "batch_size" under "panopticnerf_test.yaml"——"train", an error occured.

    Traceback (most recent call last): File "train_net.py", line 95, in main() File "train_net.py", line 92, in main train(cfg, network) File "train_net.py", line 47, in train trainer.train(epoch, train_loader, optimizer, recorder) File "/root/data1/PanopticNeRF/lib/train/trainers/trainer.py", line 58, in train for iteration, batch in enumerate(data_loader): File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 434, in reraise raise exception KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch return self.collate_fn(data) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate return {key: default_collate([d[key] for d in batch]) for key in elem} File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 74, in return {key: default_collate([d[key] for d in batch]) for key in elem} File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate return {key: default_collate([d[key] for d in batch]) for key in elem} File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 74, in return {key: default_collate([d[key] for d in batch]) for key in elem} File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 74, in return {key: default_collate([d[key] for d in batch]) for key in elem} KeyError: 89

    Could you please look at this problem?

    opened by a4152684 0
  • How to generate gt_2d_semantics, gt_2d_panoptics, lidar_depth?

    How to generate gt_2d_semantics, gt_2d_panoptics, lidar_depth?

    Many thanks for this great work!

    When evaluating, there are three files needed, gt_2d_semantics, gt_2d_panoptics and lidar_depth.

    How could I prepare these files from the original Kitti360 data? Could there be a code release to generate these files?

    opened by Asparagus15 1
Owner
Xiao Fu
Don’t regret anything in life.
Xiao Fu
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

Disentangled Representation Learning for Text-Video Retrieval This is a PyTorch implementation of the paper Disentangled Representation Learning for T

Qiang Wang 42 Sep 7, 2022
Instant-nerf-pytorch - NeRF trained SUPER FAST in pytorch

instant-nerf-pytorch This is WORK IN PROGRESS, please feel free to contribute vi

null 85 Sep 2, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 192 Oct 1, 2022
[ICRA2021] Reconstructing Interactive 3D Scene by Panoptic Mapping and CAD Model Alignment

Interactive Scene Reconstruction Project Page | Paper This repository contains the implementation of our ICRA2021 paper Reconstructing Interactive 3D

null 76 Sep 10, 2022
An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

Zou 27 Jul 21, 2022
Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言 本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签关联性,在多个数据集上测试表明该算法与主流算法无显著性差异,在该比赛数据集上的dev效果很好,但是由

null 50 Aug 15, 2022
Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

Hongxin Wei 10 Aug 28, 2022
A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 83 May 11, 2022
Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

Haochen 23 Mar 23, 2022
[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

EOPSN: Exemplar-Based Open-Set Panoptic Segmentation Network (CVPR 2021) PyTorch implementation for EOPSN. We propose open-set panoptic segmentation t

Jaedong Hwang 47 Aug 12, 2022