PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Overview

Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

This repository contains the PyTorch implementation of the PanopticBEV model proposed in our RA-L 2021 paper Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images.

Our approach, PanopticBEV, is the state-of-the-art approach for generating panoptic segmentation maps in the bird's eye view using only monocular frontal view images.

PanopticBEV Teaser

If you find this code useful for your research, please consider citing our paper:

@article{gosala2021bev,
  title={Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images},
  author={Gosala, Nikhil and Valada, Abhinav},
  journal={arXiv preprint arXiv:2108.03227},
  year={2021}
}

Relevant links

System requirements

  • Linux (Tested on Ubuntu 18.04)
  • Python3 (Tested using Python 3.6.9)
  • PyTorch (Tested using PyTorch 1.8.1)
  • CUDA (Tested using CUDA 11.1)

Installation

a. Create a python virtual environment and activate it.

python3 -m venv panoptic_bev
source panoptic_bev/bin/activate

b. Update pip to the latest version.

python3 -m pip install --upgrade pip

c. Install the required python dependencies using the provided requirements.txt file.

pip3 install -r requirements.txt

d. Install the PanopticBEV code.

python3 setup.py develop

Obtaining the datasets

Please download the datasets from here and follow the instructions provided in the encapsulated readme file.

Code Execution

Configuration parameters

The configuration parameters of the model such as the learning rate, batch size, and dataloader options are stored in the experiments/config folder. If you intend to modify the model parameters, please do so here.

Training and Evaluation

The training and evaluation python codes along with the shell scripts to execute them are provided in the scripts folder. Before running the shell scripts, please fill in the missing parameters with your computer-specific data paths and parameters.

To train the model, execute the following command after replacing * with either kitti or nuscenes.

bash train_panoptic_bev_*.sh

To evaluate the model, execute the following command after replacing * with either kitti or nuscenes.

bash eval_panoptic_bev_*.sh 

Acknowledgements

This work was supported by the Federal Ministry of Education and Research (BMBF) of Germany under ISA 4.0 and by the Eva Mayr-Stihl Stiftung.

This project contains code adapted from other open-source projects. We especially thank the authors of:

License

This code is released under the GPLv3 for academic usage. For commercial usage, please contact Nikhil Gosala.

Comments
  • Support multi-camera input?

    Support multi-camera input?

    Really great work! A little question, I saw the teaser.png (https://github.com/robot-learning-freiburg/PanopticBEV/blob/master/images/teaser.png) was generated from 6 cameras (should be nuscenes dataset?), but in nuscenes.ini, only listed cam_front. [cameras] intrinsics = {"fx": 1266.4172, "fy": 51276.4172, "px": 816.267, "py": 491.507} extrinsics = {"translation": (0.0, 0.6, 1.85), "rotation": (-90, 0, 180)} bev_params = {"f": 336, "cam_z": 26}

    So how can I enable multi-camera input for PanopticBEV? Or can you post a multi-camera version?

    opened by lygbuaa 4
  • How to visualize the result

    How to visualize the result

    Dear authors,

    thanks for your great work. I have trained one model myself and want to see the visualized output(like Fig.1 in your paper). I found one script under the path panoptic_bev/utils/visualization.py. But don't know the way to use it.

    Could please tell me the way to visualize the result?

    Bests,

    opened by ShangyinGao 2
  • 'list' object has no attribute 'cuda'

    'list' object has no attribute 'cuda'

    Hi, Have you seen this error before? I tried training from scratch and get this for:

    for it, sample in enumerate(dataloader): sample = {k: sample[k].cuda(device=varargs['device'], non_blocking=True) for k in NETWORK_INPUTS}

    Thanks, Shubhankar

    opened by sborse3 1
  • Where is the dataset annotation tool mentioned in the paper supplementary?

    Where is the dataset annotation tool mentioned in the paper supplementary?

    I downloaded nuscenes_panopticbev.zip, and found only front camera ground truth, but I need labels for all 6 cameras, so can you release the dataset annotation tool? Or the whole annotated nuscenes for 6 cameras, that will be helpful.

    opened by lygbuaa 1
  • What is the starting coordinate of the bev map?

    What is the starting coordinate of the bev map?

    For KITTI-360 data, What is the starting coordinate of the bev map? I mean the bev map's left border coordinate, The bev maps contain real-world distance relationships, so the start coordinate is very import, but it was not given in your paper. There is also another question, Are the bev maps in the LiDAR coordinate system or the camera coordinate system?

    opened by JunjieLiuSWU 0
  • Could you send out the picture of your configuration?  Thank you .

    Could you send out the picture of your configuration? Thank you .

    Which data set of Nuscenes are you using? $N6 SGB2DCWN)TW%AAXZ4CV How do you set these Settings? I would appreciate it if you could give me a reply!

    Originally posted by @laozheng1 in https://github.com/robot-learning-freiburg/PanopticBEV/issues/10#issuecomment-1072342490

    opened by xiaowhite-22 0
  • Detection perform very bad after changing input image resolution

    Detection perform very bad after changing input image resolution

    I change the "scale=1.0" to "scale=0.5" in nuscenes.ini, and re-trained the model, expecting similar results, but the object-detection perform very bad in the smaller scale.

    result from "scale=1.0", fairly good. image

    but result from "scale=0.5", nothing detected ! image

    how should I tune the model after changing the scale, any suggestion?

    opened by lygbuaa 0
  • Question about Model Training

    Question about Model Training

    Dear all, we trained the model on Nuscenes and Kitti360 respectively. The Nuscenes work fine. But for Kitti360, the results don't seem very well. We attached the result at the end. I didn't change any code or configuration. I found that the size of the FV image in Kitti360 is 768 * 1400 on paper but 384 * 1400 in the configuration file in code. Which one is correct?

    01:16:57 ---------------- Semantic mIoU ---------------- 01:16:57 road : 0.66676 01:16:57 sidewalk : 0.29211 01:16:57 building : 0.26043 01:16:57 wall : 0.03577 01:16:57 vegetation : 0.34960 01:16:57 terrain : 0.14224 01:16:57 occlusion : 0.40774 01:16:57 person : 0.00000 01:16:57 rider : 0.00789 01:16:57 car : 0.35977 01:16:57 truck : 0.07531 01:16:57 ---------------- Panoptic Scores ---------------- 01:16:57 po_miou : 0.23538 01:16:57 sem_miou : 0.23615 01:16:57 pq : 0.14057 01:16:57 pq_stuff : 0.17719 01:16:57 pq_thing : 0.07647 01:16:57 sq : 0.55955 01:16:57 sq_stuff : 0.60958 01:16:57 sq_thing : 0.47198 01:16:57 rq : 0.21159 01:16:57 rq_stuff : 0.26759

    opened by anslt 13
Owner
null
[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation Xiao Fu1*  Shangzhan Zhang1*  Tianrun Chen1  Yichong Lu1  Lanyun Zhu2  Xi

Xiao Fu 37 May 17, 2022
PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"

FIERY This is the PyTorch implementation for inference and training of the future prediction bird's-eye view network as described in: FIERY: Future In

Wayve 406 Dec 24, 2022
This project is for a Twitter bot that monitors a bird feeder in my backyard. Any detected birds are identified and posted to Twitter.

Backyard Birdbot Introduction This is a silly hobby project to use existing ML models to: Detect any birds sighted by a webcam Identify whic

Chi Young Moon 71 Dec 25, 2022
CenterPoint 3D Object Detection and Tracking using center points in the bird-eye view.

CenterPoint 3D Object Detection and Tracking using center points in the bird-eye view. Center-based 3D Object Detection and Tracking, Tianwei Yin, Xin

Tianwei Yin 134 Dec 23, 2022
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

null 118 Dec 26, 2022
Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

Haochen 23 Oct 25, 2022
[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

EOPSN: Exemplar-Based Open-Set Panoptic Segmentation Network (CVPR 2021) PyTorch implementation for EOPSN. We propose open-set panoptic segmentation t

Jaedong Hwang 49 Dec 30, 2022
[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Panoptic Segmentation Forecasting Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021 [Link to paper] We propose

Niantic Labs 44 Nov 29, 2022
An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

Semisupervised Multitask Learning This repository is an unofficial and slightly modified implementation of UM-Adapt[1] using PyTorch. This code primar

Abhinav Atrishi 11 Nov 25, 2022
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
Implementation for Panoptic-PolarNet (CVPR 2021)

Panoptic-PolarNet This is the official implementation of Panoptic-PolarNet. [ArXiv paper] Introduction Panoptic-PolarNet is a fast and robust LiDAR po

Zixiang Zhou 126 Jan 1, 2023
A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Biomedical Computer Vision @ Uniandes 52 Dec 19, 2022
[ICRA2021] Reconstructing Interactive 3D Scene by Panoptic Mapping and CAD Model Alignment

Interactive Scene Reconstruction Project Page | Paper This repository contains the implementation of our ICRA2021 paper Reconstructing Interactive 3D

null 97 Dec 28, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

Visual Understanding Lab @ Samsung AI Center Moscow 190 Dec 30, 2022
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

null 304 Jan 3, 2023
Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

Intelligent Systems Lab Org 2.3k Jan 1, 2023