RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Related tags

Deep Learning RTS3D
Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

RTS3D is efficiency and accuracy stereo 3D object detection method for autonomous driving.

RTS3D

Introduction

RTS3D is the first true real-time system (FPS>24) for stereo image 3D detection meanwhile achieves 10% improvement in average precision comparing with the previous state-of-the-art method. RTS3D only require RGB images without synthetic data, instance segmentation, CAD model, or depth generator.

Highlights

  • Fast: 33 FPS of single image test speed in KITTI benchmark with 384*1280 resolution
  • Accuracy: SOTA on the KITTI benchmark.
  • Anchor Free: No 2D or 3D anchor are reauired
  • Easy to deploy: RTS3D uses conventional convolution operations and MLP, so it is very easy to deploy and accelerate.

RTS3D Baseline and Model Zoo

All experiments are tested with Ubuntu 16.04, Pytorch 1.0.0, CUDA 9.0, Python 3.6, single NVIDIA 2080Ti

IoU Setting 1: Car IoU > 0.5, Pedestrian IoU > 0.25, Cyclist IoU > 0.25

IoU Setting 2: Car IoU > 0.7, Pedestrian IoU > 0.5, Cyclist IoU > 0.5

  • Training on KITTI train split and evaluation on val split.
Class Iteration FPS AP BEV IoU Setting1 AP 3D IoU Setting1 AP BEV IoU Setting2 AP 3D IoU Setting2
- - - Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard
Car- Recall-11 1 90.9 89.83, 77.05, 68.28 89.27, 70.12, 61.17 73.20, 53.62, 46.44 60.87, 42.38, 36.44
Car- Recall-40 1 90.9 92.92, 76.17, 66.62 90.35, 71.37, 63.52 78.12, 54.75, 47.09 60.34, 39.32, 32.97
Car- Recall-11 2 45.5 90.41, 78.70, 70.03 90.26, 77.23, 68.28 76.56, 56.46, 48.20 63.65, 44.50, 37.48
Car- Recall-40 2 45.5 95.75, 79.61, 69.69 93.57, 76.64, 66.72 78.12, 54.75, 47.09 63.99, 41.78, 34.96
  • Training on KITTI train split and evaluation on val split.
    • FCE Space Resolution: 10 * 10 * 10
    • Recall split: 11
    • Iteration: 2
    • Model: (Google Drive), (Baidu Cloud 提取码:4t4u)
Class AP BEV IoU Setting1 AP 3D IoU Setting1 AP BEV IoU Setting2 AP 3D IoU Setting2
- Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard
Car 90.18, 78.46, 69.76 89.88, 76.64, 67.86 74.95, 54.07, 46.78 58.50, 39.74, 34.83
Pedestrian 57.12, 48.82, 40.88 56.36, 48.29, 40.22 32.16, 26.31, 21.28 26.95, 20.77, 19.74
Cyclist 54.48, 35.78, 30.80 53.86, 30.90, 30.52 33.59, 20.80, 20.14 31.05, 20.26, 18.93

Installation

Please refer to INSTALL.md

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

KM3DNet
├── kitti_format
│   ├── data
│   │   ├── kitti
│   │   |   ├── annotations
│   │   │   ├── calib /000000.txt .....
│   │   │   ├── image(left[0-7480] right[7481-14961] input augmentatiom)
│   │   │   ├── label /000000.txt .....
|   |   |   ├── train.txt val.txt trainval.txt
│   │   │   ├── mono_results /000000.txt .....
├── src
├── demo_kitti_format
├── readme
├── requirements.txt

Getting Started

Please refer to GETTING_STARTED.md to learn more usage about this project.

Acknowledgement

License

RTS3D is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from, CenterNet, iou3d and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@misc{2012.15072,
Author = {Peixuan Li, Shun Su, Huaici Zhao},
Title = {RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving},
Year = {2020},
Eprint = {arXiv:2012.15072},
}
You might also like...
[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving
[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving Paper | Supplementary | Video | Poster | Blog This repository is for the ICCV 2021 pap

This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.
This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

Semisupervised Multitask Learning This repository is an unofficial and slightly modified implementation of UM-Adapt[1] using PyTorch. This code primar

Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

Python and C++ implementation of
Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation This is a PyTorch and LibTorch implementation of MarkerPose: a

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

Comments
  • data process

    data process

    what is annotations and mono results? which kitti dataset belongs to those? That is clearly not labels. then what is annotations? Can you give more specific data set name/link to download? I al logged in to kitti dataset donwload but the page link you gave has many dataset types.

    opened by safaATcurrus 0
  • Is your code complete?

    Is your code complete?

    when i run command

    python ./src/tools/kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti_format/data/kitti/label/ --label_split_file ./ImageSets/val.txt --current_class=0,1,2 --coco=False --result_path=./kitti_format/exp/results/data/
    

    it's useless and i got a error image and i add --gpus 0 also useless, got same error above.

    I want to know how can l evaluate my retrain model.

    opened by JensenGao 0
Owner
null
Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

Ethan 51 Nov 17, 2022
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

ORB-SLAM2 Authors: Raul Mur-Artal, Juan D. Tardos, J. M. M. Montiel and Dorian Galvez-Lopez (DBoW2) 13 Jan 2017: OpenCV 3 and Eigen 3.3 are now suppor

Raul Mur-Artal 7.8k Dec 30, 2022
Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

PyTorch-High-Res-Stereo-Depth-Estimation Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch. Stereo dep

Ibai Gorordo 26 Nov 24, 2022
RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching This repository contains the source code for our paper: RAFT-Stereo: Multilevel

Princeton Vision & Learning Lab 328 Jan 9, 2023
Multi-Scale Geometric Consistency Guided Multi-View Stereo

ACMM [News] The code for ACMH is released!!! [News] The code for ACMP is released!!! About ACMM is a multi-scale geometric consistency guided multi-vi

Qingshan Xu 118 Jan 4, 2023
[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

null 695 Jan 5, 2023
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

null 308 Jan 4, 2023
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

null 148 Dec 28, 2022