Complete* list of autonomous driving related datasets

Overview

AD Datasets

Complete* and curated list of autonomous driving related datasets

Contributing

Contributions are very welcome! To add or update a dataset:

  • Update my-app/src/data.js: image

  • Make sure the dataset you add or edit has as many attributes as possible filled out:

    • Some attributes can only be found in associated papers
    • Some attributes can only be found in associated websites
    • Some attributes can only be found in the dataset itself
  • Send a pull request based on the created fork

Example Contribution

This is how the KITTI dataset is integrated into the website:

[...]
{
    id: "KITTI", //07.08. fertig
    href: "http://www.cvlibs.net/datasets/kitti/",
    size_hours: "6",
    size_storage: "180",
    frames: "",
    numberOfScenes: '50',
    samplingRate: "10",
    lengthOfScenes: "",
    sensors: "camera, lidar, gps/imu",
    sensorDetail: "2 greyscale cameras 1.4 MP, 2 color cameras 1.4 MP, 1 lidar 64 beams 360° 10Hz, 1 inertial and " +
        "GPS navigation system",
    benchmark: " stereo, optical flow, visual odometry, slam, 3d object detection, 3d object tracking",
    annotations: "3d bounding boxes",
    licensing: "Creative Commons Attribution-NonCommercial-ShareAlike 3.0",
    relatedDatasets: 'Semantic KITTI, KITTI-360',
    publishDate: new Date("2012-3").toISOString().split('T')[0],
    lastUpdate: new Date("2021-2").toISOString().split('T')[0],
    relatedPaper: "http://www.cvlibs.net/publications/Geiger2013IJRR.pdf",
    location: "Karlsruhe, Germany",
    rawData: "Yes"
},
[...]

* You're missing a dataset? Simply create a pull request ;)

Metadata

In the following, the scheme according to which the entries of the respective properties have resulted is illuminated.

Annotations

This property describes the types of annotations with which the data sets have been provided.

Benchmark

If benchmark challenges are explicitly listed with the data sets, they are specified here.

Frames

Frames states the number of frames in the data set. This includes training, test and validation data.

Last Update

If information has been provided on updates and their dates, they can be found in this category.

Licensing

In order to give the users an impression of the licenses of the data sets, information on them is already included in the tool. Location. This category lists the areas where the data sets have been recorded.

N° Scenes

N° Scenes shows the number of scenes contained in the data set and includes the training, testing and validation segments. In the case of video recordings, one recording corresponds to one scene. For data sets consisting of photos, a photo is the equivalent to a scene.

Publish Date

The initial publication date of the data set can be found under this category. If no explicit information on the date of publication of the data set could be found, the submission date of the paper related to the set was used at this point.

Related Data Sets

If data sets are related, the names of the related sets can be examined as well. Related data sets are, for example, those published by the same authors and building on one another.

Related Paper

This property solely consists of a link to the paper related to the data set. Sampling Rate [Hz]. The Sampling Rate [Hz] property specifies the sampling rate in Hertz at which the sensors in the data set work. However, this declaration is only made if all sensors are working at the same rate or, alternatively, if the sensors are being synchronized. Otherwise the field remains empty.

Scene Length [s]

This property describes the length of the scenes in seconds in the data set, provided all scenes have the same length. Otherwise no information is given. For example, if a data set has scenes with lengths between 30 and 60 seconds, no entry can be made. The background to this procedure is to maintain comparability and sortability.

Sensor Types

This category contains a rough description of the sensor types used. Sensor types are, for example, lidar or radar.

Sensors - Details

The Sensors - Detail category is an extension of the Sensor Types category. It includes a more detailed description of the sensors. The sensors are described in detail in terms of type and number, the frame rates they work with, the resolutions which sensors have and the horizontal field of view.

Size [GB]

The category Size [GB] describes the storage size of the data set in gigabytes.

Size [h]

The Size [h] property is the equivalent of the Size [GB] described above, but provides information on the size of the data set in hours.

Location

The place(s) the data was recorded at

rawData

Denotes if the dataset provides raw or processed data

Citation

If you find this code useful for your research, please cite our paper:

@article{Bogdoll_addatasets_2022_VEHITS,
    author    = {Bogdoll, Daniel and Schreyer, Felix, and Z\"{o}llner, J. Marius},
    title     = {{ad-datasets: a meta-collection of data sets for autonomous driving}},
    journal   = {arXiv preprint:2202.01909},
    year      = {2022},
}
Comments
  • Add missing argoverse entries and sort by # of citations

    Add missing argoverse entries and sort by # of citations

    Hi @daniel-bogdoll, thanks for compiling this list. I'm a co-author of the Argoverse datasets, and I added missing details about our datasets

    I also figured the most appropriate way to sort the datasets would be by the influence/impact, as measured by # of citations to the corresponding dataset paper. As of Google Scholar today, the number of citations I found for many of these were:

    KITTI 4566 nuScenes 837 Oxford Robot Car 765 Waymo Open Dataset 317 Argoverse 302 Semantic KITTI 295 Apolloscape 275 BDD 240 WildDash 59 Lyft L5 42 Cityscapes3D 5

    opened by johnwlambert 5
  • sort datasets by citation impact, and add missing details about Argoverse Stereo

    sort datasets by citation impact, and add missing details about Argoverse Stereo

    Using the order mentioned here: https://github.com/daniel-bogdoll/ad-datasets/pull/11

    Resolves https://github.com/daniel-bogdoll/ad-datasets/issues/17

    The content of all entries is identical, except for Argoverse Stereo, which has some updated fields.

    opened by johnwlambert 3
  • BDD100K License incorrect

    BDD100K License incorrect

    https://github.com/daniel-bogdoll/ad-datasets/blob/b78e51d733c81b95579ab8322d907be1a0463767/my-app/src/data.json#L221

    According to https://doc.bdd100k.com/license.html, BSD3 clause is for code, data is under a different license available in link.

    opened by miquelmarti 1
  • Sort datasets by citations

    Sort datasets by citations

    I also figured the most appropriate way to sort the datasets would be by the influence/impact, as measured by # of citations to the corresponding dataset paper. As of Google Scholar today, the number of citations I found for many of these were:

    KITTI 4566 nuScenes 837 Oxford Robot Car 765 Waymo Open Dataset 317 Argoverse 302 Semantic KITTI 295 Apolloscape 275 BDD 240 WildDash 59 Lyft L5 42 Cityscapes3D 5

    • See PR -
    enhancement 
    opened by daniel-bogdoll 1
  • Enhance readme with paper information

    Enhance readme with paper information

    In the paper, it is explained why some fields are empty (e.g. if not all sensors work with the same frequency). Such key information should be added to the README or the FAQ

    enhancement 
    opened by daniel-bogdoll 1
  • Tags

    Tags

    Maybe in a future version (depending on Material-UI?) it would be nice to add tags instead of raw text (e.g. for sensors: "radar" "lidar" etc.)

    image

    enhancement 
    opened by daniel-bogdoll 0
  • Check if ChatGPT knows more than we do

    Check if ChatGPT knows more than we do

    ChatGPT proposed the following lists of datasets:

    (50/50) KITTI Waymo Open Dataset BDD100K ApolloScape nuScenes Cityscapes Mapillary Vistas BOSCH Small Traffic Lights Oxford RobotCar Argoverse SYNTHIA Virtual KITTI Virtual KITTI 2 NVIDIA Self-driving Car Baidu Apollo Scape Autonomous Driving Dataset Lyft Level 5 nuTonomy Scenes Berlin Cityscapes DeepDrive BOSCH Small Traffic Lights Daimler Urban Segmentation Ford Campus Vision and Lidar Lyft Motion Prediction Waymo Open Dataset Motion Segmentation DeepMotion Drive360 KAIST Multispectral Pedestrian Detection KAIST Multispectral Pedestrian Tracking Mapillary Vistas Panoptic Segmentation Stanford 2D-3D-Semantics Cityscapes Instance Segmentation Cityscapes Panoptic Segmentation PASCAL VOC INRIA Person Caltech Pedestrian ETH Pedestrian TUD Pedestrian KITTI Object Detection KITTI Object Tracking KITTI Stereo KITTI Odometry KITTI Flow KITTI Raw Data KITTI Semantic Segmentation KITTI Object Detection 2D KITTI Object Detection 3D KITTI Tracking KITTI Motion Segmentation KITTI Visual Odometry

    (84/100) Argoverse Autonomous Driving Dataset BDD100K BOSCH Small Traffic Lights Baidu Apollo Scape Berlin Cityscapes Caltech Pedestrian Cityscapes Cityscapes Instance Segmentation Cityscapes Panoptic Segmentation DeepDrive DeepMotion Daimler Urban Segmentation ETH Pedestrian Ford Campus Vision and Lidar INRIA Person KAIST Multispectral Pedestrian Detection KAIST Multispectral Pedestrian Tracking KITTI Object Detection 2D KITTI Object Detection 3D KITTI Object Detection KITTI Object Tracking KITTI Odometry KITTI Raw Data KITTI Semantic Segmentation KITTI Stereo KITTI Tracking KITTI Visual Odometry Lyft Level 5 Lyft Motion Prediction Mapillary Vistas Mapillary Vistas Panoptic Segmentation NVIDIA Self-driving Car Oxford RobotCar PASCAL VOC Stanford 2D-3D-Semantics SYNTHIA TUD Pedestrian Virtual KITTI Virtual KITTI 2 Waymo Open Dataset Waymo Open Dataset Motion Segmentation nuScenes nuTonomy Scenes A2D2 BDD100K Detection BDD100K Segmentation BDD100K Tracking BDD100K Video DeepDrive Object Detection DeepDrive Object Tracking DeepDrive Video Drive360 EuroCity Persons KITTI Depth Completion KITTI Stereo Visual Odometry KITTI Tracking Benchmark Lyft Level 5 Motion Prediction Lyft Level 5 Object Detection Lyft Level 5 Tracking Lyft Level 5 Video Mapillary Vistas Instance Segmentation Mapillary Vistas Semantic Segmentation nuScenes Detection nuScenes Instance Segmentation nuScenes Lidar nuScenes Semantic Segmentation nuScenes Tracking nuTonomy Scenes Detection nuTonomy Scenes Instance Segmentation nuTonomy Scenes Lidar nuTonomy Scenes Semantic Segmentation nuTonomy Scenes Tracking Cityscapes Instance Segmentation Fine Cityscapes Panoptic Segmentation Fine DeepDrive Segmentation KITTI Object Detection 2D Hard KITTI Object Detection 3D Easy KITTI Object Detection 3D Moderate KITTI Object Detection 3D Hard KITTI Object Detection Bird's Eye View KITTI Object Detection Car KITTI Object Detection Cyclist KITTI Object Detection People

    opened by daniel-bogdoll 1
  • Add DOI and semantic scholar ID to all entries (if available)

    Add DOI and semantic scholar ID to all entries (if available)

    I want to retrieve all semantic scholar paperIds to facilitate writing API requests for my paper. I think, it would be a great benefit for future projects to add the paperIds (and while I am at it all available DOIs if some are still missing).

    opened by JonasHendl 0
  • #Ciations for papers without arXiv or DOI

    #Ciations for papers without arXiv or DOI

    Hey Felix,

    there are a few papers that were published without an arXiv entry or DOI. This one is an example:

    { "id": "Argoverse 2", "href": "https://www.argoverse.org/av2.html", "relatedPaper": "https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/file/4734ba6f3de83d861c3176a6273cac6d-Paper-round2.pdf" },

    The Neurips Proceedings do not provide a DOI: https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/4734ba6f3de83d861c3176a6273cac6d-Abstract-round2.html

    However, SemanticScholar has the paper listed and provides the number of citations: https://www.semanticscholar.org/paper/Argoverse-2%3A-Next-Generation-Datasets-for-and-Wilson-Qi/048056c0321876c0d582c3bf40b1883cda9260d5#citing-papers

    Maybe we should check for the few remaining papers, if it would always be a solution to choose a third option, such as the Corpus ID (Corpus ID: 244906596) provided by SemanticScholar?

    opened by daniel-bogdoll 1
  • Search partially not working

    Search partially not working

    I added the "GTA5" dataset yesterday, but it does not show up in the search results:

    image

    However, when I search for KITTI, it works. Did the build fail or something similar?

    opened by daniel-bogdoll 3
Owner
Daniel Bogdoll
PhD student at FZI and KIT with a focus on deep learning and autonomous driving.
Daniel Bogdoll
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021). RTS3D is efficiency and accuracy s

null 71 Nov 29, 2022
[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

null 695 Jan 5, 2023
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

null 308 Jan 4, 2023
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

null 148 Dec 28, 2022
[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving Paper | Supplementary | Video | Poster | Blog This repository is for the ICCV 2021 pap

null 254 Jan 2, 2023
This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

Sagor Saha 4 Sep 4, 2021
Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

QCraft 101 Dec 5, 2022
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

(Brian) JoonHo Lee 24 Dec 12, 2022
Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

Nguyen Mau Dzung 271 Nov 29, 2022
Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

YE Luyao 6 Oct 27, 2022
Plug and play transformer you can find network structure and official complete code by clicking List

Plug-and-play Module Plug and play transformer you can find network structure and official complete code by clicking List The following is to quickly

null 8 Mar 27, 2022
This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

TSForecasting This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the tim

Rakshitha Godahewa 80 Dec 30, 2022
A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

A curated list of awesome resources related to Semantic Search?? and Semantic Similarity tasks.

null 224 Jan 4, 2023
An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

EasyDatas An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results Installation pip install git+https

Ximing Yang 4 Dec 14, 2021
Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.

Deep Learning Dataset Maker Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data. How to use Down

deepbands 25 Dec 15, 2022
Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Continual learning datasets Introduction This repository contains PyTorch image

berjaoui 5 Aug 28, 2022
MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

MiraiML Mirai: future in japanese. MiraiML is an asynchronous engine for continuous & autonomous machine learning, built for real-time usage. Usage In

Arthur Paulino 25 Jul 27, 2022