(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

Dian Chen

Last update: Dec 15, 2022

Related tags

Deep Learning prediction planning perception autonomous-driving imitation-learning distillation carla-simulator cvpr2022

Overview

LAV

Learning from All Vehicles
Dian Chen, Philipp Krähenbühl
CVPR 2022 (also arXiV 2203.11934)

This repo contains code for paper Learning from all vehicles.

It distills a model that performs joint perception, multi-modal prediction and planning, and we hope it to be a great starter kit for end-to-end autonomous driving research.

Reference

If you find our repo, dataset or paper useful, please cite us as

@inproceedings{chen2022lav,
  title={Learning from all vehicles},
  author={Chen, Dian and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={CVPR},
  year={2022}
}

Demo Video

Also checkout our website!

Getting Started

To run CARLA and train the models, make sure you are using a machine with at least a mid-end GPU.
Please follow INSTALL.md to setup the environment.

Training

We adopt a LBC-style staged privileged distillation framework. Please refer to TRAINING.md for more details.

Evaluation

We additionally provide examplery trained weights in the weights folder if you would like to directly evaluate. They are trained on Town01, 03, 04, 06. Make sure you are launching CARLA with the -vulkan flag.

Note: Please note that this is just example weights for quickstart purposes. If you directly submit this to leaderboard you will not get 61 DS. Full leaderboard codes will be released later.

Inside the root LAV repo, run

ROUTES=[PATH TO ROUTES] ./leaderboard/scripts/run_evaluation.sh

Use ROUTES=assets/routes_lav_valid.xml to run our ablation routes, or ROUTES=leaderboard/data/routes_valid.xml for the validation routes provided by leaderboard.

Dataset

We also release our LAV dataset. Download the dataset HERE.

See TRAINING.md for more details.

Acknowledgements

We thank Tianwei Yin for the pillar generation code. The ERFNet codes are taken from the official ERFNet repo.

License

This repo is released under the Apache 2.0 License (please refer to the LICENSE file for details).

Comments

[Question] questions on detailed about dataset
Thanks for providing the codes, it's amazing work. 🤩

Here are some questions after I read the paper and code readme:

Is the dataset provided in the repo the whole dataset as in the paper Table 7 which has 399776 frames?

Is the expert agent from lbc repo as the link is shown here? or as the paper said the CARLA behavior agent as link shown here? Since I didn't see the code for collecting the dataset.

As the dataset said in the paper is collected in all towns, as I knew leaderboard official public routes just have Town01-06 routes file? and I didn't find any additional routes files in this repo? would you mind to release that the route file you collected for data? since if we want to compare with the method it should have the same training routes for fair. If not, the all towns in the paper said, is that includes other towns like Town07 and Town10HD as said in paper table 7? or you also build another map for training?

Looking forward to your reply, and thanks again for this paper and codes.
opened by Kin-Zhang 30
About data generation for perception training

Thanks for your fantastic work! I have read the paper and noticed that the mapping prediction and detection in perception training are both supervised. Since I didn't find details about data generation in the paper, I am really curious about how do you generate the labelled data for such a large scale dataset, or how do you get the labels for the collected data?

Looking forward for your reply. Thanks very much!

opened by xingbw 25
Question about loss in the training phase
Question about the training phase:

The whole dataset has some data.mbd loss in folders, and the provided dataset frame is shown here:

At training phase: End-to-end Training, I downloaded the whole dataset and it's wried at some of the loss, I'd like to ask that is it correct or just normal for these kinds of loss?

The training detail is same as the default config.yml with all phases are trained in 100 epochs and bev with 160 epochs with suitable batch size, the evaluation is really terrible at the online leaderboard shown I just finished the first five route and see the result file notice the collision is serious, I'm wondering is there any process I miss that cannot reproduce the effect to evaluate? since with only 1% data loss will not effect so much on the training model.

Thanks again for your work!
opened by Kin-Zhang 11
Issues with evaluation

Thank you for your amazing work!

I have followed INSTALL.md to setup the environment . When I runROUTES=assets/routes_lav_valid.xml ./leaderboard/scripts/run_evaluation.sh , the following error occurs.

========= Preparing RouteScenario_0 (repetition 0) =========

Setting up the agent

Could not set up the required agent:

invalid load key, 'v'.

Traceback (most recent call last): File "/home/abc/carla1/LAV-main/leaderboard/leaderboard/leaderboard_evaluator.py", line 262, in _load_and_run_scenario self.agent_instance = getattr(self.module_agent, agent_class_name)(args.agent_config) File "/home/abc/carla1/LAV-main/leaderboard/leaderboard/autoagents/autonomous_agent.py", line 45, in init self.setup(path_to_conf_file) File "/home/abc/carla1/LAV-main/team_code/lav_agent.py", line 89, in setup bev_planner.load_state_dict(torch.load(self.bev_model_dir)) File "/home/abc/anaconda3/envs/LAV-env/lib/python3.7/site-packages/torch/serialization.py", line 713, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/home/abc/anaconda3/envs/LAV-env/lib/python3.7/site-packages/torch/serialization.py", line 920, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

Registering the route statistics

ps: import torch torch.version '1.11.0'

any help will be appreciated!

opened by klaus99815 7
How can I visualize the results?

Thanks to the author for the wonderful code! I would like to ask how to visualize the results of the network output, for example: RGB camera inputs, predicted road geometries, and detection and motion predictions respectively.

Looking forward to your reply, thank you very much!

opened by kevinchiu19 5
No module named 'models.lidar'

Hi,

I got error when I ran ROUTES= ./leaderboard/scripts/run_evaluation.sh in LAV-env conda environment. Here is the error message. I am running Ubuntu 18.04 with the installation steps followed. Traceback (most recent call last): File "/home/demo/Public/LAV/leaderboard/leaderboard/leaderboard_evaluator.py", line 457, in main leaderboard_evaluator = LeaderboardEvaluator(arguments, statistics_manager) File "/home/demo/Public/LAV/leaderboard/leaderboard/leaderboard_evaluator.py", line 91, in init self.module_agent = importlib.import_module(module_name) File "/home/demo/anaconda3/envs/LAV-env/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 677, in _load_unlocked File "", line 728, in exec_module File "", line 219, in _call_with_frames_removed File "/home/demo/Public/LAV/team_code/lav_agent.py", line 15, in from models.lidar import LiDARModel ModuleNotFoundError: No module named 'models.lidar'

How can I fix this? Thanks.

opened by bobd988 4
Which scenario file was used for evaluation?

Hi,

I wanted to know which scenario.json file was used for the ablations in this paper. I couldn't find the information in the README. The leaderboard version linked in the repository is from Jan 2021 it has one scenario file (all_towns_traffic_scenarios_public.json), but there seem to be only scenarios 1,3 and 4 specified within them. In the current master branch of the leaderboard there is also an all_towns_traffic_scenarios_public.json that additionally contains scenarios 2,7,8,9,10.

So, I am curious which scenario file was used for training and evaluation in this work (and which types of scenarios were considered)

opened by Kait0 3
Issues with runing

Hello,

There is a problem that puzzles me. When running programs, the CPU looks strange. The number of running processes can reach 40! It makes the simulator no longer smooth and the system execution time longer.

I think my CPU can provide good support normally..

CPU: Intel® Xeon(R) CPU E5-2630 v4 @ 2.20GHz × 40 GPU: NVIDIA GeForce RTX 2080 T

May I ask if anyone has encountered a situation similar to mine？ Any help will be appreciated!

opened by klaus99815 3
time cost during evaluation
Hi, Chen. When I evaluate the method, it cost a lot of time. It took over 1 hour to finish the route_0 in routes_lav_valid.xml file while the time ratio was under 0.2. In addition, when I collecting data via Carla challange(LBC), I found the FPS is about 4 FPS that also quite time-consuming. And I found the CPU or memory of GPU are still have many available resources.

I want to ask the could I accelerate the process during evaluate or data collecting? Like by setting the FPS. Or the ratio only depend on my device performance.

To collect large dataset like LAV provided, did you start multiple processes to collect data simultaneously? How long did it took for collecting the provided dataset?

btw, I just use one TITAN Xp for above try. Looking forward to your reply.
opened by Watson52 2
Cropped other bevs shift from locs

Hello, Sorry to bother you.

In bev_planner_v2.py, when plotting the cropped_other_bev, I find that some bevs shift from their original their locs. As shown in Fig. (c), the locs are plotted as red dots, which shifts away from the vehicle colored with yellow. I have already set the feature_x_jitter, feature_y_jitter, x_jitter, a_jitter to 0. The crop_feature code can be found bellow. locs_jitter = (torch.rand((K,2))*2-1).float().to(locs.device) * self.feature_x_jitter locs_jitter[:,1] = 0 oris_jitter = (torch.rand((K,))*2-1).float().to(oris.device) * self.feature_angle_jitter cropped_other_bev = self.crop_feature(flat_bev, flat_rel_loc0+locs_jitter, flat_rel_ori0+oris_jitter, pixels_per_meter=self.pixels_per_meter, crop_size=self.crop_size*2)

How can I crop other bevs as Fig. (b)? Thank you for your time.

opened by KP-Zhang 1
Is locs[0] equivalent to ego_locs?

Hello, Thank you for your wonderful work.

When reading your code, I find that in temporal_lidar_painted_dataset.py locs[0] is equal to ego_locs. While, in temporal_bev_dataset.py, locs[0] is different from ego_locs. The disagreement caused by ego_locs = rotate_points(ego_locs, -angle, ego_locs[0]) + [offset/self.pixels_per_meter, 0] After rotating ego_locs, ego_locs[0] has already been changed, which will affect the following line. locs = rotate_points(locs, -angle, ego_locs[0]) + [offset/self.pixels_per_meter, 0] Is locs[0] supposed to be equivalent to ego_locs?

Thank you for your time.

opened by KP-Zhang 1
Can semantic image distinguish between red and green lights?

Hi Chen. I find that the Semantic segmentation camera groups traffic lights together. But I'm not sure if I can distinguish between different colors of lights? Looking forward to your reply.

opened by Watson52 0
Porting LAV to BeamNG.tech

Hello,

I am a student working on a project involving testing ADAS/AV and scenario synthesis using BeamNG.tech and would love to run (test) your driving agent in that simulator. I know CARLA is kind of a de facto standard, but IMHO BeamNG.tech is superior when it comes to physic simulation, content, and flexibility. Further, BeamNG.tech is free for research, offers a python API, just like CARLA, and implements a wide range of sensors.

So I wonder how technically difficult it would be to port LAV to BeamNG.tech and whether anyone of you could support me (and my colleagues) in doing so. Hope to hear from you soon,

Thanks!

-- Benedikt Steininger

opened by Stoneymon 0
Data Collection Failed
Thanks for your great work! It helped me a lot!

But I still have a question: When I run python data_collect.py --num-runners=8 in terminal, it gives me this error:

(pid=raylet) File "/home/jianli/anaconda3/envs/LAV-env2/lib/python3.7/socket.py", line 752, in getaddrinfo (pid=raylet) for res in _socket.getaddrinfo(host, port, family, type, proto, flags): (pid=raylet) socket.gaierror: [Errno -2] Name or service not known

Have anyone met this problem before? Is there anything I could do to solve this problem? Thank you in advance!
opened by JianLiMech 0
LAV ego_plan_locs time interval and the EKF module
Hello @dotchen , impressive work! I have ran your latest LAV agent with CARLA Leaderboard. I try to understand the detailed implementation and have run across a couple of questions below:

The LAV agent plans 20 steps for the Ego vehicle. Are the 20 points in the ego_plan_locs distributed at equivalent time interval? If so, what is the time interval? And is the object prediction using the same time interval? According to the coordinates in the generated ego_plan_locs, a reasonable interval would be between 0.3s - 0.4s, assuming a same time interval for 20 steps.

In the new release, there's an EKF module. But I still observed quite some noise in the EKF output. The ego loc was still constantly jumping around in a small range. Is there any plan to improve this? Or am I missing something here?

I'd appreciate your insights on these questions a lot! Thanks in advance!
opened by yuting-fu 0
A suggestion on the threshold for brake prediction
I found that in the previous V1 agent, the threshold for brake prediction is 0.3 in lav_agent.py

if float(pred_bra) > 0.3: throt, brake = 0, 1

However, the threshold changes to 0.1 in the V2 agent

if float(pred_bra) > 0.1: throt, brake = 0, 1

This makes the car stop in the middle road in many scenarios.

I personally change the code to use the throttle as threshold to fix the problem.

if float(pred_bra) > throt: throt, brake = 0, 1
opened by CAS-LRJ 4

Owner

Dian Chen

GitHub https://dotchen.github.io/LAV/

[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 5, 2023

[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1 Liang Pan1 Zhongang Cai1,2,3 Ziwei Liu1* 1S-Lab, Nanyang Technologic

96 Jan 3, 2023

Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

237 Dec 27, 2022

A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

225 Dec 25, 2022

[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving Paper | Supplementary | Video | Poster | Blog This repository is for the ICCV 2021 pap

254 Jan 2, 2023

Roach: End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

CARLA-Roach This is the official code release of the paper End-to-End Urban Driving by Imitating a Reinforcement Learning Coach by Zhejun Zhang, Alexa

118 Dec 28, 2022

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

116 Dec 19, 2022

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

?? Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

156 Jan 9, 2023

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)

842 Jan 4, 2023

[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University

235 Jan 3, 2023

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

OpenCDA OpenCDA is a SIMULATION tool integrated with a prototype cooperative driving automation (CDA; see SAE J3216) pipeline as well as regular autom

726 Dec 29, 2022

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集，包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。人机交互主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022

Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

101 Dec 5, 2022

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

34 Nov 9, 2022

This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.

TransFill-Reference-Inpainting This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transf

80 Dec 8, 2022

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

Related tags

Overview

LAV

Reference

Demo Video

Getting Started

Training

Evaluation

Dataset

Acknowledgements

License

Comments

Owner

Dian Chen

[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Real-time Object Detection for Streaming Perception, CVPR 2022

A Joint Video and Image Encoder for End-to-End Retrieval

[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving

Roach: End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

git《Joint Entity and Relation Extraction with Set Prediction Networks》(2020) GitHub:

An end-to-end implementation of intent prediction with Metaflow and other cool tools

[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨