Roach: End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

Zhejun Zhang

Last update: Dec 28, 2022

Related tags

Deep Learning reinforcement-learning pytorch autonomous-driving imitation-learning carla iccv2021

Overview

CARLA-Roach

This is the official code release of the paper
End-to-End Urban Driving by Imitating a Reinforcement Learning Coach
by Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu and Luc van Gool, accepted at ICCV 2021.

It contains the code for benchmark, off-policy data collection, on-policy data collection, RL training and IL training with DAGGER. It also contains trained models of RL experts and IL agents. The supplementary videos can be found at the paper's homepage.

Installation

Please refer to INSTALL.md for installation. We use AWS EC2, but you can also install and run all experiments on your computer or cluster.

Quick Start: Collect an expert dataset using Roach

Roach is an end-to-end trained agent that drives better and more naturally than hand-crafted CARLA experts. To collect a dataset from Roach, use run/data_collect_bc.sh and modify the following arguments:

save_to_wandb: set to False if you don't want to upload the dataset to W&B.
dataset_root: local directory for saving the dataset.
test_suites: default is eu_data which collects data in Town01 for the NoCrash-dense benchmark. Available configurations are found here. You can also create your own configuration.
n_episodes: how many episodes to collect, each episode will be saved to a separate h5 file.
agent/cilrs/obs_configs: observation (i.e. sensor) configuration, default is central_rgb_wide. Available configurations are found here. You can also create your own configuration.
inject_noise: default is True. As introduced in CILRS, triangular noise is injected to steering and throttle such that the ego-vehicle does not always follow the lane center. Very useful for imitation learning.
actors.hero.terminal.kwargs.max_time: Maximum duration of an episode, in seconds.
Early stop the episode if traffic rule is violated, such that the collected dataset is error-free.
- actors.hero.terminal.kwargs.no_collision: default is True.
- actors.hero.terminal.kwargs.no_run_rl: default is False.
- actors.hero.terminal.kwargs.no_run_stop: default is False.

Benchmark

To benchmark checkpoints, use run/benchmark.sh and modify the arguments to select different settings. We recommend g4dn.xlarge with 50 GB free disk space for video recording. Use screen if you want to run it in the background

screen -L -Logfile ~/screen.log -d -m run/benchmark.sh

Trained Models

The trained models are hosted here on W&B. Given the corresponding W&B run path, our code will automatically download and load the checkpoint with the configuration yaml file.

The following checkpoints are used to produce the results reported in our paper.

To benchmark the Autopilot, use benchmark() with agent="roaming".
To benchmark the RL experts, use benchmark() with agent="ppo" and set agent.ppo.wb_run_path to one of the following.
- iccv21-roach/trained-models/1929isj0: Roach
- iccv21-roach/trained-models/1ch63m76: PPO+beta
- iccv21-roach/trained-models/10pscpih: PPO+exp
To benchmark the IL agents, use benchmark() with agent="cilrs" and set agent.cilrs.wb_run_path to one of the following.
- Checkpoints trained for the NoCrash benchmark, at DAGGER iteration 5:
  - iccv21-roach/trained-models/39o1h862: L_A(AP)
  - iccv21-roach/trained-models/v5kqxe3i: L_A
  - iccv21-roach/trained-models/t3x557tv: L_K
  - iccv21-roach/trained-models/1w888p5d: L_K+L_V
  - iccv21-roach/trained-models/2tfhqohp: L_K+L_F
  - iccv21-roach/trained-models/3vudxj38: L_K+L_V+L_F
  - iccv21-roach/trained-models/31u9tki7: L_K+L_F(c)
  - iccv21-roach/trained-models/aovrm1fs: L_K+L_V+L_F(c)
- Checkpoints trained for the LeaderBoard benchmark, at DAGGER iteration 5:
  - iccv21-roach/trained-models/1myvm4mw: L_A(AP)
  - iccv21-roach/trained-models/nw226h5h: L_A
  - iccv21-roach/trained-models/12uzu2lu: L_K
  - iccv21-roach/trained-models/3ar2gyqw: L_K+L_V
  - iccv21-roach/trained-models/9rcwt5fh: L_K+L_F
  - iccv21-roach/trained-models/2qq2rmr1: L_K+L_V+L_F
  - iccv21-roach/trained-models/zwadqx9z: L_K+L_F(c)
  - iccv21-roach/trained-models/21trg553: L_K+L_V+L_F(c)

Available Test Suites

Set argument test_suites to one of the following.

NoCrash-busy
- eu_test_tt: NoCrash, busy traffic, train town & train weather
- eu_test_tn: NoCrash, busy traffic, train town & new weather
- eu_test_nt: NoCrash, busy traffic, new town & train weather
- eu_test_nn: NoCrash, busy traffic, new town & new weather
- eu_test: eu_test_tt/tn/nt/nn, all 4 conditions in one file
NoCrash-dense
- nocrash_dense: NoCrash, dense traffic, all 4 conditions
LeaderBoard:
- lb_test_tt: LeaderBoard, busy traffic, train town & train weather
- lb_test_tn: LeaderBoard, busy traffic, train town & new weather
- lb_test_nt: LeaderBoard, busy traffic, new town & train weather
- lb_test_nn: LeaderBoard, busy traffic, new town & new weather
- lb_test: lb_test_tt/tn/nt/nn all, 4 conditions in one file
LeaderBoard-all
- cc_test: LeaderBoard, busy traffic, all 76 routes, dynamic weather

Collect Datasets

We recommend g4dn.xlarge for dataset collecting. Make sure you have enough disk space attached to the instance.

Collect Off-Policy Datasets

To collect off-policy datasets, use run/data_collect_bc.sh and modify the arguments to select different settings. You can use Roach (given a checkpoint) or the Autopilot to collect off-policy datasets. In our paper, before the DAGGER training the IL agents are initialized via behavior cloning (BC) using an off-policy dataset collected in this way.

Some arguments you may want to modify:

Set save_to_wandb=False if you don't want to upload the dataset to W&B.
Select the environment for collecting data by setting the argument test_suites to one of the following
- eu_data: NoCrash, train town & train weather. We collect n_episodes=80 for BC dataset on NoCrash, that is around 75 GB and 6 hours of data.
- lb_data: LeaderBoard, train town & train weather. We collect n_episodes=160 for BC dataset on LeaderBoard, that is around 150 GB and 12 hours of data.
- cc_data: CARLA Challenge, all six maps (Town1-6), dynamic weather. We collect n_episodes=240 for BC dataset on CARLA Challenge, that is around 150 GB and 18 hours of data.
For RL experts, the used checkpoint is set via agent.ppo.wb_run_path and agent.ppo.wb_ckpt_step.
- agent.ppo.wb_run_path is the W&B run path where the RL training is logged and the checkpoints are saved.
- agent.ppo.wb_ckpt_step is the step of the checkpoint you want to use. If it's an integer, the script will find the checkpoint closest to that step. If it's null, the latest checkpoint will be used.

Collect On-Policy Datasets

To collect on-policy datasets, use run/data_collect_dagger.sh and modify the arguments to select different settings. You can use Roach or the Autopilot to label on-policy (DAGGER) datasets generated by an IL agent (given a checkpoint). This is done by running the data_collect.py using an IL agent as the driver, and Roach/Autopilot as the coach. So the expert supervisions are generated and recorded on the fly.

Most things are the same as collecting off-policy BC datasets. Here are some changes:

Set agent.cilrs.wb_run_path to the W&B run path where the IL training is logged and the checkpoints are saved.
By adjusting n_episodes we make sure the size of the DAGGER dataset at each iteration to be around 20% of the BC dataset size.
- For RL experts we use an n_episodes which is the half of n_episodes of the BC dataset.
- For the Autopilot we use an n_episodes which is the same as n_episodes of the BC dataset.

Train RL Experts

To train RL experts, use run/train_rl.sh and modify the arguments to select different settings. We recommend to use g4dn.4xlarge for training the RL experts, you will need around 50 GB free disk space for videos and checkpoints. We train RL experts on CARLA 0.9.10.1 because 0.9.11 crashes more often for unknown reasons.

Train IL Agents

To train IL agents, use run/train_il.sh and modify the arguments to select different settings. Training IL agents does not require CARLA and it's a GPU-heavy task. Therefore, we recommend to use AWS p-instances or your cluster to run the IL training. Our implementation follows DA-RB (paper, repo), which trains a CILRS (paper, repo) agent using DAGGER.

The training starts with training the basic CILRS via behavior cloning using an off-policy dataset.

Collect off-policy DAGGER dataset.
Train the IL model.
Benchmark the trained model.

Then repeat the following DAGGER steps until the model achieves decent results.

Collect on-policy DAGGER dataset.
Train the IL model.
Benchmark the trained model.

For the BC training,the following arguments have to be set.

Datasets
- dagger_datasets: a vector of strings, for BC training it should only contain the path (local or W&B) to the BC dataset.
Measurement vector
- agent.cilrs.env_wrapper.kwargs.input_states can be a subset of [speed,vec,cmd]
- speed: scalar ego_vehicle speed
- vec: 2D vector pointing to the next GNSS waypoint
- cmd: one-hot vector of high-level command
Branching
- For 6 branches:
  - agent.cilrs.policy.kwargs.number_of_branches=6
  - agent.cilrs.training.kwargs.branch_weights=[1.0,1.0,1.0,1.0,1.0,1.0]
- For 1 branch:
  - agent.cilrs.policy.kwargs.number_of_branches=1
  - agent.cilrs.training.kwargs.branch_weights=[1.0]
Action Loss
- L1 action loss
  - agent.cilrs.env_wrapper.kwargs.action_distribution=null
  - agent.cilrs.training.kwargs.action_kl=false
- KL loss
  - agent.cilrs.env_wrapper.kwargs.action_distribution="beta_shared"
  - agent.cilrs.training.kwargs.action_kl=true
Value Loss
- Disable
  - agent.cilrs.env_wrapper.kwargs.value_as_supervision=false
  - agent.cilrs.training.kwargs.value_weight=0.0
- Enable
  - agent.cilrs.env_wrapper.kwargs.value_as_supervision=true
  - agent.cilrs.training.kwargs.value_weight=0.001
Pre-trained action/value head
- agent.cilrs.rl_run_path and agent.cilrs.rl_ckpt_step are used to initialize the IL agent's action/value heads with Roach's action/value head.
Feature Loss
- Disable
  - agent.cilrs.env_wrapper.kwargs.dim_features_supervision=0
  - agent.cilrs.training.kwargs.features_weight=0.0
- Enable
  - agent.cilrs.env_wrapper.kwargs.dim_features_supervision=256
  - agent.cilrs.training.kwargs.features_weight=0.05

During the DAGGER training, a trained IL agent will be loaded and you cannot change the configuration any more. You will have to set

agent.cilrs.wb_run_path: the W&B run path where the previous IL training was logged and the checkpoints are saved.
agent.cilrs.wb_ckpt_step: the step of the checkpoint you want to use. Leave it as null will load the latest checkpoint.
dagger_datasets: vector of strings, W&B run path or local path to DAGGER datasets and the BC dataset in time-reversed order, for example [PATH_DAGGER_DATA_2, PATH_DAGGER_DATA_1, PATH_DAGGER_DATA_0, BC_DATA]
train_epochs: optionally you can change it if you want to train for more epochs.

Citation

Please cite our work if you found it useful:

@inproceedings{zhang2021roach,
  title = {End-to-End Urban Driving by Imitating a Reinforcement Learning Coach},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  author = {Zhang, Zhejun and Liniger, Alexander and Dai, Dengxin and Yu, Fisher and Van Gool, Luc},
  year = {2021},
}

License

This software is released under a CC-BY-NC 4.0 license, which allows personal and research use only. For a commercial license, please contact the authors. You can view a license summary here.

Portions of source code taken from external sources are annotated with links to original files and their corresponding licenses.

Acknowledgements

This work was supported by Toyota Motor Europe and was carried out at the TRACE Lab at ETH Zurich (Toyota Research on Automated Cars in Europe - Zurich).

Comments

Train RL Experts

Hello, I'm executing run / train_ rl. sh, I found that several Carla windows were opened at the same time, and then crashed. I learned in your paper that RL experts are trained at the same time in Town 1-6, so I think my GPU may not meet the needs of the code. Please tell me the number and model of graphics cards you use when training RL experts.

opened by Yiquan-lol 13
Adding Leaderboard Scenarios

Hi, I wander have you tried to add the scenarios in Carla's scenario manager to your enviroment? I plan to do so but I am not sure whether py_trees used in the scenario manager supports multi-processing trainning (SubprocVecEnv). Thank you in advance!

opened by WPH-commit 4
CUDA error: out of memory

I have collected NoCrash-dense data successfully: https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/run/data_collect_bc_NeilBranch0.sh https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/data_collect_NeilBranch0.py

When I run my version of train_rl.py ( https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/train_rl_NeilBranch0.py ), I get the following error: Traceback (most recent call last): File "train_rl_NeilBranch0.py", line 87, in main agent = AgentClass('config_agent.yaml') File "/home/nsambhu/github/carla-roach/agents/rl_birdview/rl_birdview_agent.py", line 31, in init self.setup(path_to_conf_file) File "/home/nsambhu/github/carla-roach/agents/rl_birdview/rl_birdview_agent.py", line 205, in setup self._policy, self._train_cfg['kwargs'] = self._policy_class.load(self._ckpt) File "/home/nsambhu/github/carla-roach/agents/rl_birdview/models/ppo_policy.py", line 226, in load saved_variables = th.load(path, map_location=device) File "/home/nsambhu/anaconda3/envs/carla/lib/python3.7/site-packages/torch/serialization.py", line 529, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/home/nsambhu/anaconda3/envs/carla/lib/python3.7/site-packages/torch/serialization.py", line 702, in _legacy_load result = unpickler.load() File "/home/nsambhu/anaconda3/envs/carla/lib/python3.7/site-packages/torch/serialization.py", line 665, in persistent_load deserialized_objects[root_key] = restore_location(obj, location) File "/home/nsambhu/anaconda3/envs/carla/lib/python3.7/site-packages/torch/serialization.py", line 737, in restore_location return default_restore_location(storage, map_location) File "/home/nsambhu/anaconda3/envs/carla/lib/python3.7/site-packages/torch/serialization.py", line 156, in default_restore_location result = fn(storage, location) File "/home/nsambhu/anaconda3/envs/carla/lib/python3.7/site-packages/torch/serialization.py", line 136, in _cuda_deserialize return storage_type(obj.size()) File "/home/nsambhu/anaconda3/envs/carla/lib/python3.7/site-packages/torch/cuda/init.py", line 480, in _lazy_new return super(_CudaBase, cls).new(cls, *args, **kwargs) RuntimeError: CUDA error: out of memory

My shell script to call train_rl.py is listed here: https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/run/train_rl_NeilBranch0.sh

I have already reduced the batch size from 256 to 1 and the error persists: https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/config/agent/ppo/training/ppo.yaml

Output from ( https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/train_rl_NeilBranch0.py#L78 ) to show the batch size decreased: cfg.agent[agent_name] {'entry_point': 'agents.rl_birdview.rl_birdview_agent:RlBirdviewAgent', 'wb_run_path': '', 'wb_ckpt_step': None, 'env_wrapper': {'entry_point': 'agents.rl_birdview.utils.rl_birdview_wrapper:RlBirdviewWrapper', 'kwargs': {'input_states': ['control', 'vel_xy'], 'acc_as_action': True}}, 'policy': {'entry_point': 'agents.rl_birdview.models.ppo_policy:PpoPolicy', 'kwargs': {'policy_head_arch': [256, 256], 'value_head_arch': [256, 256], 'features_extractor_entry_point': 'agents.rl_birdview.models.torch_layers:XtMaCNN', 'features_extractor_kwargs': {'states_neurons': [256, 256]}, 'distribution_entry_point': 'agents.rl_birdview.models.distributions:BetaDistribution', 'distribution_kwargs': {'dist_init': None}}}, 'training': {'entry_point': 'agents.rl_birdview.models.ppo:PPO', 'kwargs': {'learning_rate': 1e-05, 'n_steps_total': 12288, 'batch_size': 1, 'n_epochs': 20, 'gamma': 0.99, 'gae_lambda': 0.9, 'clip_range': 0.2, 'clip_range_vf': None, 'ent_coef': 0.01, 'explore_coef': 0.05, 'vf_coef': 0.5, 'max_grad_norm': 0.5, 'target_kl': 0.01, 'update_adv': False, 'lr_schedule_step': 8}}, 'obs_configs': {'birdview': {'module': 'birdview.chauffeurnet', 'width_in_pixels': 192, 'pixels_ev_to_bottom': 40, 'pixels_per_meter': 5.0, 'history_idx': [-16, -11, -6, -1], 'scale_bbox': True, 'scale_mask_col': 1.0}, 'speed': {'module': 'actor_state.speed'}, 'control': {'module': 'actor_state.control'}, 'velocity': {'module': 'actor_state.velocity'}}}

opened by neilsambhu 3
ValueError: Could not find project

Hello, I'm trying to run run/benchmark.sh and I need to modify agent.cilrs.rl_run_path and agent.cilrs.rl_ckpt_step such that I could load my ckpt. Sorry for a rookie in wandb, as shown in the official docs, agent.cilrs.rl_run_path should be in 'entity/project' format, while the entity is '' and project is 'il_leaderboard_roach' if I just follow the lb running in run/train_il.sh. But it keeps telling me that project is not found, any helps? Thanks!!

opened by kejie7243 3
Segmentation fault (core dumped) while training RL expert

Hi,

First of all thank you very much for sharing your code and your work in the paper is very interesting. But I encounter a problem while training RL expert. I would appreciate if you help me.

I installed carla 0.9.10 like you said in Installation.md. I can run benchmark.py and observe the car's behavior in video log. However, when I run train_rl.py, it shows segmentation fault. Moreover, I also noticed that the problem occurs when self._world tries to get vehicle_bbox_list and walker_bbox_list in chauffeurnet.py.

Have you encountered similar problem while you train your RL expert ?

Thank you,

opened by atg93 2
Getting RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED while running `train_rl.sh`

After I start running train_rl.sh, when the program executes this line self.policy.forward(self._last_obs) in ppo.py, the program gets stuck for a while and gives out the above error. @zhejz Do you have any idea?

opened by Z7MightGuy 2
Adding New Fields to the Dataset

Hi,

As far as I understand, it is enough to change the observation configs to add new fields to the dataset. It is doable, for example, when I change the observation manager of an individual sensor (i.e., ObsManager class of GNSS -- adding the sensor noise value to the observation dictionary for the sake of example --). However, when I run the default data collection code (i.e., data_collect_bc.sh), it does not add the navigation.waypoint_plan and birdview.chauffeurnet observation dictionaries to the dataset, which exists in the agent/cilrs/obs_configs=central_rgb_wide. Probably, I am missing the point here; that is, there are also other things that need to be set. At least, I would like to reach to navigation.waypoint_plan observation dictionary during the data acquisition. So, how can I add the observation dictionaries of other ObsManager modules to the dataset? I really appreciate it if you can help.

Best.

opened by vaydingul 2
Training Time

Hi,

Currently, I am trying to train IL models from scratch. I am using a Tesla V100 (32GB) and 8 CPU cores to train models. Lastly, the batch size is 192. My dataset consists of 183000 data points collected with your dataset collection code.

At present, it takes approximately 3-4 hours to train one epoch. Was this the case for you? Can you let me know the GPU specs, training properties, and, finally, the epoch durations? I would really appreciate it if you could help with this issue.

Best.

opened by vaydingul 2
Regarding chauffeurnet implementation

Dear authors, first thanks for sharing your paper and code! It is very excellent.

I am a RL newbie. In the carla/gym/core/obs_manager/birdview/chauffeurnet.py, in the function get_observation, I noticed that when rastering vehicles' bboxes, the code only rasterize the ego vehicle's current bbox and specifically excluding the history of the ego vehicle.

I am wondering whether you have tried rasterize the ego vehicle's history bbox. Is this setting a common practice (from other codebase?)? I am curious about the reason. (Causal confusion? Markov property? RL instability?)

Thanks for your time.

opened by jiaxiaosong1002 2
About the training time of RL agent

Hi, thank you for sharing your excellent work! I want to train your RL agent with a 6 1080Ti (6 Carla server on 6 different GPU) 56 cores machine. But it seems that it takes around 5-6 min for 12288 steps so the total 10M steps will take around 50 days which is not acceptable. Do you know what the possible reason is or how to improve the speed? Thank you!

opened by WPH-commit 2
Setting the argument of n_episodes

Hi,

Very impressive work! One simple question, when we set up the argument of n_episodes, if we need to take into account of the number of towns? For example, if I want to collect 1 episode per towns (Town 1, 3, 4, 6). Shall I set up the n_episodes to 1, or 4?

And for each episode, is the route (the start point and the end point) always set to the same?

Cheers, Yi

opened by yixiao1 2

Owner

Zhejun Zhang

PhD Candidate at CVL, ETH Zurich

GitHub https://www.trace.ethz.ch/publications/2021/roach/index.html

[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 5, 2023

[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving Paper | Supplementary | Video | Poster | Blog This repository is for the ICCV 2021 pap

254 Jan 2, 2023

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

LAV Learning from All Vehicles Dian Chen, Philipp Krähenbühl CVPR 2022 (also arXiV 2203.11934) This repo contains code for paper Learning from all veh

300 Dec 15, 2022

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

?? Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

156 Jan 9, 2023

Reinforcement learning for self-driving in a 3D simulation

SelfDrive_AI Reinforcement learning for self-driving in a 3D simulation (Created using UNITY-3D) 1. Requirements for the SelfDrive_AI Gym You need Pyt

17 Dec 14, 2021

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

334 Jan 6, 2023

Avalanche RL: an End-to-End Library for Continual Reinforcement Learning

43 Dec 24, 2022

An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

33 Jan 3, 2023

[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

Panoptic NeRF Project Page | Paper | Dataset Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation Xiao Fu*, Shangzhan zhang*,

111 Dec 16, 2022

MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding

202 Nov 18, 2022

[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation Xiao Fu1* Shangzhan Zhang1* Tianrun Chen1 Yichong Lu1 Lanyun Zhu2 Xi

37 May 17, 2022

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

8 Nov 7, 2022

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

4 Apr 15, 2022

This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

4 Sep 4, 2021

Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

101 Dec 5, 2022

Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

6 Oct 27, 2022

An end-to-end machine learning web app to predict rugby scores (Pandas, SQLite, Keras, Flask, Docker)

Rugby score prediction An end-to-end machine learning web app to predict rugby scores Overview An demo project to provide a high-level overview of the

34 May 24, 2022

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

78 Dec 27, 2022

Neural Dynamic Policies for End-to-End Sensorimotor Learning

This is a PyTorch based implementation for our NeurIPS 2020 paper on Neural Dynamic Policies for end-to-end sensorimotor learning.

47 Dec 11, 2022