[NeurIPS 2021] Official implementation of paper "Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization".

DeciForce: Crossroads of Machine Perception and Autonomy

Last update: Dec 19, 2022

Related tags

Deep Learning multi-agent-reinforcement-learning

Overview

Code for Coordinated Policy Optimization

Webpage | Code | Paper | Talk (English) | Talk (Chinese)

Hi there! This is the source code of the paper “Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization”.

Please following the tutorial below to kickoff the reproduction of our results.

Installation

# Create virtual environment
conda create -n copo python=3.7
conda activate copo

# Install dependency
pip install metadrive-simulator==0.2.3
pip install torch  # Make sure your torch is successfully installed! Especially when using GPU!

# Install environment and algorithm.
cd code
pip install -e .

Training

As a quick start, you can start training CoPO in Intersection environment immediately after installation by running:

cd code/copo/
python inter/train_copo_dist.py --exp-name inter_copo_dist

The general way to run training is following:

cd code/copo/
python ENV/train_ALGO.py --exp-name EXPNAME

Here ENV refers to the shorthand of environments:

round  # Roundabout
inter  # Intersection
bottle  # Bottleneck
parking  # Parking Lot
tollgate  # Tollgate

and ALGO is the shorthand for algorithms:

ippo  # Individual Policy Optimization
ccppo  # Mean Field Policy Optimization
cl  # Curriculum Learning
copo_dist  # Coordinated Policy Optimiztion (Ours)
copo_dist_cc  # Coordinated Policy Optimiztion with Centralized Critics

finally the EXPNAME is arbitrary name to denote the experiment (with multiple concurrent trials), such as roundabout_copo.

Visualization

We provide the trained models for all algorithms in all environments. A simple command can bring you the visualization of the behaviors of the populations!

cd copo
python vis.py 

# In default, we provide you the CoPO population in Intersection environment. 
# If you want to see others, try:
python vis.py --env round --algo ippo

# Or you can use the native renderer for 3D rendering:
# (Press H to show helper message)
python vis.py --env tollgate --algo cl --use_native_render

We hope you enjoy the interesting behaviors learned in this work! Please feel free to contact us if you have any questions, thanks!

Citation

@misc{peng2021learning,
      title={Learning to Simulate Self-Driven Particles System with Coordinated Policy Optimization}, 
      author={Zhenghao Peng and Quanyi Li and Ka Ming Hui and Chunxiao Liu and Bolei Zhou},
      year={2021},
      eprint={2110.13827},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Comments

Some Visualization Issues
Hello, I am very interested in the CoPO project! But at the moment I have some problems, I hope you can clear my confusion, thanks!

The following error is prompted when running vis_from_checkpoint.py. My path points to checkpoint-480 as shown. What is the cause of the error? Is it the wrong way to run the script?

I don't understand how the .npz file in the best_checkpoints folder is generated?

You declare checkpoint read type as checkpoint-xxx in vis_from_checkpoint.py. But declare checkpoint_name is like: {ALGO}_ {ENV} _{INDEX}.npz in get_policy_function.py. Which way should I follow? Do I need to convert checkpoint-xxx files to npz files? How to convert it?

What does "Note that if you are restoring CoPO checkpoint, you need to implement appropriate wrapper to encode the LCF into the observation and feed them to the neural network." in vis_from_checkpoint.py mean?

How is the following visualization made? The vehicle trajectory and collision location are visually displayed, which is great!

Very much looking forward to your reply! Thank you for taking the time to answer these questions!
opened by 6Lackiu 8
When training multi-agent(train_all_copo_dist.py), observation type is not a Dictionary but a Box

When execute 'train_all_copo_dist.py', I was watching the process of generating train data. in this project, because it is multi-agent observation, I think observation should be gym.Dict. (as wrote in ,annotation: 'metadrive/manager/agent_manager.py' ) but I checked obs_space generated in Box type. and It was executed normally.(no errors)

How can I fix this problem? Or is this not a problem?

opened by hyoonsoo 2
Running on a remote server?

Hello, I want to run the visualization program python vis.py on a remote server, but the following problem occurs. Can the visualization program be started on a remote server?

opened by 6Lackiu 2
help wanted:ray.exceptions.RayTaskError(KeyError)

Description I am in the process of running training.

Operating System ubuntu 18.04

Problems When I run the commend 'python inter/train_cl.py --exp-name inter_cl ','ray.exceptions.RayTaskError(KeyError)' and 'KeyError: step_reward' happened.
I hope to get help.,thank you!

error: Failure # 1 (occurred at 2022-04-22_09-18-36) Traceback (most recent call last): File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 586, in _process_trial results = self.trial_executor.fetch_result(trial) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 609, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper return func(*args, **kwargs) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/worker.py", line 1456, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(KeyError): [36mray::IPPOCL.train_buffered()[39m (pid=55550, ip=192.168.79.142) File "python/ray/_raylet.pyx", line 480, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/tune/trainable.py", line 167, in train_buffered result = self.train() File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 526, in train raise e File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 515, in train result = Trainable.train(self) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/tune/trainable.py", line 226, in train result = self.step() File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 148, in step res = next(self.train_exec_impl) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 756, in next return next(self.built_iterator) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: [Previous line repeated 1 more time] File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 876, in apply_flatten for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 828, in add_wait_hooks item = next(it) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach for item in it: [Previous line repeated 1 more time] File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 471, in base_iterator yield ray.get(futures, timeout=timeout) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper return func(*args, **kwargs) ray.exceptions.RayTaskError(KeyError): [36mray::RolloutWorker.par_iter_next()[39m (pid=55549, ip=192.168.79.142) File "python/ray/_raylet.pyx", line 480, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/util/iter.py", line 1152, in par_iter_next return next(self.local_it) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 327, in gen_rollouts yield self.sample() File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 662, in sample batches = [self.input_reader.next()] File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 95, in next batches = [self.get_data()] File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 224, in get_data item = next(self.rollout_provider) File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 620, in _env_runner sample_collector=sample_collector, File "/home/behazy/anaconda3/envs/copo/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 1124, in _process_observations_w_trajectory_view_api env_index=env_id) File "/home/behazy/CoPO/copo_code/copo/callbacks.py", line 41, in on_episode_step episode.user_data["step_reward"][k].append(info["step_reward"]) KeyError: 'step_reward'

opened by Behazy 2
TypeError: 'str' object is not callable

(copo) user@user-virtual-machine:~/CoPO/copo_code/copo$ python vis.py Successfully registered the following environments: ['MetaDrive-validation-v0', 'MetaDrive-10env-v0', 'MetaDrive-100envs-v0', 'MetaDrive-1000envs-v0', 'SafeMetaDrive-validation-v0', 'SafeMetaDrive-10env-v0', 'SafeMetaDrive-100envs-v0', 'SafeMetaDrive-1000envs-v0', 'MARLTollgate-v0', 'MARLBottleneck-v0', 'MARLRoundabout-v0', 'MARLIntersection-v0', 'MARLParkingLot-v0', 'MARLMetaDrive-v0']. Traceback (most recent call last): File "vis.py", line 49, in action = policy_function(o, d) File "/home/user/CoPO/copo_code/copo/eval/get_policy_function.py", line 153, in call actions = self.policy(obs_to_be_eval) TypeError: 'str' object is not callable

opened by Behazy 2
Fix info dict key error with latest MetaDrive
In this PR, I will fix:

Wrong state observation. The path to StateObservation is wrong with latest MetaDrive and we have to change the import code for this.

Wrong info dict. In latest MetaDrive, in the first step of a newly spawn agent the info dict contains velocity entries like: {velocity=0.0}. Therefore in the DrivingCallback, the criterion to check whether the info dict is came from a newly spawn agent if "velocity" in info become invalid since velocity is there even the agent is just born. I changed this criterion to if "step_reward" in info now!

Related to #14
opened by pengzhenghao 1
Relationship between networks

Please the relationship between the policy network, the individual value network, the neighborhood value network, and the global value network and how to transfer the parameters between them.

opened by shushushulian 0
Visualizing when training

Thanks for your code and contribution! It's so great! Then, I have some questions about visualization and local mode.

First, is it possible to visualize the training process? Although it might costs lots of memory and efficiency, I prefer to check my scenes as training the model.

Secondly, I'm not sure I get your meaning about local mode. Does that mean I can solve the first question? Would you mind explaining what you mean more detail?

Sorry to bother you. Thanks a lot!!

opened by jhih-ching-yeh 7
Visualize PGMap

Hello, I reproduced your code, in addition to the five scenarios in the paper, there is also a PGMap scenario, in terms of success rate, PGMapde success rate is very high, other scenarios have a very low success rate, so I want to visualize PGMap. The model I trained has been converted into .npz . According to the previous requirements, and the five scenarios in the paper can be visualized normally except for the low success rate (which may not be trained well).But visualizing PGMap has a success rate of 0! Each time it collides halfway, it does not match the success rate of training of 0.8

I added the PGMap scene to the vis.py file

Also prompt that variable meta_svo_lookup_table is required.Noting that it was mean and std, I found the progress .csv and added two variables. I would like to ask which step is wrong or what needs to be added to make the success rate normal

opened by shile1998 1
How can I reproduce experimental results?

Hello, I am very impressed with the CoPO project. Thank you for sharing a great paper and code. I wanted to see the trained multi-agent, so I visualized it using the weight stored in copo_code/copo/best_checkpoint/ and copo_code/vis.py. file. (without any modifications) However, unlike the paper, I was able to render agents with lower performance(lower succeess rate). How should I modify the code to see the higher performance of agents like your paper? I look forward to your reply. Thank you.

opened by hyoonsoo 4

Owner

DeciForce: Crossroads of Machine Perception and Autonomy

Research on Unifying Machine Perception and Autonomy in Zhou Group

GitHub

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

11 Nov 27, 2022

Official implementation of NeurIPS'2021 paper TransformerFusion

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers Project Page | Paper | Video TransformerFusion: Monocular RGB Scene Reconstru

118 Dec 25, 2022

Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021) Official Pytorch implementation of Unbiased Classification

17 Jan 1, 2023

This is an official PyTorch implementation of Task-Adaptive Neural Network Search with Meta-Contrastive Learning (NeurIPS 2021, Spotlight).

NeurIPS 2021 (Spotlight): Task-Adaptive Neural Network Search with Meta-Contrastive Learning This is an official PyTorch implementation of Task-Adapti

15 Nov 21, 2022

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

12 Dec 7, 2022

Official implementation of Generalized Data Weighting via Class-level Gradient Manipulation (NeurIPS 2021).

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

9 Nov 3, 2021

Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

NBFNet: Neural Bellman-Ford Networks This is the official codebase of the paper Neural Bellman-Ford Networks: A General Graph Neural Network Framework

136 Dec 21, 2022

Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

Introduction Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021 Prerequisites Python 3.8 and conda, get Conda CUDA 11

51 Dec 3, 2022

PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

76 Jan 3, 2023

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022

Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

39 Nov 10, 2022

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

46 Nov 9, 2022

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Likelihood-Regret Official implementation of Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020. T

33 Oct 12, 2022

Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Official implementation of GOCor This is the official implementation of our paper : GOCor: Bringing Globally Optimized Correspondence Volumes into You

71 Nov 18, 2022

Official Implementation of Swapping Autoencoder for Deep Image Manipulation (NeurIPS 2020)

Swapping Autoencoder for Deep Image Manipulation Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros, Richard Zhang UC

449 Dec 27, 2022

Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

CDN Code for our NeurIPS 2021 paper "Mining the Benefits of Two-stage and One-stage HOI Detection". Contributed by Aixi Zhang*, Yue Liao*, Si Liu, Mia

71 Dec 14, 2022

Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Code To run: python runner.py new --save <SAVE_NAME> --data <PATH_TO_DATA_DIR> --dataset <DATASET> --model <model_name> [options] --n 1000 - train - t

5 Dec 12, 2022

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

ReLU-GP Residual (RGPR) This repository contains code for reproducing the following NeurIPS 2021 paper: @inproceedings{kristiadi2021infinite, title=

4 Dec 26, 2021

Code for our NeurIPS 2021 paper 'Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation'

Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation (NeurIPS 2021) Code for our NeurIPS 2021 paper 'Exploiting the Intri

53 Dec 25, 2022