MARL Tricks
Our codes for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implemented and standardized the hyperparameters of the SOTA MARL algorithms.
Python MARL framework
PyMARL is WhiRL's framework for deep multi-agent reinforcement learning and includes implementations of the following algorithms:
Value-based Methods:
- QMIX: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
- VDN: Value-Decomposition Networks For Cooperative Multi-Agent Learning
- IQL: Independent Q-Learning
- QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
- MAVEN: MAVEN: Multi-Agent Variational Exploration
- Qatten: Qatten: A general framework for cooperative multiagent reinforcement learning
- QPLEX: Qplex: Duplex dueling multi-agent q-learning
- WQMIX: Weighted QMIX: Expanding Monotonic Value Function Factorisation
Actor Critic Methods:
- COMA: Counterfactual Multi-Agent Policy Gradients
- VMIX: Value-Decomposition Multi-Agent Actor-Critics
- FacMADDPG: Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control
- LICA: Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
- DOP: Off-Policy Multi-Agent Decomposed Policy Gradients
- RIIT: RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning
PyMARL is written in PyTorch and uses SMAC as its environment.
Installation instructions
Install Python packages
# require Anaconda 3 or Miniconda 3
bash install_dependecies.sh
Set up StarCraft II and SMAC:
bash install_sc2.sh
This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.
Run an experiment
# For SMAC
python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=corridor
# For Cooperative Predator-Prey
python3 src/main.py --config=qmix_prey --env-config=stag_hunt with env_args.map_name=stag_hunt
The config files act as defaults for an algorithm or environment.
They are all located in src/config
. --config
refers to the config files in src/config/algs
--env-config
refers to the config files in src/config/envs
Run parallel experiments:
# bash run.sh config_name map_name_list (threads_num arg_list gpu_list experinments_num)
bash run.sh qmix corridor 2 epsilon_anneal_time=500000 0,1 5
xxx_list
is separated by ,
.
All results will be stored in the Results
folder and named with map_name
.
Force all processes to exit
# all python and game processes of current user will quit.
bash clean.sh
Some test results on Super Hard scenarios
Cite
@article{hu2021riit,
title={RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning},
author={Jian Hu and Haibin Wu and Seth Austin Harding and Siyang Jiang and Shih-wei Liao},
year={2021},
eprint={2102.03479},
archivePrefix={arXiv},
primaryClass={cs.LG}
}