Weighted QMIX: Expanding Monotonic Value Function Factorisation

whirl

Last update: Dec 29, 2022

Related tags

Deep Learning wqmix

Overview

Weighted QMIX: Expanding Monotonic Value Function Factorisation (NeurIPS 2020)

Based on PyMARL (https://github.com/oxwhirl/pymarl/). Please refer to that repo for more documentation.

This repo contains the cleaned-up code that was used in "Weighted QMIX: Expanding Monotonic Value Function Factorisation" (https://arxiv.org/abs/2006.10800).

Included in this repo

In particular implementations for:

OW-QMIX
CW-QMIX
Versions of DDPG & SAC used in the paper

We thank the authors of "QPLEX: Duplex Dueling Multi-Agent Q-Learning" (https://arxiv.org/abs/2008.01062) for their implementation of QPLEX (https://github.com/wjh720/QPLEX/), whose implementation we used. The exact implementation we used is included in this repo.

Note that in the repository the naming of certain hyper-parameters and concepts is a little different to the paper:

α in the paper is w in the code
Optimistic Weighting (OW) is referred to as hysteretic_qmix

For all SMAC experiments we used SC2.4.6.2.69232 (not SC2.4.10). The underlying dynamics are sufficiently different that you cannot compare runs across the 2 versions!

The install_sc2.sh script will install SC2.4.6.2.69232.

Running experiments

The config files (src/config/algs/*.yaml) contain default hyper-parameters for the respective algorithms. These were changed when running the experiments for the paper (epsilon_anneal_time = 1000000 for the robustness to exploration experiments, and w=0.1 for the predator prey punishment experiments for instance). Please see the Appendix of the paper for the exact hyper-parameters used.

Set central_mixer=atten to get the modified mixing network architecture that was used for the final experiment on corridor in the paper.

As an example, to run the OW-QMIX on 3s5z with epsilon annealed over 1mil timesteps using docker:

bash run.sh $GPU python3 src/main.py --config=ow_qmix --env-config=sc2 with env_args.map_name=3s5z w=0.5 epsilon_anneal_time=1000000

Citing

Bibtex:

@inproceedings{rashid2020weighted,
  title={Weighted QMIX: Expanding Monotonic Value Function Factorisation},
  author={Rashid, Tabish and Farquhar, Gregory and Peng, Bei and Whiteson, Shimon},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently This repository is the official implementat

4 Dec 20, 2022

Multiple-criteria decision-making (MCDM) with Electre, Promethee, Weighted Sum and Pareto

EasyMCDM - Quick Installation methods Install with PyPI Once you have created your Python environment (Python 3.6+) you can simply type: pip3 install

6 Nov 22, 2022

A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Documentation | External Resources | Research Paper Shapley is a Python library for evaluating binary classifiers in a machine learning ensemble. The

188 Dec 29, 2022

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

14 Nov 25, 2021

Using Hotel Data to predict High Value And Potential VIP Guests

Description Using hotel data and AI to predict high value guests and potential VIP guests. Hotel can leverage on prediction resutls to run more effect

12 Feb 14, 2022

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

78 Jan 7, 2023

On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

46 Dec 15, 2022

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

NDQ: Learning Nearly Decomposable Value Functions with Communication Minimization Note This codebase accompanies paper Learning Nearly Decomposable Va

69 Nov 26, 2022

PyTorch implementation of the Value Iteration Networks (VIN) (NIPS '16 best paper)

Value Iteration Networks in PyTorch Tamar, A., Wu, Y., Thomas, G., Levine, S., and Abbeel, P. Value Iteration Networks. Neural Information Processing

75 Nov 24, 2022

Comments

simple question regarding stag-hunt rendering

Hello,

First, thank you for sharing your code and research, it is really helpful.

During the research, I become curious about how the agents act during training on stag-hunt environment. But it seems the visualization part on stag-hunt in the uploaded code is not implemented. Hence, I was wondering if there is any update on rendering part? If so, could you share the update?

opened by HyunghoNa 0
A question about the proof

I am reading about the WQMIX, but I am not sure about the proof in the appendix:

Why the $Q_{tot}^\prime \in Q^{mix}$ ? How can I check a construction that in the QMIX family or not? Would u like to help me figure it out?

opened by Jarvis-K 0
TypeError: run() missing 1 required positional argument: 'pymongo_client'

Hi when I run the example "" I get the following error with pymongo. Any idea on how to fix this?

Thanks in advance

:/pymarl# python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z pygame 2.0.0 (SDL 2.0.12, python 3.5.2) Hello from the pygame community. https://www.pygame.org/contribute.html src/main.py:79: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. config_dict = yaml.load(f) src/main.py:49: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. config_dict = yaml.load(f) Traceback (most recent call last): File "src/main.py", line 85, in alg_config = _get_config(params, "--config", "algs") File "src/main.py", line 47, in _get_config with open(os.path.join(os.path.dirname(file), "config", subfolder, "{}.yaml".format(config_name)), "r") as f: FileNotFoundError: [Errno 2] No such file or directory: 'src/config/algs/qmix.yaml' root@6c7269d7eab6:/pymarl# python3 src/main.py --config=ow_qmix --env-config=sc2 with env_args.map_name=3s5z w=0.5 epsilon_anneal_time=1000000 pygame 2.0.0 (SDL 2.0.12, python 3.5.2) Hello from the pygame community. https://www.pygame.org/contribute.html src/main.py:79: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. config_dict = yaml.load(f) src/main.py:49: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. config_dict = yaml.load(f) [INFO 18:08:50] root Saving to FileStorageObserver in results/sacred. [DEBUG 18:08:50] pymarl Using capture mode "fd" [INFO 18:08:50] pymarl Running command 'my_main' [INFO 18:08:50] pymarl Started run with ID "1" [DEBUG 18:08:50] my_main Started [ERROR 18:08:50] pymarl Failed after 0:00:00! Traceback (most recent calls WITHOUT Sacred internals): File "src/main.py", line 35, in my_main run(_run, config, _log) TypeError: run() missing 1 required positional argument: 'pymongo_client'

opened by lcassano 10

Owner

whirl

Whiteson Research Lab

GitHub

DGN pymarl - Implementation of DGN on Pymarl, which could be trained by VDN or QMIX

This is the implementation of DGN on Pymarl, which could be trained by VDN or QM

4 Nov 23, 2022

Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

OpenDet Expanding Low-Density Latent Regions for Open-Set Object Detection (CVPR2022) Jiaming Han, Yuqiang Ren, Jian Ding, Xingjia Pan, Ke Yan, Gui-So

64 Jan 7, 2023

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

gtn_applications An applications library using GTN. Current examples include: Offline handwriting recognition Automatic speech recognition Installing

68 Dec 29, 2022

Weighted QMIX: Expanding Monotonic Value Function Factorisation

Related tags

Overview

Weighted QMIX: Expanding Monotonic Value Function Factorisation (NeurIPS 2020)

Included in this repo

For all SMAC experiments we used SC2.4.6.2.69232 (not SC2.4.10). The underlying dynamics are sufficiently different that you cannot compare runs across the 2 versions!

Running experiments

Citing

You might also like...

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

Multiple-criteria decision-making (MCDM) with Electre, Promethee, Weighted Sum and Pareto

A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

Using Hotel Data to predict High Value And Potential VIP Guests

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

On the model-based stochastic value gradient for continuous reinforcement learning

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

PyTorch implementation of the Value Iteration Networks (VIN) (NIPS '16 best paper)

Comments

simple question regarding stag-hunt rendering

A question about the proof

TypeError: run() missing 1 required positional argument: 'pymongo_client'

Owner

whirl

DGN pymarl - Implementation of DGN on Pymarl, which could be trained by VDN or QMIX

Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

Code for paper: Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

Automatic differentiation with weighted finite-state transducers.

Implements an infinite sum of poisson-weighted convolutions

CondenseNet: Light weighted CNN for mobile devices

Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

Weighted K Nearest Neighbors (kNN) algorithm implemented on python from scratch.