Overcooked-AI

We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm.
In this repository, we implemented behavior cloning(BC), offline MADDPG, MADDPG+REM (MADDPG w/ REM), MADDPG+BCQ (MADDPG w/ BCQ) with pytorch. Now, BCQ is in ' Working In Progress', and it's not implemented completely.

We collected 0.5M multi-agent offline RL dataset and experimented with each comparison methods. We collected this data with online MADDPG agents, and it includes exploration trajectories using OU noise. The experiments are ran on Asymmetric Advantages on the Overcooked environment.

We are looking forward your contribution!

How to Run

Collect Offline Data

python train_online.py agent=maddpg save_replay_buffer=true

While the agents train with 0.5M steps, the trajectory replay buffer will be dumped in your experiment/{date}/{time}_maddpg_{exp_name}/buffer folder.
Please replace the path in config/data/local.yaml to the experiment by-product directory.

Download Dataset

Or, if you want to use our dataset pre-collected, please enjoy this link.
We provide 0.5M trajectories in Asymmetric Advantages layout.
Please download our dataset in your local computer and replace the path in config/data/local.yaml

Train Offline Models

Behavior Cloning

python train_bc.py agent=bc data=local

Offline MADDPG (Vanilla)

python train_offline.py agent=maddpg data=local

Offline MADDPG (w/ REM)

python train_offline.py agent=rem_maddpg data=local

Offline MADDPG (w/ BCQ) (WIP)

python train_offline.py agent=bcq_maddpg data=local

Result

Graph

Online	Offline (0.5M Data)	Offline (0.25M Data)

Video

Online	BC	Offline /w REM

Acknowledgement

Hi, I tried to run the code, but failed. this is the error:

pygame 2.0.1 (SDL 2.0.14, Python 3.8.0) Hello from the pygame community. https://www.pygame.org/contribute.html /home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'offline_rl': Defaults list is missing _self_. See https://hydra.cc/docs/upgrades/1.0_to_1.1/default_composition_order for more information warnings.warn(msg, UserWarning) Workspace: /home/xubin/Overcooked-AI-offline/experiment/2022.08.31/2038_bcq_maddpg_orl-vanilla 2022-08-31 20:38:03.522401: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 Computing MediumLevelActionManager Loaded MediumLevelActionManager from /home/xubin/Overcooked-AI-offline/overcooked_ai_py/data/planners/asymmetric_advantages_am.pkl [2022-08-31 20:38:08,772][replay_buffer][INFO] - Loading data - /home/xubin/Overcooked-AI-offline/49999/state.npz Fatal Python error: (pygame parachute) Segmentation Fault Python runtime state: initialized

Thread 0x00007fe948f56700 (most recent call first): File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/threading.py", line 306 in wait File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/queue.py", line 179 in get File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/tensorboard/summary/writer/event_file_writer.py", line 232 in run File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/threading.py", line 932 in _bootstrap_inner File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/threading.py", line 890 in _bootstrap

Current thread 0x00007fec8aaea700 (most recent call first): File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/zipfile.py", line 710 in _get_decompressor File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/zipfile.py", line 805 in init File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/zipfile.py", line 1552 in open File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/numpy/lib/npyio.py", line 248 in getitem File "/home/xubin/Overcooked-AI-offline/replay_buffer.py", line 100 in loader File "/home/xubin/Overcooked-AI-offline/replay_buffer.py", line 104 in append_data File "train_offline.py", line 67 in init File "train_offline.py", line 136 in main File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/core/utils.py", line 160 in run_job File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 98 in run File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/utils.py", line 378 in File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211 in run_and_report File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/utils.py", line 377 in _run_hydra File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/main.py", line 48 in decorated_main File "train_offline.py", line 141 in Aborted (core dumped)

Multi-agent reinforcement learning algorithm and environment

Fatal Python error: (pygame parachute) Segmentation Fault Python runtime state: initialized

Hi, I tried to run the code, but failed. this is the error:

pygame 2.0.1 (SDL 2.0.14, Python 3.8.0) Hello from the pygame community. https://www.pygame.org/contribute.html /home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'offline_rl': Defaults list is missing _self_. See https://hydra.cc/docs/upgrades/1.0_to_1.1/default_composition_order for more information warnings.warn(msg, UserWarning) Workspace: /home/xubin/Overcooked-AI-offline/experiment/2022.08.31/2038_bcq_maddpg_orl-vanilla 2022-08-31 20:38:03.522401: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 Computing MediumLevelActionManager Loaded MediumLevelActionManager from /home/xubin/Overcooked-AI-offline/overcooked_ai_py/data/planners/asymmetric_advantages_am.pkl [2022-08-31 20:38:08,772][replay_buffer][INFO] - Loading data - /home/xubin/Overcooked-AI-offline/49999/state.npz Fatal Python error: (pygame parachute) Segmentation Fault Python runtime state: initialized

Thread 0x00007fe948f56700 (most recent call first): File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/threading.py", line 306 in wait File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/queue.py", line 179 in get File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/tensorboard/summary/writer/event_file_writer.py", line 232 in run File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/threading.py", line 932 in _bootstrap_inner File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/threading.py", line 890 in _bootstrap

Current thread 0x00007fec8aaea700 (most recent call first): File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/zipfile.py", line 710 in _get_decompressor File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/zipfile.py", line 805 in init File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/zipfile.py", line 1552 in open File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/numpy/lib/npyio.py", line 248 in getitem File "/home/xubin/Overcooked-AI-offline/replay_buffer.py", line 100 in loader File "/home/xubin/Overcooked-AI-offline/replay_buffer.py", line 104 in append_data File "train_offline.py", line 67 in init File "train_offline.py", line 136 in main File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/core/utils.py", line 160 in run_job File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 98 in run File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/utils.py", line 378 in File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211 in run_and_report File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/_internal/utils.py", line 377 in _run_hydra File "/home/xubin/anaconda3/envs/overcookedoffline/lib/python3.8/site-packages/hydra/main.py", line 48 in decorated_main File "train_offline.py", line 141 in Aborted (core dumped)

opened by heresyjj 0

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Related tags

Overview

Overcooked-AI

How to Run

Collect Offline Data

Download Dataset

Train Offline Models

Behavior Cloning

Offline MADDPG (Vanilla)

Offline MADDPG (w/ REM)

Offline MADDPG (w/ BCQ) (WIP)

Result

Graph

Video

Acknowledgement

You might also like...

Multi-agent reinforcement learning algorithm and environment

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Offline Reinforcement Learning with Implicit Q-Learning

Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

Solving reinforcement learning tasks which require language and vision

MINERVA: An out-of-the-box GUI tool for offline deep reinforcement learning

Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"

RL-driven agent playing tic-tac-toe on starknet against challengers.

A multi-entity Transformer for multi-agent spatiotemporal modeling.

Comments

Fatal Python error: (pygame parachute) Segmentation Fault Python runtime state: initialized

Owner

Baek In-Chang

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

A parallel framework for population-based multi-agent reinforcement learning.

A library of multi-agent reinforcement learning components and systems

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team