Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

Overview

scalableMARL

Scalable Reinforcement Learning Policies for Multi-Agent Control

CD. Hsu, H. Jeong, GJ. Pappas, P. Chaudhari. "Scalable Reinforcement Learning Policies for Multi-Agent Control". IEEE International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021.

Multi-Agent Reinforcement Learning method to learn scalable control polices for multi-agent target tracking.

  • Author: Christopher Hsu
  • Email: [email protected]
  • Affiliation:
    • Department of Electrical and Systems Engineering
    • GRASP Laboratory
    • @ University of Pennsylvania

Currently supports Python3.8 and is developed in Ubuntu 20.04

scalableMARL file structure

Within scalableMARL (highlighting the important files):

scalableMARL
    |___algos
        |___maTT                          #RL alg folder for the target tracking environment
            |___core                      #Self-Attention-based Model Architecture
            |___core_behavior             #Used for further evaluation (Ablation D.2.)
            |___dql                       #Soft Double Q-Learning
            |___evaluation                #Evaluation for Main Results
            |___evaluation_behavior       #Used for further evaluation (Ablation D.2.)
            |___modules                   #Self-Attention blocks
            |___replay_buffer             #RL replay buffer for sets
            |___run_script                #**Main run script to do training and evaluation
    |___envs
        |___maTTenv                       #multi-agent target tracking
            |___env
                |___setTracking_v0        #Standard environment (i.e. 4a4t tasks)
                |___setTracking_vGreedy   #Baseline Greedy Heuristic
                |___setTracking_vGru      #Experiment with Gru (Ablation D.3)
                |___setTracking_vkGreedy  #Experiment with Scalability and Heuristic Mask k=4 (Ablation D.1)
        |___run_ma_tracking               #Example scipt to run environment
    |___setup                             #set PYTHONPATH ($source setup)
  • To setup scalableMARL, follow the instruction below.

Set up python environment for the scalableMARL repository

Install python3.8 (if it is not already installed)

#to check python version
python3 -V

sudo apt-get update
sudo apt-get install python3.8-dev

Set up virtualenv

Python virtual environments are used to isolate package installation from the system

Replace 'virtualenv name' with your choice of folder name

sudo apt-get install python3-venv 

python3 -m venv --system-site-packages ./'virtualenv name'
# Activate the environment for use, any commands now will be done within this venv
source ./'virtualenv name'/bin/activate

# To deactivate (in terminal, exit out of venv), do not use during setup
deactivate

Now that the virtualenv is activated, you can install packages that are isolated from your system

When the venv is activated, you can now install packages and run scripts

Install isolated packages in your venv

sudo apt-get install -y eog python3-tk python3-yaml python3-pip ssh git

#This command will auto install packages from requirements.txt
pip3 install --trusted-host pypi.python.org -r requirements.txt

Current workflow

Setup repos

# activate virtualenv
source ./'virtualenv name'/bin/activate
# change directory to scalableMARL
cd ./scalableMARL
# setup repo  ***important in order to set PYTHONPATH***
source setup

scalableMARL repo is ready to go

Running an algorithm (for example maPredPrey)

# its best to run from the scalableMARL folder so that logging and saving is consistent
cd ./scalableMARL
# run the alg
python3 algos/maTT/run_script.py

# you can run the alg with different argument parameters. See within run_script for more options.
# for example
python3 algos/maTT/run_script.py --seed 0 --logdir ./results/maPredPrey --epochs 40

To test, evaluate, and render()

# for a general example 
python3 algos/maTT/run_script.py --mode test --render 1 --log_dir ./results/maTT/setTracking-v0_123456789/seed_0/ --nb_test_eps 50
# for a saved policy in saved_results
python3 algos/maTT/run_script.py --mode test --render 1 --log_dir ./saved_results/maTT/setTracking-v0_123456789/seed_0/

To see training curves

tensorboard --logdir ./results/maTT/setTracking-v0_123456789/

Citing scalableMARL

If you reference or use scalableMARL in your research, please cite:

@misc{hsu2021scalable,
      title={Scalable Reinforcement Learning Policies for Multi-Agent Control}, 
      author={Christopher D. Hsu and Heejin Jeong and George J. Pappas and Pratik Chaudhari},
      year={2021},
      eprint={2011.08055},
      archivePrefix={arXiv},
      primaryClass={cs.MA}
}

You might also like...
Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

 WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)
Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Citylearn Challenge This is the PyTorch implementation for PikaPika team, CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energ

 Multi-agent reinforcement learning algorithm and environment
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

advantage-weighted-regression Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning, by Peng et al. (

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning
PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

safe-control-gym Physics-based CartPole and Quadrotor Gym environments (using PyBullet) with symbolic a priori dynamics (using CasADi) for learning-ba

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Comments
  • ERROR: Failed building wheel for mpi4py

    ERROR: Failed building wheel for mpi4py

    Thank you for your paper and your work.

    I am trying to setup this code. I am running a Docker container with Python 3.8 and Ubuntu 20.04. When I try to run pip3 install --trusted-host pypi.python.org -r requirements.txt, I get an error saying that the OpenCV version does not exist. I change it to opencv=4.1.2.30. When I try to run the command again, I get faced with the error message shown below. When I try to run sudo apt install libopenmpi-dev as recommended here, I get another compounded error.

    I would appreciate any help to resolve this issue.

    ERROR: Command errored out with exit status 1:
    command: /usr/bin/python3.8 /tmp/tmpbe2eo2l7 build_wheel /tmp/tmpfnvwy74h
         cwd: /tmp/pip-install-3fm473kh/mpi4py
    Complete output (148 lines):
    running bdist_wheel
    running build
    running build_src
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.8
    creating build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/bench.py -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/__main__.py -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/run.py -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/__init__.py -> build/lib.linux-x86_64-3.8/mpi4py
    creating build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/_core.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/__main__.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/_base.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/_lib.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/aplus.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/__init__.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/server.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/pool.py -> build/lib.linux-x86_64-3.8/mpi4py/futures
    creating build/lib.linux-x86_64-3.8/mpi4py/util
    copying src/mpi4py/util/pkl5.py -> build/lib.linux-x86_64-3.8/mpi4py/util
    copying src/mpi4py/util/dtlib.py -> build/lib.linux-x86_64-3.8/mpi4py/util
    copying src/mpi4py/util/__init__.py -> build/lib.linux-x86_64-3.8/mpi4py/util
    copying src/mpi4py/py.typed -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/__main__.pyi -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/MPI.pyi -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/dl.pyi -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/run.pyi -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/bench.pyi -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/__init__.pyi -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/MPI.pxd -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/__init__.pxd -> build/lib.linux-x86_64-3.8/mpi4py
    copying src/mpi4py/libmpi.pxd -> build/lib.linux-x86_64-3.8/mpi4py
    creating build/lib.linux-x86_64-3.8/mpi4py/include
    creating build/lib.linux-x86_64-3.8/mpi4py/include/mpi4py
    copying src/mpi4py/include/mpi4py/mpi4py.MPI_api.h -> build/lib.linux-x86_64-3.8/mpi4py/include/mpi4py
    copying src/mpi4py/include/mpi4py/mpi4py.h -> build/lib.linux-x86_64-3.8/mpi4py/include/mpi4py
    copying src/mpi4py/include/mpi4py/mpi4py.MPI.h -> build/lib.linux-x86_64-3.8/mpi4py/include/mpi4py
    copying src/mpi4py/include/mpi4py/mpi4py.i -> build/lib.linux-x86_64-3.8/mpi4py/include/mpi4py
    copying src/mpi4py/include/mpi4py/mpi.pxi -> build/lib.linux-x86_64-3.8/mpi4py/include/mpi4py
    copying src/mpi4py/futures/pool.pyi -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/aplus.pyi -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/_lib.pyi -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/__main__.pyi -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/server.pyi -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/_core.pyi -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/futures/__init__.pyi -> build/lib.linux-x86_64-3.8/mpi4py/futures
    copying src/mpi4py/util/pkl5.pyi -> build/lib.linux-x86_64-3.8/mpi4py/util
    copying src/mpi4py/util/__init__.pyi -> build/lib.linux-x86_64-3.8/mpi4py/util
    copying src/mpi4py/util/dtlib.pyi -> build/lib.linux-x86_64-3.8/mpi4py/util
    running build_clib
    MPI configuration: [mpi] from 'mpi.cfg'
    checking for library 'lmpe' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -llmpe -o _configtest
    /usr/bin/ld: cannot find -llmpe
    collect2: error: ld returned 1 exit status
    failure.
    removing: _configtest.c _configtest.o
    building 'mpe' dylib library
    creating build/temp.linux-x86_64-3.8
    creating build/temp.linux-x86_64-3.8/src
    creating build/temp.linux-x86_64-3.8/src/lib-pmpi
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c src/lib-pmpi/mpe.c -o build/temp.linux-x86_64-3.8/src/lib-pmpi/mpe.o
    creating build/lib.linux-x86_64-3.8/mpi4py/lib-pmpi
    x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-3.8/src/lib-pmpi/mpe.o -o build/lib.linux-x86_64-3.8/mpi4py/lib-pmpi/libmpe.so
    checking for library 'vt-mpi' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -lvt-mpi -o _configtest
    /usr/bin/ld: cannot find -lvt-mpi
    collect2: error: ld returned 1 exit status
    failure.
    removing: _configtest.c _configtest.o
    checking for library 'vt.mpi' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -lvt.mpi -o _configtest
    /usr/bin/ld: cannot find -lvt.mpi
    collect2: error: ld returned 1 exit status
    failure.
    removing: _configtest.c _configtest.o
    building 'vt' dylib library
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c src/lib-pmpi/vt.c -o build/temp.linux-x86_64-3.8/src/lib-pmpi/vt.o
    x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-3.8/src/lib-pmpi/vt.o -o build/lib.linux-x86_64-3.8/mpi4py/lib-pmpi/libvt.so
    checking for library 'vt-mpi' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -lvt-mpi -o _configtest
    /usr/bin/ld: cannot find -lvt-mpi
    collect2: error: ld returned 1 exit status
    failure.
    removing: _configtest.c _configtest.o
    checking for library 'vt.mpi' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -lvt.mpi -o _configtest
    /usr/bin/ld: cannot find -lvt.mpi
    collect2: error: ld returned 1 exit status
    failure.
    removing: _configtest.c _configtest.o
    building 'vt-mpi' dylib library
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c src/lib-pmpi/vt-mpi.c -o build/temp.linux-x86_64-3.8/src/lib-pmpi/vt-mpi.o
    x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-3.8/src/lib-pmpi/vt-mpi.o -o build/lib.linux-x86_64-3.8/mpi4py/lib-pmpi/libvt-mpi.so
    checking for library 'vt-hyb' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -lvt-hyb -o _configtest
    /usr/bin/ld: cannot find -lvt-hyb
    collect2: error: ld returned 1 exit status
    failure.
    removing: _configtest.c _configtest.o
    checking for library 'vt.ompi' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -lvt.ompi -o _configtest
    /usr/bin/ld: cannot find -lvt.ompi
    collect2: error: ld returned 1 exit status
    failure.
    removing: _configtest.c _configtest.o
    building 'vt-hyb' dylib library
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -c src/lib-pmpi/vt-hyb.c -o build/temp.linux-x86_64-3.8/src/lib-pmpi/vt-hyb.o
    x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,--no-as-needed build/temp.linux-x86_64-3.8/src/lib-pmpi/vt-hyb.o -o build/lib.linux-x86_64-3.8/mpi4py/lib-pmpi/libvt-hyb.so
    running build_ext
    MPI configuration: [mpi] from 'mpi.cfg'
    checking for dlopen() availability ...
    checking for header 'dlfcn.h' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/include/python3.8 -c _configtest.c -o _configtest.o
    success!
    removing: _configtest.c _configtest.o
    success!
    checking for library 'dl' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/include/python3.8 -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -Lbuild/temp.linux-x86_64-3.8 -ldl -o _configtest
    success!
    removing: _configtest.c _configtest.o _configtest
    checking for function 'dlopen' ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/include/python3.8 -c _configtest.c -o _configtest.o
    x86_64-linux-gnu-gcc -pthread _configtest.o -Lbuild/temp.linux-x86_64-3.8 -ldl -o _configtest
    success!
    removing: _configtest.c _configtest.o _configtest
    building 'mpi4py.dl' extension
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -DHAVE_DLFCN_H=1 -DHAVE_DLOPEN=1 -I/usr/include/python3.8 -c src/dynload.c -o build/temp.linux-x86_64-3.8/src/dynload.o
    x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 build/temp.linux-x86_64-3.8/src/dynload.o -Lbuild/temp.linux-x86_64-3.8 -ldl -o build/lib.linux-x86_64-3.8/mpi4py/dl.cpython-38-x86_64-linux-gnu.so
    checking for MPI compile and link ...
    x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/include/python3.8 -c _configtest.c -o _configtest.o
    _configtest.c:2:10: fatal error: mpi.h: No such file or directory
        2 | #include <mpi.h>
          |          ^~~~~~~
    compilation terminated.
    failure.
    removing: _configtest.c _configtest.o
    error: Cannot compile MPI programs. Check your configuration!!!
    ----------------------------------------
    ERROR: Failed building wheel for mpi4py
    
    opened by gauravkuppa 5
  • Error in Beginner's Tutorial

    Error in Beginner's Tutorial

    when i try the example in README.md:

    To test, evaluate, and render()

    # for a general example 
    python3 algos/maTT/run_script.py --mode test --render 1 --log_dir ./results/maTT/setTracking-v0_123456789/seed_0/ --nb_test_eps 50
    

    An error occurred in setTracking_v0.py:

    /home/lih/miniconda3/envs/scalableMARL/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
      if not hasattr(tensorboard, "__version__") or LooseVersion(
    /home/lih/桌面/RL/scalableMARL-main/envs/maTTenv/maps/map_utils.py:24: UserWarning: loadtxt: Empty input file: "/home/lih/桌面/RL/scalableMARL-main/envs/maTTenv/maps/emptyMed.cfg"
      self.map = np.loadtxt(map_path+".cfg")
    /home/lih/miniconda3/envs/scalableMARL/lib/python3.8/site-packages/gym/spaces/box.py:128: UserWarning: WARN: Box bound precision lowered by casting to float32
      logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
    /home/lih/miniconda3/envs/scalableMARL/lib/python3.8/site-packages/gym/core.py:329: DeprecationWarning: WARN: Initializing wrapper in old step API which returns one bool instead of two. It is recommended to set `new_step_api=True` to use new step API. This will be the default behaviour in future.
      deprecation(
    Traceback (most recent call last):
      File "algos/maTT/run_script.py", line 207, in <module>
        test(args.seed)
      File "algos/maTT/run_script.py", line 146, in test
        Eval.test(args, env, policy)
      File "/home/lih/桌面/RL/scalableMARL-main/algos/maTT/evaluation.py", line 82, in test
        obs = env.reset(**params)
      File "/home/lih/桌面/RL/scalableMARL-main/envs/maTTenv/display_wrapper.py", line 119, in reset
        return self.env.reset(**kwargs)
      File "/home/lih/桌面/RL/scalableMARL-main/envs/utilities/ma_time_limit.py", line 30, in reset
        return self.env.reset(**kwargs)
      File "/home/lih/桌面/RL/scalableMARL-main/envs/maTTenv/env/setTracking_v0.py", line 120, in reset
        self.agents[ii].reset(init_pose['agents'][ii])
    IndexError: list index out of range
    

    And I did not make any changes to the original program

    opened by lithiumhydride2 1
  • A mistake in README.md

    A mistake in README.md

    There is a passage in the README.md that reads:

    Set up with conda

    conda env -f create environment.yml
    
    

    This command should be:

    conda env create -f environment.yml
    
    opened by lithiumhydride2 1
  • A little problem

    A little problem

    Hello, I'm a graduate student in grade one. I'm very interested in your article and code. I installed the environment according to your instructions and ran your code without adjusting any code content and by default. The following error occurred:

    Traceback (most recent call last): File "algos/maTT/run_script.py", line 181, in results = train(seed, save_dir) File "algos/maTT/run_script.py", line 90, in train doubleQlearning( File "/home/zjh/scalableMARL/algos/maTT/dql.py", line 362, in doubleQlearning batch = replay_buffer.sample_batch(batch_size)#, env.num_targets) File "/home/zjh/scalableMARL/algos/maTT/replay_buffer.py", line 104, in sample_batch return self._encode_sample(batch_size, idxes, nb_targets) File "/home/zjh/scalableMARL/algos/maTT/replay_buffer.py", line 64, in _encode_sample batch['obs'][i] = obs_t ValueError: could not broadcast input array from shape (2,6) into shape (2,5)

    I will study your code carefully. But since you are more familiar with your code, are there some small problems that you can solve immediately? If it runs successfully, I believe it can better help me understand the code. Thank you

    opened by JiahengZeng 1
Owner
Christopher Hsu
Christopher Hsu
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Ilya Kostrikov 3k Dec 31, 2022
Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Overcooked-AI We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm. In this repository, we implemented be

Baek In-Chang 14 Sep 16, 2022
Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN) This is the implementation of the paper Multi-Age

Future Power Networks 83 Jan 6, 2023
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Minqi 297 Dec 12, 2022
Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation This is a PyTorch and LibTorch implementation of MarkerPose: a

Jhacson Meza 47 Nov 18, 2022
Implementing yolov4 target detection and tracking based on nao robot

Implementing yolov4 target detection and tracking based on nao robot

null 6 Apr 19, 2022
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

RIIT Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standard

null 405 Jan 6, 2023
A parallel framework for population-based multi-agent reinforcement learning.

MALib: A parallel framework for population-based multi-agent reinforcement learning MALib is a parallel framework of population-based learning nested

MARL @ SJTU 348 Jan 8, 2023
A library of multi-agent reinforcement learning components and systems

Mava: a research framework for distributed multi-agent reinforcement learning Table of Contents Overview Getting Started Supported Environments System

InstaDeep Ltd 463 Dec 23, 2022