Gym for multi-agent reinforcement learning

Overview

PettingZoo is a Python library for conducting research in multi-agent reinforcement learning, akin to a multi-agent version of Gym.

Our website, with comprehensive documentation, is pettingzoo.ml

Environments and Installation

PettingZoo includes the following families of environments:

To install the pettingzoo base library, use pip install pettingzoo

This does not include dependencies for all families of environments (there's a massive number, and some can be problematic to install on certain systems). You can install these dependencies for one family like pip install pettingzoo[atari] or use pip install pettingzoo[all] to install all dependencies.

We support Python 3.7, 3.8 and 3.9 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.

API

PettingZoo model environments as Agent Environment Cycle (AEC) games, in order to be able to cleanly support all types of multi-agent RL environments under one API and to minimize the potential for certain classes of common bugs.

Using environments in PettingZoo is very similar to Gym, i.e. you initialize an environment via:

from pettingzoo.butterfly import pistonball_v5
env = pistonball_v5.env()

Environments can be interacted with in a manner very similar to Gym:

env.reset()
for agent in env.agent_iter():
    observation, reward, done, info = env.last()
    action = policy(observation)
    env.step(action)

For the complete API documentation, please see https://www.pettingzoo.ml/api

Parallel API

In certain environments, it's a valid to assume that agents take their actions at the same time. For these games, we offer a secondary API to allow for parallel actions, documented at https://www.pettingzoo.ml/api#parallel-api

SuperSuit

SuperSuit is a library that includes all commonly used wrappers in RL (frame stacking, observation, normalization, etc.) for PettingZoo and Gym environments with a nice API. We developed it in lieu of wrappers built into PettingZoo. https://github.com/Farama-Foundation/SuperSuit

Environment Versioning

PettingZoo keeps strict versioning for reproducibility reasons. All environments end in a suffix like "_v0". When changes are made to environments that might impact learning results, the number is increased by one to prevent potential confusion.

Citation

To cite this project in publication, please use

@article{terry2020pettingzoo,
  Title = {PettingZoo: Gym for Multi-Agent Reinforcement Learning},
  Author = {Terry, J. K and Black, Benjamin and Grammel, Nathaniel and Jayakumar, Mario and Hari, Ananth and Sulivan, Ryan and Santos, Luis and Perez, Rodrigo and Horsch, Caroline and Dieffendahl, Clemens and Williams, Niall L and Lokesh, Yashas and Sullivan, Ryan and Ravi, Praveen},
  journal={arXiv preprint arXiv:2009.14471},
  year={2020}
}
Comments
  • Error with Tiger-Deer Env: Selecting Invalid Agent.

    Error with Tiger-Deer Env: Selecting Invalid Agent.

    Hi,

    I've been running the MAgent Tiger-Deer environment with 2 different algorithms: a RandomLearner and rllib's PPO. I'm also currently using rllib's PettingZooEnv. It seems both of the algorithms work for some number of iterations, but then error-out in this line https://github.com/ray-project/ray/blob/master/rllib/env/pettingzoo_env.py#L161.

    The issue is that the agent being selected, deer_92, is not in the action_dict. I checked the self.aec_env.dones dict, however, and the agent is there. I added a snippet of the output, below. I printed out relevant info (shown after each == start step ==) when entering the step() function. Furthermore, it also appears all steps prior to this error only select deer_0 as the agent. I've re-ran the experiment several times and it always has the same result (e.g., deer_0 is always chosen and then it errors-out once any other agent is chosen).

    I'm not sure if this is an issue with rllib, the Tiger-Deer env, or my own config.

    (pid=34152) =============== start step =====================
    (pid=34152) self.aec_env.agent_selection --> deer_0
    (pid=34152) stepped_agents --> set()
    (pid=34152) list(action_dict) --> ['deer_0', 'deer_1', 'deer_2', 'deer_3', 'deer_4', 'deer_5', 'deer_6', 'deer_7', 'deer_8', 'deer_9', 'deer_10', 'deer_11', 'deer_12', 'deer_13', 'deer_14', 'deer_15', 'deer_16', 'deer_17', 'deer_18', 'deer_19', 'deer_20', 'deer_21', 'deer_22', 'deer_23', 'deer_24', 'deer_25', 'deer_26', 'deer_27', 'deer_28', 'deer_29', 'deer_30', 'deer_31', 'deer_32', 'deer_33', 'deer_34', 'deer_35', 'deer_36', 'deer_37', 'deer_38', 'deer_39', 'deer_40', 'deer_41', 'deer_42', 'deer_43', 'deer_44', 'deer_45', 'deer_46', 'deer_47', 'deer_48', 'deer_49', 'deer_50', 'deer_51', 'deer_52', 'deer_53', 'deer_54', 'deer_55', 'deer_56', 'deer_57', 'deer_58', 'deer_59', 'deer_60', 'deer_61', 'deer_62', 'deer_63', 'deer_64', 'deer_65', 'deer_66', 'deer_67', 'deer_68', 'deer_69', 'deer_70', 'deer_71', 'deer_72', 'deer_73', 'deer_74', 'deer_75', 'deer_76', 'deer_77', 'deer_78', 'deer_79', 'deer_80', 'deer_81', 'deer_82', 'deer_83', 'deer_84', 'deer_85', 'deer_86', 'deer_87', 'deer_88', 'deer_89', 'deer_90', 'deer_91', 'deer_92', 'deer_93', 'deer_94', 'deer_95', 'deer_96', 'deer_97', 'deer_98', 'deer_99', 'deer_100', 'tiger_0', 'tiger_1', 'tiger_2', 'tiger_3', 'tiger_4', 'tiger_5', 'tiger_6', 'tiger_7', 'tiger_8', 'tiger_9', 'tiger_10', 'tiger_11', 'tiger_12', 'tiger_13', 'tiger_14', 'tiger_15', 'tiger_16', 'tiger_17', 'tiger_18', 'tiger_19']
    (pid=34152) agent in action_dict -->  True
    (pid=34152) agent in self.aec_env.dones --> False
    (pid=34152) =============== start step =====================
    (pid=34152) self.aec_env.agent_selection --> deer_0
    (pid=34152) stepped_agents --> set()
    (pid=34152) list(action_dict) --> ['deer_0', 'deer_1', 'deer_2', 'deer_3', 'deer_4', 'deer_5', 'deer_6', 'deer_7', 'deer_8', 'deer_9', 'deer_10', 'deer_11', 'deer_12', 'deer_13', 'deer_14', 'deer_15', 'deer_16', 'deer_17', 'deer_18', 'deer_19', 'deer_20', 'deer_21', 'deer_22', 'deer_23', 'deer_24', 'deer_25', 'deer_26', 'deer_27', 'deer_28', 'deer_29', 'deer_30', 'deer_31', 'deer_32', 'deer_33', 'deer_34', 'deer_35', 'deer_36', 'deer_37', 'deer_38', 'deer_39', 'deer_40', 'deer_41', 'deer_42', 'deer_43', 'deer_44', 'deer_45', 'deer_46', 'deer_47', 'deer_48', 'deer_49', 'deer_50', 'deer_51', 'deer_52', 'deer_53', 'deer_54', 'deer_55', 'deer_56', 'deer_57', 'deer_58', 'deer_59', 'deer_60', 'deer_61', 'deer_62', 'deer_63', 'deer_64', 'deer_65', 'deer_66', 'deer_67', 'deer_68', 'deer_69', 'deer_70', 'deer_71', 'deer_72', 'deer_73', 'deer_74', 'deer_75', 'deer_76', 'deer_77', 'deer_78', 'deer_79', 'deer_80', 'deer_81', 'deer_82', 'deer_83', 'deer_84', 'deer_85', 'deer_86', 'deer_87', 'deer_88', 'deer_89', 'deer_90', 'deer_91', 'deer_92', 'deer_93', 'deer_94', 'deer_95', 'deer_96', 'deer_97', 'deer_98', 'deer_99', 'deer_100', 'tiger_0', 'tiger_1', 'tiger_2', 'tiger_3', 'tiger_4', 'tiger_5', 'tiger_6', 'tiger_7', 'tiger_8', 'tiger_9', 'tiger_10', 'tiger_11', 'tiger_12', 'tiger_13', 'tiger_14', 'tiger_15', 'tiger_16', 'tiger_17', 'tiger_18', 'tiger_19']
    (pid=34152) agent in action_dict -->  True
    (pid=34152) agent in self.aec_env.dones -->  False
    (pid=34152) =============== start step =====================
    (pid=34152) self.aec_env.agent_selection --> deer_92
    (pid=34152) stepped_agents --> set()
    (pid=34152) list(action_dict) --> ['deer_0', 'deer_1', 'deer_2', 'deer_3', 'deer_4', 'deer_5', 'deer_6', 'deer_7', 'deer_8', 'deer_9', 'deer_10', 'deer_11', 'deer_12', 'deer_13', 'deer_14', 'deer_15', 'deer_16', 'deer_17', 'deer_18', 'deer_19', 'deer_20', 'deer_21', 'deer_22', 'deer_23', 'deer_24', 'deer_25', 'deer_26', 'deer_27', 'deer_28', 'deer_29', 'deer_30', 'deer_31', 'deer_32', 'deer_33', 'deer_34', 'deer_35', 'deer_36', 'deer_37', 'deer_38', 'deer_39', 'deer_40', 'deer_41', 'deer_42', 'deer_43', 'deer_44', 'deer_45', 'deer_46', 'deer_47', 'deer_48', 'deer_49', 'deer_50', 'deer_51', 'deer_52', 'deer_53', 'deer_54', 'deer_55', 'deer_56', 'deer_57', 'deer_58', 'deer_59', 'deer_60', 'deer_61', 'deer_62', 'deer_63', 'deer_64', 'deer_65', 'deer_66', 'deer_67', 'deer_68', 'deer_69', 'deer_70', 'deer_71', 'deer_72', 'deer_73', 'deer_74', 'deer_75', 'deer_76', 'deer_77', 'deer_78', 'deer_79', 'deer_80', 'deer_81', 'deer_82', 'deer_83', 'deer_84', 'deer_85', 'deer_86', 'deer_87', 'deer_88', 'deer_89', 'deer_90', 'deer_91', 'deer_93', 'deer_94', 'deer_95', 'deer_96', 'deer_97', 'deer_98', 'deer_99', 'deer_100', 'tiger_0', 'tiger_1', 'tiger_2', 'tiger_3', 'tiger_4', 'tiger_5', 'tiger_6', 'tiger_7', 'tiger_8', 'tiger_9', 'tiger_10', 'tiger_11', 'tiger_12', 'tiger_13', 'tiger_14', 'tiger_15', 'tiger_16', 'tiger_17', 'tiger_18', 'tiger_19']
    (pid=34152) agent in action_dict -->  False
    (pid=34152) agent in self.aec_env.dones -->  True
    == Status ==
    Memory usage on this node: 24.8/377.6 GiB
    Using FIFO scheduling algorithm.
    Resources requested: 0/80 CPUs, 0/2 GPUs, 0.0/252.88 GiB heap, 0.0/77.54 GiB objects (0/1.0 GPUType:V100)
    Result logdir: /home/ray_results/Campaign_Tiger-Deer-v1
    Number of trials: 1 (1 ERROR)
    +------------------------------------------+----------+-------+--------+------------------+------+----------+
    | Trial name                               | status   | loc   |   iter |   total time (s) |   ts |   reward |
    |------------------------------------------+----------+-------+--------+------------------+------+----------|
    | PS_PPO_Trainer_Tiger-Deer-v1_41b65_00000 | ERROR    |       |      3 |          3576.35 | 3672 | 0.166667 |
    +------------------------------------------+----------+-------+--------+------------------+------+----------+
    Number of errored trials: 1
    

    If I use the PettingZooEnv in version in ray==0.87, the error is https://github.com/ray-project/ray/blob/releases/0.8.7/rllib/env/pettingzoo_env.py#L165.

    Lastly, I also applied the following SuperSuit wrappers: pad_observations_v0, pad_action_space_v0, agent_indicator_v0, and flatten_v0, and I'm running PettingZoo==1.3.3 and SuperSuit==2.1.0.

    Thanks.

    opened by jdpena 20
  • MPE Continuous Action support

    MPE Continuous Action support

    I've added support for continuous actions in MPE as discussed in #249 through the continuous_actions=False argument in the environment config.

    I've tested my changes on all environments using RLLib MADDPG, and run ./release_test.sh. The latter needed some updates as well.

    All environments are working except simple_world_comm_v2, which seems to have a supersuit-related bug.

    opened by Rohan138 19
  • Rps merge

    Rps merge

    • Made rps sequential with argument max_cycles.
    • Merged rpsls in rps. Choose between both env with argument lizard_spock

    I am not sure if the rps version needs to be bumped or not. Also, should the name of the env still be called rps despite having merged rpsls in the same env?

    opened by rodrigodelazcano 17
  • Move Pyright to `pre-commit` + add `pydocstyle`

    Move Pyright to `pre-commit` + add `pydocstyle`

    Description

    This PR moves the Pyright checks from CI only to pre-commit (both local + CI), fixes some typing issues, and add also pydocstyle to pre-commit.

    Type of change

    • Refactoring/maintenance

    Checklist:

    • [x] I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
    • [x] I have run pytest -v and no errors are present.
    • [x] I have commented my code, particularly in hard-to-understand areas
    • [x] I have made corresponding changes to the documentation
    • [x] I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
    • [x] New and existing unit tests pass locally with my changes
    opened by kir0ul 16
  • Error running tutorial: 'ProcConcatVec' object has no attribute 'pipes'

    Error running tutorial: 'ProcConcatVec' object has no attribute 'pipes'

    I'm running into an error with this long stack trace when I try to run the 13 line tutorial:

    /Users/erick/.local/share/virtualenvs/rl-0i49mzF7/lib/python3.9/site-packages/torch/utils/tensorboard/__init__.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
      if not hasattr(tensorboard, '__version__') or LooseVersion(tensorboard.__version__) < LooseVersion('1.15'):
    /Users/erick/.local/share/virtualenvs/rl-0i49mzF7/lib/python3.9/site-packages/torch/utils/tensorboard/__init__.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
      if not hasattr(tensorboard, '__version__') or LooseVersion(tensorboard.__version__) < LooseVersion('1.15'):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
        exitcode = _main(fd, parent_sentinel)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 125, in _main
        prepare(preparation_data)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 236, in prepare
        _fixup_main_from_path(data['init_main_from_path'])
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
        main_content = runpy.run_path(main_path,
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 268, in run_path
        return _run_module_code(code, init_globals, run_name,
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 97, in _run_module_code
        _run_code(code, mod_globals, init_globals,
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/Users/erick/dev/rl/main_pettingzoo.py", line 22, in <module>
        env = ss.concat_vec_envs_v1(env, 8, num_cpus=4, base_class="stable_baselines3")
      File "/Users/erick/.local/share/virtualenvs/rl-0i49mzF7/lib/python3.9/site-packages/supersuit/vector/vector_constructors.py", line 60, in concat_vec_envs_v1
        vec_env = MakeCPUAsyncConstructor(num_cpus)(*vec_env_args(vec_env, num_vec_envs))
      File "/Users/erick/.local/share/virtualenvs/rl-0i49mzF7/lib/python3.9/site-packages/supersuit/vector/constructors.py", line 38, in constructor
        return ProcConcatVec(
      File "/Users/erick/.local/share/virtualenvs/rl-0i49mzF7/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 144, in __init__
        proc.start()
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/process.py", line 121, in start
        self._popen = self._Popen(self)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py", line 224, in _Popen
        return _default_context.get_context().Process._Popen(process_obj)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
        return Popen(process_obj)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in __init__
        super().__init__(process_obj)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
        self._launch(process_obj)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 42, in _launch
        prep_data = spawn.get_preparation_data(process_obj._name)
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 154, in get_preparation_data
        _check_not_importing_main()
      File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 134, in _check_not_importing_main
        raise RuntimeError('''
    RuntimeError: 
            An attempt has been made to start a new process before the
            current process has finished its bootstrapping phase.
    
            This probably means that you are not using fork to start your
            child processes and you have forgotten to use the proper idiom
            in the main module:
    
                if __name__ == '__main__':
                    freeze_support()
                    ...
    
            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce an executable.
    Exception ignored in: <function ProcConcatVec.__del__ at 0x112cf1310>
    Traceback (most recent call last):
      File "/Users/erick/.local/share/virtualenvs/rl-0i49mzF7/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 210, in __del__
        for pipe in self.pipes:
    AttributeError: 'ProcConcatVec' object has no attribute 'pipes'
    
    

    Maybe the issue here is some mismatch in library versioning, but I found no reference to which supersuit version is supposed to run with the tutorial (or with the rest of the code).

    I am running python 3.9 with supersuit 3.4 and pettingzoo 1.18.1

    opened by erickrf 16
  • Import env errors

    Import env errors

    Hello,

    I am new to MARL and want to create custom parallel env in PettingZoo and wanted to play with it. I am getting the following import error when attempting to get the environments.

    from pettingzoo.butterfly import pistonball_v4 env = pistonball_v4.env()

    ` ImportError: cannot import name 'pistonball_v4' from 'pettingzoo.butterfly' PS C:\Users\mvadr\Desktop\gym>

    `

    bug 
    opened by mvadrev 13
  • Hanabi Integration

    Hanabi Integration

    • Hanabi, integrated from official repo as git submodule.

    • See hanabi/README.md for how to pull and setup git submodule.

    • See documentation within class for full information.

    • Debugged and E2E tested with unittest, including api test as part of the unit tests.

    • CHANGE REQUEST: Add typing annotation in utils/env.py, see last commit.

    enhancement 
    opened by dissendahl 13
  • [Bug Report] cannot import name 'Renderer' from 'magent'

    [Bug Report] cannot import name 'Renderer' from 'magent'

    If you are submitting a bug report, please fill in the following details and use the tag [bug].

    Describe the bug I try to change the environment based on this script. The origin script used waterworld which has bug so I change it to magent.battle. The output show below:

    
    Traceback (most recent call last):
      File "rlpe.py", line 4, in <module>
        from pettingzoo.magent import battle_v4
      File "/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/pettingzoo/magent/__init__.py", line 5, in __getattr__
        return deprecated_handler(env_name, __path__, __name__)
      File "/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/pettingzoo/utils/deprecated_module.py", line 52, in deprecated_handler
        spec.loader.exec_module(module)
      File "/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/pettingzoo/magent/battle_v4.py", line 1, in <module>
        from .battle.battle import env, parallel_env, raw_env  # noqa: F401
      File "/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/pettingzoo/magent/battle/battle.py", line 7, in <module>
        from pettingzoo.magent.magent_env import magent_parallel_env, make_env
      File "/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/pettingzoo/magent/magent_env.py", line 4, in <module>
        from magent import Renderer
    ImportError: cannot import name 'Renderer' from 'magent' (/anaconda3/envs/PettingZoo/lib/python3.7/site-packages/magent/__init__.py)
    

    Code example

    from ray import tune
    from ray.tune.registry import register_env
    from ray.rllib.env.wrappers.pettingzoo_env import PettingZooEnv
    from pettingzoo.magent import battle_v4
    
    # Based on code from github.com/parametersharingmadrl/parametersharingmadrl
    
    if __name__ == "__main__":
        # RDQN - Rainbow DQN
        # ADQN - Apex DQN
        def env_creator(args):
            return PettingZooEnv(battle_v4.env(map_size=45, minimap_mode=False, step_reward=-0.005,
    dead_penalty=-0.1, attack_penalty=-0.1, attack_opponent_reward=0.2,
    max_cycles=1000, extra_features=False))
    
        env = env_creator({})
        register_env("battle", env_creator)
    
        tune.run(
            "APEX_DDPG",
            stop={"episodes_total": 60000},
            checkpoint_freq=10,
            config={
                # Enviroment specific
                "env": "battle",
                # General
                "num_gpus": 1,
                "num_workers": 2,
                # Method specific
                "multiagent": {
                    "policies": set(env.agents),
                    "policy_mapping_fn": (lambda agent_id, episode, **kwargs: agent_id),
                },
            },
        )
    

    System Info Describe the characteristic of your environment: Linux with conda and pip Python 3.7.9 PettingZoo 1.20.1 supersuit 3.5.0 gym 0.22.0 Additional context Add any other context about the problem here.

    Checklist

    • [x] I have checked that there is no similar issue in the repo (required)
    help wanted 
    opened by hellofinch 12
  • Refactor Pytest multiple calls to a single call

    Refactor Pytest multiple calls to a single call

    Description

    This PR fixes issue https://github.com/Farama-Foundation/PettingZoo/issues/719. The main changes are the following:

    • Removed release_test.sh
    • Made according changes to the docs
    • Bumped GH actions versions
    • Updated the tests section in setup.py

    Type of change

    • Refactoring
    • This change requires a documentation update

    Checklist:

    • [x] I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
    • [x] I have run ./release_test.sh and no errors are present.
    • [x] I have commented my code, particularly in hard-to-understand areas
    • [x] I have made corresponding changes to the documentation
    • [x] I solved any possible warnings that ./release_test.sh has generated that are related to my code to the best of my knowledge.
    opened by kir0ul 12
  • How to set black_death to true

    How to set black_death to true

    Hi, I have a custom pettingzoo Parallel environment and want to train a shared policy on multiple agents. I want to be able to have a few of my agents inactive during a training-episode under certain conditions. I also want to bring back agents during an episode. I call the petting_zoo_env_to_vec_env_v1 and concat_vec_envs_v1 on my environment to allow for shared policy and training on copies of the environment.

    I also use the wrappers black_death_v3, pad_observations_v0 and pad_action_space_v0 to allow for agent death but I still get an error message "environment has agent death. Not allowed for pettingzoo_env_to_vec_env_v1 unless black_death is true". How do I set this? So in principle is it possible to have self.possible_agents as a sort of agent pool to take agents from and then have self.agents as the active agents during an episode?

    best regards

    Axel

    question 
    opened by Axelskool 12
  • Chess Environment Trailing History

    Chess Environment Trailing History

    I was trying to read through and understand the chess environment before attempting to create my own environment for MARL problem. I was reading through the step function and think that the environment is archiving a trailing history.

     def step(self, action):
            if self.dones[self.agent_selection]:
                return self._was_done_step(action)
            current_agent = self.agent_selection
            current_index = self.agents.index(current_agent)
            next_board = chess_utils.get_observation(self.board, current_agent)
            self.board_history = np.dstack((next_board[:, :, 7:], self.board_history[:, :, :-13]))
            ....
    

    When the step function is first called by the white pieces, the starting board is added to the board_history as it is run prior to any move being made. When the observe function is then run by the black pieces, the observation would fail to include the first move made by white pieces. Is there anything I am missing?

    bug 
    opened by 01jongmin 12
  • [Bug Report] AttributeError: accessing private attribute '_cumulative_rewards' is prohibited

    [Bug Report] AttributeError: accessing private attribute '_cumulative_rewards' is prohibited

    Describe the bug

    When creating a wrapper around an environment and overriding only the seed method, when I try to call the .last() method of the wrapped environment I get: AttributeError: accessing private attribute '_cumulative_rewards' is prohibited. It seems the error only happens if the environment has an observation space with spaces.Sequence in it, as this does not happen on other pettinzoo environments and spaces that I tested.

    I managed to re-create the issue on this colab notebook, I have also attached the same code below.

    Code example

    from typing import Optional, Any, Dict, List
    
    from gymnasium import Space, spaces
    from pettingzoo import AECEnv
    from pettingzoo.utils import BaseWrapper
    
    
    class TestEnv(AECEnv):
        metadata = {'render.modes': ['ansi'], 'name': 'test_env'}
    
        def __init__(self, seed: Optional[int] = None):
            super().__init__()
            self.n_cards = 52
            self.cards = []
    
            self.possible_agents: List[str] = ["player_0"]
            self.agents = self.possible_agents.copy()
    
            self.action_spaces = {agent: spaces.Discrete(self.n_cards) for agent in self.agents}
            # The bug seems to be when using spaces.Sequence
            self.observation_spaces = {agent: spaces.Sequence(spaces.Discrete(self.n_cards)) for agent in self.agents}
            self.infos = {i: {} for i in self.agents}
    
            self.reset(seed)
    
        def reset(self, seed: Optional[int] = None, return_info: bool = False, options: Optional[Dict] = None) -> None:
            self.cards = []
    
            self.agents = self.possible_agents.copy()
            self.rewards = {agent: 0 for agent in self.agents}
            self._cumulative_rewards = {agent: 0 for agent in self.agents}
            self.truncations = {i: False for i in self.agents}
    
        def seed(self, seed: Optional[int] = None) -> None:
            pass
    
        def observation_space(self, agent: str) -> Space:
            return self.observation_spaces[agent]
    
        def action_space(self, agent: str) -> Space:
            return self.action_spaces[agent]
    
        @property
        def agent_selection(self) -> str:
            return self.agents[0]
    
        @property
        def terminations(self) -> Dict[str, bool]:
            return dict([(agent, False) for agent in self.agents])
    
        def observe(self, agent: str) -> List[Any]:
            return self.cards
    
        def step(self, action: int) -> None:
            assert action in self.action_spaces[self.agent_selection]
            self.cards.append(action)
    
        def render(self) -> str:
            return self.cards.__repr__()
    
        def state(self):
            pass
    
        def close(self):
            super().close()
    
    
    class TestEnvWrapper(BaseWrapper):
        def seed(self, seed: Optional[int] = None) -> None:
            pass
    
    
    
    # The env works
    print(TestEnv().last())
    
    # The wrapper has an error
    TestEnvWrapper(TestEnv()).last()
    

    System info

    Pettingzoo version: 1.22.3

    Additional context

    No response

    Checklist

    • [X] I have checked that there is no similar issue in the repo
    bug 
    opened by LetteraUnica 0
  • Fix typo: `BaseParallelWraper` renamed to `BaseParallelWrapper`

    Fix typo: `BaseParallelWraper` renamed to `BaseParallelWrapper`

    Description

    As title, fix typo of BaseParallelWrapper (previously BaseParallelWraper). Linked to its SuperSuit counterpart https://github.com/Farama-Foundation/SuperSuit/pull/204.

    Type of change

    • Bug fix (non-breaking change which fixes an issue)

    Checklist:

    • [x] I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
    • [x] I have run pytest -v and no errors are present.
    • [x] I have commented my code, particularly in hard-to-understand areas
    • [ ] I have made corresponding changes to the documentation
    • [ ] I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
    • [ ] I have added tests that prove my fix is effective or that my feature works
    • [ ] New and existing unit tests pass locally with my changes
    opened by mikcnt 4
  • Fixed: black screen rendering for MPE env in rgb_array mode

    Fixed: black screen rendering for MPE env in rgb_array mode

    Description

    Fixes #864 by drawing on pygame screen before extracting the image for rgb_array mode. Only calls pygame.surfarray.pixels3d when needed.

    Not sure if a new unit test is needed, can add if necessary !

    Type of change

    • Bug fix (non-breaking change which fixes an issue)

    Screenshots

    Please attach before and after screenshots of the change if applicable.

    Before: drawing

    After: drawing

    Checklist:

    • [x] I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
    • [x] I have run pytest -v and no errors are present.
    • [ ] I have commented my code, particularly in hard-to-understand areas
    • [ ] I have made corresponding changes to the documentation
    • [ ] I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
    • [ ] I have added tests that prove my fix is effective or that my feature works
    • [ ] New and existing unit tests pass locally with my changes
    opened by cibeah 0
  • [Proposal] Support `python3.11`

    [Proposal] Support `python3.11`

    Proposal

    support python3.11

    Motivation

    3.11 is the latest stable python release,

    Gymnasium already supports it

    we need it for Gymnasium-Robotics/MaMuJoCo to support 3.11

    Pitch

    No response

    Alternatives

    No response

    Additional context

    No response

    Checklist

    • [X] I have checked that there is no similar issue in the repo
    enhancement 
    opened by Kallinteris-Andreas 1
  • [Proposal] Improve AssertOutOfBoundWrapper to support MultiDiscrete spaces

    [Proposal] Improve AssertOutOfBoundWrapper to support MultiDiscrete spaces

    Proposal

    When using this wrapper in a piece of code, I noticed that it only works for Discrete spaces from gym. However, extending it to MultiDiscrete spaces too is extremely straightforward: everything needed is to add a case to the assert check on line 11 in assert_out_of_bounds.py:

    `

        assert all(
    
            (isinstance(self.action_space(agent), Discrete) or isinstance(self.action_space(agent), MultiDiscrete))
    
            for agent in getattr(self, "possible_agents", [])
    
        ), "should only use AssertOutOfBoundsWrapper for Discrete spaces"
    

    `

    MultiDiscrete spaces has the same contains() method of Discrete ones, used in the step() method, so no further adjustment would be needed.

    Motivation

    MultiDiscrete spaces are supported by many stable-baselines algorithms, and are useful in many situations.

    Pitch

    The wrapper is useful in many situation, and having it to work with this kind of space as well would be really good.

    Alternatives

    No response

    Additional context

    No response

    Checklist

    • [X] I have checked that there is no similar issue in the repo
    enhancement 
    opened by opocaj92 0
Releases(1.22.3)
  • 1.22.3(Dec 28, 2022)

    What's Changed

    • Waterworld_v4: Fixed incorrect pursuer being selected when adding rewards by @TheMikeste1 in https://github.com/Farama-Foundation/PettingZoo/pull/855
    • Remove AEC diagrams from environments pages by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/856
    • Add information about MAgent2 standalone package by @dsctt in https://github.com/Farama-Foundation/PettingZoo/pull/857
    • Switch flake8 from gitlab to github by @RedTachyon in https://github.com/Farama-Foundation/PettingZoo/pull/858
    • workflow fix by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/867
    • Versioning by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/865
    • Bump pillow from 9.2.0 to 9.3.0 in /tutorials/Ray by @dependabot in https://github.com/Farama-Foundation/PettingZoo/pull/859
    • https://github.com/Farama-Foundation/PettingZoo/security/dependabot/3 by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/869
    • wrappers documentation: get to_parallel from utils.conversions by @AndrewRWilliams in https://github.com/Farama-Foundation/PettingZoo/pull/870
    • support Python 3.11 by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/872

    New Contributors

    • @TheMikeste1 made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/855
    • @dependabot made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/859
    • @AndrewRWilliams made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/870

    Full Changelog: https://github.com/Farama-Foundation/PettingZoo/compare/1.22.2...1.22.3

    Source code(tar.gz)
    Source code(zip)
  • 1.22.2(Nov 11, 2022)

    What's Changed

    • Fixing for Issue #840 by @BolunDai0216 in https://github.com/Farama-Foundation/PettingZoo/pull/841
    • fix #845 by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/846
    • added GitHub Issue Forms as proposed in #844 by @tobirohrer in https://github.com/Farama-Foundation/PettingZoo/pull/848
    • Changed reward logic in Waterworld by @BolunDai0216 in https://github.com/Farama-Foundation/PettingZoo/pull/843
    • ENH: add gui to chess by @younik in https://github.com/Farama-Foundation/PettingZoo/pull/842
    • change paradigm of parallel api loops by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/847
    • Remove magent docs artifacts by @dsctt in https://github.com/Farama-Foundation/PettingZoo/pull/850
    • tianshou tuts fixed by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/852
    • Overhaul env creation guide by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/838
    • Fix: set render_mode in tianshou tutorials (#853) by @RaffaeleGalliera in https://github.com/Farama-Foundation/PettingZoo/pull/854

    New Contributors

    • @tobirohrer made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/848
    • @RaffaeleGalliera made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/854

    Full Changelog: https://github.com/Farama-Foundation/PettingZoo/compare/1.22.1...1.22.2

    Source code(tar.gz)
    Source code(zip)
  • 1.22.1(Oct 25, 2022)

    What's Changed

    • Docs Update 2 by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/817
    • Remove average total reward from environments pages by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/821
    • Rename core to AEC by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/822
    • Add Google Analytics tag by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/825
    • Remove MAgent content by @dsctt in https://github.com/Farama-Foundation/PettingZoo/pull/823
    • This updates the setup and other things by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/829
    • Wd/tutorials ci by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/831
    • update logo, favicon, and fix broken links by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/832
    • Wd/tutorials ci by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/833
    • Bug fixes for Issue #818 by @BolunDai0216 in https://github.com/Farama-Foundation/PettingZoo/pull/836
    • Update docs by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/837
    • Fix ref to Gymnasium by @dsctt in https://github.com/Farama-Foundation/PettingZoo/pull/839

    Full Changelog: https://github.com/Farama-Foundation/PettingZoo/compare/1.22.0...1.22.1

    Source code(tar.gz)
    Source code(zip)
  • 1.22.0(Oct 7, 2022)

    Major API change: done -> termination and truncation, matching Gymnasium's new API. The dependency gym has been switched with gymnasium, which is maintained.

    What's Changed

    • replace 'done' with 'termination, truncation' new logic by @5cat in https://github.com/Farama-Foundation/PettingZoo/pull/802
    • Update new Render API by @younik in https://github.com/Farama-Foundation/PettingZoo/pull/800
    • fix wrapper unwrapped thing by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/808
    • Update Sphinx theme by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/804
    • Update security permissions for GitHub workflows by @andrewtanJS in https://github.com/Farama-Foundation/PettingZoo/pull/809
    • Updated Waterworld by @BolunDai0216 in https://github.com/Farama-Foundation/PettingZoo/pull/807
    • Docs API Update by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/810
    • Gymnasium dep by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/814

    New Contributors

    • @5cat made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/802
    • @younik made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/800

    Full Changelog: https://github.com/Farama-Foundation/PettingZoo/compare/1.21.0...1.22.0

    Source code(tar.gz)
    Source code(zip)
  • 1.21.0(Sep 24, 2022)

    What's Changed

    1. As part of the Gym update to 0.26, the following change has been made:
      • done -> termination and truncation: The singular done signal has been changed to a termination and truncation signal, where termination dictates that the environment has ended due to meeting certain conditions, and truncation dictates that the environment has ended due to exceeding the time/frame limit.
    2. Butterfly/Prospector, Classic/Mahjong, Classic/Doudizhu, Classic/Backgammon, Classic/Checkers has been pulled.
    3. Some QOL improvements for development, such as moving pyright to pre-commit and enforcing pydocstyle.
    4. Massive website upgrade.

    List of Changes

    • Fix concatvecenvs to work in proper process by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/763
    • Fix RPS render issue by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/776
    • Reduces the number of warnings by @pseudo-rnd-thoughts in https://github.com/Farama-Foundation/PettingZoo/pull/777
    • Pull Prospector by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/779
    • Remove a bunch of envs by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/781
    • Potentially fix 13 lines so it doesn't take >6 hours to run by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/782
    • Move Pyright to pre-commit + add pydocstyle by @kir0ul in https://github.com/Farama-Foundation/PettingZoo/pull/737
    • Update pre-commit config by @pseudo-rnd-thoughts in https://github.com/Farama-Foundation/PettingZoo/pull/787
    • Update docs website to Sphinx by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/780
    • Update links to the new domain by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/789
    • Truncation Update by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/767
    • Update Gym version by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/788
    • More truncation fixes by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/796
    • Update simple_env.py by @FilipinoGambino in https://github.com/Farama-Foundation/PettingZoo/pull/798
    • Docs automation by @mgoulao in https://github.com/Farama-Foundation/PettingZoo/pull/790

    New Contributors

    • @pseudo-rnd-thoughts made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/777
    • @mgoulao made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/780

    Full Changelog: https://github.com/Farama-Foundation/PettingZoo/compare/1.20.1...1.21.0

    Source code(tar.gz)
    Source code(zip)
  • 1.20.1(Aug 7, 2022)

    Refer to the previous version (1.20.0) for list of changes, this version simply exists due to technical problems with publishing to PyPI.

    Source code(tar.gz)
    Source code(zip)
  • 1.20.0(Aug 7, 2022)

    What's Changed

    • Black to pass master CI by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/734
    • Refactor Pytest multiple calls to a single call by @kir0ul in https://github.com/Farama-Foundation/PettingZoo/pull/731
    • Change MPE rendering to use Pygame instead of pyglet #732 by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/738
    • Remove pyglet and buggy dependency in Multiwalker by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/739
    • Add EzPickle to MPE by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/741
    • Fix incorrect observation dict key in documentation by @bkrl in https://github.com/Farama-Foundation/PettingZoo/pull/743
    • Temporarily remove waterworld from tests and also disable environment. by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/749
    • Fix broken atari environment path by @manu-hoffmann in https://github.com/Farama-Foundation/PettingZoo/pull/747
    • Complete Atari Envs path fix by @jjshoots in https://github.com/Farama-Foundation/PettingZoo/pull/750
    • Correct MPE requirements to reflect PR #738 by @WillDudley in https://github.com/Farama-Foundation/PettingZoo/pull/748

    New Contributors

    • @WillDudley made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/738
    • @bkrl made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/743
    • @manu-hoffmann made their first contribution in https://github.com/Farama-Foundation/PettingZoo/pull/747

    Full Changelog: https://github.com/Farama-Foundation/PettingZoo/compare/1.19.1...1.20.0

    Source code(tar.gz)
    Source code(zip)
  • 1.19.1(Jun 21, 2022)

  • 1.19.0(Jun 21, 2022)

  • 0.18.1(Apr 29, 2022)

    • Massive overhaul to Knight Archers Zombies, version bumped
    • Changed Atari games to use minimal observation space by default, all versions bumped
    • Large bug fix to all MAgent environments, versions bumped
    • MAgent environments now have Windows binaries
    • Removed Prison environment
    • Multiwalker bug fix, version bumped
    • Large number of test fixes
    • Removed manual_control with new manual_policy method
    • Converted seed method to argument to reset to match new Gym API

    (The PettingZoo 1.18.0 release never existed due to technical issues)

    Source code(tar.gz)
    Source code(zip)
  • 1.17.0(Mar 15, 2022)

    • Changed metadata naming scheme to match gym. In particular render.modes -> render_modes and video.frames_per_second -> render_fps
    • Fixed bad pettingzoo import error messages caused by autodeprication logic
    Source code(tar.gz)
    Source code(zip)
  • 1.16.0(Mar 5, 2022)

    • KAZ: Code rewrite and numerous fixes, added manual control capabililty
    • Supports changes to seeding in gym 0.22.0
    • Fixed prison state space, bumped version
    • Fixed battlefield state space
    • Increased default cycles in api tests (making them catch more errors than before)
    • Added turn-based to parallel wrapper
    • Moved magent render logic to Magent repo
    Source code(tar.gz)
    Source code(zip)
  • 1.15.0(Jan 28, 2022)

    • Bug fixes to KAZ, pistonball, multiwalker, cooperative pong. Versions bumped.
    • Removed logo from gather, version bumped.
    • Added FPS attribute to all environments to make rendering easier.
    • Multiwalker now uses pygame instead of pyglet for rendering
    • Renamed to_parallel and from_parallel to aec_to_parallel and parallel_to_aec
    • Added is_parallelizable metadata to ensure that the aec_to_parallel wrapper is not misused
    • Fixed the API tests to better support agent generation
    Source code(tar.gz)
    Source code(zip)
  • 1.14.0(Dec 5, 2021)

    -Bug fixes and partial redesign to pursuit environment logic and rendering. Environment is now learnable, version bumped -Bug fixes and reward function redesign for cooperative pong environment, version bumped -Ball moving into the left column due to physics engine imprecision in pistonball no longer gives additional reward, version bumped -PyGame version bumped, no environment version bumps needed -Python 3.10 support -Fixed parallel API tests to allow environments without possible_agents

    Source code(tar.gz)
    Source code(zip)
  • 1.13.1(Oct 19, 2021)

    • Fixed unnecessary warnings generated about observation and action spaces
    • Upstreamed new rlcard version with new texas holdem no limit implementation, bumped version to v6
    • Updated python chess dependency, bumped version to v5
    • Dropped support for python 3.6, added official support for 3.9
    • Various documentation fixes
    Source code(tar.gz)
    Source code(zip)
  • 1.12.0(Oct 8, 2021)

    • API changes
      • new observation_space(agent), action_space(agent) methods that retreive the static space for an agent
      • possible_agents, observation_spaces, action_spaces attributes made optional. Wrappers pass these attributes through if they exist.
      • parallel environment's agents list contains agents to take next step, instead of agents that took previous step.
      • Generated agents now allowed, agents can be created any time during an episode. Note that agents cannot resurect, once they are done, they cannot be readded to the environment.
    • Fixed unexpected behavior with close method in pursuit environment
    • Removed pygame loading messages
    • Fix pillow dependency issue
    • Removed local ratio arg from pistonball environment
    • Gym 0.21.0 support
    • Better code formatting (isort, etc.)
    Source code(tar.gz)
    Source code(zip)
  • 1.11.1(Aug 19, 2021)

    • Fix scipy and pyglet dependencies for sisl environments
    • Fix pistonball rendering (no version bumps)
    • Update rlcard to v1.0.4 with a fix for texas hold'em no limit; bump version
    Source code(tar.gz)
    Source code(zip)
  • 1.11.0(Aug 2, 2021)

    -Upgraded to RLCard 1.0.3, bumped all versions. Also added support for num_players in RLcard based environments which can have variable numbers of players. -Fixed Go and Chess observation spaces, bumped versions -Minor Go rendering fix -Fix PyGame dependency in classic (used for rendering) -Fixed images being loaded into slow PyGame data structures, resulting in substantial speedups in certain Butterfly games (no version bump needed) -Fix odd cache problem using RGB rendering in cooperative pong -Misc fixes to tests and warning messages

    Source code(tar.gz)
    Source code(zip)
  • 1.10.0(Jul 17, 2021)

    • Added continuous action support for MPE environments as an argument
    • Added pixel art rendering for Texas Hold'em No Limit, Rock Paper Scissors and Go
    • Fixed pixel art rendering in Connect Four
    • Fixed bug in order of black/white pieces in Go observation space, bumped version
    • Changed observation in cooperative pong to include entire screen, bumped version
    Source code(tar.gz)
    Source code(zip)
  • 1.9.0(Jun 12, 2021)

    • Created no action timer for pong to encourage players to serve (before there was no penalty to stalling the game forever). Bumped version of all pong environments (pong, basketball_pong, volleyball_pong, foozpong, quadrapong)
    • Fixed Multiwalker collision bug, bumped version
    • Add state method to Magent and MPE
    • Merged rock paper scissors and rock paper scissors lizard spock into a single environment that takes the number of actions as an argument, and adds the n_cycles argument to allow for a single game to be sequential. Bumped version
    • Removed depricated env_done method
    • Fixed order of channels in combined_arms observation
    • Added pixel art based RGB rendering to connect four. This will also be added to rock paper scissors, Go and Texas Holdem in upcoming releases
    • Moved pettingzoo CI test files outside of the repo
    • Changed max cycles test to be more robust under agent death
    Source code(tar.gz)
    Source code(zip)
  • 1.8.2(May 14, 2021)

  • 1.8.1(Apr 16, 2021)

  • 1.8.0(Apr 4, 2021)

    • Fixed arbitrary calls to observe() in classic games (especially tictactoe and connect 4)
    • Fixed documentation for tictactoe and pistonball
    Source code(tar.gz)
    Source code(zip)
  • 1.7.0(Mar 27, 2021)

  • 1.6.1(Mar 8, 2021)

    Minor miscellaneous fixes and small feature additions:

    • Added .unwrapped
    • Minor fix to from_parallel
    • removed warning from close()
    • fixed random demo
    • fixed prison manual control
    Source code(tar.gz)
    Source code(zip)
  • 1.6.0(Feb 21, 2021)

    • Changed default values of max_cycles in pistonball, prison, prospector
    • Changed pistonball default mode to continuous and changed default value for local_ratio
    • Refactored externally facing tests and utils
    • Bumped pymunk version to 6.0.0 and bumped versions of all environments which depend on pymunk
    • Added state() and state_space to API, implemented methods in butterfly environments
    • Various small bug fixes in butterfly environments.
    • Documentation updates.
    Source code(tar.gz)
    Source code(zip)
  • 1.5.2(Jan 29, 2021)

    Fix miscellaneous annoying loading messages for butterfly environments. Improvements to save_obs and related functionality. Fixes to KAZ.

    Source code(tar.gz)
    Source code(zip)
  • 1.5.1(Jan 13, 2021)

    Fixes MPE rendering dependency, fixes minor left over dependencies on six, fixes issues when pickling Pistonball. No versions were bumped.

    Source code(tar.gz)
    Source code(zip)
  • 1.5.0(Jan 5, 2021)

    Refactors tests to be generally usable by third party environments. Added average reward calculating util, and made minor improvements to random_demo and save_obs utils. Removed black death argument from KAZ (it's now a wrapper in supersuit). Redid how illegal actions are handled in classic, by making observations dictionaries where one element is the observation and the other is a proper illegal action mask. Pistonball was refactored for readability, to run faster and to allow the number of pistons to be varied via argument. Waterworld was completely refactored with various major fixes. RLCard version was bumped (and includes bug fixes impacting environments). MAgent rendering looks much better now (versions not bumped). Major bug in the observation space of pursuit is fixed. Add Python 3.9 support. Update Gym version. Fixed multiwalker observation space, for good this time, and made large improvements to code quality. Removed NaN wrapper.

    Source code(tar.gz)
    Source code(zip)
  • 1.4.2(Nov 26, 2020)

    Pistonball reward and miscellaneous problems. Fixed KAZ observation and rendering issues. Fix Cooperative Pong issues with rendering. Fixed default parameters in Hanabi. Fixed multiwalker rewards, added arguments. Changed combined_arms observation and rewards, tiger_deer rewards. Added more arguments to all MAgent environments.

    Source code(tar.gz)
    Source code(zip)
Owner
Farama Foundation
The Farama Foundation is a host organization for the development of open source reinforcement learning software
Farama Foundation
Multi-objective gym environments for reinforcement learning.

MO-Gym: Multi-Objective Reinforcement Learning Environments Gym environments for multi-objective reinforcement learning (MORL). The environments follo

Lucas Alegre 74 Jan 3, 2023
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Minqi 297 Dec 12, 2022
Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

null 1 Nov 10, 2021
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

RIIT Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standard

null 405 Jan 6, 2023
A parallel framework for population-based multi-agent reinforcement learning.

MALib: A parallel framework for population-based multi-agent reinforcement learning MALib is a parallel framework of population-based learning nested

MARL @ SJTU 348 Jan 8, 2023
A library of multi-agent reinforcement learning components and systems

Mava: a research framework for distributed multi-agent reinforcement learning Table of Contents Overview Getting Started Supported Environments System

InstaDeep Ltd 463 Dec 23, 2022
Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

null 183 Dec 28, 2022
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

Salesforce 334 Jan 6, 2023
Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

hhhusiyi 96 Dec 22, 2022
Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN) This is the implementation of the paper Multi-Age

Future Power Networks 83 Jan 6, 2023
CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Citylearn Challenge This is the PyTorch implementation for PikaPika team, CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energ

bigAIdream projects 10 Oct 10, 2022
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

万鲲鹏 7 Sep 20, 2022
Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Overcooked-AI We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm. In this repository, we implemented be

Baek In-Chang 14 Sep 16, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

coax is built on top of JAX, but it doesn't have an explicit dependence on the jax python package. The reason is that your version of jaxlib will depend on your CUDA version.

null 128 Dec 27, 2022
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI Gym toolkit.

Robin Henry 99 Dec 12, 2022
A multi-entity Transformer for multi-agent spatiotemporal modeling.

baller2vec This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotempor

Michael A. Alcorn 56 Nov 15, 2022
Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

RuanJingqing 8 Sep 30, 2022
Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated th

Yvictor 1.1k Jan 2, 2023