Python Multi-Agent Reinforcement Learning framework

Related tags

Deep Learning pymarl
Overview
- Please pay attention to the version of SC2 you are using for your experiments. 
- Performance is *not* always comparable between versions. 
- The results in SMAC (https://arxiv.org/abs/1902.04043) use SC2.4.6.2.69232 not SC2.4.10.

Python MARL framework

PyMARL is WhiRL's framework for deep multi-agent reinforcement learning and includes implementations of the following algorithms:

PyMARL is written in PyTorch and uses SMAC as its environment.

Installation instructions

Build the Dockerfile using

cd docker
bash build.sh

Set up StarCraft II and SMAC:

bash install_sc2.sh

This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.

The requirements.txt file can be used to install the necessary packages into a virtual environment (not recomended).

Run an experiment

python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

To run experiments using the Docker container:

bash run.sh $GPU python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

All results will be stored in the Results folder.

The previous config files used for the SMAC Beta have the suffix _beta.

Saving and loading learnt models

Saving models

You can save the learnt models to disk by setting save_model = True, which is set to False by default. The frequency of saving models can be adjusted using save_model_interval configuration. Models will be saved in the result directory, under the folder called models. The directory corresponding each run will contain models saved throughout the experiment, each within a folder corresponding to the number of timesteps passed since starting the learning process.

Loading models

Learnt models can be loaded using the checkpoint_path parameter, after which the learning will proceed from the corresponding timestep.

Watching StarCraft II replays

save_replay option allows saving replays of models which are loaded using checkpoint_path. Once the model is successfully loaded, test_nepisode number of episodes are run on the test mode and a .SC2Replay file is saved in the Replay directory of StarCraft II. Please make sure to use the episode runner if you wish to save a replay, i.e., runner=episode. The name of the saved replay file starts with the given env_args.save_replay_prefix (map_name if empty), followed by the current timestamp.

The saved replays can be watched by double-clicking on them or using the following command:

python -m pysc2.bin.play --norender --rgb_minimap_size 0 --replay NAME.SC2Replay

Note: Replays cannot be watched using the Linux version of StarCraft II. Please use either the Mac or Windows version of the StarCraft II client.

Documentation/Support

Documentation is a little sparse at the moment (but will improve!). Please raise an issue in this repo, or email Tabish

Citing PyMARL

If you use PyMARL in your research, please cite the SMAC paper.

M. Samvelyan, T. Rashid, C. Schroeder de Witt, G. Farquhar, N. Nardelli, T.G.J. Rudner, C.-M. Hung, P.H.S. Torr, J. Foerster, S. Whiteson. The StarCraft Multi-Agent Challenge, CoRR abs/1902.04043, 2019.

In BibTeX format:

@article{samvelyan19smac,
  title = {{The} {StarCraft} {Multi}-{Agent} {Challenge}},
  author = {Mikayel Samvelyan and Tabish Rashid and Christian Schroeder de Witt and Gregory Farquhar and Nantas Nardelli and Tim G. J. Rudner and Chia-Man Hung and Philiph H. S. Torr and Jakob Foerster and Shimon Whiteson},
  journal = {CoRR},
  volume = {abs/1902.04043},
  year = {2019},
}

License

Code licensed under the Apache License v2.0

Comments
  • Baselines used in COMA, such as Central-V, IAC-V.

    Baselines used in COMA, such as Central-V, IAC-V.

    I tried to implement baselines used in your paper, such as Central-V, IAC-V, under this project on 3M map, but I cannot reproduce the results reported in your paper. The following is the training curve of Central-V on 3M map by my code.

    image

    I am not sure about how to build inputs for these baselines and my training process is not stable.

    If possible, could you kindly share with me your implementation on these baselines, especially the implementation of Critic Function and how to build input? Thank you ;). @jakobnicolaus

    opened by yalidu 12
  • Hi, I have this error in running!

    Hi, I have this error in running!

    The error is like this:

    'Process Process-3: Traceback (most recent call last): File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/a/COMA/pymarl/src/runners/parallel_runner.py", line 221, in env_worker reward, terminated, env_info = env.step(actions) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 373, in step agent_action = self.get_agent_action(a_id, action) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 439, in get_agent_action "Agent {} cannot perform action {}".format(a_id, action) AssertionError: Agent 0 cannot perform action 0 ' Can you help me to solve this?Thanks!

    opened by BCWang93 12
  • fail to build docker image

    fail to build docker image

    BREAK when running build.sh, step 23.

    CODE: RUN pip3 install git+https://github.com/oxwhirl/smac.git

    ERROR: Successfully built SMAC absl-py future mpyq portpicker s2protocol ordered-set ERROR: sacred 0.7.2 has requirement jsonpickle<1.0,>=0.7.2, but you'll have jsonpickle 1.2 which is incompatible. Installing collected packages: future, mpyq, ordered-set, jsonpickle, deepdiff, s2clientprotocol, portpicker, urllib3, certifi, idna, chardet, requests, enum34, websocket-client, sk-video, absl-py, s2protocol, whichcraft, mock, pysc2, SMAC Found existing installation: jsonpickle 0.9.6 Uninstalling jsonpickle-0.9.6: Successfully uninstalled jsonpickle-0.9.6 Successfully installed SMAC-0.1.0b1 absl-py-0.8.0 certifi-2019.9.11 chardet-3.0.4 deepdiff-4.0.7 enum34-1.1.6 future-0.17.1 idna-2.8 jsonpickle-1.2 mock-3.0.5 mpyq-0.2.5 ordered-set-3.1.1 portpicker-1.3.1 pysc2-3.0.0 requests-2.22.0 s2clientprotocol-4.10.3.76114.0 s2protocol-4.10.3.76114.0 sk-video-1.1.10 urllib3-1.25.6 websocket-client-0.56.0 whichcraft-0.6.1 Traceback (most recent call last): File "/usr/local/bin/pip3", line 11, in sys.exit(main()) File "/usr/local/lib/python3.5/dist-packages/pip/_internal/init.py", line 77, in main return command.main(cmd_args) File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 237, in main timeout=min(5, options.timeout) File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 108, in _build_session index_urls=self._get_index_urls(options), File "/usr/local/lib/python3.5/dist-packages/pip/_internal/download.py", line 559, in init self.headers["User-Agent"] = user_agent() File "/usr/local/lib/python3.5/dist-packages/pip/_internal/download.py", line 170, in user_agent setuptools_version = get_installed_version("setuptools") File "/usr/local/lib/python3.5/dist-packages/pip/_internal/utils/misc.py", line 1044, in get_installed_version working_set = pkg_resources.WorkingSet() File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 567, in init self.add_entry(entry) File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 623, in add_entry for dist in find_distributions(entry, True): File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1974, in find_eggs_in_zip if metadata.has_metadata('PKG-INFO'): File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1414, in has_metadata return self._has(path) File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1845, in _has return zip_path in self.zipinfo or zip_path in self._index() File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1722, in zipinfo return self._zip_manifests.load(self.loader.archive) File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1679, in load mtime = os.stat(path).st_mtime FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.5/dist-packages/jsonpickle-0.9.6-py3.5.egg' The command '/bin/sh -c pip3 install git+https://github.com/oxwhirl/smac.git' returned a non-zero code: 1

    opened by YuiiCh 10
  • the problem of the mudule sacred

    the problem of the mudule sacred

    I ran on the Window10 system, but the following problem:

    (starcraft) E:\GitProject\pymarl>python src\main.py --config=qmix_smac --env-config=sc2 with env_args.map_name=2s3z [INFO 16:19:39] root Saving to FileStorageObserver in results/sacred. [DEBUG 16:19:41] pymarl Using capture mode "fd" [DEBUG 16:19:41] pymarl Stopping Heartbeat Exception originated from within Sacred. Traceback (most recent calls): File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\stdout_capturing.py", line 131, in tee_output_fd ['tee', '-a', '/dev/stderr'], preexec_fn=os.setsid, AttributeError: module 'os' has no attribute 'setsid'

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\run.py", line 217, in call with capture_stdout() as self._output_file: File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\contextlib.py", line 81, in enter return next(self.gen) File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\stdout_capturing.py", line 136, in tee_output_fd except (FileNotFoundError, (OSError, AttributeError)): TypeError: catching classes that do not inherit from BaseException is not allowed

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\run.py", line 344, in _stop_time seconds=round((self.stop_time - self.start_time).total_seconds())) TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'NoneType'

    opened by cjing9017 7
  • Has anyone encountered the problem that QMIX performs very poorly on 3s5z map and 2s3z map?

    Has anyone encountered the problem that QMIX performs very poorly on 3s5z map and 2s3z map?

    I got a very poor performance on 3s5z map which has 0 win rate all the time. And the win rate on 2s3z is not 0 but very poorly. Is it a parameter setting problem?

    opened by Judylalala 6
  • I don't find the implementation of the IQL

    I don't find the implementation of the IQL

    Hello,Thank you for your efforts for this implementations. I want to find the implementation of the IQL, unfortunately,I don't find it.(but I find the iql_smac.yaml file in config folder) Would you mind tell me where is it? Thank you!

    opened by sxwgit 6
  • I have some questions when i want to save replay

    I have some questions when i want to save replay

    python3 src/main.py --config=coma_smac --env-config=sc2 with env_args.map_name=2s3z checkpoint_path=results/models/coma_smac__2019-10-20_11-23-08/ save_replay=True

    when i run this cammand , there is no XXX.replay in my direction in StarCraft2. How can i fix it?

    opened by lml519 6
  • Had problem running your code

    Had problem running your code

    I am new to docker and I failed to run your experiments. OS: Ubuntu 18.04 Arch: x86_64

    standard_init_linux.go:211: exec user process caused "exec format error" is the error I got when I tried to run the code posted in README.m file.

    Can you tell me what is the problem if you have any ideas? Thank you.

    opened by JianyuSu 5
  • how can I debug the code,there is the problem

    how can I debug the code,there is the problem

    I debug it by pycharm, and I put the parameters "--config=qmix --env-config=sc2 with env_args.map_name=2s3z save_model=True save_model_interval=20000" in Edit Configurations. there is: FileNotFoundError: [Errno 2] No such file or directory: '/home/gezhixin/pymarl-master/src/3rdparty/StarCraftII/Versions' I know the path is false, but how can I fix it.

    opened by Gezx 5
  • Anyone can run the COMA algorithm code?

    Anyone can run the COMA algorithm code?

    Anyone can run the coma algorithm normaly?I have this erros like this: The error is like this:

    'Process Process-3: Traceback (most recent call last): File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/a/COMA/pymarl/src/runners/parallel_runner.py", line 221, in env_worker reward, terminated, env_info = env.step(actions) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 373, in step agent_action = self.get_agent_action(a_id, action) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 439, in get_agent_action "Agent {} cannot perform action {}".format(a_id, action) AssertionError: Agent 0 cannot perform action 0 ' Can you help me to solve this?Thanks!

    opened by BCWang93 5
  • 1c_3s_5z map?

    1c_3s_5z map?

    Hi, I want to evaluate on the 1c_3s_5z map, which is used as a benchmark for the QMIX paper, I can not find it under this directory: smac/smac/env/starcraft2/maps/SMAC_Maps/ Can you upload this map? Or is there any equivalent maps I can use? Thanks!

    opened by saizhang0218 5
  • Bump certifi from 2018.8.24 to 2022.12.7

    Bump certifi from 2018.8.24 to 2022.12.7

    Bumps certifi from 2018.8.24 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump pillow from 6.2.0 to 9.3.0

    Bump pillow from 6.2.0 to 9.3.0

    Bumps pillow from 6.2.0 to 9.3.0.

    Release notes

    Sourced from pillow's releases.

    9.3.0

    https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    9.3.0 (2022-10-29)

    • Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]

    • Initialize libtiff buffer when saving #6699 [radarhere]

    • Inline fname2char to fix memory leak #6329 [nulano]

    • Fix memory leaks related to text features #6330 [nulano]

    • Use double quotes for version check on old CPython on Windows #6695 [hugovk]

    • Remove backup implementation of Round for Windows platforms #6693 [cgohlke]

    • Fixed set_variation_by_name offset #6445 [radarhere]

    • Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]

    • Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]

    • Added ExifTags enums #6630 [radarhere]

    • Do not modify previous frame when calculating delta in PNG #6683 [radarhere]

    • Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]

    • Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]

    • Added GPS TIFF tag info #6661 [radarhere]

    • Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]

    • Do not attempt normalization if mode is already normal #6644 [radarhere]

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • 对打印出的输出结果的疑问(Questions about the printed output)

    对打印出的输出结果的疑问(Questions about the printed output)

    What are the connotations of some parameters in the result folder? For example, what do battle_won_mean and battle_won_mean_T mean? What is the connotation of adding the suffix T? Is the game winning rate drawn with episodes and test_battle_won_mean?

    opened by 17713679014 0
  • Bump protobuf from 3.6.1 to 3.18.3

    Bump protobuf from 3.6.1 to 3.18.3

    Bumps protobuf from 3.6.1 to 3.18.3.

    Release notes

    Sourced from protobuf's releases.

    Protocol Buffers v3.18.3

    C++

    Protocol Buffers v3.16.1

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.2

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.1

    Python

    • Update setup.py to reflect that we now require at least Python 3.5 (#8989)
    • Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)

    Ruby

    • Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)

    Protocol Buffers v3.18.0

    C++

    • Fix warnings raised by clang 11 (#8664)
    • Make StringPiece constructible from std::string_view (#8707)
    • Add missing capability attributes for LLVM 12 (#8714)
    • Stop using std::iterator (deprecated in C++17). (#8741)
    • Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)
    • Fix #7047 Safely handle setlocale (#8735)
    • Remove deprecated version of SetTotalBytesLimit() (#8794)
    • Support arena allocation of google::protobuf::AnyMetadata (#8758)
    • Fix undefined symbol error around SharedCtor() (#8827)
    • Fix default value of enum(int) in json_util with proto2 (#8835)
    • Better Smaller ByteSizeLong
    • Introduce event filters for inject_field_listener_events
    • Reduce memory usage of DescriptorPool
    • For lazy fields copy serialized form when allowed.
    • Re-introduce the InlinedStringField class
    • v2 access listener
    • Reduce padding in the proto's ExtensionRegistry map.
    • GetExtension performance optimizations
    • Make tracker a static variable rather than call static functions
    • Support extensions in field access listener
    • Annotate MergeFrom for field access listener
    • Fix incomplete types for field access listener
    • Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.
    • Reduce binary size due to fieldless proto messages
    • TextFormat: ParseInfoTree supports getting field end location in addition to start.

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump numpy from 1.15.2 to 1.22.0

    Bump numpy from 1.15.2 to 1.22.0

    Bumps numpy from 1.15.2 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Evaluate results question

    Evaluate results question

    Hi,

    I have a question about the evaluation of a model. I use the code as described

    `python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=Multi_task_6m1M_vs_12m1M checkpoint_path=results/models/qmix_best/ save_replay=True test_nepisode=5 evaluate=True'

    So, I run the model for evualuation 5 episodes but the resulst with the return_mean and the other metric have only one value

    image

    I try some modifications on the config but I get the same results.

    What I try to do is obtain the same number of results that the number of episodes. That is to say, the return and the other metric obtained on each of the episodes.

    Thanks!

    opened by AlRodA92 0
Owner
whirl
Whiteson Research Lab
whirl
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Minqi 297 Dec 12, 2022
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

RIIT Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standard

null 405 Jan 6, 2023
A library of multi-agent reinforcement learning components and systems

Mava: a research framework for distributed multi-agent reinforcement learning Table of Contents Overview Getting Started Supported Environments System

InstaDeep Ltd 463 Dec 23, 2022
Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

null 183 Dec 28, 2022
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

Salesforce 334 Jan 6, 2023
Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

hhhusiyi 96 Dec 22, 2022
Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN) This is the implementation of the paper Multi-Age

Future Power Networks 83 Jan 6, 2023
CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Citylearn Challenge This is the PyTorch implementation for PikaPika team, CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energ

bigAIdream projects 10 Oct 10, 2022
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

万鲲鹏 7 Sep 20, 2022
Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Overcooked-AI We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm. In this repository, we implemented be

Baek In-Chang 14 Sep 16, 2022
A multi-entity Transformer for multi-agent spatiotemporal modeling.

baller2vec This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotempor

Michael A. Alcorn 56 Nov 15, 2022
Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

RuanJingqing 8 Sep 30, 2022
Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated th

Yvictor 1.1k Jan 2, 2023
Deep Reinforcement Learning based Trading Agent for Bitcoin

Deep Trading Agent Deep Reinforcement Learning based Trading Agent for Bitcoin using DeepSense Network for Q function approximation. For complete deta

Kartikay Garg 669 Dec 29, 2022
Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)

Deep Reinforcement Learning for Smart Cities Documentation RLlib: https://docs.ray.io/en/master/rllib.html Mesa: https://mesa.readthedocs.io/en/stable

null 1 May 15, 2022
Minecraft agent to farm resources using reinforcement learning

BarnyardBot CS 175 group project using Malmo download BarnyardBot.py into the python examples directory and run 'python BarnyardBot.py' in the console

null 0 Jul 26, 2022
COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

ETHZ V4RL 183 Dec 27, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 7, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022