Python Multi-Agent Reinforcement Learning framework

whirl

Last update: Jan 5, 2023

Related tags

Deep Learning pymarl

Overview

- Please pay attention to the version of SC2 you are using for your experiments. 
- Performance is *not* always comparable between versions. 
- The results in SMAC (https://arxiv.org/abs/1902.04043) use SC2.4.6.2.69232 not SC2.4.10.

Python MARL framework

PyMARL is WhiRL's framework for deep multi-agent reinforcement learning and includes implementations of the following algorithms:

PyMARL is written in PyTorch and uses SMAC as its environment.

Installation instructions

Build the Dockerfile using

cd docker
bash build.sh

Set up StarCraft II and SMAC:

bash install_sc2.sh

This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.

The requirements.txt file can be used to install the necessary packages into a virtual environment (not recomended).

Run an experiment

python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

To run experiments using the Docker container:

bash run.sh $GPU python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

All results will be stored in the Results folder.

The previous config files used for the SMAC Beta have the suffix _beta.

Saving and loading learnt models

Saving models

You can save the learnt models to disk by setting save_model = True, which is set to False by default. The frequency of saving models can be adjusted using save_model_interval configuration. Models will be saved in the result directory, under the folder called models. The directory corresponding each run will contain models saved throughout the experiment, each within a folder corresponding to the number of timesteps passed since starting the learning process.

Loading models

Learnt models can be loaded using the checkpoint_path parameter, after which the learning will proceed from the corresponding timestep.

Watching StarCraft II replays

save_replay option allows saving replays of models which are loaded using checkpoint_path. Once the model is successfully loaded, test_nepisode number of episodes are run on the test mode and a .SC2Replay file is saved in the Replay directory of StarCraft II. Please make sure to use the episode runner if you wish to save a replay, i.e., runner=episode. The name of the saved replay file starts with the given env_args.save_replay_prefix (map_name if empty), followed by the current timestamp.

The saved replays can be watched by double-clicking on them or using the following command:

python -m pysc2.bin.play --norender --rgb_minimap_size 0 --replay NAME.SC2Replay

Note: Replays cannot be watched using the Linux version of StarCraft II. Please use either the Mac or Windows version of the StarCraft II client.

Documentation/Support

Documentation is a little sparse at the moment (but will improve!). Please raise an issue in this repo, or email Tabish

Citing PyMARL

If you use PyMARL in your research, please cite the SMAC paper.

M. Samvelyan, T. Rashid, C. Schroeder de Witt, G. Farquhar, N. Nardelli, T.G.J. Rudner, C.-M. Hung, P.H.S. Torr, J. Foerster, S. Whiteson. The StarCraft Multi-Agent Challenge, CoRR abs/1902.04043, 2019.

In BibTeX format:

@article{samvelyan19smac,
  title = {{The} {StarCraft} {Multi}-{Agent} {Challenge}},
  author = {Mikayel Samvelyan and Tabish Rashid and Christian Schroeder de Witt and Gregory Farquhar and Nantas Nardelli and Tim G. J. Rudner and Chia-Man Hung and Philiph H. S. Torr and Jakob Foerster and Shimon Whiteson},
  journal = {CoRR},
  volume = {abs/1902.04043},
  year = {2019},
}

License

Code licensed under the Apache License v2.0

Comments

Baselines used in COMA, such as Central-V, IAC-V.

I tried to implement baselines used in your paper, such as Central-V, IAC-V, under this project on 3M map, but I cannot reproduce the results reported in your paper. The following is the training curve of Central-V on 3M map by my code.

I am not sure about how to build inputs for these baselines and my training process is not stable.

If possible, could you kindly share with me your implementation on these baselines, especially the implementation of Critic Function and how to build input? Thank you ;). @jakobnicolaus

opened by yalidu 12
Hi, I have this error in running!

The error is like this:

'Process Process-3: Traceback (most recent call last): File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/a/COMA/pymarl/src/runners/parallel_runner.py", line 221, in env_worker reward, terminated, env_info = env.step(actions) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 373, in step agent_action = self.get_agent_action(a_id, action) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 439, in get_agent_action "Agent {} cannot perform action {}".format(a_id, action) AssertionError: Agent 0 cannot perform action 0 ' Can you help me to solve this?Thanks!

opened by BCWang93 12
fail to build docker image

BREAK when running build.sh, step 23.

CODE: RUN pip3 install git+https://github.com/oxwhirl/smac.git

ERROR: Successfully built SMAC absl-py future mpyq portpicker s2protocol ordered-set ERROR: sacred 0.7.2 has requirement jsonpickle<1.0,>=0.7.2, but you'll have jsonpickle 1.2 which is incompatible. Installing collected packages: future, mpyq, ordered-set, jsonpickle, deepdiff, s2clientprotocol, portpicker, urllib3, certifi, idna, chardet, requests, enum34, websocket-client, sk-video, absl-py, s2protocol, whichcraft, mock, pysc2, SMAC Found existing installation: jsonpickle 0.9.6 Uninstalling jsonpickle-0.9.6: Successfully uninstalled jsonpickle-0.9.6 Successfully installed SMAC-0.1.0b1 absl-py-0.8.0 certifi-2019.9.11 chardet-3.0.4 deepdiff-4.0.7 enum34-1.1.6 future-0.17.1 idna-2.8 jsonpickle-1.2 mock-3.0.5 mpyq-0.2.5 ordered-set-3.1.1 portpicker-1.3.1 pysc2-3.0.0 requests-2.22.0 s2clientprotocol-4.10.3.76114.0 s2protocol-4.10.3.76114.0 sk-video-1.1.10 urllib3-1.25.6 websocket-client-0.56.0 whichcraft-0.6.1 Traceback (most recent call last): File "/usr/local/bin/pip3", line 11, in sys.exit(main()) File "/usr/local/lib/python3.5/dist-packages/pip/_internal/init.py", line 77, in main return command.main(cmd_args) File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 237, in main timeout=min(5, options.timeout) File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 108, in _build_session index_urls=self._get_index_urls(options), File "/usr/local/lib/python3.5/dist-packages/pip/_internal/download.py", line 559, in init self.headers["User-Agent"] = user_agent() File "/usr/local/lib/python3.5/dist-packages/pip/_internal/download.py", line 170, in user_agent setuptools_version = get_installed_version("setuptools") File "/usr/local/lib/python3.5/dist-packages/pip/_internal/utils/misc.py", line 1044, in get_installed_version working_set = pkg_resources.WorkingSet() File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 567, in init self.add_entry(entry) File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 623, in add_entry for dist in find_distributions(entry, True): File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1974, in find_eggs_in_zip if metadata.has_metadata('PKG-INFO'): File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1414, in has_metadata return self._has(path) File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1845, in _has return zip_path in self.zipinfo or zip_path in self._index() File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1722, in zipinfo return self._zip_manifests.load(self.loader.archive) File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/init.py", line 1679, in load mtime = os.stat(path).st_mtime FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.5/dist-packages/jsonpickle-0.9.6-py3.5.egg' The command '/bin/sh -c pip3 install git+https://github.com/oxwhirl/smac.git' returned a non-zero code: 1

opened by YuiiCh 10
the problem of the mudule sacred

I ran on the Window10 system, but the following problem：

(starcraft) E:\GitProject\pymarl>python src\main.py --config=qmix_smac --env-config=sc2 with env_args.map_name=2s3z [INFO 16:19:39] root Saving to FileStorageObserver in results/sacred. [DEBUG 16:19:41] pymarl Using capture mode "fd" [DEBUG 16:19:41] pymarl Stopping Heartbeat Exception originated from within Sacred. Traceback (most recent calls): File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\stdout_capturing.py", line 131, in tee_output_fd ['tee', '-a', '/dev/stderr'], preexec_fn=os.setsid, AttributeError: module 'os' has no attribute 'setsid'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\run.py", line 217, in call with capture_stdout() as self._output_file: File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\contextlib.py", line 81, in enter return next(self.gen) File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\stdout_capturing.py", line 136, in tee_output_fd except (FileNotFoundError, (OSError, AttributeError)): TypeError: catching classes that do not inherit from BaseException is not allowed

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\Downloads\Environment\Anaconda\envs\starcraft\lib\site-packages\sacred\run.py", line 344, in _stop_time seconds=round((self.stop_time - self.start_time).total_seconds())) TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'NoneType'

opened by cjing9017 7
Has anyone encountered the problem that QMIX performs very poorly on 3s5z map and 2s3z map?

I got a very poor performance on 3s5z map which has 0 win rate all the time. And the win rate on 2s3z is not 0 but very poorly. Is it a parameter setting problem?

opened by Judylalala 6
I don't find the implementation of the IQL

Hello，Thank you for your efforts for this implementations. I want to find the implementation of the IQL, unfortunately，I don't find it.(but I find the iql_smac.yaml file in config folder) Would you mind tell me where is it? Thank you!

opened by sxwgit 6
I have some questions when i want to save replay

python3 src/main.py --config=coma_smac --env-config=sc2 with env_args.map_name=2s3z checkpoint_path=results/models/coma_smac__2019-10-20_11-23-08/ save_replay=True

when i run this cammand , there is no XXX.replay in my direction in StarCraft2. How can i fix it?

opened by lml519 6
Had problem running your code

I am new to docker and I failed to run your experiments. OS: Ubuntu 18.04 Arch: x86_64

standard_init_linux.go:211: exec user process caused "exec format error" is the error I got when I tried to run the code posted in README.m file.

Can you tell me what is the problem if you have any ideas? Thank you.

opened by JianyuSu 5
how can I debug the code,there is the problem

I debug it by pycharm, and I put the parameters "--config=qmix --env-config=sc2 with env_args.map_name=2s3z save_model=True save_model_interval=20000" in Edit Configurations. there is: FileNotFoundError: [Errno 2] No such file or directory: '/home/gezhixin/pymarl-master/src/3rdparty/StarCraftII/Versions' I know the path is false, but how can I fix it.

opened by Gezx 5
Anyone can run the COMA algorithm code?

Anyone can run the coma algorithm normaly?I have this erros like this: The error is like this:

'Process Process-3: Traceback (most recent call last): File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/a/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/a/COMA/pymarl/src/runners/parallel_runner.py", line 221, in env_worker reward, terminated, env_info = env.step(actions) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 373, in step agent_action = self.get_agent_action(a_id, action) File "/home/a/anaconda3/lib/python3.6/site-packages/smac/env/starcraft2/starcraft2.py", line 439, in get_agent_action "Agent {} cannot perform action {}".format(a_id, action) AssertionError: Agent 0 cannot perform action 0 ' Can you help me to solve this?Thanks!

opened by BCWang93 5
1c_3s_5z map?

Hi, I want to evaluate on the 1c_3s_5z map, which is used as a benchmark for the QMIX paper, I can not find it under this directory: smac/smac/env/starcraft2/maps/SMAC_Maps/ Can you upload this map? Or is there any equivalent maps I can use? Thanks!

opened by saizhang0218 5
Bump certifi from 2018.8.24 to 2022.12.7
Bumps certifi from 2018.8.24 to 2022.12.7.

Commits

9e9e840 2022.12.07

b81bdb2 2022.09.24

939a28f 2022.09.14

aca828a 2022.06.15.2

de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...

b8eb5e9 2022.06.15.1

47fb7ab Fix deprecation warning on Python 3.11 (#199)

b0b48e0 fixes #198 -- update link in license

9d514b4 2022.06.15

4151e88 Add py.typed to MANIFEST.in to package in sdist (#196)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump pillow from 6.2.0 to 9.3.0
Bumps pillow from 6.2.0 to 9.3.0.

Release notes

Sourced from pillow's releases.

9.3.0

https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html

Changes

Initialize libtiff buffer when saving #6699 [@radarhere]

Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [@wiredfool]

Inline fname2char to fix memory leak #6329 [@nulano]

Fix memory leaks related to text features #6330 [@nulano]

Use double quotes for version check on old CPython on Windows #6695 [@hugovk]

GHA: replace deprecated set-output command with GITHUB_OUTPUT file #6697 [@nulano]

Remove backup implementation of Round for Windows platforms #6693 [@cgohlke]

Upload fribidi.dll to GitHub Actions #6532 [@nulano]

Fixed set_variation_by_name offset #6445 [@radarhere]

Windows build improvements #6562 [@nulano]

Fix malloc in _imagingft.c:font_setvaraxes #6690 [@cgohlke]

Only use ASCII characters in C source file #6691 [@cgohlke]

Release Python GIL when converting images using matrix operations #6418 [@hmaarrfk]

Added ExifTags enums #6630 [@radarhere]

Do not modify previous frame when calculating delta in PNG #6683 [@radarhere]

Added support for reading BMP images with RLE4 compression #6674 [@npjg]

Decode JPEG compressed BLP1 data in original mode #6678 [@radarhere]

pylint warnings #6659 [@marksmayo]

Added GPS TIFF tag info #6661 [@radarhere]

Added conversion between RGB/RGBA/RGBX and LAB #6647 [@radarhere]

Do not attempt normalization if mode is already normal #6644 [@radarhere]

Fixed seeking to an L frame in a GIF #6576 [@radarhere]

Consider all frames when selecting mode for PNG save_all #6610 [@radarhere]

Don't reassign crc on ChunkStream close #6627 [@radarhere]

Raise a warning if NumPy failed to raise an error during conversion #6594 [@radarhere]

Only read a maximum of 100 bytes at a time in IMT header #6623 [@radarhere]

Show all frames in ImageShow #6611 [@radarhere]

Allow FLI palette chunk to not be first #6626 [@radarhere]

If first GIF frame has transparency for RGB_ALWAYS loading strategy, use RGBA mode #6592 [@radarhere]

Round box position to integer when pasting embedded color #6517 [@radarhere]

Removed EXIF prefix when saving WebP #6582 [@radarhere]

Pad IM palette to 768 bytes when saving #6579 [@radarhere]

Added DDS BC6H reading #6449 [@ShadelessFox]

Added support for opening WhiteIsZero 16-bit integer TIFF images #6642 [@JayWiz]

Raise an error when allocating translucent color to RGB palette #6654 [@jsbueno]

Moved mode check outside of loops #6650 [@radarhere]

Added reading of TIFF child images #6569 [@radarhere]

Improved ImageOps palette handling #6596 [@PososikTeam]

Defer parsing of palette into colors #6567 [@radarhere]

Apply transparency to P images in ImageTk.PhotoImage #6559 [@radarhere]

Use rounding in ImageOps contain() and pad() #6522 [@bibinhashley]

Fixed GIF remapping to palette with duplicate entries #6548 [@radarhere]

Allow remap_palette() to return an image with less than 256 palette entries #6543 [@radarhere]

Corrected BMP and TGA palette size when saving #6500 [@radarhere]

... (truncated)

Changelog

Sourced from pillow's changelog.

9.3.0 (2022-10-29)

Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]

Initialize libtiff buffer when saving #6699 [radarhere]

Inline fname2char to fix memory leak #6329 [nulano]

Fix memory leaks related to text features #6330 [nulano]

Use double quotes for version check on old CPython on Windows #6695 [hugovk]

Remove backup implementation of Round for Windows platforms #6693 [cgohlke]

Fixed set_variation_by_name offset #6445 [radarhere]

Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]

Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]

Added ExifTags enums #6630 [radarhere]

Do not modify previous frame when calculating delta in PNG #6683 [radarhere]

Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]

Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]

Added GPS TIFF tag info #6661 [radarhere]

Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]

Do not attempt normalization if mode is already normal #6644 [radarhere]

... (truncated)

Commits

d594f4c Update CHANGES.rst [ci skip]

909dc64 9.3.0 version bump

1a51ce7 Merge pull request #6699 from hugovk/security-libtiff_buffer

2444cdd Merge pull request #6700 from hugovk/security-samples_per_pixel-sec

744f455 Added release notes

0846bfa Add to release notes

799a6a0 Fix linting

00b25fd Hide UserWarning in logs

05b175e Tighter test case

13f2c5a Prevent DOS with large SAMPLESPERPIXEL in Tiff IFD

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
对打印出的输出结果的疑问（Questions about the printed output）

What are the connotations of some parameters in the result folder? For example, what do battle_won_mean and battle_won_mean_T mean? What is the connotation of adding the suffix T? Is the game winning rate drawn with episodes and test_battle_won_mean?

opened by 17713679014 0
Bump protobuf from 3.6.1 to 3.18.3
Bumps protobuf from 3.6.1 to 3.18.3.

Release notes

Sourced from protobuf's releases.

Protocol Buffers v3.18.3

C++

Reduce memory consumption of MessageSet parsing

This release addresses a Security Advisory for C++ and Python users

Protocol Buffers v3.16.1

Java

Improve performance characteristics of UnknownFieldSet parsing (#9371)

Protocol Buffers v3.18.2

Java

Improve performance characteristics of UnknownFieldSet parsing (#9371)

Protocol Buffers v3.18.1

Python

Update setup.py to reflect that we now require at least Python 3.5 (#8989)

Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)

Ruby

Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)

Protocol Buffers v3.18.0

C++

Fix warnings raised by clang 11 (#8664)

Make StringPiece constructible from std::string_view (#8707)

Add missing capability attributes for LLVM 12 (#8714)

Stop using std::iterator (deprecated in C++17). (#8741)

Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)

Fix #7047 Safely handle setlocale (#8735)

Remove deprecated version of SetTotalBytesLimit() (#8794)

Support arena allocation of google::protobuf::AnyMetadata (#8758)

Fix undefined symbol error around SharedCtor() (#8827)

Fix default value of enum(int) in json_util with proto2 (#8835)

Better Smaller ByteSizeLong

Introduce event filters for inject_field_listener_events

Reduce memory usage of DescriptorPool

For lazy fields copy serialized form when allowed.

Re-introduce the InlinedStringField class

v2 access listener

Reduce padding in the proto's ExtensionRegistry map.

GetExtension performance optimizations

Make tracker a static variable rather than call static functions

Support extensions in field access listener

Annotate MergeFrom for field access listener

Fix incomplete types for field access listener

Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.

Reduce binary size due to fieldless proto messages

TextFormat: ParseInfoTree supports getting field end location in addition to start.

... (truncated)

Commits

a902b39 No-op whitespace change

ae62acd Updating version.json and repo version numbers to: 18.3

f43ac49 Merge pull request #10542 from deannagarcia/3.18.x

9efdf55 Add missing includes

d1635e1 Apply patch

5b37c91 Update version.json with "lts": true (#10534)

c39d622 Merge pull request #10529 from protocolbuffers/deannagarcia-patch-5

f77d3b6 Update version.json

8178b06 Merge pull request #10503 from deannagarcia/3.18.x

24ca839 Add version file

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump numpy from 1.15.2 to 1.22.0
Bumps numpy from 1.15.2 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Evaluate results question

Hi,

I have a question about the evaluation of a model. I use the code as described

`python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=Multi_task_6m1M_vs_12m1M checkpoint_path=results/models/qmix_best/ save_replay=True test_nepisode=5 evaluate=True'

So, I run the model for evualuation 5 episodes but the resulst with the return_mean and the other metric have only one value

I try some modifications on the config but I get the same results.

What I try to do is obtain the same number of results that the number of episodes. That is to say, the return and the other metric obtained on each of the episodes.

Thanks!

opened by AlRodA92 0

Owner

whirl

Whiteson Research Lab

GitHub

Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

297 Dec 12, 2022

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

RIIT Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standard

405 Jan 6, 2023

A library of multi-agent reinforcement learning components and systems

Mava: a research framework for distributed multi-agent reinforcement learning Table of Contents Overview Getting Started Supported Environments System

463 Dec 23, 2022

Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

183 Dec 28, 2022

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

334 Jan 6, 2023

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

96 Dec 22, 2022

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN) This is the implementation of the paper Multi-Age

83 Jan 6, 2023

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Citylearn Challenge This is the PyTorch implementation for PikaPika team, CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energ

10 Oct 10, 2022

Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

7 Sep 20, 2022

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Overcooked-AI We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm. In this repository, we implemented be

14 Sep 16, 2022

A multi-entity Transformer for multi-agent spatiotemporal modeling.

baller2vec This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotempor

56 Nov 15, 2022

Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

8 Sep 30, 2022

Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated th

1.1k Jan 2, 2023

Deep Reinforcement Learning based Trading Agent for Bitcoin

Deep Trading Agent Deep Reinforcement Learning based Trading Agent for Bitcoin using DeepSense Network for Q function approximation. For complete deta

669 Dec 29, 2022

Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)

Deep Reinforcement Learning for Smart Cities Documentation RLlib: https://docs.ray.io/en/master/rllib.html Mesa: https://mesa.readthedocs.io/en/stable

1 May 15, 2022

Minecraft agent to farm resources using reinforcement learning

BarnyardBot CS 175 group project using Malmo download BarnyardBot.py into the python examples directory and run 'python BarnyardBot.py' in the console

0 Jul 26, 2022

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

183 Dec 27, 2022

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

8 Nov 7, 2022

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

4 Apr 15, 2022

Python Multi-Agent Reinforcement Learning framework

Related tags

Overview

Python MARL framework

Installation instructions

Run an experiment

Saving and loading learnt models

Saving models

Loading models

Watching StarCraft II replays

Documentation/Support

Citing PyMARL

License

Comments

9.3.0

Changes

9.3.0 (2022-10-29)

Protocol Buffers v3.18.3

C++

Protocol Buffers v3.16.1

Java

Protocol Buffers v3.18.2

Java

Protocol Buffers v3.18.1

Python

Ruby

Protocol Buffers v3.18.0

C++

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

Owner

whirl

Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

A library of multi-agent reinforcement learning components and systems

Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Multi-agent reinforcement learning algorithm and environment

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

A multi-entity Transformer for multi-agent spatiotemporal modeling.

Multi-task Multi-agent Soft Actor Critic for SMAC

Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

Deep Reinforcement Learning based Trading Agent for Bitcoin

Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)

Minecraft agent to farm resources using reinforcement learning

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

Expired deprecations for `loads`, `ndfromtxt`, and `mafromtxt` in npyio