A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

Overview

A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

This repo contains the source code to reproduce the results in the paper A Closer Look at Invalid Action Masking in Policy Gradient Algorithms.

Steps to reproduce the experiments

Our experiments use docker containers to run and Weight and Biases (https://www.wandb.com/) to record the experiments, so the first step is to register a wandb account and get an API key, which we refer to as YOUR_WANDB_KEY

# build the docker container
docker build -t invalid_action_masking:latest -f sharedmemory.Dockerfile .
# build docker run commands. replace `{YOUR_WANDB_KEY}` with your own
WANDB_KEY={YOUR_WANDB_KEY} python docker.py > docker.sh
# run experiments (96 in total)
# if you have limited computational resources, consider not running all of them at a time.
# in addition, notice the commands have --cpuset-cpus="0", --cpuset-cpus="1" for different runs
# to make sure each container is only using one core. By default I assume your machine has 40 cores,
# but feel free to modify the `cores` variable in `docker.py`
bash docker.sh

Steps to reproduce the figures

Record your wandb username, which we will refer to as YOUR_WANDB_ENTITY

cd plots
WANDB_ENTITY={YOUR_WANDB_ENTITY} python episode_reward.py
WANDB_ENTITY={YOUR_WANDB_ENTITY} python approx_kl.py

These command should reproduce the PDFs in plots that are attached to the repo.

Reproduction without WANDB

Although it would be possible, it would require a significant amount of effort to properly log metrics and redo the plotting, so at this time we would not have intructions to do reproduction without WANDB. Note that it is possible to use wandb locally by following https://docs.wandb.com/self-hosted/local.

If you have an issue reproducing the results

We have tested these scripts to reproduce but it is possible that there is a bug and maybe we are assuming something specific regarding the environment. If you couldn't reproduce our results, please file an issue and we will address it as soon as the double-blind review is over.

Comments
  • Bump pillow from 8.3.1 to 9.0.1

    Bump pillow from 8.3.1 to 9.0.1

    Bumps pillow from 8.3.1 to 9.0.1.

    Release notes

    Sourced from pillow's releases.

    9.0.1

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.1.html

    Changes

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [@​radarhere, @​hugovk]
    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]

    9.0.0

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    9.0.1 (2022-02-03)

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [radarhere, hugovk]

    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]

    9.0.0 (2022-01-02)

    • Restrict builtins for ImageMath.eval(). CVE-2022-22817 #5923 [radarhere]

    • Ensure JpegImagePlugin stops at the end of a truncated file #5921 [radarhere]

    • Fixed ImagePath.Path array handling. CVE-2022-22815, CVE-2022-22816 #5920 [radarhere]

    • Remove consecutive duplicate tiles that only differ by their offset #5919 [radarhere]

    • Improved I;16 operations on big endian #5901 [radarhere]

    • Limit quantized palette to number of colors #5879 [radarhere]

    • Fixed palette index for zeroed color in FASTOCTREE quantize #5869 [radarhere]

    • When saving RGBA to GIF, make use of first transparent palette entry #5859 [radarhere]

    • Pass SAMPLEFORMAT to libtiff #5848 [radarhere]

    • Added rounding when converting P and PA #5824 [radarhere]

    • Improved putdata() documentation and data handling #5910 [radarhere]

    • Exclude carriage return in PDF regex to help prevent ReDoS #5912 [hugovk]

    • Fixed freeing pointer in ImageDraw.Outline.transform #5909 [radarhere]

    ... (truncated)

    Commits
    • 6deac9e 9.0.1 version bump
    • c04d812 Update CHANGES.rst [ci skip]
    • 4fabec3 Added release notes for 9.0.1
    • 02affaa Added delay after opening image with xdg-open
    • ca0b585 Updated formatting
    • 427221e In show_file, use os.remove to remove temporary images
    • c930be0 Restrict builtins within lambdas for ImageMath.eval
    • 75b69dd Dont need to pin for GHA
    • cd938a7 Autolink CWE numbers with sphinx-issues
    • 2e9c461 Add CVE IDs
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump pillow from 8.3.1 to 9.0.0

    Bump pillow from 8.3.1 to 9.0.0

    Bumps pillow from 8.3.1 to 9.0.0.

    Release notes

    Sourced from pillow's releases.

    9.0.0

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    9.0.0 (2022-01-02)

    • Restrict builtins for ImageMath.eval(). CVE-2022-22817 #5923 [radarhere]

    • Ensure JpegImagePlugin stops at the end of a truncated file #5921 [radarhere]

    • Fixed ImagePath.Path array handling. CVE-2022-22815, CVE-2022-22816 #5920 [radarhere]

    • Remove consecutive duplicate tiles that only differ by their offset #5919 [radarhere]

    • Improved I;16 operations on big endian #5901 [radarhere]

    • Limit quantized palette to number of colors #5879 [radarhere]

    • Fixed palette index for zeroed color in FASTOCTREE quantize #5869 [radarhere]

    • When saving RGBA to GIF, make use of first transparent palette entry #5859 [radarhere]

    • Pass SAMPLEFORMAT to libtiff #5848 [radarhere]

    • Added rounding when converting P and PA #5824 [radarhere]

    • Improved putdata() documentation and data handling #5910 [radarhere]

    • Exclude carriage return in PDF regex to help prevent ReDoS #5912 [hugovk]

    • Fixed freeing pointer in ImageDraw.Outline.transform #5909 [radarhere]

    • Added ImageShow support for xdg-open #5897 [m-shinder, radarhere]

    • Support 16-bit grayscale ImageQt conversion #5856 [cmbruns, radarhere]

    • Convert subsequent GIF frames to RGB or RGBA #5857 [radarhere]

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump pillow from 8.3.1 to 8.3.2

    Bump pillow from 8.3.1 to 8.3.2

    Bumps pillow from 8.3.1 to 8.3.2.

    Release notes

    Sourced from pillow's releases.

    8.3.2

    https://pillow.readthedocs.io/en/stable/releasenotes/8.3.2.html

    Security

    • CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]

    • Fix 6-byte OOB read in FliDecode [wiredfool]

    Python 3.10 wheels

    • Add support for Python 3.10 #5569, #5570 [hugovk, radarhere]

    Fixed regressions

    • Ensure TIFF RowsPerStrip is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]

    • Updates for ImagePalette channel order #5599 [radarhere]

    • Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]

    Changelog

    Sourced from pillow's changelog.

    8.3.2 (2021-09-02)

    • CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]

    • Fix 6-byte OOB read in FliDecode [wiredfool]

    • Add support for Python 3.10 #5569, #5570 [hugovk, radarhere]

    • Ensure TIFF RowsPerStrip is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]

    • Updates for ImagePalette channel order #5599 [radarhere]

    • Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]

    Commits
    • 8013f13 8.3.2 version bump
    • 23c7ca8 Update CHANGES.rst
    • 8450366 Update release notes
    • a0afe89 Update test case
    • 9e08eb8 Raise ValueError if color specifier is too long
    • bd5cf7d FLI tests for Oss-fuzz crash.
    • 94a0cf1 Fix 6-byte OOB read in FliDecode
    • cece64f Add 8.3.2 (2021-09-02) [CI skip]
    • e422386 Add release notes for Pillow 8.3.2
    • 08dcbb8 Pillow 8.3.2 supports Python 3.10 [ci skip]
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump oauthlib from 3.1.1 to 3.2.1

    Bump oauthlib from 3.1.1 to 3.2.1

    Bumps oauthlib from 3.1.1 to 3.2.1.

    Release notes

    Sourced from oauthlib's releases.

    3.2.1

    In short

    OAuth2.0 Provider:

    • #803 : Metadata endpoint support of non-HTTPS
    • CVE-2022-36087

    OAuth1.0:

    • #818 : Allow IPv6 being parsed by signature

    General:

    • Improved and fixed documentation warnings.
    • Cosmetic changes based on isort

    What's Changed

    New Contributors

    Full Changelog: https://github.com/oauthlib/oauthlib/compare/v3.2.0...v3.2.1

    3.2.0

    Changelog

    OAuth2.0 Client:

    • #795: Add Device Authorization Flow for Web Application
    • #786: Add PKCE support for Client
    • #783: Fallback to none in case of wrong expires_at format.

    OAuth2.0 Provider:

    • #790: Add support for CORS to metadata endpoint.
    • #791: Add support for CORS to token endpoint.
    • #787: Remove comma after Bearer in WWW-Authenticate

    OAuth2.0 Provider - OIDC:

    • #755: Call save_token in Hybrid code flow
    • #751: OIDC add support of refreshing ID Tokens with refresh_id_token
    • #751: The RefreshTokenGrant modifiers now take the same arguments as the AuthorizationCodeGrant modifiers (token, token_handler, request).

    ... (truncated)

    Changelog

    Sourced from oauthlib's changelog.

    3.2.1 (2022-09-09)

    OAuth2.0 Provider:

    • #803: Metadata endpoint support of non-HTTPS
    • CVE-2022-36087

    OAuth1.0:

    • #818: Allow IPv6 being parsed by signature

    General:

    • Improved and fixed documentation warnings.
    • Cosmetic changes based on isort

    3.2.0 (2022-01-29)

    OAuth2.0 Client:

    • #795: Add Device Authorization Flow for Web Application
    • #786: Add PKCE support for Client
    • #783: Fallback to none in case of wrong expires_at format.

    OAuth2.0 Provider:

    • #790: Add support for CORS to metadata endpoint.
    • #791: Add support for CORS to token endpoint.
    • #787: Remove comma after Bearer in WWW-Authenticate

    OAuth2.0 Provider - OIDC:

    • #755: Call save_token in Hybrid code flow
    • #751: OIDC add support of refreshing ID Tokens with refresh_id_token
    • #751: The RefreshTokenGrant modifiers now take the same arguments as the AuthorizationCodeGrant modifiers (token, token_handler, request).

    General:

    • Added Python 3.9, 3.10, 3.11
    • Improve Travis & Coverage
    Commits
    • 88bb156 Updated date and authors
    • 1a45d97 Prepare 3.2.1 release
    • 0adbbe1 docs: fix typos
    • 6569ec3 docs: Fix a few typos
    • bdc486e Fixed isort imports
    • 7db45bd Fix typo in server.rst
    • b14ad85 chore: s/bode_code_verifier/body_code_verifier/g
    • b123283 Allow non-HTTPS issuer when OAUTHLIB_INSECURE_TRANSPORT. (#803)
    • 2f887b5 Docs: fix Sphinx warnings for better ReadTheDocs generation (#807)
    • d4bafd9 Merge pull request #797 from cclauss/patch-2
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • [Bug?] An error when installing gym_microrts

    [Bug?] An error when installing gym_microrts

    Hi,

    Thanks for the great work. I am interested in this paper and attempt to reproduce the experiments to understand more.

    One error is that Multiple top-level packages discovered in a flat-layout: ['maps', 'docker', 'experiments', 'gym_microrts']. when installing gym_microrts. I guess this is caused by automatic discovery of the setuptools as not explicitly specified in setup.py.

    This workaround works for me: adding py_modules=['gym_microrts'] in the setup.py.

    Specifically, I modified .../envs/act_mask/src/gym-microrts/setup.py to:

    from setuptools import setup
    
    setup(name='gym_microrts',
          version='0.1.0',
          install_requires=['gym', 'dacite', 'jPype1', 'hilbertcurve'],
          py_modules=['gym_microrts']
    )
    

    Previously, it is:

    from setuptools import setup
    
    setup(name='gym_microrts',
          version='0.1.0',
          install_requires=['gym', 'dacite', 'jPype1', 'hilbertcurve']
    )
    

    The environment specifications:

    • anaconda: v4.11.0
    • Python: 3.8

    Error details:

    $ poetry install
    Installing dependencies from lock file
    
    Package operations: 0 installs, 32 updates, 1 removal
    
      • Removing py (1.10.0)
      • Updating certifi (2021.10.8 -> 2021.5.30)
      • Updating charset-normalizer (2.0.7 -> 2.0.1)
      • Updating idna (3.3 -> 3.2)
      • Updating pytz (2021.3 -> 2021.1)
      • Updating urllib3 (1.26.7 -> 1.26.6)
      • Updating cachetools (4.2.4 -> 4.2.2)
      • Updating click (8.0.3 -> 8.0.1)
      • Updating jinja2 (3.0.2 -> 3.0.1)
      • Updating numpy (1.21.4 -> 1.21.0)
      • Updating packaging (21.2 -> 21.0)
      • Updating smmap (5.0.0 -> 4.0.0)
      • Updating typing-extensions (3.10.0.2 -> 3.10.0.0)
      • Updating cloudpickle (2.0.0 -> 1.6.0)
      • Updating cycler (0.11.0 -> 0.10.0)
      • Updating gitdb (4.0.9 -> 4.0.7)
      • Updating google-auth (2.3.3 -> 1.32.1)
      • Updating kiwisolver (1.3.2 -> 1.3.1)
      • Updating pillow (8.4.0 -> 8.3.1)
      • Updating absl-py (0.15.0 -> 0.13.0)
      • Updating configparser (5.1.0 -> 5.0.2)
      • Updating google-auth-oauthlib (0.4.6 -> 0.4.4)
      • Updating grpcio (1.41.1 -> 1.38.1)
      • Updating gym (0.21.0 -> 0.17.3)
      • Updating matplotlib (3.4.3 -> 3.4.2)
      • Updating pandas (1.3.4 -> 1.3.0)
      • Updating protobuf (3.19.1 -> 3.17.3)
      • Updating sentry-sdk (1.4.3 -> 1.4.2)
      • Updating werkzeug (2.0.2 -> 2.0.1)
      • Updating tensorboard (2.7.0 -> 2.5.0)
      • Updating wandb (0.12.6 -> 0.12.2)
      • Updating cleanrl (0.0.1 35694d2 -> 0.0.1 35694d2)
      • Updating gym-microrts (0.0.0 /home/***/gym-microrts -> 0.1.0 b0cabba): Failed
    
      EnvCommandError
    
      Command ['/home/***/anaconda3/envs/act_mask/bin/pip', 'install', '--no-deps', '-U', '-e', '/home/***/anaconda3/envs/act_mask/src/gym-microrts'] errored with the following return code 1, and output: 
      Obtaining file:///home/***/anaconda3/envs/act_mask/src/gym-microrts
        Preparing metadata (setup.py): started
        Preparing metadata (setup.py): finished with status 'error'
        error: subprocess-exited-with-error
        
        × python setup.py egg_info did not run successfully.
        │ exit code: 1
        ╰─> [14 lines of output]
            error: Multiple top-level packages discovered in a flat-layout: ['maps', 'docker', 'experiments', 'gym_microrts'].
            
            To avoid accidental inclusion of unwanted files or directories,
            setuptools will not proceed with this build.
            
            If you are trying to create a single distribution with multiple packages
            on purpose, you should not rely on automatic discovery.
            Instead, consider the following options:
            
            1. set up custom discovery (`find` directive with `include` or `exclude`)
            2. use a `src-layout`
            3. explicitly set `py_modules` or `packages` with a list of names
            
            To find more information, look for "package discovery" on setuptools docs.
            [end of output]
        
        note: This error originates from a subprocess, and is likely not a problem with pip.
      error: metadata-generation-failed
      
      × Encountered error while generating package metadata.
      ╰─> See above for output.
      
      note: This is an issue with the package mentioned above, not pip.
      hint: See above for details.
      
    
      at ~/.poetry/lib/poetry/utils/env.py:1195 in _run
          1191│                 output = subprocess.check_output(
          1192│                     cmd, stderr=subprocess.STDOUT, **kwargs
          1193│                 )
          1194│         except CalledProcessError as e:
        → 1195│             raise EnvCommandError(e, input=input_)
          1196│ 
          1197│         return decode(output)
          1198│ 
          1199│     def execute(self, bin, *args, **kwargs):
    
    opened by cameron-chen 0
  • Masking removed still behaves to some extent

    Masking removed still behaves to some extent

    Hey,

    not sure if that is the proper way of asking you something but I'll give it a try. So according to you paper, you analysed that invalid action masking still works quite well even without using the mask after the training finished. As far as I understood, you compared against the method of giving a reward penalty for executing some invalid action.

    In my case I am experiencing exactly the opposite. The training with invalid action masking outputs not reasonable distributions, as opposed to the negative reward penalty. However, the negative rew. penalty is still not satisfying. In my case, it doesn't bother me too much, since I am having access to the mask even for inference time. Question: Do you have any other insides of what could be the issue. Because training with invalid actions masking accelerates training by a factor of 10. The training pipeline looks great and the number of discrete actions is also not "too big" (8 actions). Another disturbing factor for my issue is, that you prove that the policy gradient is a valid gradient and must therefore be correct for backpropagation.

    Thanks in advance if you have any ideas or hints. Greetings, Fabian

    opened by sycz00 0
  • Bump numpy from 1.21.0 to 1.22.0

    Bump numpy from 1.21.0 to 1.22.0

    Bumps numpy from 1.21.0 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump ipython from 7.28.0 to 7.31.1

    Bump ipython from 7.28.0 to 7.31.1

    Bumps ipython from 7.28.0 to 7.31.1.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump ujson from 4.2.0 to 5.1.0

    Bump ujson from 4.2.0 to 5.1.0

    Bumps ujson from 4.2.0 to 5.1.0.

    Release notes

    Sourced from ujson's releases.

    5.1.0

    Changed

    5.0.0

    Added

    Removed

    Fixed

    4.3.0

    Added

    Commits
    • 682c660 Merge pull request #493 from bwoodsend/strip-binaries
    • c1d5b6d [pre-commit.ci] auto fixes from pre-commit.com hooks
    • b9275f7 Strip debugging symbols from Linux binaries.
    • e3ccc5a Merge pull request #492 from hugovk/deploy-twine
    • 243d49b Install Twine to upload to PyPI
    • 269621b Merge pull request #490 from hugovk/rm-3.6
    • cccde3f Drop support for EOL Python 3.6
    • b55049f Merge pull request #491 from bwoodsend/switch-to-ci-build-wheels
    • 04286a6 Drop wheels for Python 3.6. (#490)
    • ab32d48 CI/CD: Ensure that sdists are uploaded last.
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Owner
Costa Huang
Computer Science Ph.D student at Drexel University researching Game Artificial Intelligence
Costa Huang
code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021) Introduction PBR is a conceptually simple yet effective

H.Chen 143 Jan 5, 2023
A Closer Look at Structured Pruning for Neural Network Compression

A Closer Look at Structured Pruning for Neural Network Compression Code used to reproduce experiments in https://arxiv.org/abs/1810.04622. To prune, w

Bayesian and Neural Systems Group 140 Dec 5, 2022
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation Official PyTorch implementation for the paper Look

Rishabh Jangir 20 Nov 24, 2022
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

Nugroho Dewantoro 9 Jun 6, 2022
This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

Stock Trading Market OpenAI Gym Environment with Deep Reinforcement Learning using Keras Overview This project provides a general environment for stoc

Kim, Ki Hyun 769 Dec 25, 2022
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1

Zhengyao Jiang 1.5k Dec 29, 2022
Trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI

Introduction This script trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI. In order to run this

Momin Haider 0 Jan 2, 2022
A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Intro PyTorch implementation of Learning to learn by gradient descent by gradient descent. Run python main.py TODO Initial implementation Toy data LST

Ilya Kostrikov 300 Dec 11, 2022
Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

Michael Schlichtkrull 29 Sep 2, 2022
Code & Data for the Paper "Time Masking for Temporal Language Models", WSDM 2022

Time Masking for Temporal Language Models This repository provides a reference implementation of the paper: Time Masking for Temporal Language Models

Guy Rosin 12 Jan 6, 2023
U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

Christopher Ley 1 Jan 6, 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding by Qiaole Dong*, Chenjie Cao*, Yanwei Fu Paper and Supple

Qiaole Dong 190 Dec 27, 2022
Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

actions-includes Allows including an action inside another action (by preprocessing the Yaml file). Instead of using uses or run in your action step,

Tim Ansell 70 Nov 4, 2022
Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21).

ACTION-Net Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21). Getting Started EgoGesture data folder struct

V-Sense 171 Dec 26, 2022
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 3, 2023
Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

null 27 Jul 20, 2022
The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

PIC4SeRCentre 20 Jan 3, 2023
Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

null 183 Dec 28, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 8, 2023