A toolkit for developing and comparing reinforcement learning algorithms.

Related tags

gym
Overview

Status: Maintenance (expect bug fixes and minor updates)

OpenAI Gym

OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. This is the gym open-source library, which gives you access to a standardized set of environments.

https://travis-ci.org/openai/gym.svg?branch=master

See What's New section below

gym makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as TensorFlow or Theano. You can use it from Python code, and soon from other languages.

If you're not sure where to start, we recommend beginning with the docs on our site. See also the FAQ.

A whitepaper for OpenAI Gym is available at http://arxiv.org/abs/1606.01540, and here's a BibTeX entry that you can use to cite it in a publication:

@misc{1606.01540,
  Author = {Greg Brockman and Vicki Cheung and Ludwig Pettersson and Jonas Schneider and John Schulman and Jie Tang and Wojciech Zaremba},
  Title = {OpenAI Gym},
  Year = {2016},
  Eprint = {arXiv:1606.01540},
}

Basics

There are two basic concepts in reinforcement learning: the environment (namely, the outside world) and the agent (namely, the algorithm you are writing). The agent sends actions to the environment, and the environment replies with observations and rewards (that is, a score).

The core gym interface is Env, which is the unified environment interface. There is no interface for agents; that part is left to you. The following are the Env methods you should know:

  • reset(self): Reset the environment's state. Returns observation.
  • step(self, action): Step the environment by one timestep. Returns observation, reward, done, info.
  • render(self, mode='human'): Render one frame of the environment. The default mode will do something human friendly, such as pop up a window.

Supported systems

We currently support Linux and OS X running Python 3.5 -- 3.8 Windows support is experimental - algorithmic, toy_text, classic_control and atari should work on Windows (see next section for installation instructions); nevertheless, proceed at your own risk.

Installation

You can perform a minimal install of gym with:

git clone https://github.com/openai/gym.git
cd gym
pip install -e .

If you prefer, you can do a minimal install of the packaged version directly from PyPI:

pip install gym

You'll be able to run a few environments right away:

  • algorithmic
  • toy_text
  • classic_control (you'll need pyglet to render though)

We recommend playing with those environments at first, and then later installing the dependencies for the remaining environments.

You can also run gym on gitpod.io to play with the examples online. In the preview window you can click on the mp4 file you want to view. If you want to view another mp4 file, just press the back button and click on another mp4 file.

Installing everything

To install the full set of environments, you'll need to have some system packages installed. We'll build out the list here over time; please let us know what you end up installing on your platform. Also, take a look at the docker files (py.Dockerfile) to see the composition of our CI-tested images.

On Ubuntu 16.04 and 18.04:

apt-get install -y libglu1-mesa-dev libgl1-mesa-dev libosmesa6-dev xvfb ffmpeg curl patchelf libglfw3 libglfw3-dev cmake zlib1g zlib1g-dev swig

MuJoCo has a proprietary dependency we can't set up for you. Follow the instructions in the mujoco-py package for help. Note that we currently do not support MuJoCo 2.0 and above, so you will need to install a version of mujoco-py which is built for a lower version of MuJoCo like MuJoCo 1.5 (example - mujoco-py-1.50.1.0). As an alternative to mujoco-py, consider PyBullet which uses the open source Bullet physics engine and has no license requirement.

Once you're ready to install everything, run pip install -e '.[all]' (or pip install 'gym[all]').

Pip version

To run pip install -e '.[all]', you'll need a semi-recent pip. Please make sure your pip is at least at version 1.5.0. You can upgrade using the following: pip install --ignore-installed pip. Alternatively, you can open setup.py and install the dependencies by hand.

Rendering on a server

If you're trying to render video on a server, you'll need to connect a fake display. The easiest way to do this is by running under xvfb-run (on Ubuntu, install the xvfb package):

xvfb-run -s "-screen 0 1400x900x24" bash

Installing dependencies for specific environments

If you'd like to install the dependencies for only specific environments, see setup.py. We maintain the lists of dependencies on a per-environment group basis.

Environments

See List of Environments and the gym site.

For information on creating your own environments, see Creating your own Environments.

Examples

See the examples directory.

Testing

We are using pytest for tests. You can run them via:

pytest

Resources

What's new

  • 2020-12-18 (v 0.18.0)
    • Add python 3.9 support
    • Remove python 3.5 support (thanks @justinkterry on both!)
    • TimeAwareObservationWrapper (thanks @zuoxingdong!)
    • Space-related fixes and tests (thanks @wmmc88!)
  • 2020-09-29 (v 0.17.3)
    • Allow custom spaces in VectorEnv (thanks @tristandeleu!)
    • CarRacing performance improvements (thanks @leocus!)
    • Dict spaces are now iterable (thanks @NotNANtoN!)
  • 2020-05-08 (v 0.17.2)
    • remove unnecessary precision warning when creating Box with scalar bounds - thanks @johannespitz!
    • remove six from the dependencies
    • FetchEnv sample goal range can be specified through kwargs - thanks @YangRui2015!
  • 2020-03-05 (v 0.17.1)
    • update cloudpickle dependency to be >=1.2.0,<1.4.0
  • 2020-02-21 (v 0.17.0)
    • Drop python 2 support
    • Add python 3.8 build
  • 2020-02-09 (v 0.16.0)
    • EnvSpec API change - remove tags field (retro-active version bump, the changes are actually already in the codebase since 0.15.5 - thanks @wookayin for keeping us in check!)
  • 2020-02-03 (v0.15.6)
    • pyglet 1.4 compatibility (this time for real :))
    • Fixed the bug in BipedalWalker and BipedalWalkerHardcore, bumped version to 3 (thanks @chozabu!)
  • 2020-01-24 (v0.15.5)
    • pyglet 1.4 compatibility
    • remove python-opencv from the requirements
  • 2019-11-08 (v0.15.4)
    • Added multiple env wrappers (thanks @zuoxingdong and @hartikainen!)
    • Removed mujoco >= 2.0 support due to lack of tests
  • 2019-10-09 (v0.15.3)
    • VectorEnv modifications - unified the VectorEnv api (added reset_async, reset_wait, step_async, step_wait methods to SyncVectorEnv); more flexibility in AsyncVectorEnv workers
  • 2019-08-23 (v0.15.2)
    • More Wrappers - AtariPreprocessing, FrameStack, GrayScaleObservation, FilterObservation, FlattenDictObservationsWrapper, PixelObservationWrapper, TransformReward (thanks @zuoxingdong, @hartikainen)
    • Remove rgb_rendering_tracking logic from mujoco environments (default behavior stays the same for the -v3 environments, rgb rendering returns a view from tracking camera)
    • Velocity goal constraint for MountainCar (thanks @abhinavsagar)
    • Taxi-v2 -> Taxi-v3 (add missing wall in the map to replicate env as describe in the original paper, thanks @kobotics)
  • 2019-07-26 (v0.14.0)
    • Wrapper cleanup
    • Spec-related bug fixes
    • VectorEnv fixes
  • 2019-06-21 (v0.13.1)
    • Bug fix for ALE 0.6 difficulty modes
    • Use narrow range for pyglet versions
  • 2019-06-21 (v0.13.0)
    • Upgrade to ALE 0.6 (atari-py 0.2.0) (thanks @JesseFarebro!)
  • 2019-06-21 (v0.12.6)
    • Added vectorized environments (thanks @tristandeleu!). Vectorized environment runs multiple copies of an environment in parallel. To create a vectorized version of an environment, use gym.vector.make(env_id, num_envs, **kwargs), for instance, gym.vector.make('Pong-v4',16).
  • 2019-05-28 (v0.12.5)
    • fixed Fetch-slide environment to be solvable.
  • 2019-05-24 (v0.12.4)
    • remove pyopengl dependency and use more narrow atari-py and box2d-py versions
  • 2019-03-25 (v0.12.1)
    • rgb rendering in MuJoCo locomotion -v3 environments now comes from tracking camera (so that agent does not run away from the field of view). The old behaviour can be restored by passing rgb_rendering_tracking=False kwarg. Also, a potentially breaking change!!! Wrapper class now forwards methods and attributes to wrapped env.
  • 2019-02-26 (v0.12.0)
    • release mujoco environments v3 with support for gym.make kwargs such as xml_file, ctrl_cost_weight, reset_noise_scale etc
  • 2019-02-06 (v0.11.0)
    • remove gym.spaces.np_random common PRNG; use per-instance PRNG instead.
    • support for kwargs in gym.make
    • lots of bugfixes
  • 2018-02-28: Release of a set of new robotics environments.

  • 2018-01-25: Made some aesthetic improvements and removed unmaintained parts of gym. This may seem like a downgrade in functionality, but it is actually a long-needed cleanup in preparation for some great new things that will be released in the next month.

    • Now your Env and Wrapper subclasses should define step, reset, render, close, seed rather than underscored method names.
    • Removed the board_game, debugging, safety, parameter_tuning environments since they're not being maintained by us at OpenAI. We encourage authors and users to create new repositories for these environments.
    • Changed MultiDiscrete action space to range from [0, ..., n-1] rather than [a, ..., b-1].
    • No more render(close=True), use env-specific methods to close the rendering.
    • Removed scoreboard directory, since site doesn't exist anymore.
    • Moved gym/monitoring to gym/wrappers/monitoring
    • Add dtype to Space.
    • Not using python's built-in module anymore, using gym.logger
  • 2018-01-24: All continuous control environments now use mujoco_py >= 1.50. Versions have been updated accordingly to -v2, e.g. HalfCheetah-v2. Performance should be similar (see https://github.com/openai/gym/pull/834) but there are likely some differences due to changes in MuJoCo.

  • 2017-06-16: Make env.spec into a property to fix a bug that occurs when you try to print out an unregistered Env.

  • 2017-05-13: BACKWARDS INCOMPATIBILITY: The Atari environments are now at v4. To keep using the old v3 environments, keep gym <= 0.8.2 and atari-py <= 0.0.21. Note that the v4 environments will not give identical results to existing v3 results, although differences are minor. The v4 environments incorporate the latest Arcade Learning Environment (ALE), including several ROM fixes, and now handle loading and saving of the emulator state. While seeds still ensure determinism, the effect of any given seed is not preserved across this upgrade because the random number generator in ALE has changed. The *NoFrameSkip-v4 environments should be considered the canonical Atari environments from now on.

  • 2017-03-05: BACKWARDS INCOMPATIBILITY: The configure method has been removed from Env. configure was not used by gym, but was used by some dependent libraries including universe. These libraries will migrate away from the configure method by using wrappers instead. This change is on master and will be released with 0.8.0.

  • 2016-12-27: BACKWARDS INCOMPATIBILITY: The gym monitor is now a wrapper. Rather than starting monitoring as env.monitor.start(directory), envs are now wrapped as follows: env = wrappers.Monitor(env, directory). This change is on master and will be released with 0.7.0.

  • 2016-11-1: Several experimental changes to how a running monitor interacts with environments. The monitor will now raise an error if reset() is called when the env has not returned done=True. The monitor will only record complete episodes where done=True. Finally, the monitor no longer calls seed() on the underlying env, nor does it record or upload seed information.

  • 2016-10-31: We're experimentally expanding the environment ID format to include an optional username.

  • 2016-09-21: Switch the Gym automated logger setup to configure the root logger rather than just the 'gym' logger.

  • 2016-08-17: Calling close on an env will also close the monitor and any rendering windows.

  • 2016-08-17: The monitor will no longer write manifest files in real-time, unless write_upon_reset=True is passed.

  • 2016-05-28: For controlled reproducibility, envs now support seeding (cf #91 and #135). The monitor records which seeds are used. We will soon add seed information to the display on the scoreboard.

Issues
  • Windows support

    Windows support

    It would be nice if you could add support for Windows.

    opened by salarian 101
  • Box2d won't find some RAND_LIMIT_swigconstant

    Box2d won't find some RAND_LIMIT_swigconstant

    Hello!

    It's probably some silly mistake on my side, but i wasn't able to fix by random lever pulling, as usual.

    Installing Box2d as in instuctions (using pip install -e .[all]) will throw error when trying to use some of Box2D examples.

    Code that reproduces the issue:

    import gym
    atari = gym.make('LunarLander-v0')
    atari.reset()
    
    [2016-05-16 02:14:25,430] Making new env: LunarLander-v0
    
    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-1-f89e78f4410b> in <module>()
          1 import gym
    ----> 2 atari = gym.make('LunarLander-v0')
          3 atari.reset()
          4 #plt.imshow(atari.render('rgb_array'))
    
    /home/jheuristic/yozhik/gym/gym/envs/registration.pyc in make(self, id)
         77         logger.info('Making new env: %s', id)
         78         spec = self.spec(id)
    ---> 79         return spec.make()
         80 
         81     def all(self):
    
    /home/jheuristic/yozhik/gym/gym/envs/registration.pyc in make(self)
         52             raise error.Error('Attempting to make deprecated env {}. (HINT: is there a newer registered version of this env?)'.format(self.id))
         53 
    ---> 54         cls = load(self._entry_point)
         55         env = cls(**self._kwargs)
         56 
    
    /home/jheuristic/yozhik/gym/gym/envs/registration.pyc in load(name)
         11 def load(name):
         12     entry_point = pkg_resources.EntryPoint.parse('x={}'.format(name))
    ---> 13     result = entry_point.load(False)
         14     return result
         15 
    
    /home/jheuristic/thenv/local/lib/python2.7/site-packages/pkg_resources/__init__.pyc in load(self, require, *args, **kwargs)
       2378         if require:
       2379             self.require(*args, **kwargs)
    -> 2380         return self.resolve()
       2381 
       2382     def resolve(self):
    
    /home/jheuristic/thenv/local/lib/python2.7/site-packages/pkg_resources/__init__.pyc in resolve(self)
       2384         Resolve the entry point from its module and attrs.
       2385         """
    -> 2386         module = __import__(self.module_name, fromlist=['__name__'], level=0)
       2387         try:
       2388             return functools.reduce(getattr, self.attrs, module)
    
    /home/jheuristic/yozhik/gym/gym/envs/box2d/__init__.py in <module>()
    ----> 1 from gym.envs.box2d.lunar_lander import LunarLander
          2 from gym.envs.box2d.bipedal_walker import BipedalWalker, BipedalWalkerHardcore
    
    /home/jheuristic/yozhik/gym/gym/envs/box2d/lunar_lander.py in <module>()
          3 from six.moves import xrange
          4 
    ----> 5 import Box2D
          6 from Box2D.b2 import (edgeShape, circleShape, fixtureDef, polygonShape, revoluteJointDef, contactListener)
          7 
    
    /home/jheuristic/thenv/local/lib/python2.7/site-packages/Box2D/__init__.py in <module>()
         18 # 3. This notice may not be removed or altered from any source distribution.
         19 #
    ---> 20 from .Box2D import *
         21 __author__ = '$Date$'
         22 __version__ = '2.3.1'
    
    /home/jheuristic/thenv/local/lib/python2.7/site-packages/Box2D/Box2D.py in <module>()
        433     return _Box2D.b2CheckPolygon(shape, additional_checks)
        434 
    --> 435 _Box2D.RAND_LIMIT_swigconstant(_Box2D)
        436 RAND_LIMIT = _Box2D.RAND_LIMIT
        437 
    
    AttributeError: 'module' object has no attribute 'RAND_LIMIT_swigconstant'
    
    

    What didn't help:

    pip uninstall gym
    apt-get install -y python-numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl
    git clone https://github.com/openai/gym
    cd gym
    pip install -e .[all] --upgrade
    

    The OS is Ubuntu 14.04 Server x64 It may be a clue that i am running the thing from inside python2 virtualenv (with all numpys, etc. installed)

    opened by justheuristic 52
  • ImportError: sys.meta_path is None, Python is likely shutting down

    ImportError: sys.meta_path is None, Python is likely shutting down

    I'm using MacOS. Since the python script finished, it will print such errors:

    It's a script will cause this problem:

    import gym
    env = gym.make('SpaceInvaders-v0')
    env.reset()
    env.render()
    

    And after executing it, the error occurs:

    ~/G/qlearning $ python atari.py
    Exception ignored in: <bound method SimpleImageViewer.__del__ of <gym.envs.classic_control.rendering.SimpleImageViewer object at 0x1059ab400>>
    Traceback (most recent call last):
      File "/Users/louchenyao/anaconda3/lib/python3.6/site-packages/gym/envs/classic_control/rendering.py", line 347, in __del__
      File "/Users/louchenyao/anaconda3/lib/python3.6/site-packages/gym/envs/classic_control/rendering.py", line 343, in close
      File "/Users/louchenyao/anaconda3/lib/python3.6/site-packages/pyglet/window/cocoa/__init__.py", line 281, in close
      File "/Users/louchenyao/anaconda3/lib/python3.6/site-packages/pyglet/window/__init__.py", line 770, in close
    ImportError: sys.meta_path is None, Python is likely shutting down
    

    It doesn't affect the environment running. It just a little annoying.

    opened by louchenyao 38
  • Game window hangs up

    Game window hangs up

    Hi,

    I am a beginner with gym. After I render CartPole

    env = gym.make('CartPole-v0') env.reset() env.render()

    Window is launched from Jupyter notebook but it hangs immediately. Then the notebook is dead. I am using Python 3.5.4 on OSX 10.11.6. What could be the problem here?

    Thanks.

    opened by vishrathi 36
  • Support MuJoCo 1.5

    Support MuJoCo 1.5

    In order to use MuJoCo on a recent mac, you need to be using MuJoCo 1.5: https://github.com/openai/mujoco-py/issues/36 Otherwise you get:

    >>> import mujoco_py
    ERROR: Could not open disk
    

    This is because MuJoCo before 1.5 doesn't support NVMe disks.

    Gym depends on MuJoCo via mujoco-py, which just released support for MuJoCo 1.5.

    It looks like maybe this is something you're already working on? Or would it be useful for me to look into fixing it?

    opened by jeffkaufman 36
  • AttributeError: module 'gym' has no attribute 'make'

    AttributeError: module 'gym' has no attribute 'make'

    >>> import gym
    >>> env = gym.make('Copy-v0')
    Traceback (most recent call last):
      File "<pyshell#5>", line 1, in <module>
        env = gym.make('Copy-v0')
    AttributeError: module 'gym' has no attribute 'make'
    >>> 
    

    I wanna Why @jonasschneider

    opened by thomastao0215 34
  • Environment working, but not render()

    Environment working, but not render()

    Configuration:

    Dell XPS15 Anaconda 3.6 Python 3.5 NVIDIA GTX 1050

    I installed open ai gym through pip. When I run the below code, I can execute steps in the environment which returns all information of the specific environment, but the render() method just gives me a blank screen. When I exit python the blank screen closes in a normal way.

    Code:

    import gym
    env = gym.make('CartPole-v0')
    env.reset()
    env.render()
    for i in range(1000):
        env.step(env.action_space.sample())
    

    After hours of google searching I think the issue might have something to do with pyglet, the package used for rendering, and possibly a conflict with my nvidia graphics card? All help is welcome. Thanks!

    opened by cpatyn 34
  • Make Tuple and Dicts be seedable with lists and dicts of seeds + make the seed in default initialization controllable

    Make Tuple and Dicts be seedable with lists and dicts of seeds + make the seed in default initialization controllable

    Since seed() is being called in default initialization of Space, it should be controllable for reproducibility.

    opened by RaghuSpaceRajan 31
  • Write more documentation about environments

    Write more documentation about environments

    We should write a more detailed explanation of every environment, in particular, how the reward function is computed.

    opened by joschu 27
  • Gym on Mac OS X Big Sur

    Gym on Mac OS X Big Sur

    After updating OS all my scripts stopped working, most of the errors related to pyglet, and paths to some libraries, some errors that I came across:

    Error occurred while running `from pyglet.gl import *`
    HINT: make sure you have OpenGL install. On Ubuntu, you can run 'apt-get install python-opengl'.
    If you're running on a server, you may need a virtual frame buffer; something like this should work:
    'xvfb-run -s "-screen 0 1400x900x24" python <your_script.py>'
    

    ImportError: Can't find framework /System/Library/Frameworks/OpenGL.framework.

    I researched, and tried, various possible solutions on the internet, but nothing solved, does anyone have an idea of how to fix this problem, I appreciate any help, I'm using python 3.8, but I tried on 3.7 too, and I was unsuccessful.

    opened by marcelovidigal 25
  • Update README.md

    Update README.md

    null

    opened by treyshotz 0
  • Add Python 3.10 testing and support

    Add Python 3.10 testing and support

    null

    opened by jkterry1 1
  • [Bug Report] Bug title

    [Bug Report] Bug title

    If you are submitting a bug report, please fill in the following details and use the tag [bug].

    Describe the bug When trying to save a video of the Hopper-v2 environment, it stops after a couple (14) of steps with the error: Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) I am able to render the environment perfectly (when not capturing) , and the code does work for the Cartpole environment. In the hopper case, the first 14 frames are captured and saved in the mp4 file though, but I do not understand why the process finishes mid-episode.

    Code example

    import os
    from gym.wrappers.monitoring.video_recorder import VideoRecorder
    
    path_project = os.path.abspath(os.path.join(__file__, ".."))
    path_of_video_with_name = os.path.join(path_project, "videotest.mp4")
    env = gym.make('Hopper-v2') # for making environment
    state = env.reset()
    video_recorder = None
    video_recorder = VideoRecorder(env, path_of_video_with_name, enabled=True)
    
    for _ in range(1000):
       env.render()
       video_recorder.capture_frame()
       env.step(env.action_space.sample()) # take a random action
    
    print("Saved video.")
    
    video_recorder.close()
    video_recorder.enabled = False
    env.close()
    

    System Info

    I am using Linux 20.0.4 (Ubuntu), Gym version 0.21.0 using pip install gym, python version 3.7.6.

    Additional context Add any other context about the problem here.

    Checklist

    • [x] I have checked that there is no similar issue in the repo (required)
    mujoco 
    opened by GitJarl 0
  • [Bug Report] Uselessly calling `gym.Env`'s inexisting `__init__()` in `VectorEnv`

    [Bug Report] Uselessly calling `gym.Env`'s inexisting `__init__()` in `VectorEnv`

    Describe the bug The VectorEnv class inherits from the gym.Env class (this makes sense). However, the gym.Env class does not have an explicitly defined __init__() method. So why do we bother calling super(VectorEnv, self).__init__() here ? It actually leads to an unwanted behaviour in certain cases (see below).

    Code example Everything is fine in this situation and the call to gym.Env.__init__() does nothing here.

    class MyCustomVecEnv(gym.vector.VectorEnv):
        def __init__(self):
            super(MyCustomVecEnv, self).__init__()   # Calling gym.vector.VectorEnv.__init__()
    

    Here, it gets shaky:

    class MyAbstractClass:
        def __init__(self, a):
            print("I'm in MyAbstractClass.__init__")
    
    
    class MyCustomVecEnv(gym.vector.VectorEnv, MyAbstractClass):
        def __init__(self):
            gym.vector.VectorEnv.__init__(self, 4, gym.spaces.Discrete(2), gym.spaces.Discrete(2))
    
    
    env = MyCustomVecEnv()
    
    

    Output:

    Traceback (most recent call last):
      File "/home/gaetan/downloads/test.py", line 57, in <module>
        env = MyCustomVecEnv()
      File "/home/gaetan/downloads/test.py", line 54, in __init__
        gym.vector.VectorEnv.__init__(self, 4, gym.spaces.Discrete(2), gym.spaces.Discrete(2))
      File "/opt/miniconda3/envs/rlenv/lib/python3.9/site-packages/gym/vector/vector_env.py", line 33, in __init__
        super(VectorEnv, self).__init__()
    TypeError: __init__() missing 1 required positional argument: 'a'
    

    This code is not designed to call MyAbstractClass.__init__, however, Python, when executing the problematic line, finds nothing to call (because gym.Env has no __init__ method) so decides by itself to call MyAbstractClass.__init__ instead.

    System Info Describe the characteristic of your environment:

    • Describe how Gym was installed (pip, docker, source, ...) conda environment.
    • Arch Linux. Kernel 5.14.11
    • Python version: 3.9

    Checklist

    • [x] I have checked that there is no similar issue in the repo (required)
    help wanted 
    opened by GaetanLepage 0
  • Seeding update

    Seeding update

    See Issue #1663

    This is a bit of a ride. The base change is getting rid of the custom seeding utils of gym, and instead using np.random.default_rng() as is recommended with modern versions of NumPy. I kept the gym.utils.seeding.np_random interface and changed it to basically being a synonym for default_rng (with some API difference, consistent with the old np_random)

    Because the API (then RandomState, now Generator) changed a bit, np_random.randint calls were replaced with np_random.integers, np_random.rand -> np_random.random, np_random.randn -> np_random.standard_normal. This is all in accordance with the recommended NumPy conversion.

    My doubts in order of (subjective) importance

    Doubt number 1:

    In gym/utils/seeding.py#L18 I'm accessing a "protected" variable seed_seq. This serves accessing the random seed that was automatically generated under the hood when the passed seed is None. (it also gives the correct value if the passed seed is an integer) An alternative solution would be restoring the whole create_seed machinery which generates a random initial seed from os.urandom. I was unable to find another way to get the initial seed of a default Generator (default_rng(None)) instance.

    Doubt number 2:

    In gym/spaces/multi_discrete.py#L64. Turns out that a Generator doesn't directly support get_state and set_state. The same functionality seems to be achievable by accesing the bit generator and modifying its state directly (without getters/setters).

    Doubt number 3:

    As mentioned earlier, this version maintains the gym.utils.seeding file with just a single function. Functionally, I don't think that's a problem, but might feel a bit redundant from a stylistic point of view. This could be replaced by changing something like 17 calls of this function that occur in the codebase, but at the moment I think it'd be a bad move. The reason is that the function passes through the seed that's generated if the passed seed is None (see Doubt 1), which has to be obtained through mildly sketchy means, so it's better to keep it contained within the function. I don't think passing the seed extremely necessary, but that would somewhat significantly change the actual external API, I tend to be hesitant about changes like this if there's no good reason. Overall I think it's good to keep the seeding.np_random function to keep some consistency with previous versions. The alternative is just completely removing the concept of "gym seeding", and using NumPy. (right now "gym seeding" is basically an interface for NumPy seeding)

    Doubt number 4:

    Pinging @araffin as there's a possibility this will (again) break some old pickled spaces in certain cases, and I know this was an issue with SB3 and the model zoo. Specifically, if you create a Space with the current master branch, run space.sample() at least once, and then pickle it, it will be pickled with a RandomState instance, which is now considered a legacy generator in NumPy. If you unpickle it using new gym code (i.e. this PR), space.np_random will still point to a RandomState, but the rest of the code expects space.np_random to be a Generator instance, which has a few API changes (see the beginning of this post).

    Overall I don't know how important it is for the internal details of gym objects to remain the same - which is more or less necessary for old objects to be unpicklable in new gym versions. There's probably a way of a custom unpickling protocol as a compatibility layer - I'm not sufficiently familiar with this to do it, but I imagine it should be doable on the user side? (i.e. not in gym)

    Doubt number 2137: (very low importance)

    This doesn't technically solve #2210. IMHO this is absolutely acceptable, because np.random.seed is part of the legacy seeding mechanism, and is officially discouraged by NumPy. Proper usage yields expected results:

    import numpy as np
    from gym.utils import seeding
    
    user_given_seed = None
    np_random, seed = seeding.np_random(user_given_seed)
    
    # since in some places np.random.randn or similar is used we want to seed the global numpy random generator as well
    rng = np.random.default_rng(seed)
    

    tl;dr

    This definitely needs another set of eyes on it because I'm not confident enough about the nitty-gritty details of NumPy RNG. There are a few things I'm not 100% happy with from a stylistic point of view, but as far as my understanding goes, the functionality is what it should be. There's also the question of supporting old pickled objects, which I think is a whole different topic that needs discussing now that gym is maintained again.

    opened by RedTachyon 42
  • [Bug Report] License terms for parts of the project unclear

    [Bug Report] License terms for parts of the project unclear

    The LICENSE.md file says that parts of this project is copyrighted by Roboti LLC. There is no license specified for the code this applies to. Seeing as that code is distributed together with, and is entangled with, OpenAI Gym, I believe this leaves the overall license state of the project in limbo. It certainly makes it very hard for any serious project to rely on Gym. IANAL, but it seems to me that the project even violates GitHub terms of service in its current state.

    Please clarify the license for all of the code, or perhaps expunge the parts that are unclearly licensed.

    Checklist

    • [✓] I have checked that there is no similar issue in the repo (required): There are in fact similar issues, but they have all been closed with no comments. I'm trying again, as this is a serious bug.
    PR Needed 
    opened by gspr 3
  • Add deprecation notices for v0/v4 versions of the Atari environments and eventually remove them

    Add deprecation notices for v0/v4 versions of the Atari environments and eventually remove them

    In the next release we need to add deprecation notices when importing all the v0/v4 versions of the Atari environments, in the near future (say 3 months) we should remove the v0 environments outright, and in about a year we should remove the v4 environments outright.

    @JesseFarebro

    PR Needed 
    opened by jkterry1 16
  • How to reset a mujoco environment to a random start state

    How to reset a mujoco environment to a random start state

    How can I reset the Mujoco environment to a random start state such that the random states cover the whole state space uniformly or near uniformly?

    mujoco 
    opened by nikhilrayaprolu 0
  • [Proposal] Official Conda Support

    [Proposal] Official Conda Support

    Eventually, I would like to have native Conda support for Gym. This will have to come after the merging of ALE-Py and the planned replacement of the box2d and MuJuCo physics engines discussed in other issues (because they're unmainted and getting them into Conda would be challenging). This will also require very active development from a single maintainer to sort out due to package dependencies. I'm mostly putting this on GitHub so I don't forget in 9 months.

    PR Needed 
    opened by jkterry1 0
  • [Question] What should be done with the GoalEnv and DiscreteEnv classes?

    [Question] What should be done with the GoalEnv and DiscreteEnv classes?

    Right now, Gym has a GoalEnv class and Env class as base classes in core.py. The GoalEnv class was added as part of the robotics environments, and impose special requirements on the observation space. From what I can tell, this class has not been used outside of Gym's robotics environments and is largely unnecessary. Unless I'm missing something here, removing this class sounds like the way to go.

    PR Needed 
    opened by jkterry1 7
Releases(v0.21.0)
  • v0.21.0(Oct 2, 2021)

    -The old Atari entry point that was broken with the last release and the upgrade to ALE-Py is fixed (@JesseFarebro) -Atari environments now give much clearer error messages and warnings (@JesseFarebro) -A new plugin system to enable an easier inclusion of third party environments has been added (@JesseFarebro) -Atari environments now use the new plugin system to prevent clobbered names and other issues (@JesseFarebro) -pip install gym[atari] no longer distributes Atari ROMs that the ALE (the Atari emulator used) needs to run the various games. The easiest way to install ROMs into the ALE has been to use AutoROM. Gym now has a hook to AutoROM for easier CI automation so that using pip install gym[accept-rom-license] calls AutoROM to add ROMs to the ALE. You can install the entire suite with the shorthand gym[atari, accept-rom-license]. Note that as described in the name name, by installing gym[accept-rom-license] you are confirming that you have the relevant license to install the ROMs. (@JesseFarebro) -An accidental breaking change when loading saved policies trained on old versions of Gym with environments using the Box2d action space have been fixed. (@RedTachyon) -Pendulum has had a minor fix to it's physics logic made and the version has been bumped to v1 (@RedTachyon) -Tests have been refactored into an orderly manner (@RedTachyon) -Dict spaces now have standard dict helper methods (@Rohan138) -Environment properties are now forwarded to the wrapper (@tristandeleu) -Gym now properly enforces calling reset before stepping for the first time (@ahmedo42) -Proper piping of error messages to stderr (@XuehaiPan) -Fix video saving issues (@zlig)

    Also, Gym is compiling a list of third party environments to into the new documentation website we're working on. Please submit PRs for ones that are missing: https://github.com/openai/gym/blob/master/docs/third_party_environments.md

    Source code(tar.gz)
    Source code(zip)
  • v0.20.0(Sep 14, 2021)

    Major Change:

    • Replaced Atari-Py dependency with ALE-Py and bumped all versions. This is a massive upgrade with many changes, please see the full explainer (@JesseFarebro)
    • Note that ALE-Py does not include ROMs. You can install ROMs in two lines of bash with AutoROM though (pip3 install autorom and then autorom), see https://github.com/PettingZoo-Team/AutoROM. This is the recommended approach for CI, etc.

    Breaking changes and new features:

    • Add RecordVideo wrapper, deprecate monitor wrapper in favor of it and RecordEpisodeStatistics wrapper (@vwxyzjn)
    • Dependencies used outside of environments (e.g. for wrappers) are now in 'other' extra' (@jkterry1)
    • Moved algorithmic and unused toytext envs (guessing game, hotter colder, nchain, roulette, kellycoinflip) to third party repos (@jkterry1, @Rohan138)
    • Fixed flatten utility and flatdim in MultiDiscrete sapce (@tristandeleu)
    • Add __setitem__ to dict space (@jfpettit)
    • Large fixes to .contains method for box space (@FirefoxMetzger)
    • Made blackjack environment properly comply with Barto and Sutton book standard, bumped to v1 (@RedTachyon)
    • Added NormalizeObservation and NormalizeReward wrappers (@vwxyzjn)
    • Add __getitem__ and __len__ to MultiDiscrete space (@XuehaiPan)
    • Changed .shape to be a property of box space to prevent unexpected behaviors (@RedTachyon)

    Bug fixes and upgrades:

    • Video recorder gracefully handles closing (@XuehaiPan)
    • Remaining unnecessary dependencies in setup.py are resolved (@jkterry1)
    • Minor acrobot performance improvements (@TuckerBMorgan)
    • Pendulum properly renders when 0 force is sent (@Olimoyo)
    • Make observations dtypes be consistent with observation space dtypes for all classic control envs and bipedalwalker (@RedTachyon)
    • Removed unused and long depricated features in registration (@Rohan138)
    • Framestack wrapper now inherits from obswrapper (@jfpettit)
    • Seed method for spaces.Tuple and spaces.Dict now properly function, are fully stochastic, are fully featured and behave in the expected manner (@XuehaiPan, @RaghuSpaceRajan)
    • Replace time() with perf_counter() for better measurements of short duration (@zuoxingdong)
    Source code(tar.gz)
    Source code(zip)
  • 0.19.0(Aug 13, 2021)

    Gym 0.19.0 is a large maintenance release, and the first since @jkterry1 became the maintainer. There should be no breaking changes in this release.

    New features:

    • Added custom datatype argument to multidiscrete space (@m-orsini)
    • API compliance test added based on SB3 and PettingZoo tests (@amtamasi)
    • RecordEpisodeStatics works with VectorEnv (@vwxyzjn)

    Bug fixes:

    • Removed unused dependencies, removed unnescesary dependency version requirements that caused installation issues on newer machines, added full requirements.txt and moved general dependencies to extras. Notably, "toy_text" is not a used extra. atari-py is now pegged to a precise working version pending the switch to ale-py (@jkterry1)
    • Bug fixes to rewards in FrozenLake and FrozenLake8x8; versions bumped to v1 (@ZhiqingXiao) -Removed remaining numpy depreciation warnings (@super-pirata)
    • Fixes to video recording (@mahiuchun, @zlig)
    • EZ pickle argument fixes (@zzyunzhi, @Indoril007)
    • Other very minor (nonbreaking) fixes

    Other:

    • Removed small bits of dead code (@jkterry1)
    • Numerous typo, CI and documentation fixes (mostly @cclauss)
    • New readme and updated third party env list (@jkterry1)
    • Code is now all flake8 compliant through black (@cclauss)
    Source code(tar.gz)
    Source code(zip)
  • v0.9.6(Feb 1, 2018)

    • Now your Env and Wrapper subclasses should define step, reset, render, close, seed rather than underscored method names.
    • Removed the board_game, debugging, safety, parameter_tuning environments since they're not being maintained by us at OpenAI. We encourage authors and users to create new repositories for these environments.
    • Changed MultiDiscrete action space to range from [0, ..., n-1] rather than [a, ..., b-1].
    • No more render(close=True), use env-specific methods to close the rendering.
    • Removed scoreboard directory, since site doesn't exist anymore.
    • Moved gym/monitoring to gym/wrappers/monitoring
    • Add dtype to Space.
    • Not using python's built-in module anymore, using gym.logger
    Source code(tar.gz)
    Source code(zip)
Tensorforce: a TensorFlow library for applied reinforcement learning

Tensorforce: a TensorFlow library for applied reinforcement learning Introduction Tensorforce is an open-source deep reinforcement learning framework,

Tensorforce 3k Oct 15, 2021
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 8.5k Oct 22, 2021
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.1k Oct 13, 2021
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17k Feb 11, 2021
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Packt 1.1k Oct 21, 2021
A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

Aladdin Persson 2.3k Oct 24, 2021
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

DLR-RM 2.2k Oct 22, 2021
Solutions of Reinforcement Learning 2nd Edition

Solutions of Reinforcement Learning, An Introduction

YIFAN WANG 1000 Oct 14, 2021
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 20.3k Oct 15, 2021
Reinforcement learning framework and algorithms implemented in PyTorch.

Reinforcement learning framework and algorithms implemented in PyTorch.

Robotic AI & Learning Lab Berkeley 1.7k Oct 20, 2021
A comprehensive list of published machine learning applications to cosmology

ml-in-cosmology This github attempts to maintain a comprehensive list of published machine learning applications to cosmology, organized by subject ma

George Stein 211 Oct 15, 2021
ilpyt: imitation learning library with modular, baseline implementations in Pytorch

ilpyt The imitation learning toolbox (ilpyt) contains modular implementations of common deep imitation learning algorithms in PyTorch, with unified in

The MITRE Corporation 5 Oct 10, 2021
Creating Artificial Life with Reinforcement Learning

Although Evolutionary Algorithms have shown to result in interesting behavior, they focus on learning across generations whereas behavior could also be learned during ones lifetime.

Maarten Grootendorst 38 Oct 19, 2021
All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

Daniel Bourke 1.7k Oct 24, 2021
This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

Reinforcement-trading This project uses Reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can

Deepender Singla 1.4k Oct 20, 2021
🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

??Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

Realcat 4 Oct 21, 2021
Multi-Output Gaussian Process Toolkit

Multi-Output Gaussian Process Toolkit Paper - API Documentation - Tutorials & Examples The Multi-Output Gaussian Process Toolkit is a Python toolkit f

GAMES 66 Oct 18, 2021
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Minqi 257 Oct 21, 2021
This is the repo for Uncertainty Quantification 360 Toolkit.

UQ360 The Uncertainty Quantification 360 (UQ360) toolkit is an open-source Python package that provides a diverse set of algorithms to quantify uncert

International Business Machines 116 Oct 5, 2021