A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Related tags

Deep Learning rl
Overview

facebookresearch

TorchRL

Disclaimer

This library is not officially released yet and is subject to change.

The features are available before an official release so that users and collaborators can get early access and provide feedback. No guarantee of stability, robustness or backward compatibility is provided.


TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch.

It provides pytorch and python-first, low and high level abstractions for RL that are intended to be efficient, documented and properly tested. The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort.

This repo attempts to align with the existing pytorch ecosystem libraries in that it has a dataset pillar (torchrl/envs), transforms, models, data utilities (e.g. collectors and containers), etc. TorchRL aims at having as few dependencies as possible (python standard library, numpy and pytorch). Common environment libraries (e.g. OpenAI gym) are only optional.

On the low-level end, torchrl comes with a set of highly re-usable functionals for cost functions, returns and data processing.

On the high-level end, torchrl provides:

A series of examples are provided with an illustrative purpose:

and many more to come!

Installation

Create a conda environment where the packages will be installed. Before installing anything, make sure you have the latest version of cmake and ninja libraries:

conda create --name torch_rl python=3.9
conda activate torch_rl
conda install cmake -c conda-forge
pip install ninja

Depending on the use of functorch that you want to make, you may want to install the latest (nightly) pytorch release or the latest stable version of pytorch:

Stable

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch  # refer to pytorch official website for cudatoolkit installation
pip install functorch

Nightly

# For CUDA 10.2
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html --upgrade
# For CUDA 11.1
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu111/torch_nightly.html --upgrade
# For CPU-only build
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html --upgrade

and functorch

pip install "git+https://github.com/pytorch/functorch.git"

Torchrl

Go to the directory where you have cloned the torchrl repo and install it

cd /path/to/torchrl/
python setup.py install

To run a quick sanity check, leave that directory and try to import the library.

python -c "import torchrl"

Optional dependencies

The following libraries can be installed depending on the usage one wants to make of torchrl:

# diverse
pip install tqdm pyyaml configargparse

# rendering
pip install moviepy

# deepmind control suite
pip install dm_control 

# gym, atari games
pip install gym gym[accept-rom-license] pygame gym_retro

# tests
pip install pytest

Alternatively, extra dependencies can be installed using

pip install ".[atari,dm_control,gym_continuous,rendering,tests,utils]"

or a selection of these.

Troubleshooting

If a ModuleNotFoundError: No module named ‘torchrl._torchrl errors occurs, it means that the C++ extensions were not installed or not found. One common reason might be that you are trying to import torchrl from within the git repo location. Indeed the following code snippet should return an error if torchrl has not been installed in develop mode:

cd ~/path/to/rl/repo
python -c 'from torchrl.envs import GymEnv'

If this is the case, consider executing torchrl from another location.

This may also be caused by several dependency issues: cmake, gcc or ninja versioning, or absence of the CuDNN library when working in a CUDA environment.

On MacOs, we recommend installing XCode first. With Apple Silicon M1 chips, make sure you are using the arm64-built python (e.g. here). Running the following lines of code

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
python collect_env.py

should display

OS: macOS *** (arm64)

and not

OS: macOS **** (x86_64)

Running examples

Examples are coded in a very similar way but the configuration may change from one algorithm to another (e.g. async/sync data collection, hyperparameters, ratio of model updates / frame etc.) To train an algorithm it is therefore advised to use the predefined configurations that are found in the configs sub-folder in each algorithm directory:

python examples/ppo/ppo.py --config=examples/ppo/configs/humanoid.txt

Note that using the config files requires the configargparse library.

One can also overwrite the config parameters using flags, e.g.

python examples/ppo/ppo.py --config=examples/ppo/configs/humanoid.txt --frame_skip=2 --collection_devices=cuda:1

Each example will write a tensorboard log in a dedicated folder, e.g. ppo_logging/....

Contributing

Internal collaborations to torchrl are welcome! Feel free to fork, submit issues and PRs.

Upcoming features

In the near future, we plan to:

  • provide tutorials on how to design new actors or environment wrappers;
  • implement IMPALA (as a distributed RL example) and Meta-RL algorithms;
  • improve the tests, documentation and nomenclature.

License

TorchRL is licensed under the MIT License. See LICENSE for details.

Comments
  • [BUG] Stacked tensordicts with nested keys crash `env.step()`

    [BUG] Stacked tensordicts with nested keys crash `env.step()`

    Describe the bug

    Due to multiagent applications my env needs to return tensordict_out = torch.stack(agent_tds,dim=0) from env._step() with nested keys. This creates a LazyStackedTensorDict.

    The subsequent logic of env.step() performs certain operations on the tensordict_outwhich crash if the latter has nested keys.

    This can be solved by calling to_tensordict() before the end of the _step() but this is possible only when the stack is homogenous and not wheen it is heterogenous as in #766.

    To Reproduce

    Create LazyStackedTensorDict and return it in the _step implementation

    test/test_libs.py:525 (TestVmas.test_vmas_seeding[flocking])
    Traceback (most recent call last):
      File "/Users/Matteo/PycharmProjects/torchrl/test/test_libs.py", line 537, in test_vmas_seeding
        tdrollout.append(env.rollout(max_steps=10))
      File "/Users/Matteo/PycharmProjects/torchrl/torchrl/envs/common.py", line 605, in rollout
        tensordict = self.step(tensordict)
      File "/Users/Matteo/PycharmProjects/torchrl/torchrl/envs/common.py", line 343, in step
        tensordict_out_select = tensordict_out.select(*obs_keys)
      File "/Users/Matteo/PycharmProjects/tensordict/tensordict/tensordict.py", line 4325, in select
        raise TypeError(
    TypeError: All keys passed to LazyStackedTensorDict.select must be strings. Found ('info', 'velocity_rew') of type <class 'tuple'>. Note that LazyStackedTensorDict does not yet support nested keys.
    
    bug 
    opened by matteobettini 22
  • error: can't copy 'build/lib.linux-x86_64-3.9/torchrl/_torchrl.so': doesn't exist or not a regular file

    error: can't copy 'build/lib.linux-x86_64-3.9/torchrl/_torchrl.so': doesn't exist or not a regular file

    image When trying to install the package using "pip install -e ." or "python setup.py develop", I encounter this error. OS: Ubuntu 22.04 LTS (x86_64) Python version: 3.9.12

    help wanted 
    opened by bkpcoding 13
  • [BUG] GymWrapper does not work with nested observation gym.spaces.Dict

    [BUG] GymWrapper does not work with nested observation gym.spaces.Dict

    Describe the bug

    Hi All,

    First of all: thanks for the great work here!

    I think I have encountered a bug in the GymWrapper in torchrl.envs.libs.gym.GymWrapper. When I use a gym.Env with an observation space with nested gym.spaces.Dict, a KeyError will be thrown since the GymLikeEnv.read_obs() function does only add "next_" to the first level of Dict but not to nested sub Dicts:

    observations = {"next_" + key: value for key, value in observations.items()}
    

    Since _gym_to_torchrl_spec_transform() in torchrl.envs.libs.gym ends "next_" in a recursive call to all sub Dicts, the key is missing the necessary "next_". Nested Dict observation spaces are often used (https://www.gymlibrary.dev/api/spaces/#dict), so I guess this is required to work properly.

    To Reproduce

    #!/usr/bin/env python
    from torchrl.envs.libs.gym import GymWrapper
    from gym import spaces, Env
    import numpy as np
    
    
    class CustomGym(Env):
        def __init__(self):
            self.action_space = spaces.Discrete(5)
            self.observation_space = spaces.Dict(
                {
                    'sensor_1': spaces.Box(low=0, high=255, shape=(5, 5, 3), dtype=np.uint8),
                    'sensor_2': spaces.Box(low=0, high=255, shape=(5, 5, 3), dtype=np.uint8),
                    'sensor_3': spaces.Box(np.array([-2, -1, -5, 0]), np.array([2, 1, 30, 1]), dtype=np.float32),
                    'sensor_4': spaces.Dict({'sensor_41': spaces.Box(low=0, high=100, shape=(1,), dtype=np.float32),
                                             'sensor_42': spaces.Box(low=0, high=100, shape=(1,), dtype=np.float32),
                                             'sensor_43': spaces.Box(low=0, high=100, shape=(1,), dtype=np.float32)})
                }
            )
    
        def reset(self):
            return self.observation_space.sample()
    
    
    if __name__ == '__main__':
        env = CustomGym()
        env = GymWrapper(env)
    
    

    Reason and Possible fixes

    The issue can be fixed by adding a recursive function call to rename also nested observation space Dicts in GymLikeEnv.read_obs() correctly by adding "next_":

    
        def read_obs(
            self, observations: Union[Dict[str, Any], torch.Tensor, np.ndarray]
        ) -> Dict[str, Any]:
            """Reads an observation from the environment and returns an observation compatible with the output TensorDict.
    
            Args:
                observations (observation under a format dictated by the inner env): observation to be read.
    
            """
            if isinstance(observations, dict):
    
                def rename(obs):
                    return {
                        "next_" + key: rename(value) if isinstance(value, dict) else value
                        for key, value in obs.items()
                    }
    
                observations = rename(observations)
            if not isinstance(observations, (TensorDict, dict)):
                key = list(self.observation_spec.keys())[0]
                observations = {key: observations}
            observations = self.observation_spec.encode(observations)
            return observations
    

    The style checker required to not use lambda functions, otherwise the fix could also be as simple as

                 rename = lambda obs: {
                    "next_" + key: rename(value) if isinstance(value, dict) else value
                    for key, value in obs.items()
                 }
    

    Checklist

    • [x] I have checked that there is no similar issue in the repo (required)
    • [x] I have read the documentation (required)
    • [x] I have provided a minimal working example to reproduce the bug (required)
    bug 
    opened by raphajaner 12
  • [BugFix] SyncDataCollector init when device and env_device are different

    [BugFix] SyncDataCollector init when device and env_device are different

    Description

    In the init method of the SyncDataCollector class, a small number of steps is taken with the policy to determine the relevant keys of the output TensorDict. When the policy device and the environment device are different, that can raise a RuntimeError since the input provided to the policy is located in the environment device.

    This PR only makes sure that the TensorDict provided to the policy is in the policy device, and then moves the output TensorDict to the environment device again.

    Types of changes

    What types of changes does your code introduce? Remove all that do not apply:

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds core functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [ ] Documentation (update in the documentation)
    • [ ] Example (update in the folder of examples)

    Checklist

    Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

    • [x] I have read the CONTRIBUTION guide (required)
    • [ ] My change requires a change to the documentation.
    • [ ] I have updated the tests accordingly (required for a bug fix or a new feature).
    • [ ] I have updated the documentation accordingly.
    CLA Signed 
    opened by albertbou92 11
  • [BUG]TypeError: __init__() got an unexpected keyword argument 'disable_env_checker'

    [BUG]TypeError: __init__() got an unexpected keyword argument 'disable_env_checker'

    I used your library on google colab and it works without a problem. However, after spending a day to install gym in ubuntu using conda and run torchrl code, I am getting this error:

    >>> from torchrl.envs.libs.gym import _has_gym, GymEnv, GymWrapper
    >>> env_torchrl = GymEnv("InvertedPendulum-v2")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 282, in __init__
        super().__init__(**kwargs)
      File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 149, in __init__
        super().__init__(**kwargs)
      File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/common.py", line 717, in __init__
        self._env = self._build_env(**kwargs)  # writes the self._env attribute
      File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 327, in _build_env
        raise err
      File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 313, in _build_env
        env = self.lib.make(env_name, **kwargs)
      File "/home/anaconda3/lib/python3.9/site-packages/gym/envs/registration.py", line 235, in make
        return registry.make(id, **kwargs)
      File "/home/anaconda3/lib/python3.9/site-packages/gym/envs/registration.py", line 129, in make
        env = spec.make(**kwargs)
      File "/home/anaconda3/lib/python3.9/site-packages/gym/envs/registration.py", line 90, in make
        env = cls(**_kwargs)
    TypeError: __init__() got an unexpected keyword argument 'disable_env_checker'
    
    

    Why did this error occur?

    bug 
    opened by neuronphysics 11
  • [Feature] RewardSum transform

    [Feature] RewardSum transform

    Description

    Adds a new Transform class, called RewardSum. Which tracks the cumulative reward of all episodes in progress and adds the information to the tensordict as a new key.

    Motivation and Context

    It can be informative to be able to access the training episode rewards.

    e.g. it can be used like this to track the performance during training

    for batch in collector:
        train_episode_reward = batch["episode_reward"][batch["done"]]
        if batch["episode_reward"][batch["done"]].numel() > 0:
               print(f"train_episode_rewards {train_episode_reward.mean()}")
    

    Types of changes

    What types of changes does your code introduce? Remove all that do not apply:

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds core functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [ ] Documentation (update in the documentation)
    • [ ] Example (update in the folder of examples)

    Checklist

    Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

    • [ ] I have read the CONTRIBUTION guide (required)
    • [ ] My change requires a change to the documentation.
    • [ ] I have updated the tests accordingly (required for a bug fix or a new feature).
    • [ ] I have updated the documentation accordingly.
    CLA Signed 
    opened by albertbou92 8
  • [Feature]: `TensorDictPrimer` transform

    [Feature]: `TensorDictPrimer` transform

    Description

    This PR add a ForceTensorReset transform.

    Motivation and Context

    ForceTensorReset allows to set or reset to default values some given vectors

    Types of changes

    What types of changes does your code introduce? Remove all that do not apply:

    • [x] New feature (non-breaking change which adds core functionality)

    Checklist

    Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

    • [x] I have read the CONTRIBUTION guide (required)
    • [x] My change requires a change to the documentation.
    • [x] I have updated the tests accordingly (required for a bug fix or a new feature).
    • [x] I have updated the documentation accordingly.
    enhancement CLA Signed 
    opened by nicolas-dufour 8
  • [Logging]: implement MLFlow logging integration

    [Logging]: implement MLFlow logging integration

    Description

    This diff enables torchrl to seamlessly use the MLFlow Tracking API through its internal Logger API. These changes are consistent with previous integrations such as the one with W&B.

    A test suite for the new integration has been included in the diff.

    Motivation and Context

    close #395

    Types of changes

    What types of changes does your code introduce? Remove all that do not apply:

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds core functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [ ] Documentation (update in the documentation)
    • [ ] Example (update in the folder of examples)

    Checklist

    Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

    • [x] I have read the CONTRIBUTION guide (required)
    • [ ] My change requires a change to the documentation.
    • [x] I have updated the tests accordingly (required for a bug fix or a new feature).
    • [ ] I have updated the documentation accordingly.
    enhancement CLA Signed 
    opened by rayanht 8
  • [Doc] Added TensorDict tutorial

    [Doc] Added TensorDict tutorial

    Tutorial on Tensor Dict and TensorDictModule. Contains simple examples of TensorDict and TensorDictModule operations. Also contains implementation of a Transformer model using Tensor Dict and TensorDictModule to showcase how this modules work

    documentation CLA Signed 
    opened by nicolas-dufour 8
  • Add support for null `dim` argument in `TensorDict.squeeze`

    Add support for null `dim` argument in `TensorDict.squeeze`

    Description

    I began adding support for a null dim argument in TensorDict.squeeze. However, this code is still buggy as TensorDict.unsqueeze is called during the process of creating a SqueezedTensorDict, and unsqueeze should not support a null dim argument.

    Motivation and Context

    This resolves #592.

    • [x] I have raised an issue to propose this change (required for new features and bug fixes)

    Types of changes

    What types of changes does your code introduce? Remove all that do not apply:

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds core functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [ ] Documentation (update in the documentation)
    • [ ] Example (update in the folder of examples)

    Checklist

    Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

    • [x] I have read the CONTRIBUTION guide (required)
    • [x] My change requires a change to the documentation.
    • [x] I have updated the tests accordingly (required for a bug fix or a new feature).
    • [x] I have updated the documentation accordingly.
    enhancement CLA Signed 
    opened by jgonik 7
  • [Feature] lock tensordict when calling `share_memory_()`

    [Feature] lock tensordict when calling `share_memory_()`

    Description

    Added lock=True flag to both share_memory_ and memmap_ methods of TensorDict classes. And when calling the set method this only allows the key to be updated if the key already exists and inplace is true.

    This required a couple changes to existing code/tests:

    • Added lock=False to existing tests to let them continue to pass.
    • Removed is_locked checks from set_, set_at_, and _stack_onto_ methods due to the issue #125 stating that these methods should ignore the lock.

    Motivation and Context

    Close #125

    Types of changes

    What types of changes does your code introduce? Remove all that do not apply:

    • [X] New feature (non-breaking change which adds core functionality)

    Checklist

    Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

    • [X] I have read the CONTRIBUTION guide (required)
    • [ ] My change requires a change to the documentation.
    • [X] I have updated the tests accordingly (required for a bug fix or a new feature).
    • [ ] I have updated the documentation accordingly.
    enhancement CLA Signed 
    opened by fdabek1 7
  • [BUG] flacky tests

    [BUG] flacky tests

    A list of flacky tests to fix

    Describe the bug

    • here or here FAILED test/test_env.py::TestParallel::test_parallel_env_transform_consistency[device1-1-Pendulum-v1] - AssertionError: key observation does not match, got mse = 0.0618
    bug Good first issue 
    opened by vmoens 0
  • [BugFix] [Feature]

    [BugFix] [Feature] "_reset" flag for env reset

    Description

    This PR ddresses issue #790.

    The changes replace the "reset_workers" flag (only deigned for ParallelEnvs wrapping environments with emty batch_size) with the "_reset" flag, which spans over all batch_size dimensions.

    This allows to more precisely tell the wrapped environemnts which dimensions need to be reset.

    In accordace to this, now the reset() methods on EnvBase and ParallelEnv only check that at least the indexes that were flagged to be reset are not done. Instead of checking assert not done.any().

    CLA Signed 
    opened by matteobettini 1
  • [Feature Request] TensorSpec is_in methods should check the dtype of val

    [Feature Request] TensorSpec is_in methods should check the dtype of val

    Motivation

    As mentioned in #783, we should check that is_in always checks the dtype.

    Solution

    Add self.dtype == val.dtype condition in the is_in methods. Should be very simple to fix, but we need to make sure that all tests pass.

    Checklist

    • [X] I have checked that there is no similar issue in the repo (required)
    enhancement Good first issue 
    opened by riiswa 2
  • [Feature Request] `in_keys` and `out_keys` for loss modules

    [Feature Request] `in_keys` and `out_keys` for loss modules

    Motivation

    Loss modules should have a in_keys and out_keys attributes. To get this, we could simply use the convert_to_functional: this method could stack together the in_keys and out_keys of all the input modules.

    Another option is to make this a property that looks for all the children of the loss module, and computes the in_keys and out_keys on the fly.

    The first solution will miss all the cases where the users don't use convert_to_functional, which is not mandatory. The second is slightly more expensive.

    Going the extra mile

    We could also decorate the forward of the losses with tensordict.nn.dispatch_kwargs, allowing the users to use the losses without using tensordict.

    enhancement Good first issue 
    opened by vmoens 0
  • [BUG] Resetting environments with non-empty batch_size

    [BUG] Resetting environments with non-empty batch_size

    Describe the bug

    When resetting an environment with non-empty batch size, there is currently no unified way to specify which dimensions to reset.

    ParallelEnv has a reset key called "reset_workers" which is used to choose which workers to reset. The use of this unidimensional key makes ParallelEnv crash when env.batch_size is not empty.

    This is what happens:

    def _reset(self, tensordict: TensorDictBase, **kwargs) -> TensorDictBase:
        cmd_out = "reset"
        if tensordict is not None and "reset_workers" in tensordict.keys():
            self._assert_tensordict_shape(tensordict) # First assert that the key has the same batch size as the env, let's say [10,4,5,2]
            reset_workers = tensordict.get("reset_workers")
        else:
            reset_workers = torch.ones(self.num_workers, dtype=torch.bool) # If not create one (without respecting the batch size)
    
        for i, channel in enumerate(self.parent_channels):
            if not reset_workers[i]: # If run on dimension 0 !! This only works if the else branch is taken or env.batch_size is empty
                continue
            channel.send((cmd_out, kwargs)) # Do not even pass the tensordict to the env
    

    The problems just in this snippet are:

    1. This only works if the else branch is taken or env.batch_size is empty
    2. it does not pass the reset tensordict to the env

    This is paired with a series of problems in the various reset functions (ParallelEnv and EnvBase) where after reset done.any() is called. This is highly problematic as any spans over all dimensions.

    Proposed changes to the API

    1. Remove "reset_workers" (which already doesn't work)
    2. Introduce the possibility of having a "reset" key in the tensordict given as parameter to the reset functions. This reset key has shape (in the more general case of ParallelEnv) (n_parallel_envs, *env.batch_size) and is a boolean telling precisely which dimensions to reset. ParallelEnv then can call reset[worker_id].any() to know if to pass the reset command and key to the worker
    3. In the reset function do not check done.any() but chacke that at least the requested dimensions to reset have done=False. The absence of the "reset" key means resetting all dimensions

    To Reproduce

    env = MockBatchedLockedEnv(device="cpu", batch_size=torch.Size(env_batch_size))
    env.set_seed(1)
    parallel_env = ParallelEnv(num_parallel_env, lambda: env)
    parallel_env.start()
    reset_td = TensorDict(
        {"reset_workers": torch.full(parallel_env.batch_size, True, device=parallel_env.device)},
        batch_size=parallel_env.batch_size,
        device=parallel_env.device,
    )
    parallel_env.reset(reset_td)
    
    bug 
    opened by matteobettini 3
Releases(0.0.3)
  • 0.0.3(Nov 21, 2022)

    The main changes introduced by this release are:

    • dependency on the standalone tensordict repo;
    • refactoring of the "next" API

    What's Changed

    • [Versioning] MacOs versioning and release bugfix by @vmoens in https://github.com/pytorch/rl/pull/247
    • [Versioning] Setup metadata by @vmoens in https://github.com/pytorch/rl/pull/248
    • [BugFix] Fix setup instructions by @vmoens in https://github.com/pytorch/rl/pull/250
    • [BugFix] Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in https://github.com/pytorch/rl/pull/251
    • [Feature] Added test for RewardRescale transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/252
    • [Feature] Empty TensorDict population in loops by @vmoens in https://github.com/pytorch/rl/pull/253
    • [BugFix] Memmap del bugfix by @vmoens in https://github.com/pytorch/rl/pull/254
    • [Feature] Implement padding for tensordicts by @ajhinsvark in https://github.com/pytorch/rl/pull/257
    • [BugFix]: recursion error when calling permute(...).to_tensordict() by @vmoens in https://github.com/pytorch/rl/pull/260
    • [Feature] Differentiable PPOLoss for IRL by @vmoens in https://github.com/pytorch/rl/pull/240
    • [BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in https://github.com/pytorch/rl/pull/261
    • [Feature] Add issue and pull request template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/263
    • [Feature] Nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/256
    • [Feature]: Index nested tensordicts using tuples by @vmoens in https://github.com/pytorch/rl/pull/262
    • [Feature]: flatten nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/264
    • [Test]: test nested CompositeSpec by @vmoens in https://github.com/pytorch/rl/pull/265
    • [Test]: test squeezed TensorDict by @vmoens in https://github.com/pytorch/rl/pull/269
    • [Doc] Added TensorDict tutorial by @nicolas-dufour in https://github.com/pytorch/rl/pull/255
    • [Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in https://github.com/pytorch/rl/pull/268
    • Refactor the torch.stack with destination by @khmigor in https://github.com/pytorch/rl/pull/245
    • [Feature]: faster meta-tensor API for TensorDict by @vmoens in https://github.com/pytorch/rl/pull/272
    • [Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in https://github.com/pytorch/rl/pull/270
    • Small tweaks to make the replay buffer code more consistent by @shagunsodhani in https://github.com/pytorch/rl/pull/275
    • [BugFix]: Minor bugs in docstrings by @vmoens in https://github.com/pytorch/rl/pull/276
    • [Doc]: TorchRL demo by @vmoens in https://github.com/pytorch/rl/pull/284
    • [BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/286
    • [BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in https://github.com/pytorch/rl/pull/283
    • [Doc]: remove pip install from CONTRIBUTING.md by @vmoens in https://github.com/pytorch/rl/pull/288
    • [Feature]: faster safetanh transform via C++ bindings by @vmoens in https://github.com/pytorch/rl/pull/289
    • [BugFix]: fix GLFW3 error when installing dm_control by @vmoens in https://github.com/pytorch/rl/pull/291
    • [BugFix]: Fix examples by @vmoens in https://github.com/pytorch/rl/pull/290
    • [Doc] Simplify PR template by @vmoens in https://github.com/pytorch/rl/pull/292
    • [BugFix]: Replay buffer bugfixes by @vmoens in https://github.com/pytorch/rl/pull/294
    • [Doc] MacOs M1 troubleshooting by @ramonmedel in https://github.com/pytorch/rl/pull/296
    • [Feature]: Improving training efficiency by @vmoens in https://github.com/pytorch/rl/pull/293
    • [Feature] Wandb logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/274
    • [QuickFix]: update issue and pr template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/303
    • [Test] tests for BinarizeReward by @srikanthmg85 in https://github.com/pytorch/rl/pull/302
    • [BugFix]: L2-priority for PRB by @vmoens in https://github.com/pytorch/rl/pull/305
    • [Feature] Transforms: Compose.insert and TransformedEnv.insert_transform by @rmartimov in https://github.com/pytorch/rl/pull/304
    • [BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in https://github.com/pytorch/rl/pull/306
    • [BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in https://github.com/pytorch/rl/pull/307
    • ufmt issue if imports in order requested by distutils by @nairbv in https://github.com/pytorch/rl/pull/308
    • [BugFix]: Conda to pip for circleci by @vmoens in https://github.com/pytorch/rl/pull/310
    • [BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in https://github.com/pytorch/rl/pull/299
    • [Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in https://github.com/pytorch/rl/pull/295
    • [Doc] Tensordictmodule tutorial by @nicolas-dufour in https://github.com/pytorch/rl/pull/267
    • [Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in https://github.com/pytorch/rl/pull/316
    • [Release]: v0.0.1b versioning by @vmoens in https://github.com/pytorch/rl/pull/317
    • [Feature] Adding additional checks to TensorDict.view to remove unnecessary ViewedTensorDict object creation by @bamaxw in https://github.com/pytorch/rl/pull/319
    • [BugFix]: Safe state normalization when std=0 by @vmoens in https://github.com/pytorch/rl/pull/323
    • [BugFix]: gradient propagation in advantage estimates by @vmoens in https://github.com/pytorch/rl/pull/322
    • [BugFix]: make training example gracefully exit by @vmoens in https://github.com/pytorch/rl/pull/326
    • [Setup]: Exclude tutorials from wheels by @vmoens in https://github.com/pytorch/rl/pull/325
    • [BugFix]: Tensor map for subtensordict.set_ by @vmoens in https://github.com/pytorch/rl/pull/324
    • [Versioning]: Wheels v0.0.1c by @vmoens in https://github.com/pytorch/rl/pull/327
    • [BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in https://github.com/pytorch/rl/pull/328
    • [BugFix] functorch installation in CircleCI by @vmoens in https://github.com/pytorch/rl/pull/336
    • [Refactor] VecNorm inference API by @vmoens in https://github.com/pytorch/rl/pull/337
    • [BugFix] TransformedEnv sets added Transforms into eval mode by @alexanderlobov in https://github.com/pytorch/rl/pull/331
    • [Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in https://github.com/pytorch/rl/pull/334
    • [CircleCI] Fix dm_control rendering by @vmoens in https://github.com/pytorch/rl/pull/339
    • [BugFix]: joining processes when they're done by @vmoens in https://github.com/pytorch/rl/pull/311
    • [Test] pass the OS error in case the file isn't closed by @tongbaojia in https://github.com/pytorch/rl/pull/344
    • [Feature] Make default rollout tensordict contiguous by @vmoens in https://github.com/pytorch/rl/pull/343
    • [BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in https://github.com/pytorch/rl/pull/340
    • [CI] Using latest gym by @vmoens in https://github.com/pytorch/rl/pull/346
    • [Doc] Coding your first DDPG tutorial by @vmoens in https://github.com/pytorch/rl/pull/345
    • [Doc] Minor: typos in DDPG by @vmoens in https://github.com/pytorch/rl/pull/354
    • [Feature] Register lambda and gamma in buffers by @vmoens in https://github.com/pytorch/rl/pull/353
    • [Feature] Implement eq for TensorSpec by @omikad in https://github.com/pytorch/rl/pull/358
    • [Doc] Multi-tasking tutorial by @vmoens in https://github.com/pytorch/rl/pull/352
    • [Feature] Env refactoring for model based RL by @nicolas-dufour in https://github.com/pytorch/rl/pull/315
    • [Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in https://github.com/pytorch/rl/pull/332
    • [BugFix] Add lock to vec norm transform by @jaschmid-fb in https://github.com/pytorch/rl/pull/356
    • [Perf]: Improve PPO training performance by @vmoens in https://github.com/pytorch/rl/pull/297
    • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/pytorch/rl/pull/361
    • Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in https://github.com/pytorch/rl/pull/362
    • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/pytorch/rl/pull/363
    • [Feature] CSVLogger (ABBANDONED) by @vmoens in https://github.com/pytorch/rl/pull/371
    • [Feature] Support tensor-based decay in TD-lambda by @tcbegley in https://github.com/pytorch/rl/pull/360
    • [Feature] CSVLogger by @vmoens in https://github.com/pytorch/rl/pull/372
    • [BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in https://github.com/pytorch/rl/pull/378
    • [Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in https://github.com/pytorch/rl/pull/379
    • [BugFix] Representation of indexed nested tensordict by @vmoens in https://github.com/pytorch/rl/pull/370
    • [BugFix] In-place __setitem__ for SubTensorDict by @vmoens in https://github.com/pytorch/rl/pull/369
    • [Feature] Add ProbabilisticTensorDictModule dist key mapping support by @nicolas-dufour in https://github.com/pytorch/rl/pull/376
    • [Feature]: R3M integration by @vmoens in https://github.com/pytorch/rl/pull/321
    • [Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in https://github.com/pytorch/rl/pull/385
    • [Feature] AdditiveGaussian exploration strategy by @vmoens in https://github.com/pytorch/rl/pull/388
    • [Feature] Multi-images R3M by @vmoens in https://github.com/pytorch/rl/pull/389
    • [Feature] Flatten multi-images in R3M by @vmoens in https://github.com/pytorch/rl/pull/391
    • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/pytorch/rl/pull/392
    • [Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/pytorch/rl/pull/387
    • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/pytorch/rl/pull/397
    • [Doc] Add charts to examples by @nicolas-dufour in https://github.com/pytorch/rl/pull/374
    • [Feature] Vectorized GAE by @vmoens in https://github.com/pytorch/rl/pull/365
    • [BugFix] Temporarily fix gym to 0.25.1 to fix CI by @vmoens in https://github.com/pytorch/rl/pull/411
    • [Feature] Create a Squeeze transform and update Unsqueeze transform by @reachsumit in https://github.com/pytorch/rl/pull/408
    • [Naming] Recurse kwarg to match pytorch by @matt-fff in https://github.com/pytorch/rl/pull/410
    • [Feature] Add all implemented loggers to the init of loggers by @flinder in https://github.com/pytorch/rl/pull/402
    • [BugFix] Fix gym 0.26 compatibility by @vmoens in https://github.com/pytorch/rl/pull/403
    • [BugFix] Remove submodules by @vmoens in https://github.com/pytorch/rl/pull/414
    • [Feature] lock tensordict when calling share_memory_() by @fdabek1 in https://github.com/pytorch/rl/pull/412
    • [BugFix] Updated TensorDict.expand to work as Tensor.expand by @AnshulSehgal in https://github.com/pytorch/rl/pull/409
    • [BugFix] Looser check for test_recorder assertion by @vmoens in https://github.com/pytorch/rl/pull/415
    • [Feature] Allow spec to be passed directly to exploration wrappers by @vmoens in https://github.com/pytorch/rl/pull/418
    • [BugFix] Collector revert to default exploration mode if empty string is passed by @vmoens in https://github.com/pytorch/rl/pull/421
    • [Naming] Rename _TargetNetUpdate to TargetNetUpdater, making it public by @yushiyangk in https://github.com/pytorch/rl/pull/422
    • [Doc] Re-run tutorials by @vmoens in https://github.com/pytorch/rl/pull/381
    • Revert "[Doc] Re-run tutorials" (colab links broken) by @vmoens in https://github.com/pytorch/rl/pull/423
    • [Feature] Switch back to latest gym by @vmoens in https://github.com/pytorch/rl/pull/425
    • [Feature] TensorDict without device by @tcbegley in https://github.com/pytorch/rl/pull/413
    • Updated the README.md file by @bashnick in https://github.com/pytorch/rl/pull/427
    • [Feature] Adding support for initialising TensorDicts from nested dicts by @zeenolife in https://github.com/pytorch/rl/pull/404
    • [Features] Make image_size a cfg param by @nicolas-dufour in https://github.com/pytorch/rl/pull/430
    • Make TensorDict.expand accept Sequence arguments by @nicolasgriffiths in https://github.com/pytorch/rl/pull/424
    • [Doc] Readme revamp for efficiency/modularity display by @vmoens in https://github.com/pytorch/rl/pull/382
    • [Feature] New biased_softplus semantic to allow for minimum scale setting by @nicolas-dufour in https://github.com/pytorch/rl/pull/428
    • [Tutorial] Re-run tutos by @vmoens in https://github.com/pytorch/rl/pull/434
    • [BugFix] mixed device_safe vs device by @vmoens in https://github.com/pytorch/rl/pull/429
    • [BugFix] Explicit params and buffers by @agrotov in https://github.com/pytorch/rl/pull/436
    • [BugFix] Fixed Additive noise by @nicolas-dufour in https://github.com/pytorch/rl/pull/441
    • [Tests] Test loggers video saving by @bashnick in https://github.com/pytorch/rl/pull/439
    • Revert "[BugFix] Fixed Additive noise" by @vmoens in https://github.com/pytorch/rl/pull/442
    • [Refactor] Rename TensorDictSequence to TensorDictSequential by @ronert in https://github.com/pytorch/rl/pull/440
    • [Refactor] Refactoring set*() methods for TensorDictBase class by @zeenolife in https://github.com/pytorch/rl/pull/438
    • [Cleanup] Removing gym-retro interface by @vmoens in https://github.com/pytorch/rl/pull/444
    • [BugFix]: Fix additive noise by @nicolas-dufour in https://github.com/pytorch/rl/pull/447
    • [BugFix] CatTensors: Prepended next_ to the out_key by @ggimler3 in https://github.com/pytorch/rl/pull/449
    • [BugFix] Fix AdditiveGaussian exploration tests by @vmoens in https://github.com/pytorch/rl/pull/450
    • [BugFix] Wrong call to device_safe in replay buffer code by @vmoens in https://github.com/pytorch/rl/pull/454
    • [BugFix] Add transform_observation_spec _R3MNet by @ymwdalex in https://github.com/pytorch/rl/pull/443
    • [Doc] Add a knowledge base by @shagunsodhani in https://github.com/pytorch/rl/pull/375
    • [Feature] Allow for actions and rewards to be in the reset tensordict by @vmoens in https://github.com/pytorch/rl/pull/458
    • [Doc] Readme for knowledge base by @vmoens in https://github.com/pytorch/rl/pull/459
    • [Feature] Added batch_lock attribute in EnvBase by @nicolas-dufour in https://github.com/pytorch/rl/pull/399
    • [BugFix] deepcopy specs before transforming by @vmoens in https://github.com/pytorch/rl/pull/461
    • [BugFix]: Fixed dm_control action type casting by @nicolas-dufour in https://github.com/pytorch/rl/pull/463
    • [Versioning] Version 0.0.2a0 by @vmoens in https://github.com/pytorch/rl/pull/465
    • [CI, Doc] Update functorch source installation command by @zou3519 in https://github.com/pytorch/rl/pull/446
    • [BugFix] TransformedEnv attributes inheritance by @vmoens in https://github.com/pytorch/rl/pull/467
    • [Feature] Cleanup mocking envs init and new by @vmoens in https://github.com/pytorch/rl/pull/469
    • [Tests] Adding tensordict __repr__ tests by @sladebot in https://github.com/pytorch/rl/pull/435
    • [Logging]: implement MLFlow logging integration by @rayanht in https://github.com/pytorch/rl/pull/432
    • [BugFix] MLFlow import fix by @vmoens in https://github.com/pytorch/rl/pull/473
    • [BugFix] Fixed pip install by @brandonsj in https://github.com/pytorch/rl/pull/475
    • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in https://github.com/pytorch/rl/pull/464
    • [Feature]: ModelBased Envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/333
    • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in https://github.com/pytorch/rl/pull/476
    • [Tutorial] DQN tutorial by @vmoens in https://github.com/pytorch/rl/pull/474
    • [Feature] reader hooks for GymLike by @vmoens in https://github.com/pytorch/rl/pull/478
    • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in https://github.com/pytorch/rl/pull/483
    • [Feature]: Support for planners and CEM by @nicolas-dufour in https://github.com/pytorch/rl/pull/384
    • [Feature] Replaced device_safe() with device by @ordinskiy in https://github.com/pytorch/rl/pull/485
    • [Feature]: TensorDictPrimer transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/456
    • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in https://github.com/pytorch/rl/pull/480
    • [Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in https://github.com/pytorch/rl/pull/482
    • [BugFix] removing unwanted device_safe() by @vmoens in https://github.com/pytorch/rl/pull/486
    • [Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in https://github.com/pytorch/rl/pull/481
    • [Feature] VIP Integration by @JasonMa2016 in https://github.com/pytorch/rl/pull/487
    • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/489
    • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/490
    • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in https://github.com/pytorch/rl/pull/400
    • [BugFix] Fix TensorDictPrimer init by @vmoens in https://github.com/pytorch/rl/pull/491
    • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in https://github.com/pytorch/rl/pull/492
    • [BugFix] Defaulting passing_devices to None by @himjohntang in https://github.com/pytorch/rl/pull/477
    • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in https://github.com/pytorch/rl/pull/494
    • [BugFix] Multi-agent fixes by @vmoens in https://github.com/pytorch/rl/pull/488
    • [BugFix] Defaulting passing_devices to None by @vmoens in https://github.com/pytorch/rl/pull/495
    • [Feature] Lazy initialization of CatTensors by @vmoens in https://github.com/pytorch/rl/pull/497
    • [Cleanup] Removing cuda 10.2 references by @vmoens in https://github.com/pytorch/rl/pull/498
    • [BugFix] Migration to pytorch org by @vmoens in https://github.com/pytorch/rl/pull/499
    • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in https://github.com/pytorch/rl/pull/500
    • [BugFix] python version for linting checks by @vmoens in https://github.com/pytorch/rl/pull/502
    • [Feature] Replay Buffers refactor by @bamaxw in https://github.com/pytorch/rl/pull/330
    • [Feature] Rename step_tensordict in step_mdp by @romainjln in https://github.com/pytorch/rl/pull/512
    • [Lint] re-instantiate F821 by @vmoens in https://github.com/pytorch/rl/pull/516
    • [BugFix] run_type_checks for TransformedEnvs by @vmoens in https://github.com/pytorch/rl/pull/513
    • [BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in https://github.com/pytorch/rl/pull/511
    • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in https://github.com/pytorch/rl/pull/504
    • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in https://github.com/pytorch/rl/pull/515
    • [CI] Add coverage with codecov by @silvestrebahi in https://github.com/pytorch/rl/pull/523
    • Revert "[CI] Add coverage with codecov" by @vmoens in https://github.com/pytorch/rl/pull/525
    • [Quality] Use relative imports for local c++ deps by @apbard in https://github.com/pytorch/rl/pull/526
    • [Feature] Nightly release by @vmoens in https://github.com/pytorch/rl/pull/519
    • [Feature] Add make_tensordict() function by @sicong-huang in https://github.com/pytorch/rl/pull/522
    • [Doc] Misc readme fixes by @GavinPHR in https://github.com/pytorch/rl/pull/532
    • [BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in https://github.com/pytorch/rl/pull/530
    • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in https://github.com/pytorch/rl/pull/531
    • [Doc] Add coverage banner by @vmoens in https://github.com/pytorch/rl/pull/533
    • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in https://github.com/pytorch/rl/pull/543
    • [BugFix] Fix optional imports by @vmoens in https://github.com/pytorch/rl/pull/535
    • [BugFix] Restore missing keys in data collector output by @tcbegley in https://github.com/pytorch/rl/pull/521
    • [Lint] reorganize imports by @apbard in https://github.com/pytorch/rl/pull/545
    • [BugFix] Single-cpu compatibility by @vmoens in https://github.com/pytorch/rl/pull/548
    • [BugFix] vision install and other deps in optdeps by @vmoens in https://github.com/pytorch/rl/pull/552
    • [Feature] Implemented device argument for modules.models by @yushiyangk in https://github.com/pytorch/rl/pull/524
    • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in https://github.com/pytorch/rl/pull/559
    • [BugFix] Additive gaussian exploration spec fix by @vmoens in https://github.com/pytorch/rl/pull/560
    • [BugFix] Disabling video step for wandb by @vmoens in https://github.com/pytorch/rl/pull/561
    • [BugFix] Various device fix by @vmoens in https://github.com/pytorch/rl/pull/558
    • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in https://github.com/pytorch/rl/pull/546
    • [BugFix] Fix push binary nightly action by @psolikov in https://github.com/pytorch/rl/pull/566
    • [BugFix] TensorDict comparison by @vmoens in https://github.com/pytorch/rl/pull/567
    • [BugFix] Fix SyncDataCollector reset by @jrobine in https://github.com/pytorch/rl/pull/571
    • [Doc] Banners on README.md by @vmoens in https://github.com/pytorch/rl/pull/572
    • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in https://github.com/pytorch/rl/pull/573
    • [BugFix] Add eps to reward normalization by @vmoens in https://github.com/pytorch/rl/pull/574
    • [BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in https://github.com/pytorch/rl/pull/578
    • [Feature] Restructure torchrl/objectives by @sgrigory in https://github.com/pytorch/rl/pull/580
    • [Docs] Documentation revamp by @vmoens in https://github.com/pytorch/rl/pull/581
    • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/582
    • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/584
    • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/585
    • Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/586
    • [Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/587
    • [Feature] More restrictive tests on docstrings by @vmoens in https://github.com/pytorch/rl/pull/457
    • [BugFix] Wrong stack import in tests by @vmoens in https://github.com/pytorch/rl/pull/590
    • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in https://github.com/pytorch/rl/pull/589
    • [Feature]: Dreamer support by @nicolas-dufour in https://github.com/pytorch/rl/pull/341
    • [Doc] Missing doc for prototype RB by @vmoens in https://github.com/pytorch/rl/pull/595
    • [Feature] Update list of supported libraries by @vmoens in https://github.com/pytorch/rl/pull/594
    • [BugFix] Fix timeit count registration by @vmoens in https://github.com/pytorch/rl/pull/598
    • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in https://github.com/pytorch/rl/pull/603
    • [Feature] Categorical encoding for action space by @artkorenev in https://github.com/pytorch/rl/pull/593
    • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in https://github.com/pytorch/rl/pull/614
    • [Doc] Typos in tensordict tutorial by @PaLeroy in https://github.com/pytorch/rl/pull/621
    • [Doc] Integrate knowledge base in docs by @hatala91 in https://github.com/pytorch/rl/pull/622
    • [Doc] Updating docs requirements by @vmoens in https://github.com/pytorch/rl/pull/624
    • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in https://github.com/pytorch/rl/pull/386
    • [Feature] Habitat integration by @vmoens in https://github.com/pytorch/rl/pull/514
    • [Feature] Checkpointing by @vmoens in https://github.com/pytorch/rl/pull/549
    • Add support for null dim argument in TensorDict.squeeze by @jgonik in https://github.com/pytorch/rl/pull/608
    • [Version] Updating to torch 1.13 by @vmoens in https://github.com/pytorch/rl/pull/627
    • [Feature] Sub-memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/626
    • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in https://github.com/pytorch/rl/pull/631
    • [Feature] Unfold transforms for folded TransformedEnv by @alexanderlobov in https://github.com/pytorch/rl/pull/630
    • [BugFix] make TensorDictReplayBuffer.extend call super().extend with stacked_td by @vmoens in https://github.com/pytorch/rl/pull/634
    • [BugFix] correct the use of step_mdp method in data collector by @adityagandhamal in https://github.com/pytorch/rl/pull/637
    • [Feature] Added implement_for decorator by @ordinskiy in https://github.com/pytorch/rl/pull/618
    • [Feature] Make DQN compatible with nn.Module by @svarolgunes in https://github.com/pytorch/rl/pull/632
    • [Example] Distributed Replay Buffer Prototype Example Implementation by @adityagoel4512 in https://github.com/pytorch/rl/pull/615
    • [Feature] Benchmark storage types by @adityagoel4512 in https://github.com/pytorch/rl/pull/633
    • [Feature] Remove wild imports in the library by @sosmond in https://github.com/pytorch/rl/pull/642
    • [BugFix] Prevent transform parent from being reassigned by @jasonfkut in https://github.com/pytorch/rl/pull/641
    • [Feature] Too many deepcopy in transforms.py by @romainjln in https://github.com/pytorch/rl/pull/625
    • [Naming] Rename keys_in to in_keys in transforms.py and related modules by @sardaankita in https://github.com/pytorch/rl/pull/656
    • [Refactoring] Refactor dreamer helper in smaller pieces by @vmoens in https://github.com/pytorch/rl/pull/662
    • [Feature] VIPRewardTransform by @vmoens in https://github.com/pytorch/rl/pull/658
    • [BugFix] make_trainer possible bug for on-policy cases by @albertbou92 in https://github.com/pytorch/rl/pull/655
    • [Naming] Fixing key names by @vmoens in https://github.com/pytorch/rl/pull/668
    • [Test] Check dtypes of envs by @vmoens in https://github.com/pytorch/rl/pull/666
    • [Refactor] Relying on the standalone tensordict -- phase 1 by @vmoens in https://github.com/pytorch/rl/pull/650
    • [Doc] More doc on trainers by @vmoens in https://github.com/pytorch/rl/pull/663
    • [BugFix] PPO example GAE import by @albertbou92 in https://github.com/pytorch/rl/pull/671
    • [BugFix] Use GitHub for flake8 pre-commit hook by @vmoens in https://github.com/pytorch/rl/pull/679
    • [BugFix] Update to strict select by @vmoens in https://github.com/pytorch/rl/pull/675
    • [Feature] Auto-compute stats for ObservationNorm by @romainjln in https://github.com/pytorch/rl/pull/669
    • [Doc] _make_collector helper function by @albertbou92 in https://github.com/pytorch/rl/pull/678
    • [Doc] BatchSubSampler class docstrings example by @albertbou92 in https://github.com/pytorch/rl/pull/677
    • [BugFix] PPO objective crashes if advantage_module is None by @albertbou92 in https://github.com/pytorch/rl/pull/676
    • [Refactor] Refactor 'next_' into nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/649
    • [Doc] More doc about environments by @vmoens in https://github.com/pytorch/rl/pull/683
    • [Doc] Fix missing tensordict install for doc by @vmoens in https://github.com/pytorch/rl/pull/685
    • [CI] Added CircleCI pipeline to test compatibility across supported gym versions by @ordinskiy in https://github.com/pytorch/rl/pull/645
    • [BugFix] ConvNet forward method with tensors of more than 4 dimensions by @albertbou92 in https://github.com/pytorch/rl/pull/686
    • [Feature] add standard_normal for RewardScaling by @adityagandhamal in https://github.com/pytorch/rl/pull/682
    • [Feature] Jumanji envs by @yingchenlin in https://github.com/pytorch/rl/pull/674
    • [Feature] Default collate_fn by @vmoens in https://github.com/pytorch/rl/pull/688
    • [BugFix] Fix Examples by @vmoens in https://github.com/pytorch/rl/pull/687
    • [Refactoring] Replace direct gym version checks with decorated functions (#) by @ordinskiy in https://github.com/pytorch/rl/pull/691

    New Contributors

    • @ajhinsvark made their first contribution in https://github.com/pytorch/rl/pull/257
    • @ramonmedel made their first contribution in https://github.com/pytorch/rl/pull/296
    • @srikanthmg85 made their first contribution in https://github.com/pytorch/rl/pull/302
    • @rmartimov made their first contribution in https://github.com/pytorch/rl/pull/304
    • @nairbv made their first contribution in https://github.com/pytorch/rl/pull/306
    • @benoitdescamps made their first contribution in https://github.com/pytorch/rl/pull/299
    • @yoavnavon made their first contribution in https://github.com/pytorch/rl/pull/316
    • @bamaxw made their first contribution in https://github.com/pytorch/rl/pull/319
    • @alexanderlobov made their first contribution in https://github.com/pytorch/rl/pull/331
    • @tongbaojia made their first contribution in https://github.com/pytorch/rl/pull/344
    • @omikad made their first contribution in https://github.com/pytorch/rl/pull/358
    • @jaschmid-fb made their first contribution in https://github.com/pytorch/rl/pull/356
    • @guabao made their first contribution in https://github.com/pytorch/rl/pull/379
    • @reachsumit made their first contribution in https://github.com/pytorch/rl/pull/408
    • @matt-fff made their first contribution in https://github.com/pytorch/rl/pull/410
    • @flinder made their first contribution in https://github.com/pytorch/rl/pull/402
    • @fdabek1 made their first contribution in https://github.com/pytorch/rl/pull/412
    • @AnshulSehgal made their first contribution in https://github.com/pytorch/rl/pull/409
    • @yushiyangk made their first contribution in https://github.com/pytorch/rl/pull/422
    • @bashnick made their first contribution in https://github.com/pytorch/rl/pull/427
    • @zeenolife made their first contribution in https://github.com/pytorch/rl/pull/404
    • @nicolasgriffiths made their first contribution in https://github.com/pytorch/rl/pull/424
    • @agrotov made their first contribution in https://github.com/pytorch/rl/pull/436
    • @ronert made their first contribution in https://github.com/pytorch/rl/pull/440
    • @ggimler3 made their first contribution in https://github.com/pytorch/rl/pull/449
    • @ymwdalex made their first contribution in https://github.com/pytorch/rl/pull/443
    • @sladebot made their first contribution in https://github.com/pytorch/rl/pull/435
    • @rayanht made their first contribution in https://github.com/pytorch/rl/pull/432
    • @brandonsj made their first contribution in https://github.com/pytorch/rl/pull/475
    • @ordinskiy made their first contribution in https://github.com/pytorch/rl/pull/485
    • @JasonMa2016 made their first contribution in https://github.com/pytorch/rl/pull/487
    • @himjohntang made their first contribution in https://github.com/pytorch/rl/pull/477
    • @romainjln made their first contribution in https://github.com/pytorch/rl/pull/512
    • @apbard made their first contribution in https://github.com/pytorch/rl/pull/526
    • @sicong-huang made their first contribution in https://github.com/pytorch/rl/pull/522
    • @psolikov made their first contribution in https://github.com/pytorch/rl/pull/566
    • @jrobine made their first contribution in https://github.com/pytorch/rl/pull/571
    • @nikhlrao made their first contribution in https://github.com/pytorch/rl/pull/573
    • @sgrigory made their first contribution in https://github.com/pytorch/rl/pull/580
    • @jlesuffleur made their first contribution in https://github.com/pytorch/rl/pull/589
    • @artkorenev made their first contribution in https://github.com/pytorch/rl/pull/593
    • @paulomarciano made their first contribution in https://github.com/pytorch/rl/pull/614
    • @hatala91 made their first contribution in https://github.com/pytorch/rl/pull/622
    • @jgonik made their first contribution in https://github.com/pytorch/rl/pull/608
    • @adityagandhamal made their first contribution in https://github.com/pytorch/rl/pull/637
    • @svarolgunes made their first contribution in https://github.com/pytorch/rl/pull/632
    • @adityagoel4512 made their first contribution in https://github.com/pytorch/rl/pull/615
    • @jasonfkut made their first contribution in https://github.com/pytorch/rl/pull/641
    • @sardaankita made their first contribution in https://github.com/pytorch/rl/pull/656
    • @albertbou92 made their first contribution in https://github.com/pytorch/rl/pull/655
    • @yingchenlin made their first contribution in https://github.com/pytorch/rl/pull/674

    Full Changelog: https://github.com/pytorch/rl/compare/v0.0.1...0.0.3

    Source code(tar.gz)
    Source code(zip)
    torchrl-batch.whl.zip(16.90 MB)
  • v0.0.2a(Sep 17, 2022)

    What's Changed

    • [BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/328
    • [BugFix] functorch installation in CircleCI by @vmoens in https://github.com/facebookresearch/rl/pull/336
    • [Refactor] VecNorm inference API by @vmoens in https://github.com/facebookresearch/rl/pull/337
    • TransformedEnv sets added Transforms into eval mode by @alexanderlobov in https://github.com/facebookresearch/rl/pull/331
    • [Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/334
    • [CircleCI] Fix dm_control rendering by @vmoens in https://github.com/facebookresearch/rl/pull/339
    • [BugFix]: joining processes when they're done by @vmoens in https://github.com/facebookresearch/rl/pull/311
    • [Test] pass the OS error in case the file isn't closed by @tongbaojia in https://github.com/facebookresearch/rl/pull/344
    • [Feature] Make default rollout tensordict contiguous by @vmoens in https://github.com/facebookresearch/rl/pull/343
    • [BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in https://github.com/facebookresearch/rl/pull/340
    • [CI] Using latest gym by @vmoens in https://github.com/facebookresearch/rl/pull/346
    • [Doc] Coding your first DDPG tutorial by @vmoens in https://github.com/facebookresearch/rl/pull/345
    • [Doc] Minor: typos in DDPG by @vmoens in https://github.com/facebookresearch/rl/pull/354
    • [Feature] Register lambda and gamma in buffers by @vmoens in https://github.com/facebookresearch/rl/pull/353
    • [Feature] Implement eq for TensorSpec by @omikad in https://github.com/facebookresearch/rl/pull/358
    • [Doc] Multi-tasking tutorial by @vmoens in https://github.com/facebookresearch/rl/pull/352
    • [Feature] Env refactoring for model based RL by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/315
    • [Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/332
    • [BugFix] Add lock to vec norm transform by @jaschmid-fb in https://github.com/facebookresearch/rl/pull/356
    • [Perf]: Improve PPO training performance by @vmoens in https://github.com/facebookresearch/rl/pull/297
    • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/facebookresearch/rl/pull/361
    • Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in https://github.com/facebookresearch/rl/pull/362
    • [BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/facebookresearch/rl/pull/363
    • [Feature] CSVLogger (ABBANDONED) by @vmoens in https://github.com/facebookresearch/rl/pull/371
    • [Feature] Support tensor-based decay in TD-lambda by @tcbegley in https://github.com/facebookresearch/rl/pull/360
    • [Feature] CSVLogger by @vmoens in https://github.com/facebookresearch/rl/pull/372
    • [BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in https://github.com/facebookresearch/rl/pull/378
    • [Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in https://github.com/facebookresearch/rl/pull/379
    • [BugFix] Representation of indexed nested tensordict by @vmoens in https://github.com/facebookresearch/rl/pull/370
    • [BugFix] In-place __setitem__ for SubTensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/369
    • [Feature] Add ProbabilisticTensorDictModule dist key mapping support by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/376
    • [Feature]: R3M integration by @vmoens in https://github.com/facebookresearch/rl/pull/321
    • [Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in https://github.com/facebookresearch/rl/pull/385
    • [Feature] AdditiveGaussian exploration strategy by @vmoens in https://github.com/facebookresearch/rl/pull/388
    • [Feature] Multi-images R3M by @vmoens in https://github.com/facebookresearch/rl/pull/389
    • [Feature] Flatten multi-images in R3M by @vmoens in https://github.com/facebookresearch/rl/pull/391
    • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/facebookresearch/rl/pull/392
    • [Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/387
    • [Quality] Code cleanup for fbsync by @vmoens in https://github.com/facebookresearch/rl/pull/397
    • [Doc] Add charts to examples by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/374
    • [Feature] Vectorized GAE by @vmoens in https://github.com/facebookresearch/rl/pull/365
    • [BugFix] Temporarily fix gym to 0.25.1 to fix CI by @vmoens in https://github.com/facebookresearch/rl/pull/411
    • [Feature] Create a Squeeze transform and update Unsqueeze transform by @reachsumit in https://github.com/facebookresearch/rl/pull/408
    • [Naming] Recurse kwarg to match pytorch by @matt-fff in https://github.com/facebookresearch/rl/pull/410
    • [Feature] Add all implemented loggers to the init of loggers by @flinder in https://github.com/facebookresearch/rl/pull/402
    • [BugFix] Fix gym 0.26 compatibility by @vmoens in https://github.com/facebookresearch/rl/pull/403
    • [BugFix] Remove submodules by @vmoens in https://github.com/facebookresearch/rl/pull/414
    • [Feature] lock tensordict when calling share_memory_() by @fdabek1 in https://github.com/facebookresearch/rl/pull/412
    • [BugFix] Updated TensorDict.expand to work as Tensor.expand by @AnshulSehgal in https://github.com/facebookresearch/rl/pull/409
    • [BugFix] Looser check for test_recorder assertion by @vmoens in https://github.com/facebookresearch/rl/pull/415
    • [Feature] Allow spec to be passed directly to exploration wrappers by @vmoens in https://github.com/facebookresearch/rl/pull/418
    • [BugFix] Collector revert to default exploration mode if empty string is passed by @vmoens in https://github.com/facebookresearch/rl/pull/421
    • [Naming] Rename _TargetNetUpdate to TargetNetUpdater, making it public by @yushiyangk in https://github.com/facebookresearch/rl/pull/422
    • [Doc] Re-run tutorials by @vmoens in https://github.com/facebookresearch/rl/pull/381
    • Revert "[Doc] Re-run tutorials" (colab links broken) by @vmoens in https://github.com/facebookresearch/rl/pull/423
    • [Feature] Switch back to latest gym by @vmoens in https://github.com/facebookresearch/rl/pull/425
    • [Feature] TensorDict without device by @tcbegley in https://github.com/facebookresearch/rl/pull/413
    • Updated the README.md file by @bashnick in https://github.com/facebookresearch/rl/pull/427
    • [Feature] Adding support for initialising TensorDicts from nested dicts by @zeenolife in https://github.com/facebookresearch/rl/pull/404
    • [Features] Make image_size a cfg param by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/430
    • Make TensorDict.expand accept Sequence arguments by @nicolasgriffiths in https://github.com/facebookresearch/rl/pull/424
    • [Doc] Readme revamp for efficiency/modularity display by @vmoens in https://github.com/facebookresearch/rl/pull/382
    • [Feature] New biased_softplus semantic to allow for minimum scale setting by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/428
    • [Tutorial] Re-run tutos by @vmoens in https://github.com/facebookresearch/rl/pull/434
    • [BugFix] mixed device_safe vs device by @vmoens in https://github.com/facebookresearch/rl/pull/429
    • [BugFix] Explicit params and buffers by @agrotov in https://github.com/facebookresearch/rl/pull/436
    • [BugFix] Fixed Additive noise by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/441
    • [Tests] Test loggers video saving by @bashnick in https://github.com/facebookresearch/rl/pull/439
    • Revert "[BugFix] Fixed Additive noise" by @vmoens in https://github.com/facebookresearch/rl/pull/442
    • [Refactor] Rename TensorDictSequence to TensorDictSequential by @ronert in https://github.com/facebookresearch/rl/pull/440
    • [Refactor] Refactoring set*() methods for TensorDictBase class by @zeenolife in https://github.com/facebookresearch/rl/pull/438
    • [Cleanup] Removing gym-retro interface by @vmoens in https://github.com/facebookresearch/rl/pull/444
    • [BugFix]: Fix additive noise by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/447
    • [BugFix] CatTensors: Prepended next_ to the out_key by @ggimler3 in https://github.com/facebookresearch/rl/pull/449
    • [BugFix] Fix AdditiveGaussian exploration tests by @vmoens in https://github.com/facebookresearch/rl/pull/450
    • [BugFix] Wrong call to device_safe in replay buffer code by @vmoens in https://github.com/facebookresearch/rl/pull/454
    • [BugFix] Add transform_observation_spec _R3MNet by @ymwdalex in https://github.com/facebookresearch/rl/pull/443
    • [Doc] Add a knowledge base by @shagunsodhani in https://github.com/facebookresearch/rl/pull/375
    • [Feature] Allow for actions and rewards to be in the reset tensordict by @vmoens in https://github.com/facebookresearch/rl/pull/458
    • [Doc] Readme for knowledge base by @vmoens in https://github.com/facebookresearch/rl/pull/459
    • [Feature] Added batch_lock attribute in EnvBase by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/399
    • [BugFix] deepcopy specs before transforming by @vmoens in https://github.com/facebookresearch/rl/pull/461
    • [BugFix]: Fixed dm_control action type casting by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/463
    • [Versioning] Version 0.0.2a0 by @vmoens in https://github.com/facebookresearch/rl/pull/465

    New Contributors

    • @alexanderlobov made their first contribution in https://github.com/facebookresearch/rl/pull/331
    • @tongbaojia made their first contribution in https://github.com/facebookresearch/rl/pull/344
    • @omikad made their first contribution in https://github.com/facebookresearch/rl/pull/358
    • @jaschmid-fb made their first contribution in https://github.com/facebookresearch/rl/pull/356
    • @tcbegley made their first contribution in https://github.com/facebookresearch/rl/pull/360
    • @guabao made their first contribution in https://github.com/facebookresearch/rl/pull/379
    • @reachsumit made their first contribution in https://github.com/facebookresearch/rl/pull/408
    • @matt-fff made their first contribution in https://github.com/facebookresearch/rl/pull/410
    • @flinder made their first contribution in https://github.com/facebookresearch/rl/pull/402
    • @fdabek1 made their first contribution in https://github.com/facebookresearch/rl/pull/412
    • @AnshulSehgal made their first contribution in https://github.com/facebookresearch/rl/pull/409
    • @yushiyangk made their first contribution in https://github.com/facebookresearch/rl/pull/422
    • @bashnick made their first contribution in https://github.com/facebookresearch/rl/pull/427
    • @zeenolife made their first contribution in https://github.com/facebookresearch/rl/pull/404
    • @nicolasgriffiths made their first contribution in https://github.com/facebookresearch/rl/pull/424
    • @agrotov made their first contribution in https://github.com/facebookresearch/rl/pull/436
    • @ronert made their first contribution in https://github.com/facebookresearch/rl/pull/440
    • @ggimler3 made their first contribution in https://github.com/facebookresearch/rl/pull/449
    • @ymwdalex made their first contribution in https://github.com/facebookresearch/rl/pull/443

    Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1c...v0.0.2a

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1c(Jul 25, 2022)

    What's Changed

    • Adding additional checks to TensorDict.view to remove unnecessary ViewedTensorDict object creation by @bamaxw in https://github.com/facebookresearch/rl/pull/319
    • [BugFix]: Safe state normalization when std=0 by @vmoens in https://github.com/facebookresearch/rl/pull/323
    • [BugFix]: gradient propagation in advantage estimates by @vmoens in https://github.com/facebookresearch/rl/pull/322
    • [BugFix]: make training example gracefully exit by @vmoens in https://github.com/facebookresearch/rl/pull/326
    • [Setup]: Exclude tutorials from wheels by @vmoens in https://github.com/facebookresearch/rl/pull/325
    • [BugFix]: Tensor map for subtensordict.set_ by @vmoens in https://github.com/facebookresearch/rl/pull/324
    • [Release]: Wheels v0.0.1c by @vmoens in https://github.com/facebookresearch/rl/pull/327

    New Contributors

    • @bamaxw made their first contribution in https://github.com/facebookresearch/rl/pull/319

    Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1b...v0.0.1c

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1b(Jul 25, 2022)

    Highlights

    Supports nested tensordicts:

    • [Feature] Nested tensordicts by @vmoens in https://github.com/facebookresearch/rl/pull/256
    • [Feature]: Index nested tensordicts using tuples by @vmoens in https://github.com/facebookresearch/rl/pull/262
    • [Feature]: flatten nested tensordicts by @vmoens in https://github.com/facebookresearch/rl/pull/264

    Padding for tensordicts:

    • [Feature] Implement padding for tensordicts by @ajhinsvark in https://github.com/facebookresearch/rl/pull/257

    Speed improvements:

    • [Feature]: faster meta-tensor API for TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/272
    • [Feature]: faster safetanh transform via C++ bindings by @vmoens in https://github.com/facebookresearch/rl/pull/289
    • [Feature]: Improving training efficiency by @vmoens in https://github.com/facebookresearch/rl/pull/293

    Logging capabilities:

    • [Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/270
    • [Feature] Wandb logger by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/274

    Doc

    • [Doc]: TorchRL demo by @vmoens in https://github.com/facebookresearch/rl/pull/284
    • [Doc] Added TensorDict tutorial by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/255
    • [Doc] Tensordictmodule tutorial by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/267

    What's Changed

    • MacOs versioning and release bugfix by @vmoens in https://github.com/facebookresearch/rl/pull/247
    • Setup metadata by @vmoens in https://github.com/facebookresearch/rl/pull/248
    • Fix setup instructions by @vmoens in https://github.com/facebookresearch/rl/pull/250
    • Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in https://github.com/facebookresearch/rl/pull/251
    • Added test for RewardRescale transform by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/252
    • Empty TensorDict population in loops by @vmoens in https://github.com/facebookresearch/rl/pull/253
    • Memmap del bugfix by @vmoens in https://github.com/facebookresearch/rl/pull/254
    • [BugFix]: recursion error when calling permute(...).to_tensordict() by @vmoens in https://github.com/facebookresearch/rl/pull/260
    • Differentiable PPOLoss for IRL by @vmoens in https://github.com/facebookresearch/rl/pull/240
    • [BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in https://github.com/facebookresearch/rl/pull/261
    • [Feature] Add issue and pull request template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/263
    • [Test]: test nested CompositeSpec by @vmoens in https://github.com/facebookresearch/rl/pull/265
    • [Test]: test squeezed TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/269
    • [Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in https://github.com/facebookresearch/rl/pull/268
    • Refactor the torch.stack with destination by @khmigor in https://github.com/facebookresearch/rl/pull/245
    • Small tweaks to make the replay buffer code more consistent by @shagunsodhani in https://github.com/facebookresearch/rl/pull/275
    • [BugFix]: Minor bugs in docstrings by @vmoens in https://github.com/facebookresearch/rl/pull/276
    • [BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/286
    • [BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in https://github.com/facebookresearch/rl/pull/283
    • [Doc]: remove pip install from CONTRIBUTING.md by @vmoens in https://github.com/facebookresearch/rl/pull/288
    • [BugFix]: fix GLFW3 error when installing dm_control by @vmoens in https://github.com/facebookresearch/rl/pull/291
    • [BugFix]: Fix examples by @vmoens in https://github.com/facebookresearch/rl/pull/290
    • [Doc] Simplify PR template by @vmoens in https://github.com/facebookresearch/rl/pull/292
    • [BugFix]: Replay buffer bugfixes by @vmoens in https://github.com/facebookresearch/rl/pull/294
    • [Doc] MacOs M1 troubleshooting by @ramonmedel in https://github.com/facebookresearch/rl/pull/296
    • [QuickFix]: update issue and pr template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/303
    • [Test] tests for BinarizeReward by @srikanthmg85 in https://github.com/facebookresearch/rl/pull/302
    • [BugFix]: L2-priority for PRB by @vmoens in https://github.com/facebookresearch/rl/pull/305
    • [Feature] Transforms: Compose.insert and TransformedEnv.insert_transform by @rmartimov in https://github.com/facebookresearch/rl/pull/304
    • [BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in https://github.com/facebookresearch/rl/pull/306
    • [BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in https://github.com/facebookresearch/rl/pull/307
    • ufmt issue if imports in order requested by distutils by @nairbv in https://github.com/facebookresearch/rl/pull/308
    • [BugFix]: Conda to pip for circleci by @vmoens in https://github.com/facebookresearch/rl/pull/310
    • [BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in https://github.com/facebookresearch/rl/pull/299
    • [Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in https://github.com/facebookresearch/rl/pull/295
    • [Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in https://github.com/facebookresearch/rl/pull/316

    New Contributors

    • @nicolas-dufour made their first contribution in https://github.com/facebookresearch/rl/pull/252
    • @ajhinsvark made their first contribution in https://github.com/facebookresearch/rl/pull/257
    • @ramonmedel made their first contribution in https://github.com/facebookresearch/rl/pull/296
    • @srikanthmg85 made their first contribution in https://github.com/facebookresearch/rl/pull/302
    • @rmartimov made their first contribution in https://github.com/facebookresearch/rl/pull/304
    • @nairbv made their first contribution in https://github.com/facebookresearch/rl/pull/306
    • @benoitdescamps made their first contribution in https://github.com/facebookresearch/rl/pull/299
    • @yoavnavon made their first contribution in https://github.com/facebookresearch/rl/pull/316

    Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1...v0.0.1b

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Jul 6, 2022)

    TorchRL Initial Alpha Release

    TorchRL is the soon-to-be official RL domain library for PyTorch. It contains primitives that are aimed at covering most of the modern RL research space.

    Getting started with the library

    Installation

    The library can be installed through

    $ pip install torchrl
    

    Currently, torchrl wheels are provided for linux and macos (not M1) machines. For other architectures or for the latest features, refer to the README.md and CONTRIBUTING.md files for advanced installation instructions.

    Environments

    TorchRL currently supports gym and dm_control out-of-the-box. To create a gym wrapped environment, simply use

    from torchrl.envs import GymEnv, GymWrapper
    env = GymEnv("Pendulum-v1")
    # similarly
    env = GymWrapper(gym.make("Pendulum-v1"))
    

    Environment can be transformed using the torchrl.envs.transforms module. See the environment tutorial for more information. The ParallelEnv allows to run multiple environments in parallel.

    Policy and modules

    TorchRL modules interacts using TensorDict, a new data carrier class. Although it is not necessary to use it and one can find workarounds for it, we advise to use the TensorDictModule class to read tensordicts:

    from torchrl.modules import TensorDictModule
    >>> policy_module = nn.Linear(n_obs, n_act)
    >>> policy = TensorDictModule(policy_module, 
    ...   in_keys=["observation"],  # keys to be read for the module input
    ...   out_keys=["action"],  # keys to be written with the module output
    )
    >>> tensordict = env.reset()
    >>> tensordict = policy(tensordict)
    >>> action = tensordict["action"]
    

    By using TensorDict and TensorDictModule, you can make sure that your algorithm is robust to changes in configuration (e.g. usage of an RNN for the policy, exploration strategies etc.) TensorDict instances can be reshaped in several ways, cast to device, updated, shared among processes, stacked, concatenated etc.

    Some specialized TensorDictModule are implemented for convenience: Actor, ProbabilisticActor, ValueOperator, ActorCriticOperator, ActorCriticWrapper and QValueActor can be found in actors.py.

    Collecting data

    DataColllectors is the TorchRL data loading class family. We provide single process, sync and async multiprocess loaders. We also provide ReplayBuffers that can be stored in memory or on disk using the various storage options.

    Loss modules and advantage computation

    Loss modules are provided for each algorithm class independently. They are accompanied by efficient implementations of value and advantage computation functions. TorchRL is devoted to be fully compatible with functorch, the functional programming PyTorch library.

    Examples

    A bunch of examples are provided as well. Check the examples directory to learn more about exploration strategies, loss modules etc.

    Source code(tar.gz)
    Source code(zip)
Owner
Meta Research
Meta Research
PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

null 943 Jan 7, 2023
PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations.

HPNet This repository contains the PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations. Installation The

Siming Yan 42 Dec 7, 2022
Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Primitive Representation Learning Network (PREN) This repository contains the code for our paper accepted by CVPR 2021 Primitive Representation Learni

Ruijie Yan 76 Jan 2, 2023
The hippynn python package - a modular library for atomistic machine learning with pytorch.

The hippynn python package - a modular library for atomistic machine learning with pytorch. We aim to provide a powerful library for the training of a

Los Alamos National Laboratory 37 Dec 29, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 7, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
ilpyt: imitation learning library with modular, baseline implementations in Pytorch

ilpyt The imitation learning toolbox (ilpyt) contains modular implementations of common deep imitation learning algorithms in PyTorch, with unified in

The MITRE Corporation 11 Nov 17, 2022
🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Cogitare is a Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python. A friendly interface for beginners and a powerful too

Cogitare - Modern and Easy Deep Learning with Python 76 Sep 30, 2022
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Automatic, Readable, Reusable, Extendable Machin is a reinforcement library designed for pytorch. Build status Platform Status Linux Windows Supported

Iffi 348 Dec 24, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 27, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 29, 2022
An example project demonstrating how the Autonomous Learning Library can be used to build new reinforcement learning agents.

About This repository shows how Autonomous Learning Library can be used to build new reinforcement learning agents. In particular, it contains a model

Chris Nota 5 Aug 30, 2022
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Minqi 297 Dec 12, 2022
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Ilya Kostrikov 3k Dec 31, 2022
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

modAL 1.9k Dec 31, 2022
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

null 3k Jan 2, 2023
Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

Faster R-CNN and Mask R-CNN in PyTorch 1.0 maskrcnn-benchmark has been deprecated. Please see detectron2, which includes implementations for all model

Facebook Research 9k Jan 4, 2023
PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

VIN: Value Iteration Networks This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version) Key

Xingdong Zuo 215 Dec 7, 2022