A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Meta Research

Last update: Jan 7, 2023

Related tags

Deep Learning rl

Overview

TorchRL

Disclaimer

This library is not officially released yet and is subject to change.

The features are available before an official release so that users and collaborators can get early access and provide feedback. No guarantee of stability, robustness or backward compatibility is provided.

TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch.

It provides pytorch and python-first, low and high level abstractions for RL that are intended to be efficient, documented and properly tested. The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort.

This repo attempts to align with the existing pytorch ecosystem libraries in that it has a dataset pillar (torchrl/envs), transforms, models, data utilities (e.g. collectors and containers), etc. TorchRL aims at having as few dependencies as possible (python standard library, numpy and pytorch). Common environment libraries (e.g. OpenAI gym) are only optional.

On the low-level end, torchrl comes with a set of highly re-usable functionals for cost functions, returns and data processing.

On the high-level end, torchrl provides:

multiprocess data collectors;
a generic agent class;
efficient and generic replay buffers;
TensorDict, a convenient data structure to pass data from one object to another without friction;
An associated TDModule class which is functorch-compatible!
interfaces for environments from common libraries (OpenAI gym, deepmind control lab, etc.) and wrappers for parallel execution, as well as a new pytorch-first class of tensor-specification class;
environment transforms, which process and prepare the data coming out of the environments to be used by the agent;
various tools for distributed learning (e.g. memory mapped tensors);
various architectures and models (e.g. actor-critic);
exploration wrappers;
various recipes to build models that correspond to the environment being deployed.

A series of examples are provided with an illustrative purpose:

and many more to come!

Installation

Create a conda environment where the packages will be installed. Before installing anything, make sure you have the latest version of cmake and ninja libraries:

conda create --name torch_rl python=3.9
conda activate torch_rl
conda install cmake -c conda-forge
pip install ninja

Depending on the use of functorch that you want to make, you may want to install the latest (nightly) pytorch release or the latest stable version of pytorch:

Stable

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch  # refer to pytorch official website for cudatoolkit installation
pip install functorch

Nightly

# For CUDA 10.2
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html --upgrade
# For CUDA 11.1
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu111/torch_nightly.html --upgrade
# For CPU-only build
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html --upgrade

and functorch

pip install "git+https://github.com/pytorch/functorch.git"

Torchrl

Go to the directory where you have cloned the torchrl repo and install it

cd /path/to/torchrl/
python setup.py install

To run a quick sanity check, leave that directory and try to import the library.

python -c "import torchrl"

Optional dependencies

The following libraries can be installed depending on the usage one wants to make of torchrl:

# diverse
pip install tqdm pyyaml configargparse

# rendering
pip install moviepy

# deepmind control suite
pip install dm_control 

# gym, atari games
pip install gym gym[accept-rom-license] pygame gym_retro

# tests
pip install pytest

Alternatively, extra dependencies can be installed using

pip install ".[atari,dm_control,gym_continuous,rendering,tests,utils]"

or a selection of these.

Troubleshooting

If a ModuleNotFoundError: No module named ‘torchrl._torchrl errors occurs, it means that the C++ extensions were not installed or not found. One common reason might be that you are trying to import torchrl from within the git repo location. Indeed the following code snippet should return an error if torchrl has not been installed in develop mode:

cd ~/path/to/rl/repo
python -c 'from torchrl.envs import GymEnv'

If this is the case, consider executing torchrl from another location.

This may also be caused by several dependency issues: cmake, gcc or ninja versioning, or absence of the CuDNN library when working in a CUDA environment.

On MacOs, we recommend installing XCode first. With Apple Silicon M1 chips, make sure you are using the arm64-built python (e.g. here). Running the following lines of code

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
python collect_env.py

should display

OS: macOS *** (arm64)

and not

OS: macOS **** (x86_64)

Running examples

Examples are coded in a very similar way but the configuration may change from one algorithm to another (e.g. async/sync data collection, hyperparameters, ratio of model updates / frame etc.) To train an algorithm it is therefore advised to use the predefined configurations that are found in the configs sub-folder in each algorithm directory:

python examples/ppo/ppo.py --config=examples/ppo/configs/humanoid.txt

Note that using the config files requires the configargparse library.

One can also overwrite the config parameters using flags, e.g.

python examples/ppo/ppo.py --config=examples/ppo/configs/humanoid.txt --frame_skip=2 --collection_devices=cuda:1

Each example will write a tensorboard log in a dedicated folder, e.g. ppo_logging/....

Contributing

Internal collaborations to torchrl are welcome! Feel free to fork, submit issues and PRs.

Upcoming features

In the near future, we plan to:

provide tutorials on how to design new actors or environment wrappers;
implement IMPALA (as a distributed RL example) and Meta-RL algorithms;
improve the tests, documentation and nomenclature.

License

TorchRL is licensed under the MIT License. See LICENSE for details.

Comments

[BUG] Stacked tensordicts with nested keys crash `env.step()`

Describe the bug

Due to multiagent applications my env needs to return tensordict_out = torch.stack(agent_tds,dim=0) from env._step() with nested keys. This creates a LazyStackedTensorDict.

The subsequent logic of env.step() performs certain operations on the tensordict_outwhich crash if the latter has nested keys.

This can be solved by calling to_tensordict() before the end of the _step() but this is possible only when the stack is homogenous and not wheen it is heterogenous as in #766.

To Reproduce

Create LazyStackedTensorDict and return it in the _step implementation

test/test_libs.py:525 (TestVmas.test_vmas_seeding[flocking])
Traceback (most recent call last):
  File "/Users/Matteo/PycharmProjects/torchrl/test/test_libs.py", line 537, in test_vmas_seeding
    tdrollout.append(env.rollout(max_steps=10))
  File "/Users/Matteo/PycharmProjects/torchrl/torchrl/envs/common.py", line 605, in rollout
    tensordict = self.step(tensordict)
  File "/Users/Matteo/PycharmProjects/torchrl/torchrl/envs/common.py", line 343, in step
    tensordict_out_select = tensordict_out.select(*obs_keys)
  File "/Users/Matteo/PycharmProjects/tensordict/tensordict/tensordict.py", line 4325, in select
    raise TypeError(
TypeError: All keys passed to LazyStackedTensorDict.select must be strings. Found ('info', 'velocity_rew') of type <class 'tuple'>. Note that LazyStackedTensorDict does not yet support nested keys.

bug

opened by matteobettini 22

error: can't copy 'build/lib.linux-x86_64-3.9/torchrl/_torchrl.so': doesn't exist or not a regular file

When trying to install the package using "pip install -e ." or "python setup.py develop", I encounter this error. OS: Ubuntu 22.04 LTS (x86_64) Python version: 3.9.12
help wanted

opened by bkpcoding 13

[BUG] GymWrapper does not work with nested observation gym.spaces.Dict

Describe the bug

Hi All,

First of all: thanks for the great work here!

I think I have encountered a bug in the GymWrapper in torchrl.envs.libs.gym.GymWrapper. When I use a gym.Env with an observation space with nested gym.spaces.Dict, a KeyError will be thrown since the GymLikeEnv.read_obs() function does only add "next_" to the first level of Dict but not to nested sub Dicts:

observations = {"next_" + key: value for key, value in observations.items()}

Since _gym_to_torchrl_spec_transform() in torchrl.envs.libs.gym ends "next_" in a recursive call to all sub Dicts, the key is missing the necessary "next_". Nested Dict observation spaces are often used (https://www.gymlibrary.dev/api/spaces/#dict), so I guess this is required to work properly.

To Reproduce

#!/usr/bin/env python
from torchrl.envs.libs.gym import GymWrapper
from gym import spaces, Env
import numpy as np


class CustomGym(Env):
    def __init__(self):
        self.action_space = spaces.Discrete(5)
        self.observation_space = spaces.Dict(
            {
                'sensor_1': spaces.Box(low=0, high=255, shape=(5, 5, 3), dtype=np.uint8),
                'sensor_2': spaces.Box(low=0, high=255, shape=(5, 5, 3), dtype=np.uint8),
                'sensor_3': spaces.Box(np.array([-2, -1, -5, 0]), np.array([2, 1, 30, 1]), dtype=np.float32),
                'sensor_4': spaces.Dict({'sensor_41': spaces.Box(low=0, high=100, shape=(1,), dtype=np.float32),
                                         'sensor_42': spaces.Box(low=0, high=100, shape=(1,), dtype=np.float32),
                                         'sensor_43': spaces.Box(low=0, high=100, shape=(1,), dtype=np.float32)})
            }
        )

    def reset(self):
        return self.observation_space.sample()


if __name__ == '__main__':
    env = CustomGym()
    env = GymWrapper(env)

Reason and Possible fixes

The issue can be fixed by adding a recursive function call to rename also nested observation space Dicts in GymLikeEnv.read_obs() correctly by adding "next_":


    def read_obs(
        self, observations: Union[Dict[str, Any], torch.Tensor, np.ndarray]
    ) -> Dict[str, Any]:
        """Reads an observation from the environment and returns an observation compatible with the output TensorDict.

        Args:
            observations (observation under a format dictated by the inner env): observation to be read.

        """
        if isinstance(observations, dict):

            def rename(obs):
                return {
                    "next_" + key: rename(value) if isinstance(value, dict) else value
                    for key, value in obs.items()
                }

            observations = rename(observations)
        if not isinstance(observations, (TensorDict, dict)):
            key = list(self.observation_spec.keys())[0]
            observations = {key: observations}
        observations = self.observation_spec.encode(observations)
        return observations

The style checker required to not use lambda functions, otherwise the fix could also be as simple as

             rename = lambda obs: {
                "next_" + key: rename(value) if isinstance(value, dict) else value
                for key, value in obs.items()
             }

Checklist

[x] I have checked that there is no similar issue in the repo (required)
[x] I have read the documentation (required)
[x] I have provided a minimal working example to reproduce the bug (required)

bug

opened by raphajaner 12

[BugFix] SyncDataCollector init when device and env_device are different
Description

In the init method of the SyncDataCollector class, a small number of steps is taken with the policy to determine the relevant keys of the output TensorDict. When the policy device and the environment device are different, that can raise a RuntimeError since the input provided to the policy is located in the environment device.

This PR only makes sure that the TensorDict provided to the policy is in the policy device, and then moves the output TensorDict to the environment device again.

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[x] Bug fix (non-breaking change which fixes an issue)

[ ] New feature (non-breaking change which adds core functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Documentation (update in the documentation)

[ ] Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

[x] I have read the CONTRIBUTION guide (required)

[ ] My change requires a change to the documentation.

[ ] I have updated the tests accordingly (required for a bug fix or a new feature).

[ ] I have updated the documentation accordingly.

CLA Signed
opened by albertbou92 11

[BUG]TypeError: init() got an unexpected keyword argument 'disable_env_checker'

I used your library on google colab and it works without a problem. However, after spending a day to install gym in ubuntu using conda and run torchrl code, I am getting this error:

>>> from torchrl.envs.libs.gym import _has_gym, GymEnv, GymWrapper
>>> env_torchrl = GymEnv("InvertedPendulum-v2")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 282, in __init__
    super().__init__(**kwargs)
  File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 149, in __init__
    super().__init__(**kwargs)
  File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/common.py", line 717, in __init__
    self._env = self._build_env(**kwargs)  # writes the self._env attribute
  File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 327, in _build_env
    raise err
  File "/home/anaconda3/lib/python3.9/site-packages/torchrl/envs/libs/gym.py", line 313, in _build_env
    env = self.lib.make(env_name, **kwargs)
  File "/home/anaconda3/lib/python3.9/site-packages/gym/envs/registration.py", line 235, in make
    return registry.make(id, **kwargs)
  File "/home/anaconda3/lib/python3.9/site-packages/gym/envs/registration.py", line 129, in make
    env = spec.make(**kwargs)
  File "/home/anaconda3/lib/python3.9/site-packages/gym/envs/registration.py", line 90, in make
    env = cls(**_kwargs)
TypeError: __init__() got an unexpected keyword argument 'disable_env_checker'

Why did this error occur?

bug

opened by neuronphysics 11

[Feature] RewardSum transform
Description

Adds a new Transform class, called RewardSum. Which tracks the cumulative reward of all episodes in progress and adds the information to the tensordict as a new key.

Motivation and Context

It can be informative to be able to access the training episode rewards.

e.g. it can be used like this to track the performance during training

for batch in collector: train_episode_reward = batch["episode_reward"][batch["done"]] if batch["episode_reward"][batch["done"]].numel() > 0: print(f"train_episode_rewards {train_episode_reward.mean()}")

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[ ] Bug fix (non-breaking change which fixes an issue)

[x] New feature (non-breaking change which adds core functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Documentation (update in the documentation)

[ ] Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

[ ] I have read the CONTRIBUTION guide (required)

[ ] My change requires a change to the documentation.

[ ] I have updated the tests accordingly (required for a bug fix or a new feature).

[ ] I have updated the documentation accordingly.

CLA Signed
opened by albertbou92 8
[Feature]: `TensorDictPrimer` transform
Description

This PR add a ForceTensorReset transform.

Motivation and Context

ForceTensorReset allows to set or reset to default values some given vectors

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[x] New feature (non-breaking change which adds core functionality)

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

[x] I have read the CONTRIBUTION guide (required)

[x] My change requires a change to the documentation.

[x] I have updated the tests accordingly (required for a bug fix or a new feature).

[x] I have updated the documentation accordingly.

enhancement CLA Signed
opened by nicolas-dufour 8
[Logging]: implement MLFlow logging integration
Description

This diff enables torchrl to seamlessly use the MLFlow Tracking API through its internal Logger API. These changes are consistent with previous integrations such as the one with W&B.

A test suite for the new integration has been included in the diff.

Motivation and Context

close #395

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[ ] Bug fix (non-breaking change which fixes an issue)

[x] New feature (non-breaking change which adds core functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Documentation (update in the documentation)

[ ] Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

[x] I have read the CONTRIBUTION guide (required)

[ ] My change requires a change to the documentation.

[x] I have updated the tests accordingly (required for a bug fix or a new feature).

[ ] I have updated the documentation accordingly.

enhancement CLA Signed
opened by rayanht 8
[Doc] Added TensorDict tutorial

Tutorial on Tensor Dict and TensorDictModule. Contains simple examples of TensorDict and TensorDictModule operations. Also contains implementation of a Transformer model using Tensor Dict and TensorDictModule to showcase how this modules work
documentation CLA Signed

opened by nicolas-dufour 8
Add support for null `dim` argument in `TensorDict.squeeze`
Description

I began adding support for a null dim argument in TensorDict.squeeze. However, this code is still buggy as TensorDict.unsqueeze is called during the process of creating a SqueezedTensorDict, and unsqueeze should not support a null dim argument.

Motivation and Context

This resolves #592.

[x] I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[x] Bug fix (non-breaking change which fixes an issue)

[ ] New feature (non-breaking change which adds core functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Documentation (update in the documentation)

[ ] Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

[x] I have read the CONTRIBUTION guide (required)

[x] My change requires a change to the documentation.

[x] I have updated the tests accordingly (required for a bug fix or a new feature).

[x] I have updated the documentation accordingly.

enhancement CLA Signed
opened by jgonik 7
[Feature] lock tensordict when calling `share_memory_()`
Description

Added lock=True flag to both share_memory_ and memmap_ methods of TensorDict classes. And when calling the set method this only allows the key to be updated if the key already exists and inplace is true.

This required a couple changes to existing code/tests:

Added lock=False to existing tests to let them continue to pass.

Removed is_locked checks from set_, set_at_, and _stack_onto_ methods due to the issue #125 stating that these methods should ignore the lock.

Motivation and Context

Close #125

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[X] New feature (non-breaking change which adds core functionality)

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

[X] I have read the CONTRIBUTION guide (required)

[ ] My change requires a change to the documentation.

[X] I have updated the tests accordingly (required for a bug fix or a new feature).

[ ] I have updated the documentation accordingly.

enhancement CLA Signed
opened by fdabek1 7
[BUG] flacky tests
A list of flacky tests to fix

Describe the bug

here or here FAILED test/test_env.py::TestParallel::test_parallel_env_transform_consistency[device1-1-Pendulum-v1] - AssertionError: key observation does not match, got mse = 0.0618

bug Good first issue
opened by vmoens 0
[BugFix] [Feature] "_reset" flag for env reset

Description

This PR ddresses issue #790.

The changes replace the "reset_workers" flag (only deigned for ParallelEnvs wrapping environments with emty batch_size) with the "_reset" flag, which spans over all batch_size dimensions.

This allows to more precisely tell the wrapped environemnts which dimensions need to be reset.

In accordace to this, now the reset() methods on EnvBase and ParallelEnv only check that at least the indexes that were flagged to be reset are not done. Instead of checking assert not done.any().
CLA Signed

opened by matteobettini 1
[Feature Request] TensorSpec is_in methods should check the dtype of val
Motivation

As mentioned in #783, we should check that is_in always checks the dtype.

Solution

Add self.dtype == val.dtype condition in the is_in methods. Should be very simple to fix, but we need to make sure that all tests pass.

Checklist

[X] I have checked that there is no similar issue in the repo (required)

enhancement Good first issue
opened by riiswa 2
[Feature Request] `in_keys` and `out_keys` for loss modules

Motivation

Loss modules should have a in_keys and out_keys attributes. To get this, we could simply use the convert_to_functional: this method could stack together the in_keys and out_keys of all the input modules.

Another option is to make this a property that looks for all the children of the loss module, and computes the in_keys and out_keys on the fly.

The first solution will miss all the cases where the users don't use convert_to_functional, which is not mandatory. The second is slightly more expensive.

Going the extra mile

We could also decorate the forward of the losses with tensordict.nn.dispatch_kwargs, allowing the users to use the losses without using tensordict.
enhancement Good first issue

opened by vmoens 0
[BUG] Resetting environments with non-empty batch_size
Describe the bug

When resetting an environment with non-empty batch size, there is currently no unified way to specify which dimensions to reset.

ParallelEnv has a reset key called "reset_workers" which is used to choose which workers to reset. The use of this unidimensional key makes ParallelEnv crash when env.batch_size is not empty.

This is what happens:

def _reset(self, tensordict: TensorDictBase, **kwargs) -> TensorDictBase: cmd_out = "reset" if tensordict is not None and "reset_workers" in tensordict.keys(): self._assert_tensordict_shape(tensordict) # First assert that the key has the same batch size as the env, let's say [10,4,5,2] reset_workers = tensordict.get("reset_workers") else: reset_workers = torch.ones(self.num_workers, dtype=torch.bool) # If not create one (without respecting the batch size) for i, channel in enumerate(self.parent_channels): if not reset_workers[i]: # If run on dimension 0 !! This only works if the else branch is taken or env.batch_size is empty continue channel.send((cmd_out, kwargs)) # Do not even pass the tensordict to the env

The problems just in this snippet are:

This only works if the else branch is taken or env.batch_size is empty

it does not pass the reset tensordict to the env

This is paired with a series of problems in the various reset functions (ParallelEnv and EnvBase) where after reset done.any() is called. This is highly problematic as any spans over all dimensions.

Proposed changes to the API

Remove "reset_workers" (which already doesn't work)

Introduce the possibility of having a "reset" key in the tensordict given as parameter to the reset functions. This reset key has shape (in the more general case of ParallelEnv) (n_parallel_envs, *env.batch_size) and is a boolean telling precisely which dimensions to reset. ParallelEnv then can call reset[worker_id].any() to know if to pass the reset command and key to the worker

In the reset function do not check done.any() but chacke that at least the requested dimensions to reset have done=False. The absence of the "reset" key means resetting all dimensions

To Reproduce

env = MockBatchedLockedEnv(device="cpu", batch_size=torch.Size(env_batch_size)) env.set_seed(1) parallel_env = ParallelEnv(num_parallel_env, lambda: env) parallel_env.start() reset_td = TensorDict( {"reset_workers": torch.full(parallel_env.batch_size, True, device=parallel_env.device)}, batch_size=parallel_env.batch_size, device=parallel_env.device, ) parallel_env.reset(reset_td)
bug
opened by matteobettini 3

Releases(0.0.3)

0.0.3(Nov 21, 2022)
The main changes introduced by this release are:

dependency on the standalone tensordict repo;

refactoring of the "next" API

What's Changed

[Versioning] MacOs versioning and release bugfix by @vmoens in https://github.com/pytorch/rl/pull/247

[Versioning] Setup metadata by @vmoens in https://github.com/pytorch/rl/pull/248

[BugFix] Fix setup instructions by @vmoens in https://github.com/pytorch/rl/pull/250

[BugFix] Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in https://github.com/pytorch/rl/pull/251

[Feature] Added test for RewardRescale transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/252

[Feature] Empty TensorDict population in loops by @vmoens in https://github.com/pytorch/rl/pull/253

[BugFix] Memmap del bugfix by @vmoens in https://github.com/pytorch/rl/pull/254

[Feature] Implement padding for tensordicts by @ajhinsvark in https://github.com/pytorch/rl/pull/257

[BugFix]: recursion error when calling permute(...).to_tensordict() by @vmoens in https://github.com/pytorch/rl/pull/260

[Feature] Differentiable PPOLoss for IRL by @vmoens in https://github.com/pytorch/rl/pull/240

[BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in https://github.com/pytorch/rl/pull/261

[Feature] Add issue and pull request template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/263

[Feature] Nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/256

[Feature]: Index nested tensordicts using tuples by @vmoens in https://github.com/pytorch/rl/pull/262

[Feature]: flatten nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/264

[Test]: test nested CompositeSpec by @vmoens in https://github.com/pytorch/rl/pull/265

[Test]: test squeezed TensorDict by @vmoens in https://github.com/pytorch/rl/pull/269

[Doc] Added TensorDict tutorial by @nicolas-dufour in https://github.com/pytorch/rl/pull/255

[Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in https://github.com/pytorch/rl/pull/268

Refactor the torch.stack with destination by @khmigor in https://github.com/pytorch/rl/pull/245

[Feature]: faster meta-tensor API for TensorDict by @vmoens in https://github.com/pytorch/rl/pull/272

[Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in https://github.com/pytorch/rl/pull/270

Small tweaks to make the replay buffer code more consistent by @shagunsodhani in https://github.com/pytorch/rl/pull/275

[BugFix]: Minor bugs in docstrings by @vmoens in https://github.com/pytorch/rl/pull/276

[Doc]: TorchRL demo by @vmoens in https://github.com/pytorch/rl/pull/284

[BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/286

[BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in https://github.com/pytorch/rl/pull/283

[Doc]: remove pip install from CONTRIBUTING.md by @vmoens in https://github.com/pytorch/rl/pull/288

[Feature]: faster safetanh transform via C++ bindings by @vmoens in https://github.com/pytorch/rl/pull/289

[BugFix]: fix GLFW3 error when installing dm_control by @vmoens in https://github.com/pytorch/rl/pull/291

[BugFix]: Fix examples by @vmoens in https://github.com/pytorch/rl/pull/290

[Doc] Simplify PR template by @vmoens in https://github.com/pytorch/rl/pull/292

[BugFix]: Replay buffer bugfixes by @vmoens in https://github.com/pytorch/rl/pull/294

[Doc] MacOs M1 troubleshooting by @ramonmedel in https://github.com/pytorch/rl/pull/296

[Feature]: Improving training efficiency by @vmoens in https://github.com/pytorch/rl/pull/293

[Feature] Wandb logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/274

[QuickFix]: update issue and pr template by @Benjamin-eecs in https://github.com/pytorch/rl/pull/303

[Test] tests for BinarizeReward by @srikanthmg85 in https://github.com/pytorch/rl/pull/302

[BugFix]: L2-priority for PRB by @vmoens in https://github.com/pytorch/rl/pull/305

[Feature] Transforms: Compose.insert and TransformedEnv.insert_transform by @rmartimov in https://github.com/pytorch/rl/pull/304

[BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in https://github.com/pytorch/rl/pull/306

[BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in https://github.com/pytorch/rl/pull/307

ufmt issue if imports in order requested by distutils by @nairbv in https://github.com/pytorch/rl/pull/308

[BugFix]: Conda to pip for circleci by @vmoens in https://github.com/pytorch/rl/pull/310

[BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in https://github.com/pytorch/rl/pull/299

[Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in https://github.com/pytorch/rl/pull/295

[Doc] Tensordictmodule tutorial by @nicolas-dufour in https://github.com/pytorch/rl/pull/267

[Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in https://github.com/pytorch/rl/pull/316

[Release]: v0.0.1b versioning by @vmoens in https://github.com/pytorch/rl/pull/317

[Feature] Adding additional checks to TensorDict.view to remove unnecessary ViewedTensorDict object creation by @bamaxw in https://github.com/pytorch/rl/pull/319

[BugFix]: Safe state normalization when std=0 by @vmoens in https://github.com/pytorch/rl/pull/323

[BugFix]: gradient propagation in advantage estimates by @vmoens in https://github.com/pytorch/rl/pull/322

[BugFix]: make training example gracefully exit by @vmoens in https://github.com/pytorch/rl/pull/326

[Setup]: Exclude tutorials from wheels by @vmoens in https://github.com/pytorch/rl/pull/325

[BugFix]: Tensor map for subtensordict.set_ by @vmoens in https://github.com/pytorch/rl/pull/324

[Versioning]: Wheels v0.0.1c by @vmoens in https://github.com/pytorch/rl/pull/327

[BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in https://github.com/pytorch/rl/pull/328

[BugFix] functorch installation in CircleCI by @vmoens in https://github.com/pytorch/rl/pull/336

[Refactor] VecNorm inference API by @vmoens in https://github.com/pytorch/rl/pull/337

[BugFix] TransformedEnv sets added Transforms into eval mode by @alexanderlobov in https://github.com/pytorch/rl/pull/331

[Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in https://github.com/pytorch/rl/pull/334

[CircleCI] Fix dm_control rendering by @vmoens in https://github.com/pytorch/rl/pull/339

[BugFix]: joining processes when they're done by @vmoens in https://github.com/pytorch/rl/pull/311

[Test] pass the OS error in case the file isn't closed by @tongbaojia in https://github.com/pytorch/rl/pull/344

[Feature] Make default rollout tensordict contiguous by @vmoens in https://github.com/pytorch/rl/pull/343

[BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in https://github.com/pytorch/rl/pull/340

[CI] Using latest gym by @vmoens in https://github.com/pytorch/rl/pull/346

[Doc] Coding your first DDPG tutorial by @vmoens in https://github.com/pytorch/rl/pull/345

[Doc] Minor: typos in DDPG by @vmoens in https://github.com/pytorch/rl/pull/354

[Feature] Register lambda and gamma in buffers by @vmoens in https://github.com/pytorch/rl/pull/353

[Feature] Implement eq for TensorSpec by @omikad in https://github.com/pytorch/rl/pull/358

[Doc] Multi-tasking tutorial by @vmoens in https://github.com/pytorch/rl/pull/352

[Feature] Env refactoring for model based RL by @nicolas-dufour in https://github.com/pytorch/rl/pull/315

[Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in https://github.com/pytorch/rl/pull/332

[BugFix] Add lock to vec norm transform by @jaschmid-fb in https://github.com/pytorch/rl/pull/356

[Perf]: Improve PPO training performance by @vmoens in https://github.com/pytorch/rl/pull/297

[BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/pytorch/rl/pull/361

Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in https://github.com/pytorch/rl/pull/362

[BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/pytorch/rl/pull/363

[Feature] CSVLogger (ABBANDONED) by @vmoens in https://github.com/pytorch/rl/pull/371

[Feature] Support tensor-based decay in TD-lambda by @tcbegley in https://github.com/pytorch/rl/pull/360

[Feature] CSVLogger by @vmoens in https://github.com/pytorch/rl/pull/372

[BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in https://github.com/pytorch/rl/pull/378

[Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in https://github.com/pytorch/rl/pull/379

[BugFix] Representation of indexed nested tensordict by @vmoens in https://github.com/pytorch/rl/pull/370

[BugFix] In-place __setitem__ for SubTensorDict by @vmoens in https://github.com/pytorch/rl/pull/369

[Feature] Add ProbabilisticTensorDictModule dist key mapping support by @nicolas-dufour in https://github.com/pytorch/rl/pull/376

[Feature]: R3M integration by @vmoens in https://github.com/pytorch/rl/pull/321

[Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in https://github.com/pytorch/rl/pull/385

[Feature] AdditiveGaussian exploration strategy by @vmoens in https://github.com/pytorch/rl/pull/388

[Feature] Multi-images R3M by @vmoens in https://github.com/pytorch/rl/pull/389

[Feature] Flatten multi-images in R3M by @vmoens in https://github.com/pytorch/rl/pull/391

[Quality] Code cleanup for fbsync by @vmoens in https://github.com/pytorch/rl/pull/392

[Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/pytorch/rl/pull/387

[Quality] Code cleanup for fbsync by @vmoens in https://github.com/pytorch/rl/pull/397

[Doc] Add charts to examples by @nicolas-dufour in https://github.com/pytorch/rl/pull/374

[Feature] Vectorized GAE by @vmoens in https://github.com/pytorch/rl/pull/365

[BugFix] Temporarily fix gym to 0.25.1 to fix CI by @vmoens in https://github.com/pytorch/rl/pull/411

[Feature] Create a Squeeze transform and update Unsqueeze transform by @reachsumit in https://github.com/pytorch/rl/pull/408

[Naming] Recurse kwarg to match pytorch by @matt-fff in https://github.com/pytorch/rl/pull/410

[Feature] Add all implemented loggers to the init of loggers by @flinder in https://github.com/pytorch/rl/pull/402

[BugFix] Fix gym 0.26 compatibility by @vmoens in https://github.com/pytorch/rl/pull/403

[BugFix] Remove submodules by @vmoens in https://github.com/pytorch/rl/pull/414

[Feature] lock tensordict when calling share_memory_() by @fdabek1 in https://github.com/pytorch/rl/pull/412

[BugFix] Updated TensorDict.expand to work as Tensor.expand by @AnshulSehgal in https://github.com/pytorch/rl/pull/409

[BugFix] Looser check for test_recorder assertion by @vmoens in https://github.com/pytorch/rl/pull/415

[Feature] Allow spec to be passed directly to exploration wrappers by @vmoens in https://github.com/pytorch/rl/pull/418

[BugFix] Collector revert to default exploration mode if empty string is passed by @vmoens in https://github.com/pytorch/rl/pull/421

[Naming] Rename _TargetNetUpdate to TargetNetUpdater, making it public by @yushiyangk in https://github.com/pytorch/rl/pull/422

[Doc] Re-run tutorials by @vmoens in https://github.com/pytorch/rl/pull/381

Revert "[Doc] Re-run tutorials" (colab links broken) by @vmoens in https://github.com/pytorch/rl/pull/423

[Feature] Switch back to latest gym by @vmoens in https://github.com/pytorch/rl/pull/425

[Feature] TensorDict without device by @tcbegley in https://github.com/pytorch/rl/pull/413

Updated the README.md file by @bashnick in https://github.com/pytorch/rl/pull/427

[Feature] Adding support for initialising TensorDicts from nested dicts by @zeenolife in https://github.com/pytorch/rl/pull/404

[Features] Make image_size a cfg param by @nicolas-dufour in https://github.com/pytorch/rl/pull/430

Make TensorDict.expand accept Sequence arguments by @nicolasgriffiths in https://github.com/pytorch/rl/pull/424

[Doc] Readme revamp for efficiency/modularity display by @vmoens in https://github.com/pytorch/rl/pull/382

[Feature] New biased_softplus semantic to allow for minimum scale setting by @nicolas-dufour in https://github.com/pytorch/rl/pull/428

[Tutorial] Re-run tutos by @vmoens in https://github.com/pytorch/rl/pull/434

[BugFix] mixed device_safe vs device by @vmoens in https://github.com/pytorch/rl/pull/429

[BugFix] Explicit params and buffers by @agrotov in https://github.com/pytorch/rl/pull/436

[BugFix] Fixed Additive noise by @nicolas-dufour in https://github.com/pytorch/rl/pull/441

[Tests] Test loggers video saving by @bashnick in https://github.com/pytorch/rl/pull/439

Revert "[BugFix] Fixed Additive noise" by @vmoens in https://github.com/pytorch/rl/pull/442

[Refactor] Rename TensorDictSequence to TensorDictSequential by @ronert in https://github.com/pytorch/rl/pull/440

[Refactor] Refactoring set*() methods for TensorDictBase class by @zeenolife in https://github.com/pytorch/rl/pull/438

[Cleanup] Removing gym-retro interface by @vmoens in https://github.com/pytorch/rl/pull/444

[BugFix]: Fix additive noise by @nicolas-dufour in https://github.com/pytorch/rl/pull/447

[BugFix] CatTensors: Prepended next_ to the out_key by @ggimler3 in https://github.com/pytorch/rl/pull/449

[BugFix] Fix AdditiveGaussian exploration tests by @vmoens in https://github.com/pytorch/rl/pull/450

[BugFix] Wrong call to device_safe in replay buffer code by @vmoens in https://github.com/pytorch/rl/pull/454

[BugFix] Add transform_observation_spec _R3MNet by @ymwdalex in https://github.com/pytorch/rl/pull/443

[Doc] Add a knowledge base by @shagunsodhani in https://github.com/pytorch/rl/pull/375

[Feature] Allow for actions and rewards to be in the reset tensordict by @vmoens in https://github.com/pytorch/rl/pull/458

[Doc] Readme for knowledge base by @vmoens in https://github.com/pytorch/rl/pull/459

[Feature] Added batch_lock attribute in EnvBase by @nicolas-dufour in https://github.com/pytorch/rl/pull/399

[BugFix] deepcopy specs before transforming by @vmoens in https://github.com/pytorch/rl/pull/461

[BugFix]: Fixed dm_control action type casting by @nicolas-dufour in https://github.com/pytorch/rl/pull/463

[Versioning] Version 0.0.2a0 by @vmoens in https://github.com/pytorch/rl/pull/465

[CI, Doc] Update functorch source installation command by @zou3519 in https://github.com/pytorch/rl/pull/446

[BugFix] TransformedEnv attributes inheritance by @vmoens in https://github.com/pytorch/rl/pull/467

[Feature] Cleanup mocking envs init and new by @vmoens in https://github.com/pytorch/rl/pull/469

[Tests] Adding tensordict __repr__ tests by @sladebot in https://github.com/pytorch/rl/pull/435

[Logging]: implement MLFlow logging integration by @rayanht in https://github.com/pytorch/rl/pull/432

[BugFix] MLFlow import fix by @vmoens in https://github.com/pytorch/rl/pull/473

[BugFix] Fixed pip install by @brandonsj in https://github.com/pytorch/rl/pull/475

[Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in https://github.com/pytorch/rl/pull/464

[Feature]: ModelBased Envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/333

[Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in https://github.com/pytorch/rl/pull/476

[Tutorial] DQN tutorial by @vmoens in https://github.com/pytorch/rl/pull/474

[Feature] reader hooks for GymLike by @vmoens in https://github.com/pytorch/rl/pull/478

[BugFix] TensorSpec.zero(None) failure fix by @vmoens in https://github.com/pytorch/rl/pull/483

[Feature]: Support for planners and CEM by @nicolas-dufour in https://github.com/pytorch/rl/pull/384

[Feature] Replaced device_safe() with device by @ordinskiy in https://github.com/pytorch/rl/pull/485

[Feature]: TensorDictPrimer transform by @nicolas-dufour in https://github.com/pytorch/rl/pull/456

[Feature]: erase() method for torchrl.timeit by @nicolas-dufour in https://github.com/pytorch/rl/pull/480

[Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in https://github.com/pytorch/rl/pull/482

[BugFix] removing unwanted device_safe() by @vmoens in https://github.com/pytorch/rl/pull/486

[Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in https://github.com/pytorch/rl/pull/481

[Feature] VIP Integration by @JasonMa2016 in https://github.com/pytorch/rl/pull/487

[Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in https://github.com/pytorch/rl/pull/489

[Feature]: Deactivate typechecks in envs by @nicolas-dufour in https://github.com/pytorch/rl/pull/490

[BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in https://github.com/pytorch/rl/pull/400

[BugFix] Fix TensorDictPrimer init by @vmoens in https://github.com/pytorch/rl/pull/491

[Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in https://github.com/pytorch/rl/pull/492

[BugFix] Defaulting passing_devices to None by @himjohntang in https://github.com/pytorch/rl/pull/477

Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in https://github.com/pytorch/rl/pull/494

[BugFix] Multi-agent fixes by @vmoens in https://github.com/pytorch/rl/pull/488

[BugFix] Defaulting passing_devices to None by @vmoens in https://github.com/pytorch/rl/pull/495

[Feature] Lazy initialization of CatTensors by @vmoens in https://github.com/pytorch/rl/pull/497

[Cleanup] Removing cuda 10.2 references by @vmoens in https://github.com/pytorch/rl/pull/498

[BugFix] Migration to pytorch org by @vmoens in https://github.com/pytorch/rl/pull/499

[Refactoring] Import at root to enable vmap monkey-patching by @vmoens in https://github.com/pytorch/rl/pull/500

[BugFix] python version for linting checks by @vmoens in https://github.com/pytorch/rl/pull/502

[Feature] Replay Buffers refactor by @bamaxw in https://github.com/pytorch/rl/pull/330

[Feature] Rename step_tensordict in step_mdp by @romainjln in https://github.com/pytorch/rl/pull/512

[Lint] re-instantiate F821 by @vmoens in https://github.com/pytorch/rl/pull/516

[BugFix] run_type_checks for TransformedEnvs by @vmoens in https://github.com/pytorch/rl/pull/513

[BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in https://github.com/pytorch/rl/pull/511

[Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in https://github.com/pytorch/rl/pull/504

[BugFix] Changing the dm_control import to fail if not installed by @zeenolife in https://github.com/pytorch/rl/pull/515

[CI] Add coverage with codecov by @silvestrebahi in https://github.com/pytorch/rl/pull/523

Revert "[CI] Add coverage with codecov" by @vmoens in https://github.com/pytorch/rl/pull/525

[Quality] Use relative imports for local c++ deps by @apbard in https://github.com/pytorch/rl/pull/526

[Feature] Nightly release by @vmoens in https://github.com/pytorch/rl/pull/519

[Feature] Add make_tensordict() function by @sicong-huang in https://github.com/pytorch/rl/pull/522

[Doc] Misc readme fixes by @GavinPHR in https://github.com/pytorch/rl/pull/532

[BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in https://github.com/pytorch/rl/pull/530

[BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in https://github.com/pytorch/rl/pull/531

[Doc] Add coverage banner by @vmoens in https://github.com/pytorch/rl/pull/533

[BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in https://github.com/pytorch/rl/pull/543

[BugFix] Fix optional imports by @vmoens in https://github.com/pytorch/rl/pull/535

[BugFix] Restore missing keys in data collector output by @tcbegley in https://github.com/pytorch/rl/pull/521

[Lint] reorganize imports by @apbard in https://github.com/pytorch/rl/pull/545

[BugFix] Single-cpu compatibility by @vmoens in https://github.com/pytorch/rl/pull/548

[BugFix] vision install and other deps in optdeps by @vmoens in https://github.com/pytorch/rl/pull/552

[Feature] Implemented device argument for modules.models by @yushiyangk in https://github.com/pytorch/rl/pull/524

[BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in https://github.com/pytorch/rl/pull/559

[BugFix] Additive gaussian exploration spec fix by @vmoens in https://github.com/pytorch/rl/pull/560

[BugFix] Disabling video step for wandb by @vmoens in https://github.com/pytorch/rl/pull/561

[BugFix] Various device fix by @vmoens in https://github.com/pytorch/rl/pull/558

[Feature] Allow collectors to accept regular modules as policies by @tcbegley in https://github.com/pytorch/rl/pull/546

[BugFix] Fix push binary nightly action by @psolikov in https://github.com/pytorch/rl/pull/566

[BugFix] TensorDict comparison by @vmoens in https://github.com/pytorch/rl/pull/567

[BugFix] Fix SyncDataCollector reset by @jrobine in https://github.com/pytorch/rl/pull/571

[Doc] Banners on README.md by @vmoens in https://github.com/pytorch/rl/pull/572

[Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in https://github.com/pytorch/rl/pull/573

[BugFix] Add eps to reward normalization by @vmoens in https://github.com/pytorch/rl/pull/574

[BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in https://github.com/pytorch/rl/pull/578

[Feature] Restructure torchrl/objectives by @sgrigory in https://github.com/pytorch/rl/pull/580

[Docs] Documentation revamp by @vmoens in https://github.com/pytorch/rl/pull/581

[Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/582

Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/584

[Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/585

Revert "[Doc] Publishing on pytorch.org" by @vmoens in https://github.com/pytorch/rl/pull/586

[Doc] Publishing on pytorch.org by @vmoens in https://github.com/pytorch/rl/pull/587

[Feature] More restrictive tests on docstrings by @vmoens in https://github.com/pytorch/rl/pull/457

[BugFix] Wrong stack import in tests by @vmoens in https://github.com/pytorch/rl/pull/590

[Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in https://github.com/pytorch/rl/pull/589

[Feature]: Dreamer support by @nicolas-dufour in https://github.com/pytorch/rl/pull/341

[Doc] Missing doc for prototype RB by @vmoens in https://github.com/pytorch/rl/pull/595

[Feature] Update list of supported libraries by @vmoens in https://github.com/pytorch/rl/pull/594

[BugFix] Fix timeit count registration by @vmoens in https://github.com/pytorch/rl/pull/598

[Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in https://github.com/pytorch/rl/pull/603

[Feature] Categorical encoding for action space by @artkorenev in https://github.com/pytorch/rl/pull/593

[BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in https://github.com/pytorch/rl/pull/614

[Doc] Typos in tensordict tutorial by @PaLeroy in https://github.com/pytorch/rl/pull/621

[Doc] Integrate knowledge base in docs by @hatala91 in https://github.com/pytorch/rl/pull/622

[Doc] Updating docs requirements by @vmoens in https://github.com/pytorch/rl/pull/624

[Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in https://github.com/pytorch/rl/pull/386

[Feature] Habitat integration by @vmoens in https://github.com/pytorch/rl/pull/514

[Feature] Checkpointing by @vmoens in https://github.com/pytorch/rl/pull/549

Add support for null dim argument in TensorDict.squeeze by @jgonik in https://github.com/pytorch/rl/pull/608

[Version] Updating to torch 1.13 by @vmoens in https://github.com/pytorch/rl/pull/627

[Feature] Sub-memmap tensors by @vmoens in https://github.com/pytorch/rl/pull/626

[BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in https://github.com/pytorch/rl/pull/631

[Feature] Unfold transforms for folded TransformedEnv by @alexanderlobov in https://github.com/pytorch/rl/pull/630

[BugFix] make TensorDictReplayBuffer.extend call super().extend with stacked_td by @vmoens in https://github.com/pytorch/rl/pull/634

[BugFix] correct the use of step_mdp method in data collector by @adityagandhamal in https://github.com/pytorch/rl/pull/637

[Feature] Added implement_for decorator by @ordinskiy in https://github.com/pytorch/rl/pull/618

[Feature] Make DQN compatible with nn.Module by @svarolgunes in https://github.com/pytorch/rl/pull/632

[Example] Distributed Replay Buffer Prototype Example Implementation by @adityagoel4512 in https://github.com/pytorch/rl/pull/615

[Feature] Benchmark storage types by @adityagoel4512 in https://github.com/pytorch/rl/pull/633

[Feature] Remove wild imports in the library by @sosmond in https://github.com/pytorch/rl/pull/642

[BugFix] Prevent transform parent from being reassigned by @jasonfkut in https://github.com/pytorch/rl/pull/641

[Feature] Too many deepcopy in transforms.py by @romainjln in https://github.com/pytorch/rl/pull/625

[Naming] Rename keys_in to in_keys in transforms.py and related modules by @sardaankita in https://github.com/pytorch/rl/pull/656

[Refactoring] Refactor dreamer helper in smaller pieces by @vmoens in https://github.com/pytorch/rl/pull/662

[Feature] VIPRewardTransform by @vmoens in https://github.com/pytorch/rl/pull/658

[BugFix] make_trainer possible bug for on-policy cases by @albertbou92 in https://github.com/pytorch/rl/pull/655

[Naming] Fixing key names by @vmoens in https://github.com/pytorch/rl/pull/668

[Test] Check dtypes of envs by @vmoens in https://github.com/pytorch/rl/pull/666

[Refactor] Relying on the standalone tensordict -- phase 1 by @vmoens in https://github.com/pytorch/rl/pull/650

[Doc] More doc on trainers by @vmoens in https://github.com/pytorch/rl/pull/663

[BugFix] PPO example GAE import by @albertbou92 in https://github.com/pytorch/rl/pull/671

[BugFix] Use GitHub for flake8 pre-commit hook by @vmoens in https://github.com/pytorch/rl/pull/679

[BugFix] Update to strict select by @vmoens in https://github.com/pytorch/rl/pull/675

[Feature] Auto-compute stats for ObservationNorm by @romainjln in https://github.com/pytorch/rl/pull/669

[Doc] _make_collector helper function by @albertbou92 in https://github.com/pytorch/rl/pull/678

[Doc] BatchSubSampler class docstrings example by @albertbou92 in https://github.com/pytorch/rl/pull/677

[BugFix] PPO objective crashes if advantage_module is None by @albertbou92 in https://github.com/pytorch/rl/pull/676

[Refactor] Refactor 'next_' into nested tensordicts by @vmoens in https://github.com/pytorch/rl/pull/649

[Doc] More doc about environments by @vmoens in https://github.com/pytorch/rl/pull/683

[Doc] Fix missing tensordict install for doc by @vmoens in https://github.com/pytorch/rl/pull/685

[CI] Added CircleCI pipeline to test compatibility across supported gym versions by @ordinskiy in https://github.com/pytorch/rl/pull/645

[BugFix] ConvNet forward method with tensors of more than 4 dimensions by @albertbou92 in https://github.com/pytorch/rl/pull/686

[Feature] add standard_normal for RewardScaling by @adityagandhamal in https://github.com/pytorch/rl/pull/682

[Feature] Jumanji envs by @yingchenlin in https://github.com/pytorch/rl/pull/674

[Feature] Default collate_fn by @vmoens in https://github.com/pytorch/rl/pull/688

[BugFix] Fix Examples by @vmoens in https://github.com/pytorch/rl/pull/687

[Refactoring] Replace direct gym version checks with decorated functions (#) by @ordinskiy in https://github.com/pytorch/rl/pull/691

New Contributors

@ajhinsvark made their first contribution in https://github.com/pytorch/rl/pull/257

@ramonmedel made their first contribution in https://github.com/pytorch/rl/pull/296

@srikanthmg85 made their first contribution in https://github.com/pytorch/rl/pull/302

@rmartimov made their first contribution in https://github.com/pytorch/rl/pull/304

@nairbv made their first contribution in https://github.com/pytorch/rl/pull/306

@benoitdescamps made their first contribution in https://github.com/pytorch/rl/pull/299

@yoavnavon made their first contribution in https://github.com/pytorch/rl/pull/316

@bamaxw made their first contribution in https://github.com/pytorch/rl/pull/319

@alexanderlobov made their first contribution in https://github.com/pytorch/rl/pull/331

@tongbaojia made their first contribution in https://github.com/pytorch/rl/pull/344

@omikad made their first contribution in https://github.com/pytorch/rl/pull/358

@jaschmid-fb made their first contribution in https://github.com/pytorch/rl/pull/356

@guabao made their first contribution in https://github.com/pytorch/rl/pull/379

@reachsumit made their first contribution in https://github.com/pytorch/rl/pull/408

@matt-fff made their first contribution in https://github.com/pytorch/rl/pull/410

@flinder made their first contribution in https://github.com/pytorch/rl/pull/402

@fdabek1 made their first contribution in https://github.com/pytorch/rl/pull/412

@AnshulSehgal made their first contribution in https://github.com/pytorch/rl/pull/409

@yushiyangk made their first contribution in https://github.com/pytorch/rl/pull/422

@bashnick made their first contribution in https://github.com/pytorch/rl/pull/427

@zeenolife made their first contribution in https://github.com/pytorch/rl/pull/404

@nicolasgriffiths made their first contribution in https://github.com/pytorch/rl/pull/424

@agrotov made their first contribution in https://github.com/pytorch/rl/pull/436

@ronert made their first contribution in https://github.com/pytorch/rl/pull/440

@ggimler3 made their first contribution in https://github.com/pytorch/rl/pull/449

@ymwdalex made their first contribution in https://github.com/pytorch/rl/pull/443

@sladebot made their first contribution in https://github.com/pytorch/rl/pull/435

@rayanht made their first contribution in https://github.com/pytorch/rl/pull/432

@brandonsj made their first contribution in https://github.com/pytorch/rl/pull/475

@ordinskiy made their first contribution in https://github.com/pytorch/rl/pull/485

@JasonMa2016 made their first contribution in https://github.com/pytorch/rl/pull/487

@himjohntang made their first contribution in https://github.com/pytorch/rl/pull/477

@romainjln made their first contribution in https://github.com/pytorch/rl/pull/512

@apbard made their first contribution in https://github.com/pytorch/rl/pull/526

@sicong-huang made their first contribution in https://github.com/pytorch/rl/pull/522

@psolikov made their first contribution in https://github.com/pytorch/rl/pull/566

@jrobine made their first contribution in https://github.com/pytorch/rl/pull/571

@nikhlrao made their first contribution in https://github.com/pytorch/rl/pull/573

@sgrigory made their first contribution in https://github.com/pytorch/rl/pull/580

@jlesuffleur made their first contribution in https://github.com/pytorch/rl/pull/589

@artkorenev made their first contribution in https://github.com/pytorch/rl/pull/593

@paulomarciano made their first contribution in https://github.com/pytorch/rl/pull/614

@hatala91 made their first contribution in https://github.com/pytorch/rl/pull/622

@jgonik made their first contribution in https://github.com/pytorch/rl/pull/608

@adityagandhamal made their first contribution in https://github.com/pytorch/rl/pull/637

@svarolgunes made their first contribution in https://github.com/pytorch/rl/pull/632

@adityagoel4512 made their first contribution in https://github.com/pytorch/rl/pull/615

@jasonfkut made their first contribution in https://github.com/pytorch/rl/pull/641

@sardaankita made their first contribution in https://github.com/pytorch/rl/pull/656

@albertbou92 made their first contribution in https://github.com/pytorch/rl/pull/655

@yingchenlin made their first contribution in https://github.com/pytorch/rl/pull/674

Full Changelog: https://github.com/pytorch/rl/compare/v0.0.1...0.0.3
Source code(tar.gz)
Source code(zip)
torchrl-batch.whl.zip(16.90 MB)
v0.0.2a(Sep 17, 2022)
What's Changed

[BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/328

[BugFix] functorch installation in CircleCI by @vmoens in https://github.com/facebookresearch/rl/pull/336

[Refactor] VecNorm inference API by @vmoens in https://github.com/facebookresearch/rl/pull/337

TransformedEnv sets added Transforms into eval mode by @alexanderlobov in https://github.com/facebookresearch/rl/pull/331

[Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/334

[CircleCI] Fix dm_control rendering by @vmoens in https://github.com/facebookresearch/rl/pull/339

[BugFix]: joining processes when they're done by @vmoens in https://github.com/facebookresearch/rl/pull/311

[Test] pass the OS error in case the file isn't closed by @tongbaojia in https://github.com/facebookresearch/rl/pull/344

[Feature] Make default rollout tensordict contiguous by @vmoens in https://github.com/facebookresearch/rl/pull/343

[BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in https://github.com/facebookresearch/rl/pull/340

[CI] Using latest gym by @vmoens in https://github.com/facebookresearch/rl/pull/346

[Doc] Coding your first DDPG tutorial by @vmoens in https://github.com/facebookresearch/rl/pull/345

[Doc] Minor: typos in DDPG by @vmoens in https://github.com/facebookresearch/rl/pull/354

[Feature] Register lambda and gamma in buffers by @vmoens in https://github.com/facebookresearch/rl/pull/353

[Feature] Implement eq for TensorSpec by @omikad in https://github.com/facebookresearch/rl/pull/358

[Doc] Multi-tasking tutorial by @vmoens in https://github.com/facebookresearch/rl/pull/352

[Feature] Env refactoring for model based RL by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/315

[Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/332

[BugFix] Add lock to vec norm transform by @jaschmid-fb in https://github.com/facebookresearch/rl/pull/356

[Perf]: Improve PPO training performance by @vmoens in https://github.com/facebookresearch/rl/pull/297

[BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/facebookresearch/rl/pull/361

Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in https://github.com/facebookresearch/rl/pull/362

[BugFix] Functorch-Tensordict bug fixes by @vmoens in https://github.com/facebookresearch/rl/pull/363

[Feature] CSVLogger (ABBANDONED) by @vmoens in https://github.com/facebookresearch/rl/pull/371

[Feature] Support tensor-based decay in TD-lambda by @tcbegley in https://github.com/facebookresearch/rl/pull/360

[Feature] CSVLogger by @vmoens in https://github.com/facebookresearch/rl/pull/372

[BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in https://github.com/facebookresearch/rl/pull/378

[Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in https://github.com/facebookresearch/rl/pull/379

[BugFix] Representation of indexed nested tensordict by @vmoens in https://github.com/facebookresearch/rl/pull/370

[BugFix] In-place __setitem__ for SubTensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/369

[Feature] Add ProbabilisticTensorDictModule dist key mapping support by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/376

[Feature]: R3M integration by @vmoens in https://github.com/facebookresearch/rl/pull/321

[Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in https://github.com/facebookresearch/rl/pull/385

[Feature] AdditiveGaussian exploration strategy by @vmoens in https://github.com/facebookresearch/rl/pull/388

[Feature] Multi-images R3M by @vmoens in https://github.com/facebookresearch/rl/pull/389

[Feature] Flatten multi-images in R3M by @vmoens in https://github.com/facebookresearch/rl/pull/391

[Quality] Code cleanup for fbsync by @vmoens in https://github.com/facebookresearch/rl/pull/392

[Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/387

[Quality] Code cleanup for fbsync by @vmoens in https://github.com/facebookresearch/rl/pull/397

[Doc] Add charts to examples by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/374

[Feature] Vectorized GAE by @vmoens in https://github.com/facebookresearch/rl/pull/365

[BugFix] Temporarily fix gym to 0.25.1 to fix CI by @vmoens in https://github.com/facebookresearch/rl/pull/411

[Feature] Create a Squeeze transform and update Unsqueeze transform by @reachsumit in https://github.com/facebookresearch/rl/pull/408

[Naming] Recurse kwarg to match pytorch by @matt-fff in https://github.com/facebookresearch/rl/pull/410

[Feature] Add all implemented loggers to the init of loggers by @flinder in https://github.com/facebookresearch/rl/pull/402

[BugFix] Fix gym 0.26 compatibility by @vmoens in https://github.com/facebookresearch/rl/pull/403

[BugFix] Remove submodules by @vmoens in https://github.com/facebookresearch/rl/pull/414

[Feature] lock tensordict when calling share_memory_() by @fdabek1 in https://github.com/facebookresearch/rl/pull/412

[BugFix] Updated TensorDict.expand to work as Tensor.expand by @AnshulSehgal in https://github.com/facebookresearch/rl/pull/409

[BugFix] Looser check for test_recorder assertion by @vmoens in https://github.com/facebookresearch/rl/pull/415

[Feature] Allow spec to be passed directly to exploration wrappers by @vmoens in https://github.com/facebookresearch/rl/pull/418

[BugFix] Collector revert to default exploration mode if empty string is passed by @vmoens in https://github.com/facebookresearch/rl/pull/421

[Naming] Rename _TargetNetUpdate to TargetNetUpdater, making it public by @yushiyangk in https://github.com/facebookresearch/rl/pull/422

[Doc] Re-run tutorials by @vmoens in https://github.com/facebookresearch/rl/pull/381

Revert "[Doc] Re-run tutorials" (colab links broken) by @vmoens in https://github.com/facebookresearch/rl/pull/423

[Feature] Switch back to latest gym by @vmoens in https://github.com/facebookresearch/rl/pull/425

[Feature] TensorDict without device by @tcbegley in https://github.com/facebookresearch/rl/pull/413

Updated the README.md file by @bashnick in https://github.com/facebookresearch/rl/pull/427

[Feature] Adding support for initialising TensorDicts from nested dicts by @zeenolife in https://github.com/facebookresearch/rl/pull/404

[Features] Make image_size a cfg param by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/430

Make TensorDict.expand accept Sequence arguments by @nicolasgriffiths in https://github.com/facebookresearch/rl/pull/424

[Doc] Readme revamp for efficiency/modularity display by @vmoens in https://github.com/facebookresearch/rl/pull/382

[Feature] New biased_softplus semantic to allow for minimum scale setting by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/428

[Tutorial] Re-run tutos by @vmoens in https://github.com/facebookresearch/rl/pull/434

[BugFix] mixed device_safe vs device by @vmoens in https://github.com/facebookresearch/rl/pull/429

[BugFix] Explicit params and buffers by @agrotov in https://github.com/facebookresearch/rl/pull/436

[BugFix] Fixed Additive noise by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/441

[Tests] Test loggers video saving by @bashnick in https://github.com/facebookresearch/rl/pull/439

Revert "[BugFix] Fixed Additive noise" by @vmoens in https://github.com/facebookresearch/rl/pull/442

[Refactor] Rename TensorDictSequence to TensorDictSequential by @ronert in https://github.com/facebookresearch/rl/pull/440

[Refactor] Refactoring set*() methods for TensorDictBase class by @zeenolife in https://github.com/facebookresearch/rl/pull/438

[Cleanup] Removing gym-retro interface by @vmoens in https://github.com/facebookresearch/rl/pull/444

[BugFix]: Fix additive noise by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/447

[BugFix] CatTensors: Prepended next_ to the out_key by @ggimler3 in https://github.com/facebookresearch/rl/pull/449

[BugFix] Fix AdditiveGaussian exploration tests by @vmoens in https://github.com/facebookresearch/rl/pull/450

[BugFix] Wrong call to device_safe in replay buffer code by @vmoens in https://github.com/facebookresearch/rl/pull/454

[BugFix] Add transform_observation_spec _R3MNet by @ymwdalex in https://github.com/facebookresearch/rl/pull/443

[Doc] Add a knowledge base by @shagunsodhani in https://github.com/facebookresearch/rl/pull/375

[Feature] Allow for actions and rewards to be in the reset tensordict by @vmoens in https://github.com/facebookresearch/rl/pull/458

[Doc] Readme for knowledge base by @vmoens in https://github.com/facebookresearch/rl/pull/459

[Feature] Added batch_lock attribute in EnvBase by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/399

[BugFix] deepcopy specs before transforming by @vmoens in https://github.com/facebookresearch/rl/pull/461

[BugFix]: Fixed dm_control action type casting by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/463

[Versioning] Version 0.0.2a0 by @vmoens in https://github.com/facebookresearch/rl/pull/465

New Contributors

@alexanderlobov made their first contribution in https://github.com/facebookresearch/rl/pull/331

@tongbaojia made their first contribution in https://github.com/facebookresearch/rl/pull/344

@omikad made their first contribution in https://github.com/facebookresearch/rl/pull/358

@jaschmid-fb made their first contribution in https://github.com/facebookresearch/rl/pull/356

@tcbegley made their first contribution in https://github.com/facebookresearch/rl/pull/360

@guabao made their first contribution in https://github.com/facebookresearch/rl/pull/379

@reachsumit made their first contribution in https://github.com/facebookresearch/rl/pull/408

@matt-fff made their first contribution in https://github.com/facebookresearch/rl/pull/410

@flinder made their first contribution in https://github.com/facebookresearch/rl/pull/402

@fdabek1 made their first contribution in https://github.com/facebookresearch/rl/pull/412

@AnshulSehgal made their first contribution in https://github.com/facebookresearch/rl/pull/409

@yushiyangk made their first contribution in https://github.com/facebookresearch/rl/pull/422

@bashnick made their first contribution in https://github.com/facebookresearch/rl/pull/427

@zeenolife made their first contribution in https://github.com/facebookresearch/rl/pull/404

@nicolasgriffiths made their first contribution in https://github.com/facebookresearch/rl/pull/424

@agrotov made their first contribution in https://github.com/facebookresearch/rl/pull/436

@ronert made their first contribution in https://github.com/facebookresearch/rl/pull/440

@ggimler3 made their first contribution in https://github.com/facebookresearch/rl/pull/449

@ymwdalex made their first contribution in https://github.com/facebookresearch/rl/pull/443

Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1c...v0.0.2a
Source code(tar.gz)
Source code(zip)
v0.0.1c(Jul 25, 2022)
What's Changed

Adding additional checks to TensorDict.view to remove unnecessary ViewedTensorDict object creation by @bamaxw in https://github.com/facebookresearch/rl/pull/319

[BugFix]: Safe state normalization when std=0 by @vmoens in https://github.com/facebookresearch/rl/pull/323

[BugFix]: gradient propagation in advantage estimates by @vmoens in https://github.com/facebookresearch/rl/pull/322

[BugFix]: make training example gracefully exit by @vmoens in https://github.com/facebookresearch/rl/pull/326

[Setup]: Exclude tutorials from wheels by @vmoens in https://github.com/facebookresearch/rl/pull/325

[BugFix]: Tensor map for subtensordict.set_ by @vmoens in https://github.com/facebookresearch/rl/pull/324

[Release]: Wheels v0.0.1c by @vmoens in https://github.com/facebookresearch/rl/pull/327

New Contributors

@bamaxw made their first contribution in https://github.com/facebookresearch/rl/pull/319

Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1b...v0.0.1c
Source code(tar.gz)
Source code(zip)
v0.0.1b(Jul 25, 2022)
Highlights

Supports nested tensordicts:

[Feature] Nested tensordicts by @vmoens in https://github.com/facebookresearch/rl/pull/256

[Feature]: Index nested tensordicts using tuples by @vmoens in https://github.com/facebookresearch/rl/pull/262

[Feature]: flatten nested tensordicts by @vmoens in https://github.com/facebookresearch/rl/pull/264

Padding for tensordicts:

[Feature] Implement padding for tensordicts by @ajhinsvark in https://github.com/facebookresearch/rl/pull/257

Speed improvements:

[Feature]: faster meta-tensor API for TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/272

[Feature]: faster safetanh transform via C++ bindings by @vmoens in https://github.com/facebookresearch/rl/pull/289

[Feature]: Improving training efficiency by @vmoens in https://github.com/facebookresearch/rl/pull/293

Logging capabilities:

[Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/270

[Feature] Wandb logger by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/274

Doc

[Doc]: TorchRL demo by @vmoens in https://github.com/facebookresearch/rl/pull/284

[Doc] Added TensorDict tutorial by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/255

[Doc] Tensordictmodule tutorial by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/267

What's Changed

MacOs versioning and release bugfix by @vmoens in https://github.com/facebookresearch/rl/pull/247

Setup metadata by @vmoens in https://github.com/facebookresearch/rl/pull/248

Fix setup instructions by @vmoens in https://github.com/facebookresearch/rl/pull/250

Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in https://github.com/facebookresearch/rl/pull/251

Added test for RewardRescale transform by @nicolas-dufour in https://github.com/facebookresearch/rl/pull/252

Empty TensorDict population in loops by @vmoens in https://github.com/facebookresearch/rl/pull/253

Memmap del bugfix by @vmoens in https://github.com/facebookresearch/rl/pull/254

[BugFix]: recursion error when calling permute(...).to_tensordict() by @vmoens in https://github.com/facebookresearch/rl/pull/260

Differentiable PPOLoss for IRL by @vmoens in https://github.com/facebookresearch/rl/pull/240

[BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in https://github.com/facebookresearch/rl/pull/261

[Feature] Add issue and pull request template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/263

[Test]: test nested CompositeSpec by @vmoens in https://github.com/facebookresearch/rl/pull/265

[Test]: test squeezed TensorDict by @vmoens in https://github.com/facebookresearch/rl/pull/269

[Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in https://github.com/facebookresearch/rl/pull/268

Refactor the torch.stack with destination by @khmigor in https://github.com/facebookresearch/rl/pull/245

Small tweaks to make the replay buffer code more consistent by @shagunsodhani in https://github.com/facebookresearch/rl/pull/275

[BugFix]: Minor bugs in docstrings by @vmoens in https://github.com/facebookresearch/rl/pull/276

[BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/286

[BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in https://github.com/facebookresearch/rl/pull/283

[Doc]: remove pip install from CONTRIBUTING.md by @vmoens in https://github.com/facebookresearch/rl/pull/288

[BugFix]: fix GLFW3 error when installing dm_control by @vmoens in https://github.com/facebookresearch/rl/pull/291

[BugFix]: Fix examples by @vmoens in https://github.com/facebookresearch/rl/pull/290

[Doc] Simplify PR template by @vmoens in https://github.com/facebookresearch/rl/pull/292

[BugFix]: Replay buffer bugfixes by @vmoens in https://github.com/facebookresearch/rl/pull/294

[Doc] MacOs M1 troubleshooting by @ramonmedel in https://github.com/facebookresearch/rl/pull/296

[QuickFix]: update issue and pr template by @Benjamin-eecs in https://github.com/facebookresearch/rl/pull/303

[Test] tests for BinarizeReward by @srikanthmg85 in https://github.com/facebookresearch/rl/pull/302

[BugFix]: L2-priority for PRB by @vmoens in https://github.com/facebookresearch/rl/pull/305

[Feature] Transforms: Compose.insert and TransformedEnv.insert_transform by @rmartimov in https://github.com/facebookresearch/rl/pull/304

[BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in https://github.com/facebookresearch/rl/pull/306

[BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in https://github.com/facebookresearch/rl/pull/307

ufmt issue if imports in order requested by distutils by @nairbv in https://github.com/facebookresearch/rl/pull/308

[BugFix]: Conda to pip for circleci by @vmoens in https://github.com/facebookresearch/rl/pull/310

[BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in https://github.com/facebookresearch/rl/pull/299

[Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in https://github.com/facebookresearch/rl/pull/295

[Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in https://github.com/facebookresearch/rl/pull/316

New Contributors

@nicolas-dufour made their first contribution in https://github.com/facebookresearch/rl/pull/252

@ajhinsvark made their first contribution in https://github.com/facebookresearch/rl/pull/257

@ramonmedel made their first contribution in https://github.com/facebookresearch/rl/pull/296

@srikanthmg85 made their first contribution in https://github.com/facebookresearch/rl/pull/302

@rmartimov made their first contribution in https://github.com/facebookresearch/rl/pull/304

@nairbv made their first contribution in https://github.com/facebookresearch/rl/pull/306

@benoitdescamps made their first contribution in https://github.com/facebookresearch/rl/pull/299

@yoavnavon made their first contribution in https://github.com/facebookresearch/rl/pull/316

Full Changelog: https://github.com/facebookresearch/rl/compare/v0.0.1...v0.0.1b
Source code(tar.gz)
Source code(zip)
v0.0.1(Jul 6, 2022)
TorchRL Initial Alpha Release

TorchRL is the soon-to-be official RL domain library for PyTorch. It contains primitives that are aimed at covering most of the modern RL research space.

Getting started with the library

Installation

The library can be installed through

$ pip install torchrl

Currently, torchrl wheels are provided for linux and macos (not M1) machines. For other architectures or for the latest features, refer to the README.md and CONTRIBUTING.md files for advanced installation instructions.

Environments

TorchRL currently supports gym and dm_control out-of-the-box. To create a gym wrapped environment, simply use

from torchrl.envs import GymEnv, GymWrapper env = GymEnv("Pendulum-v1") # similarly env = GymWrapper(gym.make("Pendulum-v1"))

Environment can be transformed using the torchrl.envs.transforms module. See the environment tutorial for more information. The ParallelEnv allows to run multiple environments in parallel.

Policy and modules

TorchRL modules interacts using TensorDict, a new data carrier class. Although it is not necessary to use it and one can find workarounds for it, we advise to use the TensorDictModule class to read tensordicts:

from torchrl.modules import TensorDictModule >>> policy_module = nn.Linear(n_obs, n_act) >>> policy = TensorDictModule(policy_module, ... in_keys=["observation"], # keys to be read for the module input ... out_keys=["action"], # keys to be written with the module output ) >>> tensordict = env.reset() >>> tensordict = policy(tensordict) >>> action = tensordict["action"]

By using TensorDict and TensorDictModule, you can make sure that your algorithm is robust to changes in configuration (e.g. usage of an RNN for the policy, exploration strategies etc.) TensorDict instances can be reshaped in several ways, cast to device, updated, shared among processes, stacked, concatenated etc.

Some specialized TensorDictModule are implemented for convenience: Actor, ProbabilisticActor, ValueOperator, ActorCriticOperator, ActorCriticWrapper and QValueActor can be found in actors.py.

Collecting data

DataColllectors is the TorchRL data loading class family. We provide single process, sync and async multiprocess loaders. We also provide ReplayBuffers that can be stored in memory or on disk using the various storage options.

Loss modules and advantage computation

Loss modules are provided for each algorithm class independently. They are accompanied by efficient implementations of value and advantage computation functions. TorchRL is devoted to be fully compatible with functorch, the functional programming PyTorch library.

Examples

A bunch of examples are provided as well. Check the examples directory to learn more about exploration strategies, loss modules etc.
Source code(tar.gz)
Source code(zip)