RLHive: a framework designed to facilitate research in reinforcement learning.

Last update: Jan 5, 2023

Related tags

Deep Learning RLHive

Overview

Installing | Tutorials | Contributing

RLHive

RLHive is a framework designed to facilitate research in reinforcement learning. It provides the components necessary to run a full RL experiment, for both single agent and multi agent environments. It is designed to be readable and easily extensible, to allow users to quickly run and experiment with their own ideas.

The full documentation and tutorials are available at https://rlhive.readthedocs.io/.

Installing

RLHive is available through pip! For the basic RLHive package, simply run pip install rlhive.

You can also install dependencies necessary for the environments that RLHive comes with by running pip install rlhive[ ] where is a comma separated list made up of the following:

atari
gym_minigrid
pettingzoo

In addition to these environments, Minatar and Marlgrid are also supported, but need to be installed separately.

To install Minatar, run pip install MinAtar@git+https://github.com/kenjyoung/MinAtar.git@8b39a18a60248ede15ce70142b557f3897c4e1eb

To install Marlgrid, run pip install marlgrid@https://github.com/kandouss/marlgrid/archive/refs/heads/master.zip

Tutorials

Contributing

We'd love for you to contribute your own work to RLHive. Before doing so, please read our contributing guide.

Comments

Frame Stacking is still not working in `conv.py`

`super().init() self._normalization_factor = normalization_factor

    if isinstance(kernel_sizes, int):
        kernel_sizes = [kernel_sizes] * len(channels)
    if isinstance(strides, int):
        strides = [strides] * len(channels)
    if isinstance(paddings, int):
        paddings = [paddings] * len(channels)

    assert len(channels) == len(kernel_sizes)
    assert len(channels) == len(strides)
    assert len(channels) == len(paddings)

    c, h, w = in_dim`

The c variable should be the number of channels from the env multiplied by the frame stacking variable.

opened by alirahkay 3

Added buffer integration

Added buffer integration. Involved changing signature of original buffer, updating dqn interface with buffer, and runner interface with dqn. Also updated relevant tests.

Efficient buffer cartpole run: https://wandb.ai/dapatil211/Hive-v1/reports/Snapshot-May-14-2021-2-13pm--Vmlldzo2OTEzNDY?accessToken=rr30zh1dje15nvv5nffxfmzjtkcjor849f0q0oydr2hrtcdnx3r9eqdjohxafau3
feature request

opened by dapatil211 3
Bug with testing episodes

I believe this should be < rather than <=, no?

https://github.com/chandar-lab/RLHive/blob/c5b1b77a3daf87aeb877e10c4f65bcd7e2bc1ff8/hive/runners/base.py#L159

opened by hnekoeiq 2
Adding Switch env

This PR adds the switch env based on this paper (https://arxiv.org/abs/1706.05296). Defining MultiGridEnvHive was needed to add the full-observability feature to marlgrid. It also helps us to add more features on top of MultiGridEnv in the future if we want.

The structure of the environment:

Some initial experiments: https://wandb.ai/nekoeiha/Hive-v1/reports/Untitled-Report--Vmlldzo4ODY3OTM?accessToken=m17fmcgbnuw0ifxbr708iwtfykf3skksowbl2dghnriyvby6ni48ymip5mfgg96f

opened by hnekoeiq 2
CI/CD dependecy installation

The current CI/CD installs the latest available version of torch to run the tests. So features not available in older versions are also passed. See #280 #282

opened by sriyash421 1
Add num_updates_per_train_step hyperparameter

Previously we did not have this hyperparameter. It's needed in the Atari100K benchmark for the OverTrainedRainbow. The reason for adding to the general algorithms is that I think it's kind of an important hyperparameter in RL algorithms in general. I'll be glad to hear your comments on this.

opened by alirahkay 1
Visualization: having a cmap_name in plot_results kwargs is restrictive.

While working on a recent assignment for 8250, I noticed that the cmap_name in the plot_results function kwargs is overly restrictive.

My use case: Imagine a user wants to use a custom colormap to visualize model runs, with blues corresponding to one set of seeded runs, reds to another, and so on.

This is not possible in the current code, which only allows use of default matplotlib cmaps via their string name.

opened by JakeColor 0
Visualization: trying to overwrite previous images gives misleading error

While working on a recent assignment for 8250, I noticed that

Running the same visualization code twice using hive.utils visualization function throws a misleading NotADirectoryError error if the image path already exists. (my use case was overwriting the existing image with new results)

opened by JakeColor 0

RLHive: a framework designed to facilitate research in reinforcement learning.

Related tags

Overview

RLHive

Installing

Tutorials

Contributing

Comments

Frame Stacking is still not working in `conv.py`

Added buffer integration

Bug with testing episodes

Adding Switch env

CI/CD dependecy installation

Add num_updates_per_train_step hyperparameter

Visualization: having a cmap_name in plot_results kwargs is restrictive.

Visualization: trying to overwrite previous images gives misleading error

Owner

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

A web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks

StocksMA is a package to facilitate access to financial and economic data of Moroccan stocks.

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Learning to trade under the reinforcement learning framework

Plato: A New Framework for Federated Learning Research

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

A parallel framework for population-based multi-agent reinforcement learning.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.