RLHive: a framework designed to facilitate research in reinforcement learning.

Related tags

Deep Learning RLHive
Overview

Python unit tests for Hive Black Linter

Installing | Tutorials | Contributing

te

RLHive

RLHive is a framework designed to facilitate research in reinforcement learning. It provides the components necessary to run a full RL experiment, for both single agent and multi agent environments. It is designed to be readable and easily extensible, to allow users to quickly run and experiment with their own ideas.

The full documentation and tutorials are available at https://rlhive.readthedocs.io/.

Installing

RLHive is available through pip! For the basic RLHive package, simply run pip install rlhive.

You can also install dependencies necessary for the environments that RLHive comes with by running pip install rlhive[ ] where is a comma separated list made up of the following:

  • atari
  • gym_minigrid
  • pettingzoo

In addition to these environments, Minatar and Marlgrid are also supported, but need to be installed separately.

To install Minatar, run pip install MinAtar@git+https://github.com/kenjyoung/MinAtar.git@8b39a18a60248ede15ce70142b557f3897c4e1eb

To install Marlgrid, run pip install marlgrid@https://github.com/kandouss/marlgrid/archive/refs/heads/master.zip

Tutorials

Contributing

We'd love for you to contribute your own work to RLHive. Before doing so, please read our contributing guide.

Comments
  • Frame Stacking is still not working in `conv.py`

    Frame Stacking is still not working in `conv.py`

    `super().init() self._normalization_factor = normalization_factor

        if isinstance(kernel_sizes, int):
            kernel_sizes = [kernel_sizes] * len(channels)
        if isinstance(strides, int):
            strides = [strides] * len(channels)
        if isinstance(paddings, int):
            paddings = [paddings] * len(channels)
    
        assert len(channels) == len(kernel_sizes)
        assert len(channels) == len(strides)
        assert len(channels) == len(paddings)
    
        c, h, w = in_dim`
    

    The c variable should be the number of channels from the env multiplied by the frame stacking variable.

    opened by alirahkay 3
  • Added buffer integration

    Added buffer integration

    Added buffer integration. Involved changing signature of original buffer, updating dqn interface with buffer, and runner interface with dqn. Also updated relevant tests.

    Efficient buffer cartpole run: https://wandb.ai/dapatil211/Hive-v1/reports/Snapshot-May-14-2021-2-13pm--Vmlldzo2OTEzNDY?accessToken=rr30zh1dje15nvv5nffxfmzjtkcjor849f0q0oydr2hrtcdnx3r9eqdjohxafau3

    feature request 
    opened by dapatil211 3
  • Bug with testing episodes

    Bug with testing episodes

    I believe this should be < rather than <=, no?

    https://github.com/chandar-lab/RLHive/blob/c5b1b77a3daf87aeb877e10c4f65bcd7e2bc1ff8/hive/runners/base.py#L159

    opened by hnekoeiq 2
  • Adding Switch env

    Adding Switch env

    This PR adds the switch env based on this paper (https://arxiv.org/abs/1706.05296). Defining MultiGridEnvHive was needed to add the full-observability feature to marlgrid. It also helps us to add more features on top of MultiGridEnv in the future if we want.

    The structure of the environment: image

    Some initial experiments: https://wandb.ai/nekoeiha/Hive-v1/reports/Untitled-Report--Vmlldzo4ODY3OTM?accessToken=m17fmcgbnuw0ifxbr708iwtfykf3skksowbl2dghnriyvby6ni48ymip5mfgg96f

    opened by hnekoeiq 2
  • CI/CD dependecy installation

    CI/CD dependecy installation

    The current CI/CD installs the latest available version of torch to run the tests. So features not available in older versions are also passed. See #280 #282

    opened by sriyash421 1
  • Add num_updates_per_train_step hyperparameter

    Add num_updates_per_train_step hyperparameter

    Previously we did not have this hyperparameter. It's needed in the Atari100K benchmark for the OverTrainedRainbow. The reason for adding to the general algorithms is that I think it's kind of an important hyperparameter in RL algorithms in general. I'll be glad to hear your comments on this.

    opened by alirahkay 1
  • Visualization: having a cmap_name in plot_results kwargs is restrictive.

    Visualization: having a cmap_name in plot_results kwargs is restrictive.

    While working on a recent assignment for 8250, I noticed that the cmap_name in the plot_results function kwargs is overly restrictive.

    My use case: Imagine a user wants to use a custom colormap to visualize model runs, with blues corresponding to one set of seeded runs, reds to another, and so on.

    This is not possible in the current code, which only allows use of default matplotlib cmaps via their string name.

    opened by JakeColor 0
  • Visualization: trying to overwrite previous images gives misleading error

    Visualization: trying to overwrite previous images gives misleading error

    While working on a recent assignment for 8250, I noticed that

    Running the same visualization code twice using hive.utils visualization function throws a misleading NotADirectoryError error if the image path already exists. (my use case was overwriting the existing image with new results)

    opened by JakeColor 0
Owner
null
Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory >= 8G Numpy > 1.

null 46 Dec 14, 2022
A web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks

This project is a web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks. Thanks for NVlabs' excelle

K.L. 150 Dec 15, 2022
StocksMA is a package to facilitate access to financial and economic data of Moroccan stocks.

Creating easier access to the Moroccan stock market data What is StocksMA ? StocksMA is a package to facilitate access to financial and economic data

Salah Eddine LABIAD 28 Jan 4, 2023
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 7, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services. This library is developed by Bandit ML and ex-authors of Facebook's applied reinforcement learning platform, Reagent.

Bandit ML 51 Dec 22, 2022
Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Pacman AI Jussi Doherty CAP 4601 - Introduction to Artificial Intelligence - Fall 2020 Python version 3.0+ Source of this project This repo contains a

Jussi Doherty 1 Jan 3, 2022
Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph

Aleksei Petrenko 191 Dec 23, 2022
SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

null 59 Feb 25, 2022
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Facebook Research 338 Dec 29, 2022
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 7, 2023
Learning to trade under the reinforcement learning framework

Trading Using Q-Learning In this project, I will present an adaptive learning model to trade a single stock under the reinforcement learning framework

Uirá Caiado 470 Nov 28, 2022
Plato: A New Framework for Federated Learning Research

a new software framework to facilitate scalable federated learning research.

System Group@Theory Lab 192 Jan 5, 2023
MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ranging from simulation engineering up to agent development, training and deployment.

EnliteAI GmbH 222 Dec 24, 2022
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1

Zhengyao Jiang 1.5k Dec 29, 2022
A parallel framework for population-based multi-agent reinforcement learning.

MALib: A parallel framework for population-based multi-agent reinforcement learning MALib is a parallel framework of population-based learning nested

MARL @ SJTU 348 Jan 8, 2023
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI Gym toolkit.

Robin Henry 99 Dec 12, 2022