MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Facebook Research

Last update: Dec 29, 2022

Related tags

Deep Learning minihack

Overview

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

MiniHack is a sandbox framework for easily designing rich and diverse environments for Reinforcement Learning (RL). Based on the game of NetHack, arguably the hardest grid-based game in the world, MiniHack uses the NetHack Learning Environment (NLE) to communicate with the game and provide a convenient interface for customly created RL testbeds.

MiniHack already comes with a large list of challenging tasks. However, it is primarily built for easily designing new ones. The motivation behind MiniHack is to be able to perform RL experiments in a controlled setting while being able to increasingly scale the complexity of the tasks.

To this end, MiniHack leverages the description files of NetHack. The description files (or des-files) are human-readable specifications of levels: distributions of grid layouts together with monsters, objects on the floor, dungeon features, etc. The des-files can be compiled into binary using the NetHack level compiler, and MiniHack maps them to Gym environments. We refer users to our brief overview, detailed tutorial, or interactive notebook for further information on des-files.

Our documentation will walk you through everything you need to know about MiniHack, step-by-step, including information on how to get started, configure environments or design new ones, train baseline agents, and much more.

Installation

MiniHack is available on pypi and can be installed as follows:

pip install minihack

We advise using a conda environment for this:

conda create -n minihack python=3.8
conda activate minihack
pip install minihack

NOTE: NLE requires cmake>=3.15 to be installed when building the package. Check out here how to install it on MacOS and Ubuntu 18.04. Windows users should use Docker.

NOTE: Baseline agents have separate installation instructions. See here for more details.

Extending MiniHack

If you wish to extend MiniHack, please install the package as follows:

git clone https://github.com/facebookresearch/minihack
cd minihack
pip install -e ".[dev]"
pre-commit install

Docker

We have provided several Dockerfiles for building images with pre-installed MiniHack. Please follow the instructions described here.

Trying out MiniHack

MiniHack uses the popular Gym interface for the interactions between the agent and the environment. A pre-registered MiniHack environment can be used as follows:

import gym
import minihack
env = gym.make("MiniHack-River-v0")
env.reset() # each reset generates a new environment instance
env.step(1)  # move agent '@' north
env.render()

To see the list of all MiniHack environments, run:

python -m minihack.scripts.env_list

The following scripts allow to play MiniHack environments with a keyboard:

# Play the MiniHack in the Terminal as a human
python -m minihack.scripts.play --env MiniHack-River-v0

# Use a random agent
python -m minihack.scripts.play --env MiniHack-River-v0  --mode random

# Play the MiniHack with graphical user interface (gui)
python -m minihack.scripts.play_gui --env MiniHack-River-v0

NOTE: If the package has been properly installed one could run the scripts above with mh-envs, mh-play, and mh-guiplay commands.

Baseline Agents

In order to get started with MiniHack environments, we provide a variety of baselines agent integrations.

TorchBeast

A TorchBeast agent is bundled in minihack.agent.polybeast together with a simple model to provide a starting point for experiments. To install and train this agent, first install torchbeast by following the instructions here, then use the following commands:

pip install ".[polybeast]"
python -m minihack.agent.polybeast.polyhydra env=MiniHack-Room-5x5-v0 total_steps=100000

More information on running our TorchBeast agents, and instructions on how to reproduce the results of the paper, can be found here. The learning curves for all of our polybeast experiments can be accessed in our Weights&Biases repository.

RLlib

An RLlib agent is provided in minihack.agent.rllib, with a similar model to the torchbeast agent. This can be used to try out a variety of different RL algorithms. To install and train an RLlib agent, use the following commands:

pip install ".[rllib]"
python -m minihack.agent.rllib.train algo=dqn env=MiniHack-Room-5x5-v0 total_steps=1000000

More information on running RLlib agents can be found here.

Unsupervised Environment Design

MiniHack also enables research in Unsupervised Environment Design, whereby an adaptive task distribution is learned during training by dynamically adjusting free parameters of the task MDP. Check out the ucl-dark/paired repository for replicating the examples from the paper using the PAIRED.

Citation

If you use MiniHack in your work, please cite:

@inproceedings{samvelyan2021minihack,
  title={MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research},
  author={Mikayel Samvelyan and Robert Kirk and Vitaly Kurin and Jack Parker-Holder and Minqi Jiang and Eric Hambro and Fabio Petroni and Heinrich Kuttler and Edward Grefenstette and Tim Rockt{\"a}schel},
  booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
  year={2021},
  url={https://openreview.net/forum?id=skFwlyefkWJ}
}

If you use our example ported environments, please cite the original papers: MiniGrid (see license, bib), Boxoban (see license, bib).

Contributions and Maintenance

We welcome contributions to MiniHack. If you are interested in contributing, please see this document. Our maintenance plan can be found here.

Papers using the MiniHack

Powers et al. CORA: Benchmarks, Baselines, and a Platform for Continual Reinforcement Learning Agents (CMU, Georgia Tech, AI2, August 2021)
Samvelyan et al. MiniHack The Planet (FAIR, UCL, Oxford, NeurIPS 2021)

Open a pull request to add papers.

Comments

Manual pickup multiple items
🐛 Bug

When autopickup=True the agent will attempt to pickup all the objects at a location. If I set autopickup=False, I can use the Command.PICKUP/, command to pickup the item, but if there are multiple items at that locations nothing happens. I don't see any message or prompt either. If this isn't a bug is there a work around?

To Reproduce

Steps to reproduce the behavior:

Set autopickup=False

Spawn two different items at a single spot

Attempt to pickup using Command.PICKUP

Expected behavior

Nethack should return a prompt listing the objects available for pickup.

Environment

NLE version: 0.7.3 PyTorch version: 1.10.0+cu113 Is debug build: No CUDA used to build PyTorch: 11.3 OS: Ubuntu 20.04.2 LTS GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 CMake version: version 3.21.3 Python version: 3.8 Is CUDA available: Yes CUDA runtime version: Could not collect GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090 Nvidia driver version: 495.29.05 cuDNN version: Could not collect Versions of relevant libraries: [pip3] numpy==1.21.2 [pip3] torch==1.10.0+cu113 [pip3] torchtext==0.11.0 [conda] Could not collect

Additional context

Could be a problem with nle? If this isn't a bug is there a work around?
bug
opened by kolbytn 6
[BUG] Error creating environment or running mh-play (Mac OSX 12.6)
🐛 Bug

Can't create environment or run play scripts in MacOSX 12.6

To Reproduce

Steps to reproduce the behavior:

Install NLE 0.8.1 following workaround at https://github.com/facebookresearch/nle/issues/340

pip install minihack

mh-play leads to error: AttributeError: 'MiniHackRoom5x5Random' object has no attribute 'env'

python -m minihack.scripts.play --env MiniHack-River-v0 --mode random leads to similar error: AttributeError: 'MiniHackRiver' object has no attribute 'env'

Expected behavior

Environment created successfully

Environment

MiniHack version: 0.1.2 NLE version: 0.8.1+103c667 Gym version: 0.21.0 PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A

OS: Mac OSX 12.6 GCC version: Could not collect CMake version: version 3.24.2

Python version: 3.8 Is CUDA available: N/A CUDA runtime version: Could not collect GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect

Versions of relevant libraries: [pip3] numpy==1.23.3 [conda] Could not collect

Additional context

Used workaround for NLE install described here: https://github.com/facebookresearch/nle/issues/340 nle-play works as expected mh-env returns list of environments as expected. Same error encountered with python 3.9 & 3.10, and minihack version 0.1.3 (not tested all combinations)
bug
opened by tmssmith 5
#64 issue solved: Fix access from base.py to nethack _vardir

#64 issue solved: Fix access from base.py to nethack _vardir. I tested the modification on my local project. Seem that before this modification the library wasn't able to run because there was an access to self.env._vardir instead of self.nethack._vardir.
CLA Signed core

opened by GeremiaPompei 3
[BUG] minihack.scripts.play don't work on Debian 11
🐛 Bug

After installing minihack+nle to Debian 11 the following commands work:

import gym import minihack env = gym.make("MiniHack-River-v0") env.reset() # each reset generates a new environment instance env.step(1) # move agent '@' north env.render()

But when running

python3 -m minihack.scripts.play --env MiniHack-River-v0

The program fails to errors.

To Reproduce

Steps to reproduce the behavior:

Install Debian 11. Install minihack+nle (+deps) using pip install/apt-get commands.

python3 -m minihack.scripts.play --env MiniHack-River-v0

Error messages/traceback:

Traceback (most recent call last): File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.9/dist-packages/minihack/scripts/play.py", line 334, in main() File "/usr/local/lib/python3.9/dist-packages/minihack/scripts/play.py", line 330, in main play(**vars(flags)) File "/usr/local/lib/python3.9/dist-packages/minihack/scripts/play.py", line 123, in play print("Available actions:", env._actions) File "/home/optimus/.local/lib/python3.9/site-packages/gym/core.py", line 235, in getattr raise AttributeError( AttributeError: attempted to get missing private attribute '_actions'

Expected behavior

One should be able to play minihack using keyboard commands.

Environment

Collecting environment information...

MiniHack version: 0.1.1 NLE version: 0.7.3 PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A

OS: Debian GNU/Linux 11 (bullseye) GCC version: (Debian 10.2.1-6) 10.2.1 20210110 CMake version: version 3.18.4

Python version: 3.9 Is CUDA available: N/A CUDA runtime version: Could not collect GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect

Versions of relevant libraries: [pip3] msgpack-numpy==0.4.7.1 [pip3] numpy==1.19.5 [conda] Could not collect

Additional context

No Anaconda installed.
bug
opened by cslr 3
Is it possible to generate pixel images (ideally cropped) in a desired resolution?
Right now, I use opencv as

env = gym.make(env, observation_keys=("pixel_crop",), penalty_step=0.0) obs_dict, reward, done, info = env.step(action) image = cv2.resize(obs_dict['pixel_crop'], dsize=(64, 64), interpolation=cv2.INTER_LINEAR)

I'm wondering if it's possible to avoid this resizing by just directly rendering in the desired resolution.
enhancement
opened by wcarvalho 2
[BUG] Broken monster generation from des file
🐛 Bug

I'm trying to generate different levels using des files. It works fine when I'm using just map. But MONSTER brakes env: instead of my map it returns some different random levels

To Reproduce

Steps to reproduce the behavior:

Generate env with MONSTER

Try to render it with get_des_file_rendering

this generates random levels:

from minihack.tiles.rendering import get_des_file_rendering import IPython.display def render_des_file(des_file, **kwargs): image = get_des_file_rendering(des_file, **kwargs) IPython.display.display(image) des = """ MAZE: "mylevel", ' ' FLAGS:premapped GEOMETRY:center,center MAP ..... ..... L.... ..L.. |.... ENDMAP STAIR:(4, 4),down BRANCH: (0,0,0,0),(1,1,1,1) MONSTER:'v',"dust vortex",(0,4) """ render_des_file(des, n_images=2, full_screen=False)

this works ok:

des = """ MAZE: "mylevel", ' ' FLAGS:premapped GEOMETRY:center,center MAP ..... ..... L.... ..L.. |.... ENDMAP STAIR:(4, 4),down BRANCH: (0,0,0,0),(1,1,1,1) """ render_des_file(des, n_images=2, full_screen=False)

Expected behavior

Env consists of map described is des file

Environment

Collecting environment information... MiniHack version: 0.1.3 NLE version: 0.8.1 Gym version: 0.21.0 PyTorch version: 1.12.0+cu113 Is debug build: No CUDA used to build PyTorch: 11.3

OS: Ubuntu 18.04.5 LTS GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CMake version: version 3.22.5

Python version: 3.7 Is CUDA available: Yes CUDA runtime version: Could not collect GPU models and configuration: GPU 0: Tesla T4 Nvidia driver version: 460.32.03 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5 /usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5

Versions of relevant libraries: [pip3] numpy==1.21.6 [pip3] torch==1.12.0+cu113 [pip3] torchaudio==0.12.0+cu113 [pip3] torchsummary==1.5.1 [pip3] torchtext==0.13.0 [pip3] torchvision==0.13.0+cu113 [conda] Could not collect
bug
opened by salamantos 2

[BUG] Inconsistent environment seeding

🐛 Bug

Seeding doesn't consistently generate the same environment.

To Reproduce

Steps to reproduce the behavior:

Run this snippet repeatedly:

env = gym.make("MiniHack-KeyRoom-Fixed-S5-v0",
    observation_keys=("pixel", "colors", "chars", "glyphs", "tty_chars"),
    seeds=(42, 42, False))
env.seed(42, 42, False)
obs = env.reset()
env.render()
print(env.get_seeds())

Sometimes this prints

Hello Agent, welcome to NetHack!  You are a chaotic male human Rogue.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       ----                                     
                                       |..|                                     
                                       +(.|                                     
                                    ----..|                                     
                                    |.....|                                     
                                    |...@.|                                     
                                    -------                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
Agent the Footpad              St:18/02 Dx:18 Co:13 In:8 Wi:9 Ch:7 Chaotic S:0  
Dlvl:1 $:0 HP:12(12) Pw:2(2) AC:7 Xp:1/0                                        
(42, 42, False)

But also occasionally prints (note the printed seeds are (0, 0, False)):

Hello Agent, welcome to NetHack!  You are a chaotic male human Rogue.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       ----                                     
                                       |@.|                                     
                                       +..|                                     
                                       -..|                                     
                                        ..|                                     
                                        ..|                                     
                                       ----                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
Agent the Footpad              St:14 Dx:18 Co:14 In:11 Wi:11 Ch:8 Chaotic S:0   
Dlvl:1 $:0 HP:12(12) Pw:2(2) AC:7 Xp:1/0                                        
(0, 0, False)

Expected behavior

Same positions of agent/key, and same seeds being printed by env.get_seeds()

Environment


MiniHack version: 0.1.3+57ca418
NLE version: 0.8.1
Gym version: 0.21.0
PyTorch version: 1.11.0+cu102
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Ubuntu 20.04.3 LTS
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
CMake version: version 3.23.1

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: NVIDIA GeForce RTX 3080
GPU 1: NVIDIA GeForce RTX 3080

Nvidia driver version: 510.47.03
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.11.0
[conda] torch                     1.11.0                   pypi_0    pypi

bug

opened by jlin816 2

[FEATURE] Suggested MiniHack Editor Webpage tweaks

🚀 Feature

Could we populate the level with a standard level that demonstrates the individual building blocks (like the one used on the readme)? Could we also have a "clear level" button to get to the empty level that is currently shown when calling https://minihack-editor.github.io/
enhancement

opened by rockt 2
[BUG] No module named 'minihack.version' when importing

When installing via pip, you get an bug after importing: "ModuleNotFoundError: No module named 'minihack.version'". This can be resolved by adding a file "version.py" with the correct info to the install directory. (version = '0.1.3+4c398d4' git_version = '4c398d480eac26883104e867280d1d3ddbcb9a20' ).
bug

opened by nesou2 2
With it as-is, I get 'can only concatenate list (not tuple) to list'

I can't currently run the fb-internal minihack due to this bug. Here's the obvious fix; if there's one that's more suitable, let me know.

Basically what happened here was that when I switched from the public minihack to the fb internal one, I started getting this concat issue. I'm not 100% sure what changed, but the basic issue is that before, a list was acceptable as inputs to the observation keys, but now it isn't. By casting to the consistent type, both should be acceptable.
CLA Signed

opened by SamNPowers 2

[BUG] Minihack does not work with NLE v0.9.0

🐛 Bug

Minihack does not work with NLE v0.9.0

To Reproduce

Follow the Trying Out MiniHack example

[/usr/local/lib/python3.7/dist-packages/minihack/base.py](https://localhost:8080/#) in _patch_nhdat(self, des_file)
    366         """
    367         if not des_file.endswith(".des"):
--> 368             fpath = os.path.join(self.env._vardir, "mylevel.des")
    369             # If the des-file is passed as a string
    370             with open(fpath, "w") as f:

AttributeError: 'MiniHackRiver' object has no attribute 'env'

bug

opened by ngoodger 1

[FEATURE] monobeast baseline implementation

🚀 Feature

current polybeast implementation has most code written in C++, requesting for mnonobeast implementation for more clarity

Motivation

readability/flexibility

Pitch

monobeast implementation will offfer more readability and flexibility

Alternatives

N/A

Additional context

N/A
enhancement

opened by Andrewzh112 0

Releases(v0.1.4)

v0.1.4(Dec 9, 2022)
Installing MiniHack

Install with pip: pip install minihack==0.1.4.

See README.md for further instructions.

New in MiniHack v0.1.4

MiniHack version 0.1.4 (#67, @samvelyan)

Gym issue fix (#58, @samvelyan)

pushing the fix for more height in the logo (#49, @Bam4d)

[WIP] Bam4d/level editor (#46, @Bam4d)

📝 Documentation

Deleted level editor site code (#50, @samvelyan)

🔨 Maintenance

Fixing the seeding issue (#68, @samvelyan)

Fixing the NetHack variable renaming and _underscore access recently introduced in NLE==0.9.0 (#66, @samvelyan)

🎡 Environment

Fixing the NetHack variable renaming and _underscore access recently introduced in NLE==0.9.0 (#66, @samvelyan)

Fix forced actions (#55, @ian-cannon)

Source code(tar.gz)
Source code(zip)
v0.1.3(Mar 14, 2022)
Installing MiniHack

Install with pip: pip install minihack==0.1.3.

See README.md for further instructions.

New in MiniHack v0.1.3

📝 Documentation

MiniHack Environment Zoo (#38, @samvelyan)

🔨 Maintenance

A flag for including pet to the game (#40, @samvelyan)

🎡 Environment

Turned autopickup off for ExploreMaze envs (#45, @samvelyan)

Fixing boxoban level data path (#42, @samvelyan)

Source code(tar.gz)
Source code(zip)
v0.1.2(Nov 30, 2021)
Installing MiniHack

Install with pip: pip install minihack==0.1.2.

See README.md for further instructions.

New in MiniHack v0.1.2

Cached Environment Wrapper (#33, @samvelyan)

Printing the gym version in the collect_env script (#30, @samvelyan)

Update README.md (#24, @samvelyan)

Updating the PR labeler and Release Drafter (#23, @samvelyan)

📝 Documentation

Fixes to the documentation (#37, @samvelyan)

Updating docs (#25, @samvelyan)

Fixed Typo (#22, @mohamadmansourX)

🔨 Maintenance

Bump the MiniHack and NLE versions (#36, @samvelyan)

Supporting gym version 0.21.0 (#31, @samvelyan)

🎡 Environment

Fixing seeding in MiniGrid (#34, @samvelyan)

Supporting gym version 0.21.0 (#31, @samvelyan)

Source code(tar.gz)
Source code(zip)
v0.1.1(Sep 30, 2021)
Installing MiniHack

Install with pip: pip install minihack==0.1.1.

See README.md for further instructions.

New in MiniHack v0.1.1

Added a workflow for testing and pushing the releases to PyPI (#21, @samvelyan)

Importing Pillow whenever needed. (#20, @samvelyan)

Release drafter GitHub workflow (#19, @samvelyan)

Being able to save gifs when evaluating pre-trained agents (#18, @samvelyan)

Updating README (#16, @samvelyan)

Update MANIFEST.in (#15, @samvelyan)

📝 Documentation

Updating REAMDE (#17, @samvelyan)

Source code(tar.gz)
Source code(zip)

Owner

Facebook Research

GitHub

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

NIRPS-ETC Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph February 2

2 Sep 15, 2022

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

UNION Automatic Evaluation Metric described in the paper UNION: An UNreferenced MetrIc for Evaluating Open-eNded Story Generation (EMNLP 2020). Please

50 Dec 30, 2022

Open-Ended Commonsense Reasoning (NAACL 2021)

Open-Ended Commonsense Reasoning Quick links: [Paper] | [Video] | [Slides] | [Documentation] This is the repository of the paper, Differentiable Open-

31 Oct 19, 2022

Benchmark for evaluating open-ended generation

OpenMEVA Contributed by Jian Guan, Zhexin Zhang. Thank Jiaxin Wen for DeBugging. OpenMEVA is a benchmark for evaluating open-ended story generation me

25 Nov 15, 2022

Sandbox for training deep learning networks

Deep learning networks This repo is used to research convolutional networks primarily for computer vision tasks. For this purpose, the repo contains (

2.7k Jan 1, 2023

Misc YOLOL scripts for use in the Starbase space sandbox videogame

starbase-misc Misc YOLOL scripts for use in the Starbase space sandbox videogame. Each directory contains standalone YOLOL scripts. They don't really

4 Oct 17, 2021

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

8 Nov 7, 2022

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

4 Apr 15, 2022

Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph

191 Dec 23, 2022

SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

59 Feb 25, 2022

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

25.5k Jan 7, 2023

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

5.1k Jan 4, 2023

[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

SurRoL IROS 2021 SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning Features dVRK compati

55 Jan 3, 2023

Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

535 Nov 15, 2022

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 7, 2022

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Related tags

Overview

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Installation

Extending MiniHack

Docker

Trying out MiniHack

Baseline Agents

TorchBeast

RLlib

Unsupervised Environment Design

Citation

Contributions and Maintenance

Papers using the MiniHack

Comments

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

🐛 Bug

To Reproduce

Expected behavior

Environment

🐛 Bug

To Reproduce

Expected behavior

Environment

🚀 Feature

🐛 Bug

To Reproduce

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Releases(v0.1.4)

v0.1.4(Dec 9, 2022)

Installing MiniHack

New in MiniHack v0.1.4

📝 Documentation

🔨 Maintenance

🎡 Environment

v0.1.3(Mar 14, 2022)

Installing MiniHack

New in MiniHack v0.1.3

📝 Documentation

🔨 Maintenance

🎡 Environment

v0.1.2(Nov 30, 2021)

Installing MiniHack

New in MiniHack v0.1.2

📝 Documentation

🔨 Maintenance

🎡 Environment

v0.1.1(Sep 30, 2021)

Installing MiniHack

New in MiniHack v0.1.1

📝 Documentation

Owner

Facebook Research

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Open-Ended Commonsense Reasoning (NAACL 2021)

Benchmark for evaluating open-ended generation

Sandbox for training deep learning networks

Misc YOLOL scripts for use in the Starbase space sandbox videogame

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning