Open world survival environment for reinforcement learning

Overview

Crafter

PyPI

Open world survival environment for reinforcement learning.

Crafter Terrain

Highlights

Crafter is a procedurally generated 2D world, where the agent finds food, avoids or defends against zombies, and collect materials to build tools, which in turn unlock new materials.

  • Generalization: New procedurally generated map for each episode.
  • Exploration: Materials unlock new tools which unlock new materials.
  • Memory: Input images show small part of the world centered at the agent.
  • No trivial behaviors: Must find food and avoid or defend against zombies.
  • Easy: Flat categorical action space with 12 actions.
  • Fast: Runs at 2000 FPS on a simple laptop.
  • Reproducible: All randomness is controlled by a seed.

Play Yourself

You can play the game yourself with an interactive window and keyboard input. The mapping from keys to actions, health level, and inventory state are printed to the terminal.

# Install with GUI
pip3 install 'crafter[gui]'

# Start the game
crafter

# Alternative way to start the game
python3 -m crafter.run_gui

Crafter Video

The following optional command line flags are available:

Flag Default Description
--window <size> 800 Window size in pixels, used as width and height.
--fps <integer> 5 How many times to update the environment per second.
--record <filename>.mp4 None Record a video of the trajectory.
--area <width> <height> 64 64 The size of the world in cells.
--view <distance> 6 The view distance of the player in cells.
--length <integer> None Time limit for the episode.
--seed <integer> None Determines world generation and creatures.

Training Agents

Installation: pip3 install -U crafter

The environment follows the OpenAI Gym interface:

import crafter

env = crafter.Env(seed=0)
obs = env.reset()
assert obs['image'].shape == (64, 64, 3)

done = False
while not done:
  action = env.action_space.sample()
  obs, reward, done, info = env.step(action)

Environment Details

Constructor

For comparability between papers, we recommend using the environment in its default configuration. Nonetheless, the environment can be configured via its constructor:

crafter.Env(area=(64, 64), view=4, size=64, length=10000, seed=None)
Parameter Default Description
area (64, 64) Size of the world in cells.
view 4 View distance of the player in cells.
size 64 Render size of the images, used for both width and height.
length 10000 Time limit for the episode, can be None.
health 10 Initial health level of the player.
seed None Interger that determines world generation and creatures.

Reward

The reward is sparse. It can either be given to the agent or used as a proxy metric for evaluating unsupervised agents. The reward is 1 when the agent unlocks a new achievement and 0 for all other time steps. The list of achievements is as follows:

  • find_food
  • defeat_zombie
  • collect_wood
  • place_table
  • make_wood_pickaxe
  • collect_stone
  • place_stone
  • make_stone_pickaxe
  • collect_coal
  • collect_iron
  • place_furnace
  • make_iron_pickaxe
  • collect_diamond

The set of unlocked achievements can also be accessed via the info dictionary.

Termination

The episode terminates when the health points of the agent reach zero. Episodes also end when reaching a time limit, which is 100,000 steps by default.

Observation Space

Each observation is a dictionary that contains a local image centered at the agent and counters for player health and inventory. The following keys are available:

Key Space
image Box(0, 255, (64, 64, 3), np.uint8)
health Box(0, 255, (), np.uint8)
wood Box(0, 255, (), np.uint8)
stone Box(0, 255, (), np.uint8)
iron Box(0, 255, (), np.uint8)
diamond Box(0, 255, (), np.uint8)
wood_pickaxe Box(0, 255, (), np.uint8)
stone_pickaxe Box(0, 255, (), np.uint8)
iron_pickaxe Box(0, 255, (), np.uint8)

Action Space

The action space is categorical. Each action is an integer index representing one of the 12 possible actions:

Integer Name Requirement
0 noop Always applicable.
1 left Flat ground left to the agent.
2 right Flat ground right to the agent.
3 up Flat ground above the agent.
4 down Flat ground below the agent.
5 grab_or_attack Facing creature or material and has necessary tool.
6 place_stone Stone in inventory.
7 place_table Wood in inventory.
8 place_furnace Stone in inventory.
9 make_wood_pickaxe Nearby table and wood in inventory.
10 make_stone_pickaxe Nearby tabel and wood, stone in inventory.
11 make_iron_pickaxe Nearby furnace and wood, coal, iron an inventory.

Questions

Please open an issue on Github.

Comments
  • There is an functools has no

    There is an functools has no "cache" error when I try to import crafter

    The installation seems to go well but when I try to import crafter it doesn't work. I looked for some ways to solve this problem but the only solution I could find was to upgrade to Python 3.9 but I can't do that since I plan on using an online service like google colab or kaggle. What would you suggest I do?

    opened by DanielS684 5
  • How can I render when training an agent?

    How can I render when training an agent?

    Hi, danijar.

    Thanks for making a good environment.

    I could check 'Play Yourself' and play the game myself. however, I just got information on (obs, reward, done, info), the gui game is not working when I tried to import crafter in a python script. And env.render() is also not working.

    could you give me the answer how can I render in a python script?

    opened by LeejwUniverse 4
  • Potentially port to the Griddly RL Engine

    Potentially port to the Griddly RL Engine

    Hi @danijar!

    I'm the creator of Griddly https://griddly.readthedocs.io/en/latest/ and I'm pretty sure it would be possible to port most of the crafter environment to use the Griddly Engine.

    Using Griddly you get a bunch of features that work out of the box

    • Multi-agent interfaces (just put more players in the level map and it just works)
      • Hardware accellerated video recording per-agent and globally.
    • Really easily modifiable mechanics (you just change the YAML a bit)
    • 5 different ways to produce observations,
      • Vector observations can be about 70k observations per second (on a single thread))
      • 3 different GPU accellerated observation types (probably around 4k obs per second (on a single thread))
      • ASCII observations (if you're one of those NetHack/Dwarf Fortress nerds)
    • RLLib support (if you want to do multiple policies/agents competitive/collaborative etc)

    It would be super simple to skin crafter like this also for interesting demos: image

    What do you think? There's definitely some things I'd have to add to Griddly (such as the chasing mechanics) But I've been looking to do that for a while anyway.

    Ill see if I can make a super simple example and add some images/gifs this weekend.

    opened by Bam4d 3
  • Rainbow Hyperparameters

    Rainbow Hyperparameters

    Hi. The paper mentions

    Rainbow (Hessel et al., 2018) is based on Q-Learning and combines several advances, including for exploration. The defaults for Atari did not work well, so we tuned the hyper parameters for Crafter and found a compromise between Atari defaults and the data-efficient version of the method (van Hasselt et al., 2019) to be ideal.

    Are these hyperparameters available anywhere, and if not, can you please share these? Thecrafter baselines for Rainbow has the same hyperparameters as Kaixhin's Rainbow (which are designed for ALE), and there are no hyper parameter values mentioned in the paper. Thanks for your time!

    opened by rfali 1
  • Clarification about Training and Evalaution Steps

    Clarification about Training and Evalaution Steps

    Hi. I am a little confused about how many steps should the agent train and then evalauted on. To add a little context, the paper mentions on pg2

    Crafter evaluates many different abilities of an agent by training only on a single environment for 5M steps

    and this can be seen in crafter_baselines code as well (e.g. PPO, Rainbow)

    But in Sec 3.3 of the paper

    An agent is granted a budget of 1M environment steps to interact with the environment.

    elsewhere (pg 6, sec 4.1), multiple figures

    budget of 1M environment steps

    Table A.1

    It is computed across all training episodes within the budget of 1M environment steps

    I also see that you commented out evaluation code from the Rainbow code (here).

    What I can make of this is that I need to run crafter agent for 1M steps (I saw the PPO example) and then use the saved stats (json file?) and the analysis code to calculate the success rate and score. Precisely, using the existing crafter code available, how can I go from training to plotting meaningful results. Can you please clarify? Thanks

    opened by rfali 1
  • Global View of the map

    Global View of the map

    Hi! Is there any way to generate a global view of the map like the ones in the readme? I have seen a Global View class in engine.py but it is not implemented. Thank you!

    opened by roger-creus 1
  • Integration in Envpool

    Integration in Envpool

    Hi!

    I was wondering if there exists a plan to integrate the crafter learning environment in Envpool. I think it would be very useful since it would allow training agents in parallel to benefit from the high-performance framework of Envpool, and build on top of great baselines.

    There exists this guide which can be useful.

    Thank you very much!

    opened by roger-creus 1
  • Can't initialize the enviroment

    Can't initialize the enviroment

    I followed the instructions but I got the error:

    "reset() missing 1 required positional argument: 'self'"

    When using:

    env = crafter.Recorder( env, './', save_stats=True, save_video=True, save_episode=True, )

    env.reset()

    Any solution? I really would like to try this environment :(

    opened by ImaGonEs 1
  • VideoRecorder saving path error

    VideoRecorder saving path error

    First of all, thank you for this environment!

    I think VideoRecorder has a small bug that causes duplicate directories in the path, which causes an error when saving logs.

    Example of an error for directory="videos":

    FileNotFoundError: The directory '/home/project_path/videos/videos' does not exist
    

    Removing self._directory / from filename fixes this error for me. https://github.com/danijar/crafter/blob/e36271d45f282c991261ba18997d410c26a08bd0/crafter/recorder.py#L98

    opened by Howuhh 1
  • Saving videos

    Saving videos

    Hi!,

    I am having problems with the recorder, as

    env = crafter.Recorder(
      env, './logs',
      save_stats=True,
      save_video=True,
      save_episode=True,
    )
    

    returns TypeError: render() takes 1 positional argument but 2 were given

    and

    env.render()

    returns AttributeError: 'NoneType' object has no attribute 'get'

    opened by roger-creus 0
  • Error when using env.render() along with stable-baselines3

    Error when using env.render() along with stable-baselines3

    Hello,

    In the gym.Env class, the first argument is defined as the mode, which is either "human" or "rgb_array". In Crafter, the render function takes only one argument, size. This causes errors when using stable-baselines3 to evaluate policies and create videos since their helper functions assume envs follow gym.Env and have the first argument as the mode.

    A fix would be to just add a dummy first argument mode to the render function

    https://github.com/danijar/crafter/blob/e955b1135797d8ca783d5706e8876e0b3b8babb3/crafter/env.py#L120-L122

    opened by shivakanthsujit 3
Owner
Danijar Hafner
I'm trying to build unsupervised intelligent machines and evaluate them in complex environments.
Danijar Hafner
Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

Coach Coach is a python reinforcement learning framework containing implementation of many state-of-the-art algorithms. It exposes a set of easy-to-us

Intel Labs 2.2k Jan 5, 2023
An open source robotics benchmark for meta- and multi-task reinforcement learning

Meta-World Meta-World is an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic

Reinforcement Learning Working Group 823 Jan 6, 2023
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 1, 2023
Doom-based AI Research Platform for Reinforcement Learning from Raw Visual Information. :godmode:

ViZDoom ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research

Marek Wydmuch 1.5k Dec 30, 2022
A toolkit for reproducible reinforcement learning research.

garage garage is a toolkit for developing and evaluating reinforcement learning algorithms, and an accompanying library of state-of-the-art implementa

Reinforcement Learning Working Group 1.6k Jan 9, 2023
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Status: Maintenance (expect bug fixes and minor updates) Baselines OpenAI Baselines is a set of high-quality implementations of reinforcement learning

OpenAI 13.5k Jan 7, 2023
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Stable Baselines Stable Baselines is a set of improved implementations of reinforcement learning algorithms based on OpenAI Baselines. You can read a

Ashley Hill 3.7k Jan 1, 2023
A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)

Applied Reinforcement Learning @ Facebook Overview ReAgent is an open source end-to-end platform for applied reinforcement learning (RL) developed and

Facebook Research 3.3k Jan 5, 2023
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning. TF-Agents makes implementing, de

null 2.4k Dec 29, 2022
Tensorforce: a TensorFlow library for applied reinforcement learning

Tensorforce: a TensorFlow library for applied reinforcement learning Introduction Tensorforce is an open-source deep reinforcement learning framework,

Tensorforce 3.2k Jan 2, 2023
TensorFlow Reinforcement Learning

TRFL TRFL (pronounced "truffle") is a library built on top of TensorFlow that exposes several useful building blocks for implementing Reinforcement Le

DeepMind 3.1k Dec 29, 2022
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

Dopamine Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. It aims to fill the need for a small, easily grok

Google 10k Jan 7, 2023
Deep Reinforcement Learning for Keras.

Deep Reinforcement Learning for Keras What is it? keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seaml

Keras-RL 5.4k Jan 4, 2023
ChainerRL is a deep reinforcement learning library built on top of Chainer.

ChainerRL ChainerRL is a deep reinforcement learning library that implements various state-of-the-art deep reinforcement algorithms in Python using Ch

Chainer 1.1k Dec 26, 2022
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

MARL Tricks Our codes for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implemented and standardiz

null 404 Dec 25, 2022
Paddle-RLBooks is a reinforcement learning code study guide based on pure PaddlePaddle.

Paddle-RLBooks Welcome to Paddle-RLBooks which is a reinforcement learning code study guide based on pure PaddlePaddle. 欢迎来到Paddle-RLBooks,该仓库主要是针对强化学

AgentMaker 117 Dec 12, 2022
Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Nafis Ahmed 1 Dec 28, 2021
customer churn prediction prevention in telecom industry using machine learning and survival analysis

Telco Customer Churn Prediction - Plotly Dash Application Description This dash application allows you to predict telco customer churn using machine l

Benaissa Mohamed Fayçal 3 Nov 20, 2021
Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

null 1 Nov 10, 2021