DeepMind Alchemy task environment: a meta-reinforcement learning benchmark

Overview

dm_alchemy: DeepMind Alchemy environment

Overview | Requirements | Installation | Usage | Documentation | Tutorial | Paper | Blog post

The DeepMind Alchemy environment is a meta-reinforcement learning benchmark that presents tasks sampled from a task distribution with deep underlying structure. It was created to test for the ability of agents to reason and plan via latent state inference, as well as useful exploration and experimentation. It is Unity-based.

Overview

This environment is provided through pre-packaged Docker containers.

This package consists of support code to run these Docker containers. You interact with the task environment via a dm_env Python interface.

Please see the documentation for more detailed information on the available tasks, actions and observations.

Requirements

dm_alchemy requires Docker, Python 3.6.1 or later and a x86-64 CPU with SSE4.2 support. We do not attempt to maintain a working version for Python 2.

Alchemy is intended to be run on Linux and is not officially supported on Mac and Windows. However, it can in principle be run on any platform (though installation may be more of a headache). In particular, on Windows, you will need to install and run Alchemy with WSL.

Note: We recommend using Python virtual environment to mitigate conflicts with your system's Python environment.

Download and install Docker:

Ensure that docker is working correctly by running docker run -d gcr.io/deepmind-environments/alchemy:v1.0.0.

Installation

You can install dm_alchemy by cloning a local copy of our GitHub repository:

$ git clone https://github.com/deepmind/dm_alchemy.git
$ pip install wheel
$ pip install --upgrade setuptools
$ pip install ./dm_alchemy

To also install the dependencies for the examples/, install with:

$ pip install ./dm_alchemy[examples]

Usage

Once dm_alchemy is installed, to instantiate a dm_env instance run the following:

import dm_alchemy

LEVEL_NAME = ('alchemy/perceptual_mapping_'
              'randomized_with_rotation_and_random_bottleneck')
settings = dm_alchemy.EnvironmentSettings(seed=123, level_name=LEVEL_NAME)
env = dm_alchemy.load_from_docker(settings)

For more details see the introductory colab.

Open in colab

Citing Alchemy

If you use Alchemy in your work, please cite the accompanying technical report:

@article{wang2021alchemy,
    title={Alchemy: A structured task distribution for meta-reinforcement learning},
    author={Jane Wang and Michael King and Nicolas Porcel and Zeb Kurth-Nelson
        and Tina Zhu and Charlie Deck and Peter Choy and Mary Cassin and
        Malcolm Reynolds and Francis Song and Gavin Buttimore and David Reichert
        and Neil Rabinowitz and Loic Matthey and Demis Hassabis and Alex Lerchner
        and Matthew Botvinick},
    year={2021},
    journal={arXiv preprint arXiv:2102.02926},
    url={https://arxiv.org/abs/2102.02926},
}

Notice

This is not an officially supported Google product.

Comments
  •     raise grpc.FutureTimeoutError() grpc.FutureTimeoutError

    raise grpc.FutureTimeoutError() grpc.FutureTimeoutError

    followed instruction from readme below

    $ git clone https://github.com/deepmind/dm_alchemy.git
    $ pip install wheel
    $ pip install --upgrade setuptools
    $ pip install ./dm_alchemy
    $ pip install pygame
    

    and then try to run human_agent.py using below command getting error

    python human_agent.py --seed 123 --level_name 'alchemy/perceptual_mapping_randomized_with_rotation_and_random_bottleneck'

    pygame 2.0.1 (SDL 2.0.14, Python 3.7.4)
    Hello from the pygame community. https://www.pygame.org/contribute.html
    I0212 16:02:42.783728 140172242384704 _load_environment.py:377] Downloading docker image "gcr.io/deepmind-environments/alchemy:v1.0.0"...
    I0212 16:03:59.853875 140172242384704 _load_environment.py:379] Download finished.
    Traceback (most recent call last):
      File "human_agent.py", line 186, in <module>
        app.run(main)
      File "/home/venv/lib/python3.7/site-packages/absl/app.py", line 303, in run
        _run_main(main, args)
      File "/home/venv/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
        sys.exit(main(argv))
      File "human_agent.py", line 117, in main
        environment_variables=environment_variables) as env:
      File "/home/venv/lib/python3.7/site-packages/dm_alchemy/_load_environment.py", line 389, in load_from_docker
        connection_details=_connect_to_environment(port, settings),
      File "/home/venv/lib/python3.7/site-packages/dm_alchemy/_load_environment.py", line 246, in _connect_to_environment
        channel, connection = _create_channel_and_connection(port)
      File "/home/venv/lib/python3.7/site-packages/dm_alchemy/_load_environment.py", line 204, in _create_channel_and_connection
        _check_grpc_channel_ready(channel)
      File "/home/venv/lib/python3.7/site-packages/dm_alchemy/_load_environment.py", line 183, in _check_grpc_channel_ready
        return grpc.channel_ready_future(channel).result(timeout=1)
      File "/home/venv/lib/python3.7/site-packages/grpc/_utilities.py", line 140, in result
        self._block(timeout)
      File "/home/venv/lib/python3.7/site-packages/grpc/_utilities.py", line 86, in _block
        raise grpc.FutureTimeoutError()
    grpc.FutureTimeoutError
    
    opened by vis7 5
  • Add `frozendict` requirement

    Add `frozendict` requirement

    frozendict seems to be used by the code in a few places [1] with no explicit requirement, just adding a missing line to fix install.

    [1] https://github.com/deepmind/dm_alchemy/search?q=frozendict

    opened by tkukurin 2
  • grpc.FutureTimeoutError

    grpc.FutureTimeoutError

    unfortunately on windows, I run into the same error trace as #1 trying to run human_agent.py:

    python human_agent.py --seed 123 --level_name alchemy/perceptual_mapping_randomized_with_rotation_and_random_bottleneck' gives the error:

    pygame 2.0.1 (SDL 2.0.14, Python 3.7.9)
    Hello from the pygame community. https://www.pygame.org/contribute.html
    Traceback (most recent call last):
      File "human_agent.py", line 186, in <module>
        app.run(main)
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\app.py", line 303, in run
        _run_main(main, args)
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\app.py", line 251, in _run_main
        sys.exit(main(argv))
      File "human_agent.py", line 117, in main
        environment_variables=environment_variables) as env:
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\dm_alchemy\_load_environment.py", line 389, in load_from_docker
        connection_details=_connect_to_environment(port, settings),
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\dm_alchemy\_load_environment.py", line 246, in _connect_to_environment
        channel, connection = _create_channel_and_connection(port)
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\dm_alchemy\_load_environment.py", line 204, in _create_channel_and_connection
        _check_grpc_channel_ready(channel)
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\dm_alchemy\_load_environment.py", line 183, in _check_grpc_channel_ready
        return grpc.channel_ready_future(channel).result(timeout=1)
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\grpc\_utilities.py", line 140, in result
        self._block(timeout)
      File "C:\Users\jonat\AppData\Local\Programs\Python\Python37\lib\site-packages\grpc\_utilities.py", line 86, in _block
        raise grpc.FutureTimeoutError()
    grpc.FutureTimeoutError
    

    the command that is mentioned in #1 was already updated in the installation instructions, but does not seem to help. Running

    level_name = 'alchemy/perceptual_mapping_randomized_with_rotation_and_random_bottleneck'
    seed = 1023
    settings = dm_alchemy.EnvironmentSettings(
        seed=seed, level_name=level_name, width=width, height=height)
    env = dm_alchemy.load_from_docker(settings)
    

    in the AlchemyGettingStarted.ipynb seems to give the same error. The setup is as written in the guide and uses Docker Desktop (except that I don't use a Python virtual environment). docker run -d gcr.io/deepmind-environments/alchemy:v1.0.0 gives no error.

    While you don't officially support windows maybe you can still help?

    opened by jthomy 2
  • Running IdealObserverBot

    Running IdealObserverBot

    Hi, I am trying to replicate the evaluation of the Ideal Observer agent on Symbolic Alchemy to get the ~ 284.4 ± 1.6 score reported in the paper. That said, I wrote the following function in an attempt to do so:

    Side note: I know that you made available the ideal observer evaluation trajectory in the agent_events folder, but my aim is to then use the IdealObserverBot on a different set of events.

    def evaluate():
    
        level_name = "perceptual_mapping_randomized_with_random_bottleneck"
        chems = chemistries_proto_conversion.load_chemistries_and_items(
            f'chemistries/{level_name}/chemistries')
    
        precomputed = precomputed_maps.load_from_level_name(level_name)
        reward_weights = RewardWeights(coefficients=[1, 1, 1], offset=0, bonus=12)
    
        N_EPISODES = 1_000
        rewards = np.zeros(N_EPISODES)
    
        for ep in tqdm(range(N_EPISODES)):
    
            chem, items = chems[ep]
            env = symbolic_alchemy.get_symbolic_alchemy_fixed(chemistry=chem, episode_items=items)
    
            env.add_trackers({symbolic_alchemy_trackers.ScoreTracker.NAME: symbolic_alchemy_trackers.ScoreTracker(env._reward_weights)})
    
            belief_state_tracker = {symbolic_alchemy_trackers.BeliefStateTracker.NAME: symbolic_alchemy_trackers.BeliefStateTracker(precomputed, env)}
            env.add_trackers(belief_state_tracker)
    
            bot = IdealObserverBot(reward_weights, precomputed, env, minimise_world_states=False)
    
            episode_results = bot.run_episode()
        
            rewards[ep] = np.sum(episode_results['score']['per_trial'])
    
        return rewards.mean(), rewards.std() / np.sqrt(N_EPISODES)
    

    Since each episode takes understandably a long time to run, I just want to confirm the correctness of my code as it's based on a few assumptions I took while reading the codebase.

    Thanks!

    opened by BKHMSI 2
  • Live animation of the environment

    Live animation of the environment

    Is there an easy way to have a live-updating render of what's happening in the environment as actions are taken? I'm hoping for something similar to the GIF you have in the README.

    opened by ilia10000 0
Owner
DeepMind
DeepMind
RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

Google Research 66 Oct 7, 2022
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022
Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner This repository is the official implementation of Meta-rPPG: Remote Heart Ra

Eugene Lee 137 Dec 13, 2022
This is an official PyTorch implementation of Task-Adaptive Neural Network Search with Meta-Contrastive Learning (NeurIPS 2021, Spotlight).

NeurIPS 2021 (Spotlight): Task-Adaptive Neural Network Search with Meta-Contrastive Learning This is an official PyTorch implementation of Task-Adapti

Wonyong Jeong 15 Nov 21, 2022
Source code of the paper Meta-learning with an Adaptive Task Scheduler.

ATS About Source code of the paper Meta-learning with an Adaptive Task Scheduler. If you find this repository useful in your research, please cite the

Huaxiu Yao 16 Dec 26, 2022
Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

null 1 Nov 10, 2021
​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

TextWorld A text-based game generator and extensible sandbox learning environment for training and testing reinforcement learning (RL) agents. Also ch

Microsoft 983 Dec 23, 2022
Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Bootstrap Your Own Latent (BYOL), in Pytorch Practical implementation of an astoundingly simple method for self-supervised learning that achieves a ne

Phil Wang 1.4k Dec 29, 2022
Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated th

Yvictor 1.1k Jan 2, 2023
CowHerd is a partially-observed reinforcement learning environment

CowHerd is a partially-observed reinforcement learning environment, where the player walks around an area and is rewarded for milking cows. The cows try to escape and the player can place fences to help capture them. The implementation of CowHerd is based on the Crafter environment.

Danijar Hafner 6 Mar 6, 2022
Reinforcement learning models in ViZDoom environment

DoomNet DoomNet is a ViZDoom agent trained by reinforcement learning. The agent is a neural network that outputs a probability of actions given only p

Andrey Kolishchak 126 Dec 9, 2022
Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Preference-Planning-Deep-IRL Introduction Check my portfolio post Dependencies Gym stable-baselines3 PyTorch Usage Take Demonstration python3 record.

Tianyu Li 9 Oct 26, 2022
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

万鲲鹏 7 Sep 20, 2022
The Environment I built to study Reinforcement Learning + Pokemon Showdown

pokemon-showdown-rl-environment The Environment I built to study Reinforcement Learning + Pokemon Showdown Been a while since I ran this. Think it is

null 3 Jan 16, 2022
Wordle Env: A Daily Word Environment for Reinforcement Learning

Wordle Env: A Daily Word Environment for Reinforcement Learning Setup Steps: git pull [email protected]:alex-nooj/wordle_env.git From the wordle_env dire

null 2 Mar 28, 2022
My implementation of DeepMind's Perceiver

DeepMind Perceiver (in PyTorch) Disclaimer: This is not official and I'm not affiliated with DeepMind. My implementation of the Perceiver: General Per

Louis Arge 55 Dec 12, 2022
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

relational-rnn-pytorch An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch. Relational Memory Core (

Sang-gil Lee 241 Nov 18, 2022
This's an implementation of deepmind Visual Interaction Networks paper using pytorch

Visual-Interaction-Networks An implementation of Deepmind visual interaction networks in Pytorch. Introduction For the purpose of understanding the ch

Mahmoud Gamal Salem 166 Dec 6, 2022
A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

ParallelFold Author: Bozitao Zhong This is a modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (p

Bozitao Zhong 77 Dec 22, 2022