PAIRED in PyTorch 🔥

UCL DARK Lab

Last update: Dec 12, 2022

Related tags

Deep Learning paired

Overview

PAIRED

This codebase provides a PyTorch implementation of Protagonist Antagonist Induced Regret Environment Design (PAIRED), which was first introduced in "Emergent Complexity and Zero-Shot Transfer via Unsupervised Environment Design" (Dennis et al, 2020). This implementation comes integrated with custom adversarial maze environments based on MiniGrid environment (Chevalier-Boisvert et al, 2018), as used in Dennis et al, 2020.

Unsupervised environment design (UED) methods propose a curriculum of tasks or environment instances (levels) that aims to foster more sample efficient learning and robust policies. PAIRED performs unsupervised environment design (UED) using a three-player game among two student agents—the protagonist and antagonist—and an adversary. The antagonist is allied with the adversary, which proposes new environment instances (or levels) aiming to maximize the regret of the protagonist, estimated as the difference in returns achieved by the student agents across a batch of rollouts on proposed levels.

PAIRED has a strong guarantee of robustness in that at Nash equilibrium, it provably induces a minimax regret policy for the protagonist, which means that the protagonist optimally trades off regret across all possible levels that can be proposed by the adversary.

UED algorithms included

PAIRED (Protagonist Antagonist Induced Regret Environment Design)
Minimax
Domain randomization

Set up

To install the necessary dependencies, run the following commands:

conda create --name paired python=3.8
conda activate paired
pip install -r requirements.txt

git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .
cd ..

Configuration

Detailed descriptions of the various command-line arguments for the main training script, train.py can be found in arguments.py.

Experiments

For convenience, configuration json files are provided to generate the commands to run the specific experimental settings featured in Dennis et al, 2020. To generate the command to launch 1 run of the experiment codified by the configuration file config.json in the local folder train_scripts/configs, simply run the following, and copy and paste the output into your command line.

python train_scripts/make_cmd.py --json config --num_trials 1

Alternatively, you can run the following to copy the command directly to your clipboard:

python train_scripts/make_cmd.py --json config --num_trials 1 | pbcopy

By default, each experiment run will generate a folder in ~/logs/paired named after the --xpid argument passed into the the train command. This folder will contain log outputs in logs.csv and periodic screenshots of generated levels in the directory screenshots. Each screenshot uses the naming convention update_<number of PPO updates>.png. The latest model checkpoint will be output to model.tar, and archived model checkpoints are also saved according to the naming convention model_<number of PPO updates>.tar.

The json files for reproducing various MiniGrid experiments from Dennis et al, 2020 are listed below:

Method	json config
PAIRED	minigrid/paired.json
Minimax	minigrid/minimax.json
DR	minigrid/dr.json

Evaluation

You can use the following command to batch evaluate all trained models whose output directory shares the same <xpid_prefix> before the indexing _[0-9]+ suffix:

python -m eval \
--base_path "~/logs/paired" \
--prefix '<xpid prefix>' \
--num_processes 2 \
--env_names \
'MultiGrid-SixteenRooms-v0,MultiGrid-Labyrinth-v0,MultiGrid-Maze-v0'
--num_episodes 100 \
--model_tar model

Comments

Failed to run MiniHack Experiments (Environments not found)

I am trying to run the MiniHack experiments with the following command:

python -m train \
--xpid=ued-MiniHack-GoalLastAdv-WallsLavaMonsterDoor-15x15-v0-paired-lstm256a-lr0.0001-epoch5-mb1-v0.5-henv0.0-ha0.0-tl_0 \
--env_name=MiniHack-GoalLastAdv-WallsLavaMonsterDoor-15x15-v0 \
--use_gae=True \
--gamma=0.995 \
--gae_lambda=0.95 \
--seed=88 \
--recurrent_arch=lstm \
--recurrent_agent=True \
--recurrent_adversary_env=False \
--recurrent_hidden_size=256 \
--lr=0.0001 \
--num_steps=256 \
--num_processes=4 \
--num_env_steps=100000 \
--ppo_epoch=5 \
--num_mini_batch=1 \
--entropy_coef=0.0 \
--value_loss_coef=0.5 \
--clip_param=0.2 \
--clip_value_loss=True \
--adv_entropy_coef=0.0 \
--algo=ppo \
--ued_algo=paired \
--log_interval=10 \
--screenshot_interval=1000 \
--log_grad_norm=True \
--handle_timelimits=True \
--test_env_names=MiniHack-Room-15x15-v0,MiniHack-Room-Monster-15x15-v0,MiniHack-MazeWalk-9x9-v0,MiniHack-MazeWalk-15x15-v0,MiniHack-Labyrinth-Small-v0,MiniHack-LockedMultiRoom-N2-S4-v0,MiniHack-LavaMultiRoom-N2-S4-v0,MiniHack-MonsterMultiRoom-N2-S4-v0,MiniHack-ExtremeMultiRoom-N2-S4-v0,MiniHack-ExtremeMultiRoom-N4-S5-v0 \
--log_dir=logs/paired/minihack \
--log_action_complexity=True \
--checkpoint=True

I am receiving errors of the following kind:

gym.error.UnregisteredEnv: No registered env with id: MiniHack-LockedMultiRoom-N2-S4-v0

I searched the codebase but did not find any matches with the string "MiniHack-LockedMultiRoom-N2-S4-v0". Any idea how to solve this?

The error is originating from Line 133 envs/register.py, because no MiniHack environments was registered via EnvRegistry.register()

opened by azadsalam 3

Is the implementation of final rewards correct?

As per the original implementation, the final rewards are supposed to replace the reward at the end of each episode in the replay buffer.

https://github.com/google-research/google-research/blob/901524f4d4ab15ef9d2f5165148347d0f26b32c2/social_rl/adversarial_env/agent_train_package.py#L260-L264

Whereas in the case of this PyTorch implementation the final reward is replaced only for the final return. https://github.com/ucl-dark/paired/blob/c836e868c6cb805012f93590e0ece1bc8461dbcf/algos/storage.py#L201-L202

Did I misunderstand anything in the code?

opened by nikhilrayaprolu 0
Pytorch Code takes longer time than Tensorflow

On the same hardware and with the same parameters, the Pytorch code is almost 4 times slower than the original TensorFlow implementation. What could be the cause of this issue?

opened by nikhilrayaprolu 1
minihack env

git clone https://github.com/ucl-dark/blob/main/minihack is no longer available? Is it safe to assume using https://github.com/facebookresearch/minihack will work as well?

opened by raymond2338 0
max_step in rollout and order of training
Hi there, thank you a lot for contributing this pytorch version of paired. I have two questions and I hope you could clarify for me. Really appreciate it.

the num_steps for rollouting Protagonist's and Antagonist's policy in the grid env is set as 256 by default (https://github.com/ucl-dark/paired/blob/fd49543811dca1177eb34cb846035470c141aac1/envs/runners/adversarial_runner.py#L373). I am not sure will the env be terminated when the max_steps=256 is reached. If yes, then the two agents are only rollout on the env for one episode, which is not enough to produce max/mean return for Antagonist/Protagonist. If no, then the two agents will be rollout for several episodes, depending on how many steps they will perform in the env for one episode. But, if this is the case, then the Antagonist and Protagonist are not evaluated for the same number of episodes. So, I am confused about this.

As stated in the first paragraph in Part 4 in the paper, the authors will first generate the env by env_adversary given the Protagonist with fixed policy, and then the Antagonist will be trained on this env to optimality. After training Antagonist, we compute the Regret based on the trained Antagonist's policy and pre-trained Protagonist's policy. However, in your implementation, in the run() function (https://github.com/ucl-dark/paired/blob/fd49543811dca1177eb34cb846035470c141aac1/envs/runners/adversarial_runner.py#L356), I found that you run env_adversary, Protagonist and Antagonist in order. Could you also clarify this?

Again, thank you so much for your effort.
opened by wenjunli-0 0

Owner

UCL DARK Lab

UCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab

GitHub

Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors.

PairRE Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors. This implementation of PairRE for Open Graph Benchmak datasets (

65 Dec 19, 2022

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision Project | Arxiv | Abstract It is very challenging for various visual tasks such as image

377 Jan 7, 2023

U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

43 Oct 3, 2022

An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

48 Sep 27, 2022

RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

90 Dec 8, 2022

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

520 Dec 30, 2022

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

6.9k Jan 3, 2023

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

119 Nov 24, 2022

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

1.4k Jan 1, 2023

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

360 Dec 10, 2022

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

9.2k Jan 2, 2023

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

121 Dec 17, 2022

A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

172 Dec 12, 2022

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

111 Dec 8, 2022

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

359 Jan 5, 2023

A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

0 Jul 13, 2021

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

8 Nov 21, 2022

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

157 Dec 11, 2022

A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

6 Mar 17, 2022

PAIRED in PyTorch 🔥

Related tags

Overview

PAIRED

UED algorithms included

Set up

Configuration

Experiments

Evaluation

Comments

Failed to run MiniHack Experiments (Environments not found)

Is the implementation of final rewards correct?

Pytorch Code takes longer time than Tensorflow

minihack env

max_step in rollout and order of training

Owner

UCL DARK Lab

Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors.

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

U-2-Net: U Square Net - Modified for paired image training of style transfer

An essential implementation of BYOL in PyTorch + PyTorch Lightning

RealFormer-Pytorch Implementation of RealFormer using pytorch

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

A pytorch implementation of Pytorch-Sketch-RNN

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

A general framework for deep learning experiments under PyTorch based on pytorch-lightning