Invariant Causal Prediction for Block MDPs

Meta Research

Last update: Sep 17, 2022

Related tags

Deep Learning icp-block-mdp

Overview

MISA

Abstract

Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challenges. In this paper, we consider the problem of learning abstractions that generalize in block MDPs, families of environments with a shared latent state space and dynamics structure over that latent space, but varying observations. We leverage tools from causal inference to propose a method of invariant prediction to learn model-irrelevance state abstractions (MISA) that generalize to novel observations in the multi-environment setting. We prove that for certain classes of environments, this approach outputs with high probability a state abstraction corresponding to the causal feature set with respect to the return. We further provide more general bounds on model error and generalization error in the multi-environment setting, in the process showing a connection between causal variable selection and the state abstraction framework for MDPs. We give empirical evidence that our methods work in both linear and nonlinear settings, attaining improved generalization over single-and multi-task baselines.

Citation

@inproceedings{zhang2020invariant,
    title={Invariant Causal Prediction for Block MDPs},
    author={Amy Zhang and Clare Lyle and Shagun Sodhani and Angelos Filos and Marta Kwiatkowska and Joelle Pineau and Yarin Gal and Doina Precup},
    year={2020},
    booktitle={International Conference on Machine Learning (ICML)},
}

Experiments

The three sets of experiments on model learning, imitation learning, and reinforcement learning can be found in their respective folder. To install requirements, create a new conda environment and run

pip install -e requirements.txt

In model learning, there are two sets of experiments, linear MISA and nonlinear MISA. The code is in model_learning. First cd model_learning.

The main experiment with linear MISA can be run with

ICPAbstractMDP.ipynb

The main experiment with nonlinear MISA can be run with

python main.py

For running the imitation learning experiments, first cd imitation_learning. Then install the baselines by running cd baselines && pip install tensorflow==1.14 && pip install -e . The main experiments can be run in imitation_learning directory with:

python train_expert.py --save_model --save_model_path models # Training the expert model

#Lets say the model was trained for 150K steps.

mkdir -p buffers/train/0 buffers/train/1 buffers/eval/0 # Directory to hold the buffer data

python collect_data_using_expert_policy.py --load_model_path models_150000 --save_buffer --save_buffer_path buffers  # Collecting the trajectories using the expert model

python train.py --use_single_encoder_decoder --num_train_envs 1 --num_eval_envs 1 --load_buffer_path buffers # MISA One Env

python train.py --use_single_encoder_decoder --num_train_envs 2 --num_eval_envs 1 --load_buffer_path buffers # Baseline One Decoder 

python train.py --use_discriminator --num_train_envs 2 --num_eval_envs 1 --load_buffer_path buffers # Proposed Approach

python train.py --use_irm_loss --num_train_envs 2 --num_eval_envs 1 --load_buffer_path buffers # IRM

In reinforcement learning, the main experiment can be run in reinforcement_learning directory with

./run_local.sh

LICENSE

Attribution-NonCommercial 4.0 International

You might also like...

Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

118 Dec 12, 2022

A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

235 Jan 7, 2023

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features Overview This repository is the Pytorch implementation of PRIN/SPRIN: On Extracting P

17 Mar 2, 2022

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

IR-GAIL This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation". Dependency The experiments are de

1 Jul 14, 2022

An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

94 Dec 10, 2022

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

DIR-GNN "Discovering Invariant Rationales for Graph Neural Networks" (ICLR 2022)

70 Nov 13, 2022

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

OverlapTransformer The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for

136 Jan 3, 2023

Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Price-Prediction-For-a-Dream-Home ROADMAP TO THIS LINEAR REGRESSION BASED HOUSE PRICE PREDICTION PREDICTION MODEL Import all the dependencies of the p

1 Dec 29, 2021

Doge-Prediction - Coding Club prediction ig

Doge-Prediction Coding Club prediction ig Basically: Create an application that

1 Jan 10, 2022

Comments

Adding Code of Conduct file

This is pull request was created automatically because we noticed your project was missing a Code of Conduct file.

Code of Conduct files facilitate respectful and constructive communities by establishing expected behaviors for project contributors.

This PR was crafted with love by Facebook's Open Source Team.
CLA Signed

opened by facebook-github-bot 0
Adding Contributing file

This is pull request was created automatically because we noticed your project was missing a Contributing file.

CONTRIBUTING files explain how a developer can contribute to the project - which you should actively encourage.

This PR was crafted with love by Facebook's Open Source Team.
CLA Signed

opened by facebook-github-bot 0

Invariant Causal Prediction for Block MDPs

Related tags

Overview

MISA

Abstract

Citation

Experiments

LICENSE

You might also like...

Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

An SE(3)-invariant autoencoder for generating the periodic structure of materials

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Doge-Prediction - Coding Club prediction ig

Comments

Adding Code of Conduct file

Adding Contributing file

Owner

Meta Research

Causal-Adversarial-Instruments - PyTorch Implementation for Developing Library of Investigating Adversarial Examples on A Causal View by Instruments

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

The Python ensemble sampling toolkit for affine-invariant MCMC

《Truly shift-invariant convolutional neural networks》(2021)

Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

DIRL: Domain-Invariant Representation Learning

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Systematic generalisation with group invariant predictions

PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

Implementation of Invariant Point Attention, used for coordinate refinement in the structure module of Alphafold2, as a standalone Pytorch module