Causal Influence Detection for Improving Efficiency in Reinforcement Learning

Overview

Causal Influence Detection for Improving Efficiency in Reinforcement Learning

This repository contains the code release for the paper "Causal Influence Detection for Improving Efficiency in Reinforcement Learning", published at NeurIPS 2021.

This work was done by Maximilian Seitzer, Bernhard Schölkopf and Georg Martius at the Autonomous Learning Group, Max-Planck Institute for Intelligent Systems.

If you make use of our work, please use the citation information below.

Abstract

Many reinforcement learning (RL) environments consist of independent entities that interact sparsely. In such environments, RL agents have only limited influence over other entities in any particular situation. Our idea in this work is that learning can be efficiently guided by knowing when and what the agent can influence with its actions. To achieve this, we introduce a measure of situation-dependent causal influence based on conditional mutual information and show that it can reliably detect states of influence. We then propose several ways to integrate this measure into RL algorithms to improve exploration and off-policy learning. All modified algorithms show strong increases in data efficiency on robotic manipulation tasks.

Setup

Use make_conda_env.sh to create a Conda environment with minimal dependencies:

./make_conda_env.sh minimal cid_in_rl

or recreate the environment used to get the results (more dependencies than necessary):

conda env create -f orig_environment.yml

Activate the environment with conda activate cid_in_rl.

Experiments

Causal Influence Detection

To reproduce the causal influence detection experiment, you will need to download the used datasets here. Extract them into the folder data/. The most simple way to run all experiments is to use the included Makefile (this will take a long time):

make -C experiments/1-influence

The results will be in the folder ./data/experiments/1-influence/.

You can also train a single model, for example

python -m cid.influence_estimation.train_model \
        --log-dir logs/eval_fetchpickandplace 
        --no-logging-subdir --seed 0 \
        --memory-path data/fetchpickandplace/memory_5k_her_agent_v2.npy \
        --val-memory-path data/fetchpickandplace/val_memory_2kof5k_her_agent_v2.npy \
        experiments/1-influence/pickandplace_model_gaussian.gin

which will train a model on FetchPickPlace, and put the results in logs/eval_fetchpickandplace.

To evaluate the CAI score performance of the model on the validation set, use

python experiments/1-influence/pickandplace_cmi.py 
    --output-path logs/eval_fetchpickandplace 
    --model-path logs/eval_fetchpickandplace
    --settings-path logs/eval_fetchpickandplace/eval_settings.gin \
    --memory-path data/fetchpickandplace/val_memory_2kof5k_her_agent_v2.npy 
    --variants var_prod_approx

Reinforcement Learning

The RL experiments can be reproduced using the settings in experiments/2-prioritization, experiments/3-exploration, experiments/4-other.

To do so, run

python -m cid.train 
   

   

By default, the output will be in the folder ./logs.

Codebase Overview

  • cid/algorithms/ddpg_agent.py contains the DDPG agent
  • cid/envs contains new environments
    • cid/envs/one_d_slide.py implements the 1D-Slide dataset
    • cid/envs/robotics/pick_and_place_rot_table.py implements the RotatingTable environment
    • cid/envs/robotics/fetch_control_detection.py contains the code for deriving ground truth control labels for FetchPickAndPlace
  • cid/influence_estimation contains code for model training, evaluation and computing the causal influence score
    • cid/influence_estimation/train_model.py is the main model training script
    • cid/influence_estimation/eval_influence.py evaluates a trained model for its classification performance
    • cid/influence_estimation/transition_scorers contains code for computing the CAI score
  • cid/memory/ contains the replay buffers, which handle prioritization and exploration bonuses
    • cid/memory/mbp implements CAI (ours)
    • cid/memory/her implements Hindsight Experience Replay
    • cid/memory/ebp implements Energy-Based Hindsight Experience Prioritization
    • cid/memory/per implements Prioritized Experience Replay
  • cid/models contains Pytorch model implementations
    • cid/bnn.py contains the implementation of VIME
  • cid/play.py lets a trained RL agent run in an environment
  • cid/train.py is the main RL training script

Citation

Please use the following citation if you make use of our work:

@inproceedings{Seitzer2021CID,
  title = {Causal Influence Detection for Improving Efficiency in Reinforcement Learning},
  author = {Seitzer, Maximilian and Sch{\"o}lkopf, Bernhard and Martius, Georg},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS 2021)},
  month = dec,
  year = {2021},
  url = {https://arxiv.org/abs/2106.03443},
  month_numeric = {12}
}

License

This implementation is licensed under the MIT license.

The robotics environments were adapted from OpenAI Gym under MIT license. The VIME implementation was adapted from https://github.com/alec-tschantz/vime under MIT license.

You might also like...
Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Invariant Causal Imitation Learning for Generalizable Policies Ioana Bica, Daniel Jarrett, Mihaela van der Schaar Neural Information Processing System

Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding
Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

🍐 quince Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding 🍐 Installation $ git clone [email protected]

CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.

CausalNLP CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable. Install pip install -U

CausaLM: Causal Model Explanation Through Counterfactual Language Models

CausaLM: Causal Model Explanation Through Counterfactual Language Models Authors: Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart Abstract: Understan

Code for "Causal autoregressive flows" - AISTATS, 2021

Code for "Causal Autoregressive Flow" This repository contains code to run and reproduce experiments presented in Causal Autoregressive Flows, present

[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition

CaaM This repo contains the codes of training our CaaM on NICO/ImageNet9 dataset. Due to my recent limited bandwidth, this codebase is still messy, wh

Causal estimators for use with WhyNot

WhyNot Estimators A collection of causal inference estimators implemented in Python and R to pair with the Python causal inference library whynot. For

JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.
JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

JudeasRX Instructions Read the references given in the Theory and Notation section below Fire up the Jupyter Notebook judeas-rx.ipynb The notebook dra

Code for ACM MM 2020 paper
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Owner
Autonomous Learning Group
Autonomous Learning Group
Official implementation of Influence-balanced Loss for Imbalanced Visual Classification in PyTorch.

Official implementation of Influence-balanced Loss for Imbalanced Visual Classification in PyTorch.

Seulki Park 70 Jan 3, 2023
constructing maps of intellectual influence from publication data

Influencemap Project @ ANU Influence in the academic communities has been an area of interest for researchers. This can be seen in the popularity of a

CS Metrics 13 Jun 18, 2022
[CVPR 2021] Monocular depth estimation using wavelets for efficiency

Single Image Depth Prediction with Wavelet Decomposition Michaël Ramamonjisoa, Michael Firman, Jamie Watson, Vincent Lepetit and Daniyar Turmukhambeto

Niantic Labs 205 Jan 2, 2023
Model Zoo for AI Model Efficiency Toolkit

We provide a collection of popular neural network models and compare their floating point and quantized performance.

Qualcomm Innovation Center 137 Jan 3, 2023
Code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition"

SEW (Squeezed and Efficient Wav2vec) The repo contains the code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speec

ASAPP Research 67 Dec 1, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 7, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Jan 3, 2023
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard  J Koch 151 Dec 31, 2022
Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021)

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021) Authors: Xinshi Chen, Haoran Sun, Caleb Ellington, Eric Xing, Le Song Link to pap

Xinshi Chen 2 Dec 20, 2021