PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Jason Ma

Last update: Aug 30, 2022

Related tags

Deep Learning SMODICE

Overview

SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

This is the official PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching.

SMODICE Demos

Tabular Experiments

Offline Imitation Learning from Mismatched Experts

python smodice_tabular/run_tabular_mismatched.py

Offline Imitation Learning from Examples

python smodice_tabular/run_tabular_example.py

Deep IL Experiments

Setup

Create conda environment and activate it:

conda env create -f environment.yml
conda activate smodice
pip install --upgrade numpy
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio===0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
git clone https://github.com/rail-berkeley/d4rl
cd d4rl
pip install -e .

Offline IL from Observations

Run the following command with variable ENV set to any of hopper, walker2d, halfcheetah, ant, kitchen.

python run_oil_observations.py --env_name $ENV

For the AntMaze environment, first generate the random dataset:

cd envs
python generate_antmaze_random.py --noise

Then, run

python run_oil_antmaze.py

Offline IL from Mismatched Experts

For halfcheetah and ant, run

python run_oil_observations.py --env_name halfcheetah --dataset 0.5 --mismatch True

and

python run_oil_observations.py --env_name ant --dataset disabled --mismatch True

respectively. 2. For AntMaze, run

python run_oil_antmaze.py --mismatch True

Offline IL from Examples

For the PointMass-4Direction task, run

python run_oil_examples_pointmass.py

For the AntMaze task, run

python run_oil_antmaze.py --mismatch False --example True

For the Franka Kitchen based tasks, run

python run_oil_examples_kitchen.py --dataset $DATASET

where DATASET can be one of microwave, kettle.

Baselines

For any task, the BC baseline can be run by appending --disc_type bc to the above commands.

For RCE-TD3-BC and ORIL baselines, on the appropriate tasks, append --algo_type $ALGO where ALGO can be one of rce, oril.

Citation

If you find this repository useful for your research, please cite

@article{ma2022smodice,
      title={SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching}, 
      author={Yecheng Jason Ma and Andrew Shen and Dinesh Jayaraman and Osbert Bastani},
      year={2022},
      url={https://arxiv.org/abs/2202.02433}
}

Contact

If you have any questions regarding the code or paper, feel free to contact me at [email protected].

Acknowledgment

This codebase is partially adapted from optidice, rce, relay-policy-learning, and d4rl ; We thank the authors and contributors for open-sourcing their code.

You might also like...

Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.

219 Dec 20, 2022

Repository for the paper "Online Domain Adaptation for Occupancy Mapping", RSS 2020

RSS 2020 - Online Domain Adaptation for Occupancy Mapping Repository for the paper "Online Domain Adaptation for Occupancy Mapping", Robotics: Science

26 Sep 22, 2022

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

63 Nov 18, 2022

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

64 Jan 5, 2023

Nest Protect integration for Home Assistant. This will allow you to integrate your smoke, heat, co and occupancy status real-time in HA.

Nest Protect integration for Home Assistant Custom component for Home Assistant to interact with Nest Protect devices via an undocumented and unoffici

175 Dec 29, 2022

PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Related tags

Overview

SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

SMODICE Demos

Tabular Experiments

Deep IL Experiments

Setup

Offline IL from Observations

Offline IL from Mismatched Experts

Offline IL from Examples

Baselines

Citation

Contact

Acknowledgment

You might also like...

Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

Repository for the paper "Online Domain Adaptation for Occupancy Mapping", RSS 2020

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Nest Protect integration for Home Assistant. This will allow you to integrate your smoke, heat, co and occupancy status real-time in HA.

Project for tracking occupancy in Tel-Aviv parking lots.

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Disagreement-Regularized Imitation Learning

Owner

Jason Ma

Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Generalized Decision Transformer for Offline Hindsight Information Matching

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

[CVPR'22] COAP: Learning Compositional Occupancy of People

A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generative Modeling" (ICCV 2021)

ilpyt: imitation learning library with modular, baseline implementations in Pytorch

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code