Disagreement-Regularized Imitation Learning

Kianté Brantley

Last update: Apr 28, 2022

Related tags

Overview

Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in codebase for where the bug was fixed at. [link]

Disagreement-Regularized Imitation Learning

Code to train the models described in the paper "Disagreement-Regularized Imitation Learning", by Kianté Brantley, Wen Sun and Mikael Henaff.

Usage:

Install using pip

Install the DRIL package

pip install -e .

Software Dependencies

"stable-baselines", "rl-baselines-zoo", "baselines", "gym", "pytorch", "pybullet"

Data

We provide a python script to generate expert data from per-trained models using the "rl-baselines-zoo" repository. Click "Here" to see all of the pre-trained agents available and their respective perfromance. Replace <name-of-environment> with the name of the pre-trained agent environment you would like to collect expert data for.

python -u generate_demonstration_data.py --seed <seed-number> --env-name <name-of-environment> --rl_baseline_zoo_dir <location-to-top-level-directory>

Training

DRIL requires a per-trained ensemble model and a per-trained behavior-cloning model.

Note that <location-to-rl-baseline-zoo-directory> is the full-path to the top-level directory to the rl_baseline_zoo repository.

To train only a behavior-cloning model run:

python -u main.py --env-name <name-of-environment> --num-trajs <number-of-trajectories> --behavior_cloning --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>'

To train only a ensemble model run:

python -u main.py --env-name <name-of-environment> --num-trajs <number-of-trajectories> --pretrain_ensemble_only --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>'

To train a DRIL model run the command below. Note that command below first checks that both the behavior cloning model and the ensemble model are trained, if they are not the script will automatically train both the ensemble and behavior-cloning model.

python -u main.py --env-name <name-of-environment> --default_experiment_params <type-of-env>  --num-trajs <number-of-trajectories> --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>  --dril

--default_experiment_params are the default parameters we use in the DRIL experiments and has two options: atari and continous-control

Visualization

After training the models, the results are stored in a folder called trained_results. Run the command below to reproduce the plots in our paper. If you change any of the hyperparameters, you will need to change the hyperparameters in the plot file naming convention.

python -u plot.py -env <name-of-environment>

Empirical evaluation

Atari

Results on Atari environments.

Continous Control

Results on continuous control tasks.

Acknowledgement:

We would like to thank Ilya Kostrikov for creating this "repo" that our codebase builds on.

(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

RDPNet IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation PyTorch training and testing code are available.

41 Oct 21, 2022

R-Drop: Regularized Dropout for Neural Networks

R-Drop: Regularized Dropout for Neural Networks R-drop is a simple yet very effective regularization method built upon dropout, by minimizing the bidi

756 Dec 27, 2022

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

ARAE Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun https://arxiv.org/abs/1706.04223 Disc

399 Jan 2, 2023

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

18 Dec 17, 2022

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

256 Dec 28, 2022

Disagreement-Regularized Imitation Learning

Related tags

Overview

Disagreement-Regularized Imitation Learning

Usage:

Install using pip

Software Dependencies

Data

Training

Visualization

Empirical evaluation

Atari

Continous Control

Acknowledgement:

You might also like...

(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

R-Drop: Regularized Dropout for Neural Networks

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Pytorch code for "State-only Imitation with Transition Dynamics Mismatch" (ICLR 2020)

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

Owner

Kianté Brantley

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification

ilpyt: imitation learning library with modular, baseline implementations in Pytorch

Visual Adversarial Imitation Learning using Variational Models (VMAIL)

Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer