Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Facebook Research

Last update: Dec 6, 2022

Related tags

Deep Learning hsd3

Overview

Hierarchical Skills for Efficient Exploration

This is the source code release for the paper Hierarchical Skills for Efficient Exploration. It contains

Code for pre-training and hierarchical learning with HSD-3
Code for the baselines we compare to in the paper

Additionally, we provide pre-trained skill policies for the Walker and Humanoid robots considered in the paper.

The benchmark suite can be found in a standalone repository at facebookresearch/bipedal-skills

Prerequisites

Install PyTorch according to the official instructions, for example in a new conda environment. This code-base was tested with PyTorch 1.8 and 1.9.

Then, install remaining requirements via

pip install -r requirements.txt

For optimal performance, we also recommend installing NVidia's PyTorch extensions.

Usage

We use Hydra to handle training configurations, with some defaults that might not make everyone happy. In particular, we disable the default job directory management -- which is good for local development but not desirable for running full experiments. This can be changed by adapting the initial portion of config/common.yaml or by passing something like hydra.run.dir=./outputs/my-custom-string to the commands below.

Pre-training Hierarchical Skills

For pre-training skill policies, use the pretrain.py script (note that this requires a machine with 2 GPUs):

# Walker robot
python pretrain.py -cn walker_pretrain
# Humanoid robot
python pretrain.py -cn humanoid_pretrain

Hierarchical Control

High-level policy training with HSD-3 is done as follows:

# Walker robot
python train.py -cn walker_hsd3
# Humanoid robot
python train.py -cn humanoid_hsd3

The default configuration assumes that a pre-trained skill policy is available at checkpoint-lo.pt. The location can be overriden by setting a new value for agent.lo.init_from (see below for an example). By default, a high-level agent will be trained on the "Hurdles" task. This can be changed by passing env.name=BiskStairs-v1, for example.

Pre-trained skill policies are available here. After unpacking the archive in the top-level directory of this repository, they can be used as follows:

# Walker robot
python train.py -cn walker_hsd3 agent.lo.init_from=$PWD/pretrained-skills/walker.pt
# Humanoid robot
python train.py -cn humanoid_hsd3 agent.lo.init_from=$PWD/pretrained-skills/humanoidpc.pt

Baselines

Individual baselines can be run by passing the following as the -cn argument to train.py (for the Walker robot):

Baseline	Configuration name
Soft Actor-Critic	`walker_sac`
DIAYN-C pre-training	`walker_diaync_pretrain`
DIAYN-C HRL	`walker_diaync_hrl`
HIRO-SAC	`walker_hiro`
Switching Ensemble	`walker_se`
HSD-Bandit	`walker_hsdb`
SD	`walker_sd`

By default, walker_sd will select the full goal space. Other goal spaces can be selected by modifying the configuration, e.g., passing subsets=2-3+4 will limit high-level control to X translation (2) and the left foot (3+4).

License

hsd3 is MIT licensed, as found in the LICENSE file.

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集，包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。人机交互主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022

TorchPQ is a python library for Approximate Nearest Neighbor Search (ANNS) and Maximum Inner Product Search (MIPS) on GPU using Product Quantization (PQ) algorithm.

Efficient implementations of Product Quantization and its variants using Pytorch and CUDA

146 Dec 28, 2022

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A

697 Dec 27, 2022

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

567 Dec 26, 2022

Comments

Observations and Actions spaces

Hi! First of all - great repo, code, and paper. I am just starting my adventure with HRL and probably this is the best-structured repo I've seen so far.

The only issue I have is the interpretation of observation and action spaces - where can I find the documentation of the vector elements?

opened by MichalBortkiewicz 2
Can't run baseline algorithms.

Hi, I am trying to run some baseline codes, but I run into the same issue. I followed the requirements.txt and used gym==0.23.1. Below is the output of the problem. Thanks.

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. Exception ignored in: <function VectorEnv.del at 0x7fe9eac66e50> Traceback (most recent call last): File "hsd3/env/lib/python3.8/site-packages/gym/vector/vector_env.py", line 215, in del self.close() File "hsd3/env/lib/python3.8/site-packages/gym/vector/vector_env.py", line 193, in close self.close_extras(**kwargs) File "hsd3/hucc/envs/thmp_vector_env.py", line 329, in close_extras if self._state != AsyncState.DEFAULT: AttributeError: 'TorchAsyncVectorEnv' object has no attribute '_state'

opened by XinyuWang2 2

Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Related tags

Overview

Hierarchical Skills for Efficient Exploration

Prerequisites

Usage

Pre-training Hierarchical Skills

Hierarchical Control

Baselines

License

You might also like...

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

TorchPQ is a python library for Approximate Nearest Neighbor Search (ANNS) and Maximum Inner Product Search (MIPS) on GPU using Product Quantization (PQ) algorithm.

Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

I tried to apply the CAM algorithm to YOLOv4 and it worked.

Custom TensorFlow2 implementations of forward and backward computation of soft-DTW algorithm in batch mode.

Bagua is a flexible and performant distributed training algorithm development framework.

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Comments

Observations and Actions spaces

Can't run baseline algorithms.

Owner

Facebook Research

Multi Task RL Baselines

Baselines for TrajNet++

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

RL algorithm PPO and IRL algorithm AIRL written with Tensorflow.

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

Supporting code for the Neograd algorithm

Code for the paper "JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design"

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Implements pytorch code for the Accelerated SGD algorithm.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.