DROPO: Sim-to-Real Transfer with Offline Domain Randomization

Overview

DROPO: Sim-to-Real Transfer with Offline Domain Randomization

Gabriele Tiboni, Karol Arndt, Ville Kyrki.

This repository contains the code for the paper: "DROPO: Sim-to-Real Transfer with Offline Domain Randomization" submitted to the IEEE Robotics and Automation Letters (RAL) Journal, in December 2021.

Abstract: In recent years, domain randomization has gained a lot of traction as a method for sim-to-real transfer of reinforcement learning policies; however, coming up with optimal randomization ranges can be difficult. In this paper, we introduce DROPO, a novel method for estimating domain randomization ranges for a safe sim-to-real transfer. Unlike prior work, DROPO only requires a precollected offline dataset of trajectories, and does not converge to point estimates. We demonstrate that DROPO is capable of recovering dynamic parameter distributions in simulation and finding a distribution capable of compensating for an unmodelled phenomenon. We also evaluate the method on two zero-shot sim-to-real transfer scenarios, showing a successful domain transfer and improved performance over prior methods.

dropo_general_framework

Requirements

This repository makes use of the following external libraries:

How to launch DROPO

1. Dataset collection and formatting

Prior to running the code, an offline dataset of trajectories from the target (real) environment needs to be collected. This dataset can be generated either by rolling out any previously trained policy, or by kinesthetic guidance of the robot.

The dataset object must be formatted as follows:

n : int
      state space dimensionality
a : int
      action space dimensionality
t : int
      number of state transitions

dataset : dict,
      object containing offline-collected trajectories

dataset['observations'] : ndarray
      2D array (t, n) containing the current state information for each timestep

dataset['next_observations'] : ndarray
      2D array (t, n) containing the next-state information for each timestep

dataset['actions'] : ndarray
      2D array (t, a) containing the action commanded to the agent at the current timestep

dataset['terminals'] : ndarray
      1D array (t,) of booleans indicating whether or not the current state transition is terminal (ends the episode)

2. Add environment-specific methods

Augment the simulated environment with the following methods to allow Domain Randomization and its optimization:

  • env.set_task(*new_task) # Set new dynamics parameters

  • env.get_task() # Get current dynamics parameters

  • mjstate = env.get_sim_state() # Get current internal mujoco state

  • env.get_initial_mjstate(state) and env.get_full_mjstate # Get the internal mujoco state from given state

  • env.set_sim_state(mjstate) # Set the simulator to a specific mujoco state

  • env.set_task_search_bounds() # Set the search bound for the mean of the dynamics parameters

  • (optional) env.get_task_lower_bound(i) # Get lower bound for i-th dynamics parameter

  • (optional) env.get_task_upper_bound(i) # Get upper bound for i-th dynamics parameter

3. Run test_dropo.py

Sample file to launch DROPO.

Test DROPO on the Hopper environment

This repository contains a ready-to-use Hopper environment implementation (based on the code from OpenAI gym) and an associated offline dataset to run quick DROPO experiments on Hopper, with randomized link masses. The dataset consists of 20 trajectories collected on the ground truth hopper environment with mass values [3.53429174, 3.92699082, 2.71433605, 5.0893801].

E.g.:

  • Quick test (10 sparse transitions and 1000 obj. function evaluations only):

    python3 test_dropo.py --sparse-mode -n 10 -l 1 --budget 1000 -av --epsilon 1e-5 --seed 100 --dataset datasets/hopper10000 --normalize --logstdevs

  • Advanced test (2 trajectories are considered, with 5000 obj. function evaluations, and 10 parallel workers):

    python3 test_dropo.py -n 2 -l 1 --budget 5000 -av --epsilon 1e-5 --seed 100 --dataset datasets/hopper10000 --normalize --logstdevs --now 10

test_dropo.py will return the optimized domain randomization distribution, suitable for training a reinforcement learning policy on the same simulated environment.

Cite us

If you use this repository, please consider citing

    @misc{tiboni2022dropo,
          title={DROPO: Sim-to-Real Transfer with Offline Domain Randomization},
          author={Gabriele Tiboni and Karol Arndt and Ville Kyrki},
          year={2022},
          eprint={2201.08434},
          archivePrefix={arXiv},
          primaryClass={cs.RO}
    }
You might also like...
Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

DKPNet ICCV 2021 Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting Baseline of DKPNet is availa

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation [arxiv] This is the official repository for CDTrans: Cross-domain Transformer for

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

Implementation for
Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

DSBF Introduction This repository contains the implementation code for paper: Domain-Specific Bias Filtering for Single Labeled Domain Generalization

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

A Pytorch Implementation of Source data‐free domain adaptation of object detector through domain‐specific perturbation Please follow Faster R-CNN and

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

Official code of
Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Real-ESRGAN Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Ported from https://github.com/xinntao/Real-ESRGAN Depend

Owner
Gabriele Tiboni
First-year Ellis PhD student in Artificial Intelligence @ Politecnico di Torino.
Gabriele Tiboni
code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

null 75 Dec 16, 2022
Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Visualizing Adapted Knowledge in Domain Transfer @inproceedings{hou2021visualizing, title={Visualizing Adapted Knowledge in Domain Transfer}, auth

Yunzhong Hou 80 Dec 25, 2022
Transfer-Learn is an open-source and well-documented library for Transfer Learning.

Transfer-Learn is an open-source and well-documented library for Transfer Learning. It is based on pure PyTorch with high performance and friendly API. Our code is pythonic, and the design is consistent with torchvision. You can easily develop new algorithms, or readily apply existing algorithms.

THUML @ Tsinghua University 2.2k Jan 3, 2023
Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

Brian Alejandro 1 Feb 13, 2022
Official Implementation of Domain-Aware Universal Style Transfer

Domain Aware Universal Style Transfer Official Pytorch Implementation of 'Domain Aware Universal Style Transfer' (ICCV 2021) Domain Aware Universal St

KibeomHong 80 Dec 30, 2022
PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

MNIST-to-SVHN and SVHN-to-MNIST PyTorch Implementation of CycleGAN and Semi-Supervised GAN for Domain Transfer. Prerequites Python 3.5 PyTorch 0.1.12

Yunjey Choi 401 Dec 30, 2022
Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

This is the official implementation of our paper Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR), which has been accepted by WSDM2022.

Yongchun Zhu 81 Dec 29, 2022
Instant Real-Time Example-Based Style Transfer to Facial Videos

FaceBlit: Instant Real-Time Example-Based Style Transfer to Facial Videos The official implementation of FaceBlit: Instant Real-Time Example-Based Sty

Aneta Texler 131 Dec 19, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 6, 2023
[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

Guangrui Li 84 Dec 26, 2022