Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions

Overview

README

Repository containing the code for the paper "Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions". Specifically, an implementation of SAC + Robust Control Barrier Functions (RCBFs) for safe reinforcement learning in two custom environments.

While exploring, an RL agent can take actions that lead the system to unsafe states. Here, we use a differentiable RCBF safety layer that minimially alters (in the least-squares sense) the actions taken by the RL agent to ensure the safety of the agent.

Robust control barrier functions

As explained in the paper, RCBFs are formulated with respect to differential inclusions that serve to represent disturbed dynamical system (x_dot \in f(x) + g(x)u + D(x)). The QP used to ensure the system's safety is given by:

u_star(x) = minimize_u ||u||^2 + l ||epsilon||^2
subject to min. h_dot(x, D(x), u, u_RL) > - gamma * h(x) + epsilon

In this work, the disturbance set D in the differential inclusion is learned via Gaussian Processes (GPs). The underlying library is GPyTorch.

Coupling RL & RCBFs to improve training performance

The above is sufficient to ensure the safety of the system, however, we would also like to improve the performance of the learning by letting the RCBF layer guide the training. This is achieved via:

  • Using a differentiable version of the safety layer that allows us to backpropagte through the RCBF based Quadratic Program (QP).
  • Using the GPs and the dynamics prior to generate synthetic data (model-based RL).

Other approaches

In addition, the approach is compared against two other frameworks (implementated here) in the experiments:

Running the experiments

The two environments are Unicycle and SimulatedCars. Unicycle involves a unicycle robot tasked with reaching a desired location while avoiding obstacles and SimulatedCars involves a chain of cars driving in a lane, the RL agent controls the 4th car and must try minimzing control effort while avoiding colliding with the other cars.

  • Running the proposed approach: python main.py --env SimulatedCars --cuda --updates_per_step 2 --batch_size 512 --seed 12345 --model_based

  • Running the baseline: python main.py --env SimulatedCars --cuda --updates_per_step 1 --batch_size 256 --seed 12345 --no_diff_qp

  • Running the modified approach from "End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks": python main.py --env SimulatedCars --cuda --updates_per_step 1 --batch_size 256 --seed 12345 --no_diff_qp --use_comp True

You might also like...
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp

Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.
Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.

Robotic Arm Simulation in ROS2 and Gazebo General Overview This repository includes: First, how to simulate a 6DoF Robotic Arm from scratch using GAZE

 Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)
Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN) This is the implementation of the paper Multi-Age

Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

ROS-UGV-Control-Interface - Control interface which can be used in any UGV
ROS-UGV-Control-Interface - Control interface which can be used in any UGV

ROS-UGV-Control-Interface Cam Closed: Cam Opened:

The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

BMC The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing". BibTex entry available here. B

The implementation of the algorithm in the paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020.

DS3L This is the code for paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020. Setups The code is implem

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022
SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

SafePicking Learning Safe Object Extraction via Object-Level Mapping Kentaro Wad

Owner
Yousef Emam
Robotics PhD student at the Georgia Institute of Technology.
Yousef Emam
PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

safe-control-gym Physics-based CartPole and Quadrotor Gym environments (using PyBullet) with symbolic a priori dynamics (using CasADi) for learning-ba

Dynamic Systems Lab 300 Dec 28, 2022
Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Vansh Wassan 15 Jun 17, 2021
Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" (RSS 2022)

Intro Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" Robotics:Science and

Yunho Kim 21 Dec 7, 2022
Thermal Control of Laser Powder Bed Fusion using Deep Reinforcement Learning

This repository is the implementation of the paper "Thermal Control of Laser Powder Bed Fusion Using Deep Reinforcement Learning", linked here. The project makes use of the Deep Reinforcement Library stable-baselines3 to derive a control policy that maximizes melt pool depth consistency.

BaratiLab 11 Dec 27, 2022
Self-Correcting Quantum Many-Body Control using Reinforcement Learning with Tensor Networks

Self-Correcting Quantum Many-Body Control using Reinforcement Learning with Tensor Networks This repository contains the code and data for the corresp

Friederike Metz 7 Apr 23, 2022
RoMA: Robust Model Adaptation for Offline Model-based Optimization

RoMA: Robust Model Adaptation for Offline Model-based Optimization Implementation of RoMA: Robust Model Adaptation for Offline Model-based Optimizatio

null 9 Oct 31, 2022
Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

Hand Gesture Volume Control Modules There are basically three modules Handtracking Program Handtracking Module Volume Control Program Handtracking Pro

VITTAL 1 Jan 12, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 7, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
LBK 35 Dec 26, 2022