Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Overview

Getting Started

This repository contains the code used for the following publications:

  • Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)
  • Verifying Reinforcement Learning up to Infinity (IJCAI 2021)
  • Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

These instructions will help with setting up the project

Prerequisites

Create a virtual environment with conda:

conda env create -f environment.yml
conda activate safedrl

This will take care of installing all the dependencies needed by python

In addition, download PRISM from the following link: https://github.com/phate09/prism

Ensure you have Gradle installed (https://gradle.org/install/)

Running the code

Before running any code, in a new terminal go to the PRISM project folder and run

gradle run

This will enable the communication channel between PRISM and the rest of the repository

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)

Training

Run the train_pendulum.py inside agents/dqn to train the agent on the inverted pendulum problem and record the location of the saved agent

Analysis

Run the domain_analysis_sym.py inside runnables/symbolic/dqn changing paths to point to the saved network

Verifying Reinforcement Learning up to Infinity (IJCAI 2021)

####Paper results ## download and unzip experiment_collection_final.zip in the 'save' directory

run tensorboard --logdir=./save/experiment_collection_final

(results for the output range analysis experiments are in experiment_collection_ora_final.zip)

####Train neural networks from scratch ## run either:

  • training/tune_train_PPO_bouncing_ball.py
  • training/tune_train_PPO_car.py
  • training/tune_train_PPO_cartpole.py

####Check safety of pretrained agents ## download and unzip pretrained_agents.zip in the 'save' directory

run verification/run_tune_experiments.py

(to monitor the progress of the algorithm run tensorboard --logdir=./save/experiment_collection_final)

The results in tensorboard can be filtered using regular expressions (eg. "bouncing_ball.* template: 0") on the search bar on the left:

The name of the experiment contains the name of the problem (bouncing_ball, cartpole, stopping car), the amount of adversarial noise ("eps", only for stopping_car), the time steps length for the dynamics of the system ("tau", only for cartpole) and the choice of restriction in order of complexity (0 being box, 1 being the chosen template, and 2 being octagon).

The table in the paper is filled by using some of the metrics reported in tensorboard:

  • max_t: Avg timesteps
  • seen: Avg polyhedra
  • time_since_restore: Avg clock time (s)

alt text

Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

You might also like...
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

Repository to run object detection on a model trained on an autonomous driving dataset.
Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.
This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

OpenVINO Inference API This is a repository for an object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operati

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.
This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Pytorch Medical Segmentation Read Chinese Introduction:Here! Recent Updates 2021.1.8 The train and test codes are released. 2021.2.6 A bug in dice was

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.
Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

Comments
  • Error: Illegal instruction (core dumped)

    Error: Illegal instruction (core dumped)

    Hi I am a graduate student, and I am trying to run this code and getting into errors of "Illegal instruction (core dumped)". Here is my situation: My host machine is windows. I am running the code on a ubuntu virtual machine on Virtualbox. Right now, Prism is installed and the prism binary works. Gradle is installed and works. I am able to run "train_pendulum.py" and obtain the dqn model. But when running the "domain_analysis_sym.py", it runs into error "Illegal instruction (core dumped)".

    image

    By checking the diagnosis message, it reports "python domain_a[23732] trap invalid opcode ip:7f650a3094d0 sp:7ffc9c1a7ba8 error:0 in _pywrap_tensorflow_internal.so[7f64fa800000+2a2d8000]", attached below.

    image

    I searched for the error code online, and it might due to that the windows virtualization Hyper-V needs to be turned off in Virutalbox. I haven't succeeded with that yet. I would like to know what os you have run these code with. Is it a linux machine without virtualization?

    I was wondering if you could come up with any hint on this problem? Your help would be much appreciated.

    opened by taoxinkth 0
  • About using PRISM for validation in Python

    About using PRISM for validation in Python

    I am a graduate student who has just started researching the use of PRISM for some validation tasks. Instead of using a PRISM desktop application, I wanted to do this directly by building the PRISM model in Python and using the program interface.

    I notice that your work uses PRISM through Py4j, for example "MDP = gateway.entry_point.getmdpsimple ()". I'm curious if "getMdpSimple()" is in a Java program you wrote yourself, and how can I use PRISM's Java API?

    I would be honored if you could take note of my problem!

    opened by Serendipity953 4
  • meet error py4j.Py4JException: Method reset_mdp([]) does not exist

    meet error py4j.Py4JException: Method reset_mdp([]) does not exist

    when i try to run runnables/symbolic/dqn/domain_analysis_sym.py That error occurred:

    Resetting the StateStorage
    Loading complete
    Building the tree
    Resetting the tree
    Finished building the tree
    Iteration 0
    Traceback (most recent call last):
      File "runnables/symbolic/dqn/domain_analysis_sym.py", line 46, in <module>
        allow_assign_actions=True)
      File "/usr/local/SafeDRL-master/symbolic/unroll_methods.py", line 599, in probability_iteration
        storage.recreate_prism(horizon * 2)
      File "/usr/local/SafeDRL-master/prism/state_storage.py", line 102, in recreate_prism
        mdp = gateway.entry_point.reset_mdp()
      File "/root/miniconda3/envs/safedrl/lib/python3.7/site-packages/py4j/java_gateway.py", line 1310, in __call__
        answer, self.gateway_client, self.target_id, self.name)
      File "/root/miniconda3/envs/safedrl/lib/python3.7/site-packages/py4j/protocol.py", line 332, in get_return_value
        format(target_id, ".", name, value))
    py4j.protocol.Py4JError: An error occurred while calling t.reset_mdp. Trace:
    py4j.Py4JException: Method reset_mdp([]) does not exist
    	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
    	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
    	at py4j.Gateway.invoke(Gateway.java:274)
    	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    	at py4j.commands.CallCommand.execute(CallCommand.java:79)
    	at py4j.GatewayConnection.run(GatewayConnection.java:238)
    	at java.lang.Thread.run(Thread.java:748)
    

    why come to that Is it due to spark version?

    opened by kirlaw 4
Owner
Edoardo Bacci
Edoardo Bacci
Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".

Real-time stock predictions with deep learning and news scraping This repository contains a partial implementation of my bachelor's thesis "Real-time

David Álvarez de la Torre 0 Feb 9, 2022
Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Temporal copying and local hallucination for video inpainting This repository contains the implementation of my master's thesis "Temporal copying and

David Álvarez de la Torre 1 Dec 2, 2022
Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

THESIS_CAIRONE_FIORENTINO Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques" GENERATE TOKE

cairone_fiorentino97 1 Dec 10, 2021
Python library containing BART query generation and BERT-based Siamese models for neural retrieval.

Neural Retrieval Embedding-based Zero-shot Retrieval through Query Generation leverages query synthesis over large corpuses of unlabeled text (such as

Amazon Web Services - Labs 35 Apr 14, 2022
PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

Compositional Zero-Shot Learning This is the official PyTorch code of the CVPR 2021 works Learning Graph Embeddings for Compositional Zero-shot Learni

EML Tübingen 70 Dec 27, 2022
TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline.

TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline

null 193 Dec 22, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
A toolkit for document-level event extraction, containing some SOTA model implementations

❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there ?? . Thanks for your stay in this repo. This project aims at buildi

Tong Zhu(朱桐) 159 Dec 22, 2022
Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

google-drive-to-sqlite Create a SQLite database containing metadata from Google

Simon Willison 140 Dec 4, 2022