Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Edoardo Bacci

Last update: Aug 31, 2022

Related tags

Deep Learning SafeDRL

Overview

Getting Started

This repository contains the code used for the following publications:

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)
Verifying Reinforcement Learning up to Infinity (IJCAI 2021)
Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

These instructions will help with setting up the project

Prerequisites

Create a virtual environment with conda:

conda env create -f environment.yml
conda activate safedrl

This will take care of installing all the dependencies needed by python

In addition, download PRISM from the following link: https://github.com/phate09/prism

Ensure you have Gradle installed (https://gradle.org/install/)

Running the code

Before running any code, in a new terminal go to the PRISM project folder and run

gradle run

This will enable the communication channel between PRISM and the rest of the repository

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)

Training

Run the train_pendulum.py inside agents/dqn to train the agent on the inverted pendulum problem and record the location of the saved agent

Analysis

Run the domain_analysis_sym.py inside runnables/symbolic/dqn changing paths to point to the saved network

Verifying Reinforcement Learning up to Infinity (IJCAI 2021)

####Paper results ## download and unzip experiment_collection_final.zip in the 'save' directory

run tensorboard --logdir=./save/experiment_collection_final

(results for the output range analysis experiments are in experiment_collection_ora_final.zip)

####Train neural networks from scratch ## run either:

training/tune_train_PPO_bouncing_ball.py
training/tune_train_PPO_car.py
training/tune_train_PPO_cartpole.py

####Check safety of pretrained agents ## download and unzip pretrained_agents.zip in the 'save' directory

run verification/run_tune_experiments.py

(to monitor the progress of the algorithm run tensorboard --logdir=./save/experiment_collection_final)

The results in tensorboard can be filtered using regular expressions (eg. "bouncing_ball.* template: 0") on the search bar on the left:

The name of the experiment contains the name of the problem (bouncing_ball, cartpole, stopping car), the amount of adversarial noise ("eps", only for stopping_car), the time steps length for the dynamics of the system ("tau", only for cartpole) and the choice of restriction in order of complexity (0 being box, 1 being the chosen template, and 2 being octagon).

The table in the paper is filled by using some of the metrics reported in tensorboard:

max_t: Avg timesteps
seen: Avg polyhedra
time_since_restore: Avg clock time (s)

Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

364 Dec 28, 2022

Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

51 Nov 17, 2022

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

33 Dec 2, 2022

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

46 Dec 14, 2022

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

OpenVINO Inference API This is a repository for an object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operati

68 Nov 24, 2022

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Pytorch Medical Segmentation Read Chinese Introduction：Here！ Recent Updates 2021.1.8 The train and test codes are released. 2021.2.6 A bug in dice was

618 Dec 27, 2022

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

962 Dec 23, 2022

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

76 Dec 22, 2022

Comments

Error: Illegal instruction (core dumped)

Hi I am a graduate student, and I am trying to run this code and getting into errors of "Illegal instruction (core dumped)". Here is my situation: My host machine is windows. I am running the code on a ubuntu virtual machine on Virtualbox. Right now, Prism is installed and the prism binary works. Gradle is installed and works. I am able to run "train_pendulum.py" and obtain the dqn model. But when running the "domain_analysis_sym.py", it runs into error "Illegal instruction (core dumped)".

By checking the diagnosis message, it reports "python domain_a[23732] trap invalid opcode ip:7f650a3094d0 sp:7ffc9c1a7ba8 error:0 in _pywrap_tensorflow_internal.so[7f64fa800000+2a2d8000]", attached below.

I searched for the error code online, and it might due to that the windows virtualization Hyper-V needs to be turned off in Virutalbox. I haven't succeeded with that yet. I would like to know what os you have run these code with. Is it a linux machine without virtualization?

I was wondering if you could come up with any hint on this problem? Your help would be much appreciated.

opened by taoxinkth 0
About using PRISM for validation in Python

I am a graduate student who has just started researching the use of PRISM for some validation tasks. Instead of using a PRISM desktop application, I wanted to do this directly by building the PRISM model in Python and using the program interface.

I notice that your work uses PRISM through Py4j, for example "MDP = gateway.entry_point.getmdpsimple ()". I'm curious if "getMdpSimple()" is in a Java program you wrote yourself, and how can I use PRISM's Java API?

I would be honored if you could take note of my problem！

opened by Serendipity953 4

meet error py4j.Py4JException: Method reset_mdp([]) does not exist

when i try to run runnables/symbolic/dqn/domain_analysis_sym.py That error occurred：

Resetting the StateStorage
Loading complete
Building the tree
Resetting the tree
Finished building the tree
Iteration 0
Traceback (most recent call last):
  File "runnables/symbolic/dqn/domain_analysis_sym.py", line 46, in <module>
    allow_assign_actions=True)
  File "/usr/local/SafeDRL-master/symbolic/unroll_methods.py", line 599, in probability_iteration
    storage.recreate_prism(horizon * 2)
  File "/usr/local/SafeDRL-master/prism/state_storage.py", line 102, in recreate_prism
    mdp = gateway.entry_point.reset_mdp()
  File "/root/miniconda3/envs/safedrl/lib/python3.7/site-packages/py4j/java_gateway.py", line 1310, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/root/miniconda3/envs/safedrl/lib/python3.7/site-packages/py4j/protocol.py", line 332, in get_return_value
    format(target_id, ".", name, value))
py4j.protocol.Py4JError: An error occurred while calling t.reset_mdp. Trace:
py4j.Py4JException: Method reset_mdp([]) does not exist
	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
	at py4j.Gateway.invoke(Gateway.java:274)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

why come to that Is it due to spark version?

opened by kirlaw 4

Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Related tags

Overview

Getting Started

Prerequisites

Running the code

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)

Training

Analysis

Verifying Reinforcement Learning up to Infinity (IJCAI 2021)

Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

You might also like...

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Repository to run object detection on a model trained on an autonomous driving dataset.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Comments

Error: Illegal instruction (core dumped)

About using PRISM for validation in Python

meet error py4j.Py4JException: Method reset_mdp([]) does not exist

Owner

Edoardo Bacci

Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".

Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Python library containing BART query generation and BERT-based Siamese models for neural retrieval.

PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline.

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

A toolkit for document-level event extraction, containing some SOTA model implementations

Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive