Safe Policy Optimization with Local Features

Akifumi Wachi

Last update: Jun 5, 2022

Related tags

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization with Local Generalized Linear Function Approximations" which was presented in NeurIPS-21.

Installation

There is requirements.txt in this repository. Except for the common modules (e.g., numpy, scipy), our source code depends on the following modules.

Mandatory
- Gym-MiniGrid (https://github.com/maximecb/gym-minigrid)
- Hydra (https://github.com/facebookresearch/hydra)
- pymdptoolbox (https://github.com/sawcordwell/pymdptoolbox)
Optional
- GPy (https://github.com/SheffieldML/GPy)

We also provide Dockerfile in this repository, which can be used for reproducing our grid-world experiment.

Simulation configuration

We manage the simulation configuration using hydra. Configurations are listed in config.yaml. For example, the algorithm to run should be chosen from the ones we implemented:

sim_type: {safe_glm, unsafe_glm, random, oracle, safe_gp_state, safe_gp_feature, safe_glm_stepwise}

Grid World Experiment

The source code necessary for our grid-world experiment is contained in /grid_world folder. To run the simulation, for example, use the following commands.

cd grid_world
python main.py sim_type=safe_glm env.reuse_env=False

For the monte carlo simulation while comparing our proposed method with baselines, use the shell file, run.sh.

We also provide a script for visualization. If you want to render how the agent behaves, use the following command.

python main.py sim_type=safe_glm env.reuse_env=True

Safety-Gym Experiment

The source code necessary for our safety-gym experiment is contained in /safety_gym_discrete folder. Our experiment is based on safety-gym. Our proposed method utilize dynamic programming algorithms to solve Bellman Equation, so we modified engine.py to discrtize the environment. We attach modified safety-gym source code in /safety_gym_discrete/engine.py. To use the modified library, please clone safety-gym, then replace safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py in our repo. Using the following commands to install the modified library:

cd safety_gym
pip install -e .

Note that MuJoCo licence is needed for installing Safety-Gym. To run the simulation, use the folowing commands.

cd safety_gym_discrete
python main.py sim_idx=0

We compare our proposed method with three notable baselines: CPO, PPO-Lagrangian, and TRPO-Lagrangian. The baseline implementation depends on safety-starter-agents. We modified run_agent.py in the repo source code.

To run the baseline, use the folowing commands.

cd safety_gym_discrete/baseline
python baseline_run.py sim_type=cpo

The environment that agent runs on is generated using generate_env.py. We provide 10 50*50 environments. If you want to generate other environments, you can change the world shape in safety_gym_discrete.py, and running the following commands:

cd safety_gym_discrete
python generate_env.py

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{wachi_yue_sui_neurips2021,
  Author = {Wachi, Akifumi and Wei, Yunyue and Sui, Yanan},
  Title = {Safe Policy Optimization with Local Generalized Linear Function Approximations},
  Booktitle  = {Neural Information Processing Systems (NeurIPS)},
  Year = {2021}
}

You might also like...

Searches for potentially vulnerable websites to local file inclusion, throughout the web and then exploits them for LFI

LFI-Hunter Searches for potentially vulnerable websites to local file inclusion, throughout the web and then exploits them for LFI A script written in

6 Jan 30, 2022

A local Socks5 server written in python, used for integrating Multi-hop

proxy-Zata proxy-Zata v1.0 This is a local Socks5 server written in python, used for integrating Multi-hop (Socks4/Socks5/HTTP) forward proxy then pro

4 Feb 24, 2022

Local File Inclusion Scanner and Exploiter

LFI-Paradise Local File Inclusion Scanner and Exploiter Features 1- Scanner 2- E

11 Sep 4, 2022

Anti-Nuke capabilities, powerful moderation features, auto punishments, captcha-verification and more.

Server-Security-Discord-Bot Anti-Nuke capabilities, powerful moderation features, auto punishments, captcha-verification and more. Installation Instal

20 Apr 7, 2022

Convert a collection of features to a fixed-dimensional matrix using the hashing trick.

FeatureHasher Convert a collection of features to a fixed-dimensional matrix using the hashing trick. Note, this requires Jina=2.2.4. Example Here I

5 Mar 15, 2022

A windows post exploitation tool that contains a lot of features for information gathering and more.

Crowbar - A windows post exploitation tool Status - ✔️ This project is now considered finished. Any updates from now on will most likely be new script

29 Nov 20, 2022

An experimental script to perform bulk parsing of arbitrary file features with YARA and console logging.

RonnieColemanYARAParser This script is named after Ronnie Coleman, and peforms bulk lifts on arbitary file features using YARA console logging. Requir

20 Dec 13, 2022

Fully Automated YouTube Channel ▶️with Added Extra Features.

Fully Automated Youtube Channel ▒█▀▀█ █▀▀█ ▀▀█▀▀ ▀▀█▀▀ █░░█ █▀▀▄ █▀▀ █▀▀█ ▒█▀▀▄ █░░█ ░░█░░ ░▒█░░ █░░█ █▀▀▄ █▀▀ █▄▄▀ ▒█▄▄█ ▀▀▀▀ ░░▀░░ ░▒█░░ ░▀▀▀ ▀▀▀░

249 Jan 2, 2023

Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

9 Jun 6, 2022

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods This repository is the official implementation of Seohong Park, Jaeky

6 Aug 2, 2022

Policy and data administration, distribution, and real-time updates on top of Open Policy Agent

⚡ OPAL ⚡ Open Policy Administration Layer OPAL is an administration layer for Open Policy Agent (OPA), detecting changes to both policy and policy dat

8 Dec 7, 2022

Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app.

django-permissions-policy Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app. Requirements Python 3.

76 Nov 30, 2022

Wonk is a tool for combining a set of AWS policy files into smaller compiled policy sets.

140 Dec 16, 2022

Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app.

django-permissions-policy Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app. Requirements Python 3.

78 Jan 2, 2023

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

1.4k Dec 25, 2022

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

pytorch-a2c-ppo-acktr Update (April 12th, 2021) PPO is great, but Soft Actor Critic can be better for many continuous control tasks. Please check out

3k Jan 9, 2023

Comments

The missing engine.py file

Hello, you really did a good job, and I am interested in learning more details from your source code. You mentioned that replacing safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py, but I don't find the engine.py in the safety_gym_discrete folder. Maybe you forget to add and commit it to this repository? I would sincerely appreciate it if you can respond to my issue!

opened by molumitu 1

Safe Policy Optimization with Local Features

Related tags

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

Installation

Simulation configuration

Grid World Experiment

Safety-Gym Experiment

Citation

You might also like...

Searches for potentially vulnerable websites to local file inclusion, throughout the web and then exploits them for LFI

A local Socks5 server written in python, used for integrating Multi-hop

Local File Inclusion Scanner and Exploiter

Anti-Nuke capabilities, powerful moderation features, auto punishments, captcha-verification and more.

Convert a collection of features to a fixed-dimensional matrix using the hashing trick.

A windows post exploitation tool that contains a lot of features for information gathering and more.

An experimental script to perform bulk parsing of arbitrary file features with YARA and console logging.

Fully Automated YouTube Channel ▶️with Added Extra Features.

Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Policy and data administration, distribution, and real-time updates on top of Open Policy Agent

Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app.

Wonk is a tool for combining a set of AWS policy files into smaller compiled policy sets.

Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app.

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

Safe Bayesian Optimization

This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

PyTorch implementation of Trust Region Policy Optimization

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Comments

The missing engine.py file

Owner

Akifumi Wachi

Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app.

Wonk is a tool for combining a set of AWS policy files into smaller compiled policy sets.

A simple python script to dump remote files through a local file read or local file inclusion web vulnerability.

Mad Spammer is a python webhook spammer which is very easy and safe to use.

Dark-Fb No Login 100% safe

This tool allows to automatically test for Content Security Policy bypass payloads.

Enhancing Twin Delayed Deep Deterministic Policy Gradient with Cross-Entropy Method

Local server for IDA Lumina feature

Polkit - Local Privilege Escalation (CVE-2021-3560)

An Advanced Local Network IP Scanner, made in python of course!