Policy Gradient Algorithms (One Step Actor Critic & PPO) from scratch using Numpy

Last update: Jan 17, 2022

Related tags

Algorithms PolicyGradientsNumpy

Overview

Policy Gradient Algorithms From Scratch (NumPy)

This repository showcases two policy gradient algorithms (One Step Actor Critic and Proximal Policy Optimization) applied to two MDPs. The algorithms are implemented from scratch with Numpy and utilize linear regression for the value function and single layer Softmax for the policy. The MDPs are: Gridworld and Mountain Car.

Run Instructions

Packages:

numpy and matplotlib

Create virtual environment, install requirements and run: (windows instructions)

Run python -m venv venv
Run .\venv\Scripts\activate (windows)
Run pip install -r requirements.txt
Run python .\experiments.py be wary of long compute times and plots that will pop up and must be exited in order to comtinue.

Some Sample Plots

Files

experiments.py - Runs pre programmed experiments that output various plots both in the terminal and saved to .png files.
mdp.py - Contains two MDP domains: Gridworld and Mountain Car, that the experiments are run on.
models.py - Contains ValueFunction and Policy which are the two models used (linear layers) for function approximation by the algorithms.
policy_gradient_algorithms.py - Contains the policy gradient algorithms One Step Actor Critic and Proximal Policy Optimization (PPO).

MIT License

You might also like...

This is the code repository for 40 Algorithms Every Programmer Should Know , published by Packt.

40 Algorithms Every Programmer Should Know, published by Packt

721 Jan 2, 2023

Solving a card game with three search algorithms: BFS, IDS, and A*

Search Algorithms Overview In this project, we want to solve a card game with three search algorithms. In this card game, we have to sort our cards by

5 Aug 4, 2022

🧬 Performant Evolutionary Algorithms For Python with Ray support

49 Oct 20, 2022

Nature-inspired algorithms are a very popular tool for solving optimization problems.

Nature-inspired algorithms are a very popular tool for solving optimization problems. Numerous variants of nature-inspired algorithms have been develo

215 Dec 28, 2022

All algorithms implemented in Python for education

The Algorithms - Python All algorithms implemented in Python - for education Implementations are for learning purposes only. As they may be less effic

1 Oct 20, 2021

Implementation of Apriori algorithms via Python

Installing run bellow command for installing all packages pip install -r requirements.txt Data Put csv data under this directory "infrastructure/data

0 Jul 25, 2022

A simple python application to visualize sorting algorithms.

Visualize sorting algorithms A simple python application to visualize sorting algorithms. Sort Algorithms Name Function Name O( ) Bubble Sort bubble_s

3 Apr 1, 2022

Programming Foundations Algorithms With Python

Programming-Foundations-Algorithms Algorithms purpose to solve a specific proplem with a sequential sets of steps for instance : if you need to add di

1 Nov 1, 2021

Planning Algorithms in AI and Robotics. MSc course at Skoltech Data Science program

Planning Algorithms in AI and Robotics course T2 2021-22 The Planning Algorithms in AI and Robotics course at Skoltech, MS in Data Science, during T2,