483 Repositories
Python reinforcement-agent Libraries
MetaDrive: Composing Diverse Scenarios for Generalizable Reinforcement Learning
MetaDrive: Composing Diverse Driving Scenarios for Generalizable RL [ Documentation | Demo Video ] MetaDrive is a driving simulator with the following
Tensorforce: a TensorFlow library for applied reinforcement learning
Tensorforce: a TensorFlow library for applied reinforcement learning Introduction Tensorforce is an open-source deep reinforcement learning framework,
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥
TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning
Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp
SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research
SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch
Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea
Deep Reinforcement Learning with pytorch & visdom
Deep Reinforcement Learning with pytorch & visdom Sample testings of trained agents (DQN on Breakout, A3C on Pong, DoubleDQN on CartPole, continuous A
Implementation of algorithms for continuous control (DDPG and NAF).
DEPRECATION This repository is deprecated and is no longer maintaned. Please see a more recent implementation of RL for continuous control at jax-sac.
Reinforcement learning models in ViZDoom environment
DoomNet DoomNet is a ViZDoom agent trained by reinforcement learning. The agent is a neural network that outputs a probability of actions given only p
Deep Q-Learning Network in pytorch (not actively maintained)
pytoch-dqn This project is pytorch implementation of Human-level control through deep reinforcement learning and I also plan to implement the followin
CompilerGym is a library of easy to use and performant reinforcement learning environments for compiler tasks
CompilerGym is a library of easy to use and performant reinforcement learning environments for compiler tasks
Efficient Sparse Attacks on Videos using Reinforcement Learning
EARL This repository provides a simple implementation of the work "Efficient Sparse Attacks on Videos using Reinforcement Learning" Example: Demo: Her
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.
Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021
Roach: End-to-End Urban Driving by Imitating a Reinforcement Learning Coach
CARLA-Roach This is the official code release of the paper End-to-End Urban Driving by Imitating a Reinforcement Learning Coach by Zhejun Zhang, Alexa
A Deep Reinforcement Learning Framework for Stock Market Trading
DQN-Trading This is a framework based on deep reinforcement learning for stock market trading. This project is the implementation code for the two pap
PPO is a very popular Reinforcement Learning algorithm at present.
PPO is a very popular Reinforcement Learning algorithm at present. OpenAI takes PPO as the current baseline algorithm. We use the PPO algorithm to train a policy to give the best action in any situation.
A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)
gym-mtsim: OpenAI Gym - MetaTrader 5 Simulator MtSim is a simulator for the MetaTrader 5 trading platform alongside an OpenAI Gym environment for rein
PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning
safe-control-gym Physics-based CartPole and Quadrotor Gym environments (using PyBullet) with symbolic a priori dynamics (using CasADi) for learning-ba
Yet Another Reinforcement Learning Tutorial
This repo contains self-contained RL implementations
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)
Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec
Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"
Trajectory Transformer Code release for Reinforcement Learning as One Big Sequence Modeling Problem. Installation All python dependencies are in envir
An efficient framework for reinforcement learning.
rl: An efficient framework for reinforcement learning Requirements Introduction PPO Test Requirements name version Python =3.7 numpy =1.19 torch =1
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU
WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).
[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations
CrowdNav with Social-NCE This is an official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations by
A3C LSTM Atari with Pytorch plus A3G design
NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C
PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning.
neural-combinatorial-rl-pytorch PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning. I have implemented the basic
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"
This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-decoder model and benchmarks the combination under simulated noisy rewards.
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
pytorch-a2c-ppo-acktr Update (April 12th, 2021) PPO is great, but Soft Actor Critic can be better for many continuous control tasks. Please check out
A working implementation of the Categorical DQN (Distributional RL).
Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari
Customizable RecSys Simulator for OpenAI Gym
gym-recsys: Customizable RecSys Simulator for OpenAI Gym Installation | How to use | Examples | Citation This package describes an OpenAI Gym interfac
MINERVA: An out-of-the-box GUI tool for offline deep reinforcement learning
MINERVA is an out-of-the-box GUI tool for offline deep reinforcement learning, designed for everyone including non-programmers to do reinforcement learning as a tool.
This is the official code of our paper "Diversity-based Trajectory and Goal Selection with Hindsight Experience Relay" (PRICAI 2021)
Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay This is the official implementation of our paper "Diversity-based Traje
DockStream: A Docking Wrapper to Enhance De Novo Molecular Design
DockStream Description DockStream is a docking wrapper providing access to a collection of ligand embedders and docking backends. Docking execution an
Orthrus is a macOS agent that uses Apple's MDM to backdoor a device using a malicious profile.
Orthrus is a macOS agent that uses Apple's MDM to backdoor a device using a malicious profile. It effectively runs its own MDM server and allows the operator to interface with it using Mythic.
A TensorFlow implementation of SOFA, the Simulator for OFfline LeArning and evaluation.
SOFA This repository is the implementation of SOFA, the Simulator for OFfline leArning and evaluation. Keeping Dataset Biases out of the Simulation: A
[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild
IVOS-W Paper Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild Zhaoyun Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanli
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI
Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.
Rainbow: Combining Improvements in Deep Reinforcement Learning
Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning [1]. Results and pretrained models can be found in the releases. DQN [2] Double
Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.
Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme
Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)
Gated-Attention Architectures for Task-Oriented Language Grounding This is a PyTorch implementation of the AAAI-18 paper: Gated-Attention Architecture
PyTorch implementation of Trust Region Policy Optimization
PyTorch implementation of TRPO Try my implementation of PPO (aka newer better variant of TRPO), unless you need to you TRPO for some specific reasons.
YARR is Yet Another Robotics and Reinforcement learning framework for PyTorch.
Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.
Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.
Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.
Code for Emergent Translation in Multi-Agent Communication
Emergent Translation in Multi-Agent Communication PyTorch implementation of the models described in the paper Emergent Translation in Multi-Agent Comm
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)
A very short and easy implementation of Quantile Regression DQN
Quantile Regression DQN Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression (https://arx
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This
PyTorch implementations of deep reinforcement learning algorithms and environments
Deep Reinforcement Learning Algorithms with PyTorch This repository contains PyTorch implementations of deep reinforcement learning algorithms and env
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).
GAM ⠀⠀ A PyTorch implementation of Graph Classification Using Structural Attention (KDD 2018). Abstract Graph classification is a problem with practic
DrQ-v2: Improved Data-Augmented Reinforcement Learning
DrQ-v2: Improved Data-Augmented RL Agent Method DrQ-v2 is a model-free off-policy algorithm for image-based continuous control. DrQ-v2 builds on DrQ,
Medusa is a cross-platform agent compatible with both Python 3.8 and Python 2.7.
Medusa Medusa is a cross-platform agent compatible with both Python 3.8 and Python 2.7. Installation To install Medusa, you'll need Mythic installed o
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators. It's also a suite of learning algorithms to train agents to operate in these environments (PPO, SAC, evolutionary strategy, and direct trajectory optimization are implemented).
Artificial Intelligence search algorithm base on Pacman
Pacman Search Artificial Intelligence search algorithm base on Pacman Source The Pacman Projects by the University of California, Berkeley. Layouts Di
Implement A3C for Mujoco gym envs
pytorch-a3c-mujoco Disclaimer: my implementation right now is unstable (you ca refer to the learning curve below), I'm not sure if it's my problems. A
Malmo Collaborative AI Challenge - Team Pig Catcher
The Malmo Collaborative AI Challenge - Team Pig Catcher Approach The challenge involves 2 agents who can either cooperate or defect. The optimal polic
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
AlphaZero-Gomoku This is an implementation of the AlphaZero algorithm for playing the simple board game Gomoku (also called Gobang or Five in a Row) f
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch
A library of multi-agent reinforcement learning components and systems
Mava: a research framework for distributed multi-agent reinforcement learning Table of Contents Overview Getting Started Supported Environments System
DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms
DI-HPC: Decision Intelligence - High Performance Computation DI-HPC is an acceleration operator component for general algorithm modules in reinforceme
OpenDILab RL Kubernetes Custom Resource and Operator Lib
DI Orchestrator DI Orchestrator is designed to manage DI (Decision Intelligence) jobs using Kubernetes Custom Resource and Operator. Prerequisites A w
ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.
Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch. 🔥
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI Gym toolkit.
Weighing Counts: Sequential Crowd Counting by Reinforcement Learning
LibraNet This repository includes the official implementation of LibraNet for crowd counting, presented in our paper: Weighing Counts: Sequential Crow
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"
Recurrent Fast Weight Programmers This is the official repository containing the code we used to produce the experimental results reported in the pape
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators. It's also a suite of learning algorithms to train agents to operate in these environments (PPO, SAC, evolutionary strategy, and direct trajectory optimization are implemented).
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks
PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability and average power constraints. It applies Lyapunov optimization to decouple the multi-stage stochastic MINLP into deterministic per-frame MINLP subproblems and solves each subproblem via DROO algorithm. It includes:
BridgeWalk is a partially-observed reinforcement learning environment with dynamics of varying stochasticity.
BridgeWalk is a partially-observed reinforcement learning environment with dynamics of varying stochasticity. The player needs to walk along a bridge to reach a goal location. When the player walks off the bridge into the water, the current will move it randomly until it gets washed back on the shore. A good agent in this environment avoids this stochastic trap
Pretraining Representations For Data-Efficient Reinforcement Learning
Pretraining Representations For Data-Efficient Reinforcement Learning Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Ch
Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research
Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph
Thermal Control of Laser Powder Bed Fusion using Deep Reinforcement Learning
This repository is the implementation of the paper "Thermal Control of Laser Powder Bed Fusion Using Deep Reinforcement Learning", linked here. The project makes use of the Deep Reinforcement Library stable-baselines3 to derive a control policy that maximizes melt pool depth consistency.
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI
CowHerd is a partially-observed reinforcement learning environment
CowHerd is a partially-observed reinforcement learning environment, where the player walks around an area and is rewarded for milking cows. The cows try to escape and the player can place fences to help capture them. The implementation of CowHerd is based on the Crafter environment.
A parallel framework for population-based multi-agent reinforcement learning.
MALib: A parallel framework for population-based multi-agent reinforcement learning MALib is a parallel framework of population-based learning nested
Deep reinforcement learning library built on top of Neural Network Libraries
Deep Reinforcement Learning Library built on top of Neural Network Libraries NNablaRL is a deep reinforcement learning library built on top of Neural
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning DouZero is a reinforcement learning framework for DouDizhu (斗地主), t
On the model-based stochastic value gradient for continuous reinforcement learning
On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a
JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"
Optimal Model Design for Reinforcement Learning This repository contains JAX code for the paper Control-Oriented Model-Based Reinforcement Learning wi
banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services. This library is developed by Bandit ML and ex-authors of Facebook's applied reinforcement learning platform, Reagent.
Multi Agent Path Finding Algorithms
MATP-solver Simulator collision check path step random initial states or given states Traditional method Seperate A* algorithem Confict-based Search S
Codebase for the Summary Loop paper at ACL2020
Summary Loop This repository contains the code for ACL2020 paper: The Summary Loop: Learning to Write Abstractive Summaries Without Examples. Training
PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".
Maria: A Visual Experience Powered Conversational Agent This repository is the Pytorch implementation of our paper "Maria: A Visual Experience Powered
Disagreement-Regularized Imitation Learning
Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
Decision Transformer Lili Chen*, Kevin Lu*, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas†, and Igor M
🔬 A curated list of awesome machine learning strategies & tools in financial market.
🔬 A curated list of awesome machine learning strategies & tools in financial market.
Narya API allows you track soccer player from camera inputs, and evaluate them with an Expected Discounted Goal (EDG) Agent
Narya The Narya API allows you track soccer player from camera inputs, and evaluate them with an Expected Discounted Goal (EDG) Agent. This repository
Pytorch code for "State-only Imitation with Transition Dynamics Mismatch" (ICLR 2020)
This repo contains code for our paper State-only Imitation with Transition Dynamics Mismatch published at ICLR 2020. The code heavily uses the RL mach
AndroidEnv is a Python library that exposes an Android device as a Reinforcement Learning (RL) environment.
AndroidEnv is a Python library that exposes an Android device as a Reinforcement Learning (RL) environment.
AutoPentest-DRL: Automated Penetration Testing Using Deep Reinforcement Learning
AutoPentest-DRL: Automated Penetration Testing Using Deep Reinforcement Learning AutoPentest-DRL is an automated penetration testing framework based o
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects
House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr
Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"
When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi
Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization
Hybrid solving process for combinatorial optimization problems Combinatorial optimization has found applications in numerous fields, from aerospace to
Paddle-RLBooks is a reinforcement learning code study guide based on pure PaddlePaddle.
Paddle-RLBooks Welcome to Paddle-RLBooks which is a reinforcement learning code study guide based on pure PaddlePaddle. 欢迎来到Paddle-RLBooks,该仓库主要是针对强化学
CURL: Contrastive Unsupervised Representations for Reinforcement Learning
CURL Rainbow Status: Archive (code is provided as-is, no updates expected) This is an implementation of CURL: Contrastive Unsupervised Representations
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
RoboDesk A Multi-Task Reinforcement Learning Benchmark
RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro
A general-purpose multi-agent training framework.
MALib A general-purpose multi-agent training framework. Installation step1: build environment conda create -n malib python==3.7 -y conda activate mali
A collection of various RL algorithms like policy gradients, DQN and PPO. The goal of this repo will be to make it a go-to resource for learning about RL. How to visualize, debug and solve RL problems. I've additionally included playground.py for learning more about OpenAI gym, etc.
Reinforcement Learning (PyTorch) 🤖 + 🍰 = ❤️ This repo will contain PyTorch implementation of various fundamental RL algorithms. It's aimed at making
Ray provides a simple, universal API for building distributed applications.
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.