Multi-objective gym environments for reinforcement learning.

Lucas Alegre

Last update: Jan 3, 2023

Related tags

Deep Learning reinforcement-learning gym multi-objective gym-environments morl multi-objective-rl

Overview

MO-Gym: Multi-Objective Reinforcement Learning Environments

Gym environments for multi-objective reinforcement learning (MORL). The environments follow the standard gym's API, but return vectorized rewards as numpy arrays.

For details on multi-objective MPDS (MOMDP's) and other MORL definitions, see A practical guide to multi-objective reinforcement learning and planning.

Install

git clone https://github.com/LucasAlegre/mo-gym.git
cd mo-gym
pip install -e .

Usage

import gym
import mo_gym

env = gym.make('minecart-v0') # It follows the original gym's API ...

obs = env.reset()
next_obs, vector_reward, done, info = env.step(your_agent.act(obs))  # but vector_reward is a numpy array!

# Optionally, you can scalarize the reward function with the LinearReward wrapper
env = mo_gym.LinearReward(env, weight=np.array([0.8, 0.2, 0.2]))

Environments

Env	Obs/Action spaces	Objectives	Description
`deep-sea-treasure-v0`	Discrete / Discrete	`[treasure, time_penalty]`	Agent is a submarine that must collect a treasure while taking into account a time penalty. Treasures values taken from Yang et al. 2019.
`resource-gathering-v0`	Discrete / Discrete	`[enemy, gold, gem]`	Agent must collect gold or gem. Enemies have a 10% chance of killing the agent. From Barret & Narayanan 2008.
`four-room-v0`	Discrete / Discrete	`[item1, item2, item3]`	Agent must collect three different types of items in the map and reach the goal.
`mo-mountaincar-v0`	Continuous / Discrete	`[time_penalty, reverse_penalty, forward_penalty]`	Classic Mountain Car env, but with extra penalties for the forward and reverse actions. From Vamplew et al. 2011.
`mo-reacher-v0`	Continuous / Discrete	`[target_1, target_2, target_3, target_4]`	Reacher robot from PyBullet, but there are 4 different target positions.
`minecart-v0`	Continuous or Image / Discrete	`[ore1, ore2, fuel]`	Agent must collect two types of ores and minimize fuel consumption. From Abels et al. 2019.
`mo-supermario-v0`	Image / Discrete	`[x_pos, time, death, coin, enemy]`	Multi-objective version of SuperMarioBrosEnv. Objectives are defined similarly as in Yang et al. 2019.

Citing

If you use this repository in your work, please cite:

@misc{mo-gym,
  author = {Lucas N. Alegre},
  title = {MO-Gym: Multi-Objective Reinforcement Learning Environments},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LucasAlegre/mo-gym}},
}

Acknowledgments

The minecart-v0 env is a refactor of https://github.com/axelabels/DynMORL.
The deep-sea-treasure-v0 and mo-supermario-v0 are based on https://github.com/RunzheYang/MORL.
The four-room-v0 is based on https://github.com/mike-gimelfarb/deep-successor-features-for-transfer.

Comments

Adds the breakable bottles environment

Adds the breakable bottles environment which is used in Vamplew et al. 2021 as a toy model for irreversible change in stochastic environments.

I wasn't really planning for creating a pull request, so the commit history is a bit messy...

opened by rk1a 4
A few bug fixes
DST:

The bounds of the rewards were hardcoded for the convex map.

The way to fix the seed is deprecated. From what I saw in the official gym envs, the seed is now fixed just using the reset method. (e.g. https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py#L198)

setup.py:

Gym 0.25.0 introduces breaking changes. So I fixed the version to 0.24.1.
opened by ffelten 2
Consider using info field for reward vector

Hello,

Thanks for this repository, it will be very useful to the MORL community :-).

I was just wondering if you think it would be a good idea to enforce gym compatibility by specifying rewards as scalar and giving the vectorial rewards elsewhere. The idea would be to use a field in the info dictionary as they do in PGMORL. This would allow to use existing RL algorithms and logging libraries out of box (e.g. stable-baselines, tensorboard logs, ...).

For example: In a DST env, if you return the treasure reward only in the reward field, you can use the DQN implementation from baselines and have insights on the average reward, as well as the episode length in the tensorboard logs. Of course, you can extract the full vectorial reward from the info dictionary in order to learn with MORL :-).

With kind regards,

Florian

opened by ffelten 2
Add MO reward wrappers

I added two wrappers commonly used: normalize and clip.

The idea is to provide the index of the reward component you want to normalize or clip, and leave the other components as they are. Of course, wrappers can be wrapped inside others to normalize all rewards (see tests).

opened by ffelten 1

Fix notebook

There are still issues with the video recorder :(

/usr/local/lib/python3.9/site-packages/gym/wrappers/monitoring/video_recorder.py:59: UserWarning: WARN: Disabling video recorder because environment <TimeLimit<OrderEnforcing<MOMountainCar<mo-mountaincar-v0>>>> was not initialized with any compatible video mode between `rgb_array` and `rgb_array_list`
  logger.warn(

opened by ffelten 0

Add fishwood env

Code was provided by Denis Steckelmacher, I did a bit of refactoring and migrated it to 0.26.

I didn't bother making the render with the images, but I did upload them in case somebody gets motivated, the env is super simple.

opened by ffelten 0
Add wrapper to help logging episode returns

The implementation is mostly a copy paste of the original gym. I had to copy paste instead of override and call to super because the way the return is a numpy array, which is mutable, and the original implementation resets it to 0. Hence, if we kept the original, the return will always be a vector of zeros (because resetted)

opened by ffelten 0

Releases(0.2.1)

0.2.1(Dec 9, 2022)
5 new environments: fishwood-v0 (ESR), mo-MountainCarContinuous-v0, water-reservoir-v0, mo-highway-v0 and mo-highway-fast-v0;

Revamped README file;

Linting and automatic imports optimization;

Updated bib file and citation;

Few bugfixes.

Source code(tar.gz)
Source code(zip)
0.2.0(Sep 25, 2022)

Support for new Gym>=0.26 API
Source code(tar.gz)
Source code(zip)
0.1.2(Sep 25, 2022)

Source code(tar.gz)
Source code(zip)
0.1.1(Aug 24, 2022)

Source code(tar.gz)
Source code(zip)

Owner

Lucas Alegre

PhD student at Institute of Informatics - UFRGS. Interested in reinforcement learning, machine learning and artificial (neuro-inspired) intelligence.

GitHub

A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)

gym-mtsim: OpenAI Gym - MetaTrader 5 Simulator MtSim is a simulator for the MetaTrader 5 trading platform alongside an OpenAI Gym environment for rein

184 Dec 31, 2022

[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

65 Jan 7, 2023

Manipulation OpenAI Gym environments to simulate robots at the STARS lab

Manipulator Learning This repository contains a set of manipulation environments that are compatible with OpenAI Gym and simulated in pybullet. In par

5 Dec 8, 2022

Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

1 Nov 10, 2021

Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

535 Nov 15, 2022

Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

coax is built on top of JAX, but it doesn't have an explicit dependence on the jax python package. The reason is that your version of jaxlib will depend on your CUDA version.

128 Dec 27, 2022

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

safe-control-gym Physics-based CartPole and Quadrotor Gym environments (using PyBullet) with symbolic a priori dynamics (using CasADi) for learning-ba

300 Dec 28, 2022

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC

49 Nov 28, 2022

PyTorch implementations of deep reinforcement learning algorithms and environments

Deep Reinforcement Learning Algorithms with PyTorch This repository contains PyTorch implementations of deep reinforcement learning algorithms and env

4.7k Jan 4, 2023

CompilerGym is a library of easy to use and performant reinforcement learning environments for compiler tasks

721 Jan 3, 2023

Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

40 Dec 24, 2022

Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Relation Prediction as an Auxiliary Training Objective for Knowledge Base Completion This repo provides the code for the paper Relation Prediction as

85 Jan 2, 2023

Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations

Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations Requirements The code is implemented in Python and requires

1 Nov 3, 2021

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

8 Nov 7, 2022

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

4 Apr 15, 2022

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

119 Nov 24, 2022

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

71 Dec 22, 2022

Pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

MOSNet pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion" https://arxiv.org/abs/1904.08352 Dependency L

9 Nov 18, 2022

CL-Gym: Full-Featured PyTorch Library for Continual Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning CL-Gym is a small yet very flexible library for continual learning research and developme

36 Dec 25, 2022

Multi-objective gym environments for reinforcement learning.

Related tags

Overview

MO-Gym: Multi-Objective Reinforcement Learning Environments

Install

Usage

Environments

Citing

Acknowledgments

Comments

Adds the breakable bottles environment

A few bug fixes

Consider using info field for reward vector

Add MO reward wrappers

Fix notebook

Add fishwood env

Add wrapper to help logging episode returns

Releases(0.2.1)

0.2.1(Dec 9, 2022)

0.2.0(Sep 25, 2022)

0.1.2(Sep 25, 2022)

0.1.1(Aug 24, 2022)

Owner

Lucas Alegre

A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)

[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

Manipulation OpenAI Gym environments to simulate robots at the STARS lab

Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

PyTorch implementations of deep reinforcement learning algorithms and environments

CompilerGym is a library of easy to use and performant reinforcement learning environments for compiler tasks

Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

CL-Gym: Full-Featured PyTorch Library for Continual Learning