Customizable RecSys Simulator for OpenAI Gym

Overview

gym-recsys: Customizable RecSys Simulator for OpenAI Gym

Installation | How to use | Examples | Citation

This package describes an OpenAI Gym interface for creating a simulation environment of reinforcement learning-based recommender systems (RL-RecSys). The design strives for simple and flexible APIs to support novel research.

Installation

gym-recsys can be installed from PyPI using pip:

pip install gym-recsys

Note that we support Python 3.7+ only.

You can also install it directly from this GitHub repository using pip:

pip install git+git://github.com/zuoxingdong/gym-recsys.git

How to use

To use gym-recsys, you need to define the following components:

user_ids

This describes a list of available user IDs for the simulation. Normally, a user ID is an integer.

An example of three users: user_ids = [0, 1, 2]

Note that the user ID will be taken as an input to user_state_model_callback to generate observations of the user state.

item_category

This describes the categories of a list of available items. The data type should be a list of strings. The indices of the list is assumed to correspond to item IDs.

An example of three items: item_category = ['sci-fi', 'romance', 'sci-fi']

The category information is mainly used for visualization via env.render().

item_popularity

This describe the popularity measure of a list of available items. The data type should be a list (or 1-dim array) of integers. The indices of the list is assumed to correspond to item IDs.

An example of three items: item_popularity = [5, 3, 1]

The popularity information is used for calculating Expected Popularity Complement (EPC) in the visualization.

hist_seq_len

This is an integer describing the number of most recently clicked items by the user to encode as the current state of the user.

An example of the historical sequence with length 3: hist_seq = [-1, 2, 0]. The item ID -1 indicates an empty event. In this case, the user clicked two items in the past, first item ID 2 followed by a second item ID 0.

The internal FIFO queue hist_seq will be taken as an input to both user_state_model_callback and reward_model_callback to generate observations of the user state.

slate_size

This is an integer describing the size of the slate (display list of recommended items).

It induces a combinatorial action space for the RL agent.

user_state_model_callback

This is a Python callback function taking user_id and hist_seq as inputs to generate an observation of current user state.

Note that it is generic. Either pre-defined heuristic computations or pre-trained neural network models using user/item embeddings can be wrapped as a callback function.

reward_model_callback

This is a Python callback function taking user_id, hist_seq and action as inputs to generate a reward value for each item in the slate. (i.e. action)

Note that it is generic. Either pre-defined heuristic computations or pre-trained neural network models using user/item embeddings can be wrapped as a callback function.

Examples

To illustrate the simple yet flexible design of gym-recsys, we provide a toy example to construct a simulation environment.

First, let us sample random embeddings for one user and five items:

user_features = np.random.randn(1, 10)
item_features = np.random.randn(5, 10)

Now let us define the category and popularity score for each item:

item_category = ['sci-fi', 'romance', 'sci-fi', 'action', 'sci-fi']
item_popularity = [5, 3, 1, 2, 3]

Then, we define callback functions for user state and reward values:

def user_state_model_callback(user_id, hist_seq):
    return user_features[user_id]

def reward_model_callback(user_id, hist_seq, action):
    return np.inner(user_features[user_id], item_features[action])

Finally, we are ready to create a simulation environment with OpenAI Gym API:

env_kws = dict(
    user_ids=[0],
    item_category=item_category,
    item_popularity=item_popularity,
    hist_seq_len=3,
    slate_size=2,
    user_state_model_callback=user_state_model_callback,
    reward_model_callback=reward_model_callback
)
env = gym.make('gym_recsys:RecSys-t50-v0', **env_kws)

Note that we created the environment with slate size of two items and historical interactions of the recent 3 steps. The horizon is 50 time steps.

Now let us play with this environment.

By evaluating a random agent with 100 times, we got the following performance:

Agent Episode Reward CTR
random 73.54 68.23%

Given the sampled embeddings, let's say item 1 and 3 lead to maximally possible reward values. Let us see how a greedy policy performs by constantly recommending item 1 and 3:

Agent Episode Reward CTR
greedy 180.86 97.93%

Last but not least, for the most fun part, let us generate animations of both policy for an episode via gym's Monitor wrapper, showing as GIFs in the following:

Random Agent

Greedy Agent

Citation

If you use gym-recsys in your work, please cite this repository:

@software{zuo2021recsys,
  author={Zuo, Xingdong},
  title={gym-recsys: Customizable RecSys Simulator for OpenAI Gym},
  url={https://github.com/zuoxingdong/gym-recsys},
  year={2021}
}
You might also like...
CL-Gym: Full-Featured PyTorch Library for Continual Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning CL-Gym is a small yet very flexible library for continual learning research and developme

Implement A3C for Mujoco gym envs
Implement A3C for Mujoco gym envs

pytorch-a3c-mujoco Disclaimer: my implementation right now is unstable (you ca refer to the learning curve below), I'm not sure if it's my problems. A

A
A "gym" style toolkit for building lightweight Neural Architecture Search systems

A "gym" style toolkit for building lightweight Neural Architecture Search systems

Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

Robot Servers and Server Manager software for robo-gym

robo-gym-server-modules Robot Servers and Server Manager software for robo-gym. For info on how to use this package please visit the robo-gym website

Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies.

Crypto_Bot Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies. Steps to get started using the bot: Sign up

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

Multi-objective gym environments for reinforcement learning.
Multi-objective gym environments for reinforcement learning.

MO-Gym: Multi-Objective Reinforcement Learning Environments Gym environments for multi-objective reinforcement learning (MORL). The environments follo

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

Owner
Xingdong Zuo
AI in well-being is my dream. Neural networks need to understand the world causally.
Xingdong Zuo
Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

coax is built on top of JAX, but it doesn't have an explicit dependence on the jax python package. The reason is that your version of jaxlib will depend on your CUDA version.

null 128 Dec 27, 2022
Deep Q Learning with OpenAI Gym and Pokemon Showdown

pokemon-deep-learning An openAI gym project for pokemon involving deep q learning. Made by myself, Sam Little, and Layton Webber. This code captures g

null 2 Dec 22, 2021
Manipulation OpenAI Gym environments to simulate robots at the STARS lab

Manipulator Learning This repository contains a set of manipulation environments that are compatible with OpenAI Gym and simulated in pybullet. In par

STARS Laboratory 5 Dec 8, 2022
An OpenAI Gym environment for Super Mario Bros

gym-super-mario-bros An OpenAI Gym environment for Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The Nintendo Entertainment System (NES) us

Andrew Stelmach 1 Jan 5, 2022
Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

Debiasing Item-to-Item Recommendations With Small Annotated Datasets This is the code for our RecSys '20 paper. Other materials can be found here: Ful

Microsoft 34 Aug 10, 2022
Source code and data from the RecSys 2020 article "Carousel Personalization in Music Streaming Apps with Contextual Bandits" by W. Bendada, G. Salha and T. Bontempelli

Carousel Personalization in Music Streaming Apps with Contextual Bandits - RecSys 2020 This repository provides Python code and data to reproduce expe

Deezer 48 Jan 2, 2023
đŸ„A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

PyTorch implementation of OpenAI's Finetuned Transformer Language Model This is a PyTorch implementation of the TensorFlow code provided with OpenAI's

Hugging Face 1.4k Jan 5, 2023
Fast and customizable reconnaissance workflow tool based on simple YAML based DSL.

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL, with support of notifications and distributed workload of that work

AmĂ©rico JĂșnior 3 Mar 11, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI Gym toolkit.

Robin Henry 99 Dec 12, 2022