Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

Machel Reid

Last update: Dec 19, 2022

Related tags

Text Data & NLP can-wikipedia-help-offline-rl

Overview

Can Wikipedia Help Offline RL?

Machel Reid, Yutaro Yamada and Shixiang Shane Gu.

Our paper is up on arXiv.

Overview

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?. Contains scripts to reproduce experiments. (This codebase is based on that of https://github.com/kzl/decision-transformer)

Instructions

We provide code our code directory containing code for our experiments.

Installation

Experiments require MuJoCo. Follow the instructions in the mujoco-py repo to install. Then, dependencies can be installed with the following command:

conda env create -f conda_env.yml

Downloading datasets

Datasets are stored in the data directory. LM co-training and vision experiments can be found in lm_cotraining and vision directories respectively. Install the D4RL repo, following the instructions there. Then, run the following script in order to download the datasets and save them in our format:

python download_d4rl_datasets.py

Downloading ChibiT

ChibiT can be downloaded with gdown as follows:

gdown --id $ID #we will add it soon!

Example usage

Experiments can be reproduced with the following:

python experiment.py --env hopper --dataset medium --model_type dt --pretrained_lm gpt2 \ # or path to chibiT
--gpt_kmeans --gpt_kmeans-const 0.1 
--

The run.sh file has example commands.

Adding -w True will log results to Weights and Biases.

Citation

Please cite our paper as:

@misc{reid2022wikipedia,
      title={Can Wikipedia Help Offline Reinforcement Learning?}, 
      author={Machel Reid and Yutaro Yamada and Shixiang Shane Gu},
      year={2022},
      eprint={2201.12122},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

License

MIT

Partially offline multi-language translator built upon Huggingface transformers.

Translate Command-line interface to translation pipelines, powered by Huggingface transformers. This tool can download translation models, and then us

8 Oct 25, 2022

Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

Google Text-To-Speech Batch Prompt File Maker Are you in the need of IVR prompts, but you have no voice actors? Let Google talk your prompts like a pr

1 Aug 19, 2021

This is the offline-training-pipeline for our project.

offline-training-pipeline This is the offline-training-pipeline for our project. We adopt the offline training and online prediction Machine Learning

0 Apr 22, 2022

Open-source offline translation library written in Python. Uses OpenNMT for translations

Open source neural machine translation in Python. Designed to be used either as a Python library or desktop application. Uses OpenNMT for translations and PyQt for GUI.

1.6k Jan 1, 2023

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

IVR-Chatbot Achievements 🏆 Team Uhtred won the Maverick 2.0 Bot-a-thon 2021 organized by AbInbev India. ❓ Problem Statement As we all know that, lot

9 Dec 8, 2022

Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

"# bpe_algorithm_can_finetune_tokenizer" this is an implyment for https://github

1 Feb 2, 2022

Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

CvarAdversarialRL Official code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning". Initial setup Create a virtual

1 Nov 19, 2021

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

Sentance Parser Executing the Program Make sure Python 3.6+ is installed. Install requirements $ pip install requirements.txt Run the program:

12 Sep 28, 2022

Comments

Typos in Table 2 for D4RL baselines numbers

Hi, I believe there are some typos in the number reported from https://arxiv.org/abs/2004.07219 in Table 2 of your paper. In particular, for hopper and walker2d -medium-expert: the BC numbers don't match those reported in https://arxiv.org/abs/2004.07219 and the CQL numbers have been swapped.

opened by MarcCote 1

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

Related tags

Overview

Can Wikipedia Help Offline RL?

Overview

Instructions

Installation

Downloading datasets

Downloading ChibiT

Example usage

Citation

License

You might also like...

Partially offline multi-language translator built upon Huggingface transformers.

Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

This is the offline-training-pipeline for our project.

Open-source offline translation library written in Python. Uses OpenNMT for translations

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

Comments

Typos in Table 2 for D4RL baselines numbers

Owner

Machel Reid

DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers.

New Modeling The Background CodeBase

Rootski - Full codebase for rootski.io (without the data)

I can help you convert your images to pdf file.

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Free and Open Source Machine Translation API. 100% self-hosted, offline capable and easy to setup.