Clean and readable code for Decision Transformer: Reinforcement Learning via Sequence Modeling

Nikhil Barhate

Last update: Jan 6, 2023

Related tags

Deep Learning min-decision-transformer

Overview

Decision Transformer

Clean and readable code for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Notable difference from official implementation are:

Simple GPT implementation (causal transformer)
Uses PyTorch's Dataset and Dataloader class and removes redundant computations for calculating rewards to go and state normalization for efficient training

Instructions

Results

Dataset	Environment	DT (this repo)	DT (offcial)
Medium	HalfCheetah	42.18 ± 0.77	42.6 ± 0.1

Note that these results are mean and variance for 3 random seeds obtained by after only 20k updates while the official models are trained to saturation for 100k updates.

References

official code and paper
minimal GPT (causal transformer) tweet and colab notebook

You might also like...

An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

195 Dec 17, 2022

Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

24 Oct 26, 2022

Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022

Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

3.5k Jan 1, 2023

Generalized Decision Transformer for Offline Hindsight Information Matching

Generalized Decision Transformer for Offline Hindsight Information Matching [arxiv] If you use this codebase for your research, please cite the paper:

35 Dec 12, 2022

Decision Transformer: A brand new Offline RL Pattern

DecisionTransformer_StepbyStep Intro Decision Transformer: A brand new Offline RL Pattern. 这是关于NeurIPS 2021 热门论文Decision Transformer的复现。 👍 原文地址: Deci

14 Nov 22, 2022

[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

460 Oct 13, 2022

Improving Deep Network Debuggability via Sparse Decision Layers

Improving Deep Network Debuggability via Sparse Decision Layers This repository contains the code for our paper: Leveraging Sparse Linear Layers for D

35 Nov 14, 2022

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

8 Nov 7, 2022

Comments

The position of `dropout` operation are different from official repo ?

In official repo, Attention part: `

    w = nn.Softmax(dim=-1)(w)
    w = self.attn_dropout(w)
    # Mask heads if we want to
    if head_mask is not None:
        w = w * head_mask

    outputs = [torch.matmul(w, v)]`

The dropout is directly after the Softmax and before the matmul.

On the other hand, in our implement: `

    normalized_weights = F.softmax(weights, dim=-1)
    # attention (B, N, T, D)
    # normalized_weights.shape: (B, N, T, T)
    # v.shape: (B, N, T, D)
    attention = self.att_drop(normalized_weights @ v)`

The dropout is at last.

In my opinion, they are different. How do you think about it? :-)

opened by zerlinwang 2

How to calculate the last result scores in the table?

As shown in the table, DT (this repo) got 69.43 in Hopper and 75.47 in Walker2d. I would like to ask how these scores calculated from the csv results?

opened by zerlinwang 1

Owner

Nikhil Barhate

Machine Learning Research and Engineering

GitHub

Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"

Trajectory Transformer Code release for Reinforcement Learning as One Big Sequence Modeling Problem. Installation All python dependencies are in envir

269 Jan 5, 2023

Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"

Trajectory Transformer Code release for Offline Reinforcement Learning as One Big Sequence Modeling Problem. Installation All python dependencies are

266 Dec 27, 2022

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

22 Nov 25, 2022

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

1.4k Jan 8, 2023

Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train

514 Nov 17, 2022

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

161 Dec 8, 2022

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

112 Dec 16, 2022

Clean and readable code for Decision Transformer: Reinforcement Learning via Sequence Modeling

Related tags

Overview

Decision Transformer

Instructions

Results

References

You might also like...

An implementation of a sequence to sequence neural network using an encoder-decoder

Sequence lineage information extracted from RKI sequence data repo

Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Sequence modeling benchmarks and temporal convolutional networks

Generalized Decision Transformer for Offline Hindsight Information Matching

Decision Transformer: A brand new Offline RL Pattern

[ICLR'19] Trellis Networks for Sequence Modeling

Improving Deep Network Debuggability via Sparse Decision Layers

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

Comments

The position of `dropout` operation are different from official repo ?

How to calculate the last result scores in the table?

Owner

Nikhil Barhate

Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"

Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Sequence-to-Sequence learning using PyTorch

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Sequence to Sequence Models with PyTorch

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction