Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

IDSIA

Last update: Nov 15, 2022

Related tags

Deep Learning reinforcement-learning transformers language-modeling recurrent-neural-networks fast-weights fast-weight-programmers

Overview

Recurrent Fast Weight Programmers

This is the official repository containing the code we used to produce the experimental results reported in the paper:

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

algorithmic directory for code execution and ListOps
language_modeling directory for language modeling
reinforcement_learning directory for RL

Separate license files can be found under each directory.

General instructions

Please refer to the readme file in each directory for further instructions.

In all tasks, our custom CUDA kernels will be automatically compiled. To avoid recompiling the code multiple times, we recommend to specify the path to a directory to store the compiled code via:

export TORCH_EXTENSIONS_DIR="/home/me/torch_extensions/lm"

Such a line is already included in the example scripts we provide. Please change the path to a safe directory of your choice.

Important: separate paths should be used for different tasks (i.e. here, one for language modeling, one for code execution, one for ListOps, and one for RL).

BibTex

@article{irie2021going,
      title={Going Beyond Linear Transformers with Recurrent Fast Weight Programmers}, 
      author={Kazuki Irie and Imanol Schlag and R\'obert Csord\'as and J\"urgen Schmidhuber},
      journal={Preprint arXiv:2106.06295},
      year={2021}
}

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Related tags

Overview

Recurrent Fast Weight Programmers

Contents

General instructions

BibTex

Links

You might also like...

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Official repository for the paper "Instance-Conditioned GAN"

An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Owner

IDSIA

The repository offers the official implementation of our paper in PyTorch.

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

Official repository of the paper 'Essentials for Class Incremental Learning'

This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"