Official codebase for Pretrained Transformers as Universal Computation Engines.

Overview

universal-computation

Overview

Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to reproduce experiments.

Project Demo

For a minimal demonstration of frozen pretrained transformers, see demo.ipynb. You can run the notebook which reproduces the Bit XOR experiment in a couple minutes, and visualizes the learned attention maps.

Status

Project is released but will receive updates soon.

Currently the repo supports the following tasks:

['bit-memory', 'bit-xor', 'listops', 'mnist', 'cifar10', 'cifar10-gray']

Note that CIFAR-10 LRA is cifar10-gray with a patch size of 1.

Usage

Installation

  1. Install Anaconda environment:

    $ conda env create -f environment.yml
    
  2. Add universal-computation/ to your PYTHONPATH, i.e. add this line to your ~/.bashrc:

    export PYTHONPATH=~/universal-computation:$PYTHONPATH
    

Downloading datasets

Datasets are stored in data/. MNIST and CIFAR-10 are automatically downloaded by Pytorch upon starting experiment.

Listops

Download the files for Listops from Long Range Arena. Move the .tsv files into data/listops. There should be three files: basic_test, basic_train, basic_val. The script evaluates on the validation set by default.

Remote homology

Support coming soon.

Running experiments

You can run experiments with:

python scripts/run.py

Adding -w True will log results to Weights and Biases.

Citation

@article{lu2021fpt,
  title={Pretrained Transformers as Universal Computation Engines},
  author={Kevin Lu and Aditya Grover and Pieter Abbeel and Igor Mordatch},
  journal={arXiv preprint arXiv:2103.05247},
  year={2021}
}
You might also like...
Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

B-Pref Official codebase for B-Pref: Benchmarking Preference-BasedReinforcement Learning contains scripts to reproduce experiments. Install conda env

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.
[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

RegionProxy Figure 2. Performance vs. GFLOPs on ADE20K val split. Semantic Segmentation by Early Region Proxy Yifan Zhang, Bo Pang, Cewu Lu CVPR 2022

Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

Big Vision This codebase is designed for training large-scale vision models on Cloud TPU VMs. It is based on Jax/Flax libraries, and uses tf.data and

CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

UC2 UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu,

Official Implementation of Domain-Aware Universal Style Transfer
Official Implementation of Domain-Aware Universal Style Transfer

Domain Aware Universal Style Transfer Official Pytorch Implementation of 'Domain Aware Universal Style Transfer' (ICCV 2021) Domain Aware Universal St

aka
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Bayesian Methods for Hackers Using Python and PyMC The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chap

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

Comments
  • Not converging in xor calculation task

    Not converging in xor calculation task

    when i run demo.py notebook provided in this repository, calculating xor task converges. but when i create and run a custom notebook, it doesnot converge and it is almost 2x slower

    opened by saahiluppal 1
  • Vision pretraining

    Vision pretraining

    Thanks for sharing your code! I was wondering if you could share the settings for reproducing the results with ViT initialization mentioned in section 3.2

    opened by vmurahari3 1
  •  Remote Homology Dataset size

    Remote Homology Dataset size

    In the datasets/remote_homology.py notebook, it says that there are 236224 examples (out of 242560 -- 97.39%). The dataset size according to the paper is 12312. I am a bit confused about the number you have "236224", where it comes from.

    opened by BalloutAI 0
  • ListOps brackets are not tokenized

    ListOps brackets are not tokenized

    Hi,

    A ListOps input of "[MAX 4 3 [ MIN 2 3 ] 1 0 ])" will get encoded as "MAX 4 3 MIN 2 3 1 0" so all brackets are removed, which makes the task unsolvable. This is also described here https://github.com/google-research/long-range-arena/issues/20

    How I got aware of this: In the paper, page 3 under ListOps you write "models are fed 512 tokens of dimension 15". However there are 4 operations, 2 brackets and 10 numbers which would require dimension 16. Checking the dataset code, there is one unused UNK token, 10 numbers, 4 operations which equals to a vocabulary length of 15.

    Your code reproduces the ~38% accuracy of ListOps described in the paper correctly.

    Best

    opened by gingsi 0
Owner
Kevin Lu
Researcher interested in artificial intelligence. Undergrad at UC Berkeley.
Kevin Lu
Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

MetaMorph: Learning Universal Controllers with Transformers This is the code for the paper MetaMorph: Learning Universal Controllers with Transformers

Agrim Gupta 50 Jan 3, 2023
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.

Couler What is Couler? Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines, such as Argo

Couler Project 781 Jan 3, 2023
This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

null 19 Oct 27, 2022
QueryFuzz implements a metamorphic testing approach to test Datalog engines.

Datalog is a popular query language with applications in several domains. Like any complex piece of software, Datalog engines may contain bugs. The mo

null 34 Sep 10, 2022
Fuzzing JavaScript Engines with Aspect-preserving Mutation

DIE Repository for "Fuzzing JavaScript Engines with Aspect-preserving Mutation" (in S&P'20). You can check the paper for technical details. Environmen

gts3.org (SSLab@Gatech) 190 Dec 11, 2022
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Decision Transformer Lili Chen*, Kevin Lu*, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas†, and Igor M

Kevin Lu 1.4k Jan 7, 2023
Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Legged Robots that Keep on Learning Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, whic

Laura Smith 70 Dec 7, 2022
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

null 47 Dec 28, 2022