EMNLP 2021 paper The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers.

Csordás Róbert

Last update: Nov 21, 2022

Related tags

Deep Learning machine-learning deep-learning paper transformers pytorch generalization systematic-generalization

Overview

Codebase for training transformers on systematic generalization datasets.

The official repository for our EMNLP 2021 paper The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers.

Please note that this repository is a cleaned-up version of the internal research repository we use. In case you encounter any problems with it, please don't hesitate to contact me.

Setup

This project requires Python 3 (tested with Python 3.8 and 3.9) and PyTorch 1.8.

pip3 install -r requirements.txt

Create a Weights and Biases account and run

wandb login

More information on setting up Weights and Biases can be found on https://docs.wandb.com/quickstart.

For plotting, LaTeX is required (to avoid Type 3 fonts and to render symbols). Installation is OS specific.

Downloading data

All datasets are downloaded automatically except the Mathematics Dataset and CFQ which is hosted in Google Cloud and one has to log in with his/her Google account to be able to access it.

Math dataset

Download the .tar.gz file manually from here:

https://console.cloud.google.com/storage/browser/mathematics-dataset?pli=1

Copy it to the cache/dm_math/ folder. You should have a cache/dm_math/mathematics_dataset-v1.0.tar.gz file in the project folder if you did everyhing correctly.

CFQ

Download the .tar.gz file manually from here:

https://storage.cloud.google.com/cfq_dataset/cfq1.1.tar.gz

Copy it to the cache/CFQ/ folder. You should have a cache/CFQ/cfq1.1.tar.gz file in the project folder if you did everyhing correctly.

Usage

Running the experiments from the paper on a cluster

The code makes use of Weights and Biases for experiment tracking. In the sweeps directory, we provide sweep configurations for all experiments we have performed. The sweeps are officially meant for hyperparameter optimization, but we use them to run multiple configurations and seeds.

To reproduce our results, start a sweep for each of the YAML files in the sweeps directory. Run wandb agent for each of them in the root directory of the project. This will run all the experiments, and they will be displayed on the W&B dashboard. The name of the sweeps must match the name of the files in sweeps directory, except the .yaml ending. More details on how to run W&B sweeps can be found at https://docs.wandb.com/sweeps/quickstart.

For example, if you want to run Math Dataset experiments, run wandb sweep --name dm_math sweeps/dm_math.yaml. This creates the sweep and prints out its ID. Then run wandb agent with that ID.

Re-creating plots from the paper

Edit config file paper/config.json. Enter your project name in the field "wandb_project" (e.g. "username/project").

Run the scripts in the paper directory. For example:

cd paper
./run_all.sh

The output will be generated in the paper/out/ directory. Tables will be printed to stdout in latex format.

If you want to reproduce individual plots, it can be done by running individial python files in the paper directory.

Running experiments locally

It is possible to run single experiments with Tensorboard without using Weights and Biases. This is intended to be used for debugging the code locally.

If you want to run experiments locally, you can use run.py:

./run.py sweeps/tuple_rnn.yaml

If the sweep in question has multiple parameter choices, run.py will interactively prompt choices of each of them.

The experiment also starts a Tensorboard instance automatically on port 7000. If the port is already occupied, it will incrementally search for the next free port.

Note that the plotting scripts work only with Weights and Biases.

Reducing memory usage

In case some tasks won't fit on your GPU, play around with "-max_length_per_batch " argument. It can trade off memory usage/speed by slicing batches and executing them in multiple passes. Reduce it until the model fits.

BibTex

@inproceedings{csordas2021devil,
      title={The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers}, 
      author={R\'obert Csord\'as and Kazuki Irie and J\"urgen Schmidhuber},
      booktitle={Proc. Conf. on Empirical Methods in Natural Language Processing (EMNLP)},
      year={2021},
      month={November},
      address={Punta Cana, Dominican Republic}
}

Comments

CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

360 Dec 10, 2022

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study Codes for [Preprint] Bag of Tricks for Training Deeper Graph

101 Dec 29, 2022

359 Jan 5, 2023

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Out-of-distribution Generalization Investigation on Vision Transformers This repository contains PyTorch evaluation code for Delving Deep into the Gen

72 Dec 13, 2022

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

53 Dec 16, 2022

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

12 Dec 7, 2022

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

440 Jan 2, 2023

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

11 Jan 1, 2023

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 7, 2023

EMNLP 2021 paper The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers.

Related tags

Overview

Codebase for training transformers on systematic generalization datasets.

Setup

Downloading data

Math dataset

CFQ

Usage

Running the experiments from the paper on a cluster

Re-creating plots from the paper

Running experiments locally

Reducing memory usage

BibTex

You might also like...

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Comments

CVE-2007-4559 Patch

Patching CVE-2007-4559

Owner

Csordás Róbert

This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Systematic generalisation with group invariant predictions

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Evaluating different engineering tricks that make RL work

EMNLP 2021 paper The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers.

Related tags

Overview

Codebase for training transformers on systematic generalization datasets.

Setup

Downloading data

Math dataset

CFQ

Usage

Running the experiments from the paper on a cluster

Re-creating plots from the paper

Running experiments locally

Reducing memory usage

BibTex

You might also like...

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Comments

CVE-2007-4559 Patch

Patching CVE-2007-4559

Owner

Csordás Róbert

This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Systematic generalisation with group invariant predictions

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Evaluating different engineering tricks that make RL work

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang