A framework to train language models to learn invariant representations.

Last update: Nov 16, 2022

Related tags

Deep Learning invariant-language-models

Overview

Invariant Language Modeling

Implementation of the training for invariant language models.

Motivation

Modern pretrained language models are critical components of NLP pipelines. Yet, they suffer from spurious correlations, poor out-of-domain generalization, and biases. Inspired by recent progress in causal machine learning, we propose invariant language modeling, a framework to learn invariant representations that should generalize across training environments. In particular, we adapt IRM-games to language models, where the invariance emerges from a specific training schedule in which environments compete to optimize their environment-specific loss by updating subsets of the model in a round-robin fashion.

Model Description

The data is assumed to come as n distinct environments and we aim to learn a language model that focusing on correlations that generalize across environments.

The model is decomposed into two components:

ϕ the main body of the transformer language model,
w the language modeling head that predicts the missing token.

In our implementation, there are now as many heads as environments: n. For each data point, all heads make their predictions and they are averaged. However, during training we sample one batch from each environment in a round-robin fashion. When seeing a batch from environment e only the head w_e and the main body ϕ receive a batch update.

Usage

To get started with the code:

pip install -r requirements.txt

PyTorch with a CUDA installation is required to run this framework. Please find all useful installation information here

Then, to continue the training of a language model from a huggingface checkpoint:

python3 run_invariant_mlm.py \
    --model_name_or_path roberta-base \
    --validation_file data-folder/validation_file.txt \
    --do_train \
    --do_eval \
    --nb_steps 5000 \
    --learning_rate 1e-5 \
    --output_dir folder-to-save-model \
    --seed 123 \
    --train_file data-folder/training-environments \
    --overwrite_cache

Currently, the supported base models are:

roberta: checkpoints
distilbert: checkpoints

Implementation

To train language models according to the IRM-games, one needs to modify:

the training schedule to perform batch updates according to each environment in a round-robin fashion. This logic is implemented by the InvariantTrainer in invariant_trainer.py', a class inherited from the Trainer` from huggingface.
the language modeling heads in the model. It needs one head per environment. This is done by creating variations of the base model classes. It is implemented in invariant_roberta.py for roberta and in invariant_distilbert.py for distilbert.

Contact

Maxime Peyrard, [email protected]

You might also like...

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Learning Invariant Representation for Unsupervised Image Restoration (CVPR 2020) Introduction This is an implementation for the paper "Learning Invari

88 Nov 7, 2022

Systematic generalisation with group invariant predictions

Requirements are Python 3, TensorFlow v1.14, Numpy, Scipy, Scikit-Learn, Matplotlib, Pillow, Scikit-Image, h5py, tqdm. Experiments were run on V100 GPUs (16 and 32GB).

30 Dec 1, 2022

PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

PERIN: Permutation-invariant Semantic Parsing David Samuel & Milan Straka Charles University Faculty of Mathematics and Physics Institute of Formal an

40 Jan 4, 2023

Implementation of Invariant Point Attention, used for coordinate refinement in the structure module of Alphafold2, as a standalone Pytorch module

Invariant Point Attention - Pytorch Implementation of Invariant Point Attention as a standalone module, which was used in the structure module of Alph

113 Jan 5, 2023

Comments

Computational Budget

Hi,

Nice work on robust language models!

How much computational budget (GPU type, number, and hours) does it take to pre-train iLM(RoBERTa) from scratch with your implementation?

opened by FeiWang96 0

A framework to train language models to learn invariant representations.

Related tags

Overview

Invariant Language Modeling

Motivation

Model Description

Usage

Implementation

Contact

You might also like...

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Systematic generalisation with group invariant predictions

PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

Implementation of Invariant Point Attention, used for coordinate refinement in the structure module of Alphafold2, as a standalone Pytorch module

Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

An SE(3)-invariant autoencoder for generating the periodic structure of materials

Comments

Computational Budget

Owner

Code to train models from "Paraphrastic Representations at Scale".

Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet.

Code for the paper "Implicit Representations of Meaning in Neural Language Models"

TorchFlare is a simple, beginner-friendly, and easy-to-use PyTorch Framework train your models effortlessly.

This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

The Python ensemble sampling toolkit for affine-invariant MCMC

《Truly shift-invariant convolutional neural networks》(2021)

Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

DIRL: Domain-Invariant Representation Learning