Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Bo Zheng

Last update: Dec 9, 2022

Related tags

Deep Learning xTune

Overview

xTune

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Environment

DockerFile: dancingsoul/pytorch:xTune

Install the fine-tuning code: pip install --user .

Data & Model Preparation

XTREME Datasets

Create a download folder with mkdir -p download in the root of this project.
manually download panx_dataset (for NER) [here][2], (note that it will download as AmazonPhotos.zip) to the download directory.
run the following command to download the remaining datasets: bash scripts/download_data.sh The code of downloading dataset from XTREME is from [xtreme offical repo][1].

Note that we keep the labels in test set for easier evaluation. To prevent accidental evaluation on the test sets while running experiments, the code of [xtreme offical repo][1] removes labels of the test data during pre-processing and changes the order of the test sentences for cross-lingual sentence retrieval. Replace csv.writer(fout, delimiter='\t') with csv.writer(fout, delimiter='\t', quoting=csv.QUOTE_NONE, quotechar='') in utils_process.py if using XTREME official repo.

Translations

XTREME provides translations for SQuAD v1.1 (only train and dev), MLQA, PAWS-X, TyDiQA-GoldP, XNLI, and XQuAD, which can be downloaded from [here][3]. The xtreme_translations folder should be moved to the download directory.

The target language translations for panx and udpos are obtained with Google Translate, since they are not provided. Our processed version can be downloaded from [here][4]. It should be merged with the above xtreme_translations folder.

Bi-lingual dictionaries

We obtain the bi-lingual dictionaries from the [MUSE][6] repo. For convenience, you can download them from [here][7] and move it to the download directory, i.e., ./download/dicts.

Models

XLM-Roberta is supported. We utilize the [huggingface][5] format, which can be downloaded with bash scripts/download_model.sh.

Fine-tuning Usage

Our default settings were using Nvidia V100-32GB GPU cards. If there were out-of-memory errors, you can reduce per_gpu_train_batch_size while increasing gradient_accumulation_steps, or use multi-GPU training.

xTune consists of a two-stage training process.

Stage 1: fine-tuning with example consistency on the English training set.
Stage 2: fine-tuning with example consistency on the augmented training set and regularize model consistency with the model from Stage 1.

It's recommended to use both Stage 1 and Stage 2 for token-level tasks, such as sequential labeling, and question answering. For text classification, you can only use Stage 1 if the computation budget was limited.

bash ./scripts/train.sh [setting] [dataset] [model] [stage] [gpu] [data_dir] [output_dir]

where the options are described as follows:

[setting]: translate-train-all (using input translation for the languages other than English) or cross-lingual-transfer (only using English for zero-shot cross-lingual transfer)
[dataset]: dataset names in XTREME, i.e., xnli, panx, pawsx, udpos, mlqa, tydiqa, xquad
[model]: xlm-roberta-base, xlm-roberta-large
[stage]: 1 (first stage), 2 (second stage)
[gpu]: used to set environment variable CUDA_VISIBLE_DEVICES
[data_dir]: folder of training data
[output_dir]: folder of fine-tuning output

Examples: XTREME Tasks

XNLI fine-tuning on English training set and translated training sets (`translate-train-all`)

# run stage 1 of xTune
bash ./scripts/train.sh translate-train-all xnli xlm-roberta-base 1
# run stage 2 of xTune (optional)
bash ./scripts/train.sh translate-train-all xnli xlm-roberta-base 2

XNLI fine-tuning on English training set (`cross-lingual-transfer`)

# run stage 1 of xTune
bash ./scripts/train.sh cross-lingual-transfer xnli xlm-roberta-base 1
# run stage 2 of xTune (optional)
bash ./scripts/train.sh cross-lingual-transfer xnli xlm-roberta-base 2

Paper

Please cite our paper \cite{bo2021xtune} if you found the resources in the repository useful.

@inproceedings{bo2021xtune,
author = {Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei},
booktitle = {Proceedings of ACL 2021},
title = {{Consistency Regularization for Cross-Lingual Fine-Tuning}},
year = {2021}
}

Reference

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

13 Nov 30, 2022

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

T-Few This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learni

220 Dec 31, 2022

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

274 Dec 6, 2022

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

75 Nov 2, 2022

Comments

How to reproduce your results shown in your paper, especially Table 1
Hi folks,

Thanks for your sharing awesome tuning methods. I have run your code on MLQA task, just keeping all hyperparams as your setting, also on TESLA V100 32GB. I ran it for 4 random seeds and average them.

# run stage 1 of xTune bash ./scripts/train.sh translate-train-all mlqa xlm-roberta-large 1 # run stage 2 of xTune (optional) bash ./scripts/train.sh translate-train-all mlqa xlm-roberta-large 2

However, I got the lower results than F1/EM=75.0/57.1 in your papers. My results as follows: Stage 1: F1=73.50, EM=55.28; Stage 2: F1=73.68, EM=55.73

So do you have any idea about it? And if there are something should be noticed?
opened by shunyuzh 4

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Related tags

Overview

xTune

Environment

Data & Model Preparation

XTREME Datasets

Translations

Bi-lingual dictionaries

Models

Fine-tuning Usage

Examples: XTREME Tasks

XNLI fine-tuning on English training set and translated training sets (`translate-train-all`)

XNLI fine-tuning on English training set (`cross-lingual-transfer`)

Paper

Reference

You might also like...

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

Code for our paper "Sematic Representation for Dialogue Modeling" in ACL2021

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis

Meta Representation Transformation for Low-resource Cross-lingual Learning

Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

Comments

How to reproduce your results shown in your paper, especially Table 1

Owner

Bo Zheng

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

Consistency Regularization for Adversarial Robustness

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Code for "Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency" paper

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Related tags

Overview

xTune

Environment

Data & Model Preparation

XTREME Datasets

Translations

Bi-lingual dictionaries

Models

Fine-tuning Usage

Examples: XTREME Tasks

XNLI fine-tuning on English training set and translated training sets (translate-train-all)

XNLI fine-tuning on English training set (cross-lingual-transfer)

Paper

Reference

You might also like...

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

Code for our paper "Sematic Representation for Dialogue Modeling" in ACL2021

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis

Meta Representation Transformation for Low-resource Cross-lingual Learning

Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

Comments

How to reproduce your results shown in your paper, especially Table 1

Owner

Bo Zheng

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

Consistency Regularization for Adversarial Robustness

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Code for "Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency" paper

XNLI fine-tuning on English training set and translated training sets (`translate-train-all`)

XNLI fine-tuning on English training set (`cross-lingual-transfer`)