Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Allen

Last update: Nov 12, 2022

Related tags

Overview

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

arXiv link: upcoming

To be published in Findings of NAACL 2022

Authors: Chin-Lun Fu*, Zih-Ching Chen*, Yun-Ru Lee, Hung-yi Lee

Overview

In this study, AdapterBias, a surprisingly simple yet effective adapter architecture, is proposed. AdapterBias adds a token-dependent shift to the hidden output of transformer layers to adapt to downstream tasks with only a vector and a linear layer.

Dataset

We use GLUE Benchmark as our dataset. You can download all datasets from the website.

Training

cd src
python exp.py \
    --adapter True \
    --GLUE_path <ur_GLUE_path> \
    --output_path <output_path> \
    --model <model name> \
    --task <the task u want to run> \
    --epoch 100 \
    --lr 0.0001 \
    --max_len 512 \
    --batch_size 32 \

-s or --seed specifies the random seed
-g or --GLUE_path specifies the path of your GLUE dataset.
-o or --output_path specifies the path of saved model and saved predicted file.
-m or --model specifies the pre-trained language model (PLM) you used in training.
- Some examples: bert-base, bert-large, roberta-base, roberta-large
-t or --task specifies the downstream task.
- Some examples: cola, mnli, qnli, qqp, mrpc, rte, sst, sts
-a or --adapter specifies whether you adding our AdapterBias in PLM
--share_alpha specifies whether you share the same alpha in AdapterBias in all transformer layers

Inference

After you run the training, you can automatically get the prediction file in <output_path>/result/. Also, the saved model is in <output_path>/model/.

Running all nine tasks of GLUE benchmark, you can sumbit the prediction files to the website.

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning This repo is for Findings at EMNLP 2021 paper: Learn Cont

6 Sep 2, 2022

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

Expediting Vision Transformers via Token Reorganizations This repository contain

101 Dec 26, 2022

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Proteno This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deploymen

37 Dec 4, 2022

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Related tags

Overview

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Overview

Dataset

Training

Inference

You might also like...

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Long text token classification using LongFormer

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

Beyond Paragraphs: NLP for Long Sequences

Owner

Allen

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Huggingface Transformers + Adapters = ❤️

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

PyTorch Implementation of "Bridging Pre-trained Language Models and Hand-crafted Features for Unsupervised POS Tagging" (Findings of ACL 2022)

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".