Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

CopeNLU

Last update: Dec 5, 2022

Related tags

Deep Learning xformer-multi-source-domain-adaptation

Overview

Transformer Based Multi-Source Domain Adaptation

Dustin Wright and Isabelle Augenstein

To appear in EMNLP 2020. Read the preprint: https://arxiv.org/abs/2009.07806

In practical machine learning settings, the data on which a model must make predictions often come from a different distribution than the data it was trained on. Here, we investigate the problem of unsupervised multi-source domain adaptation, where a model is trained on labelled data from multiple source domains and must make predictions on a domain for which no labelled data has been seen. Prior work with CNNs and RNNs has demonstrated the benefit of mixture of experts, where the predictions of multiple domain expert classifiers are combined; as well as domain adversarial training, to induce a domain agnostic representation space. Inspired by this, we investigate how such methods can be effectively applied to large pretrained transformer models. We find that domain adversarial training has an effect on the learned representations of these models while having little effect on their performance, suggesting that large transformer-based models are already relatively robust across domains. Additionally, we show that mixture of experts leads to significant performance improvements by comparing several variants of mixing functions, including one novel mixture based on attention. Finally, we demonstrate that the predictions of large pretrained transformer based domain experts are highly homogenous, making it challenging to learn effective functions for mixing their predictions.

Citing

@inproceedings{wright2020transformer,
  title={{Transformer Based Multi-Source Domain Adaptation}},
  author={Dustin Wright and Isabelle Augenstein},
  booktitle = {Proceedings of EMNLP},
  publisher = {Association for Computational Linguistics},
  year = 2020
}

Recreating Results

To recreate our results, first download the Amazon Product Reviews and PHEME Rumour Detection datasets and place them in the 'data/' directory. For sentiment data place it in a directory called 'data/sentiment-dataset' and for the PHEME data place it in a directory called 'data/PHEME'

Create a new conda environment:

$ conda create --name xformer-multisource-domain-adaptation python=3.7
$ conda activate xformer-multisource-domain-adaptation
$ pip install -r requirements.txt

Note that this project uses wandb; if you do not use wandb, set the following flag to store runs only locally:

export WANDB_MODE=dryrun

Running all experiments

The files for running all of the experiments are in run_sentiment_experiments.sh and run_claim_experiments.sh. You can look in these files for the commands to run a particular experiment. Running either of these files will run all 10 variants presented in the paper 5 times. The individual scripts used for each experiment are under emnlp_final_experiments/claim-detection and emnlp_final_experiments/sentiment-analysis

Comments

Question on min-max objective of eq(9) in paper

Hi Dustin,

Thank you for your great work on EMNLP. I was reading the paper and come across a confusion. In eq(9), it seems the discriminator shall minimize the domain adversarial loss and it is the generator that aims to maximize it. So does that mean it should be max_theta_g and min_theta_d?

opened by tonytan48 0

Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

Related tags

Overview

Transformer Based Multi-Source Domain Adaptation

Citing

Recreating Results

Running all experiments

You might also like...

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

EMNLP 2020 - Summarizing Text on Any Aspects

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.

This repo will contain code to reproduce and build upon understanding transfer learning

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Code to reproduce the results for Compositional Attention: Disentangling Search and Retrieval.

Code to reproduce the results for Statistically Robust Neural Network Classification, published in UAI 2021

Comments

Question on min-max objective of eq(9) in paper

Owner

CopeNLU

Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".