Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Minho Ryu

Last update: Nov 30, 2022

Related tags

Overview

Knowledge Distillation for BERT Unsupervised Domain Adaptation

Official PyTorch implementation | Paper

Abstract

A pre-trained language model, BERT, has brought significant performance improvements across a range of natural language processing tasks. Since the model is trained on a large corpus of diverse topics, it shows robust performance for domain shift problems in which data distributions at training (source data) and testing (target data) differ while sharing similarities. Despite its great improvements compared to previous models, it still suffers from performance degradation due to domain shifts. To mitigate such problems, we propose a simple but effective unsupervised domain adaptation method, adversarial adaptation with distillation (AAD), which combines the adversarial discriminative domain adaptation (ADDA) framework with knowledge distillation. We evaluate our approach in the task of cross-domain sentiment classification on 30 domain pairs, advancing the state-of-the-art performance for unsupervised domain adaptation in text sentiment classification.

Requirements

pandas
pytorch
transformers

Run the test

$ python main.py --pretrain --adapt --src books --tgt dvd

How to cite

@article{ryu2020knowledge,
  title={Knowledge Distillation for BERT Unsupervised Domain Adaptation},
  author={Ryu, Minho and Lee, Kichun},
  journal={arXiv preprint arXiv:2010.11478},
  year={2020}
}

You might also like...

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022)

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022) Junjie Ye, Changhong Fu, Guangze Zheng, Danda Pani Paudel, and Guang Chen. Uns

Intelligent Vision for Robotics in Complex Environment

91 Dec 30, 2022

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking Updates 08/2021: check out our domain adaptation for video segmentation paper Domain A

17 Nov 30, 2022

FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

This repository contains the code accompanying the paper " FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation" Paper link: R

20 Jun 29, 2022

Cycle Consistent Adversarial Domain Adaptation (CyCADA)

Cycle Consistent Adversarial Domain Adaptation (CyCADA) A pytorch implementation of CyCADA. If you use this code in your research please consider citi

2 Jan 10, 2022

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Visualizing Adapted Knowledge in Domain Transfer @inproceedings{hou2021visualizing, title={Visualizing Adapted Knowledge in Domain Transfer}, auth

80 Dec 25, 2022

Comments

The adapt accuracies are not comparable to no adapt?

Hi,

Thanks for your code. It helps a lot. When I was trying to reproduce your results on the amazon review datasets, I found the BERT-ADD accuracies are worse than no adapt results? Have you encountered the same issue? Epoch [79/80] Step [195/200]: acc=0.5000 g_loss=0.6922 d_loss=0.6932 kd_loss=0.0000. I also noticed that when the adapt training converges, only the kd_loss descent to 0, but g_loss and d_loss didn't descend at all. Is it normal or maybe this is where the problem is? Or could you please release the hyperparameters for Bert-AAD?

Thanks,

opened by Originofamonia 0

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Related tags

Overview

Knowledge Distillation for BERT Unsupervised Domain Adaptation

Abstract

Requirements

Run the test

How to cite

You might also like...

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022)

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

Cycle Consistent Adversarial Domain Adaptation (CyCADA)

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

Pytorch implementation for Patient Knowledge Distillation for BERT Model Compression

[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Comments

The adapt accuracies are not comparable to no adapt?

Owner

Minho Ryu

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

A PyTorch implementation for Unsupervised Domain Adaptation by Backpropagation(DANN), support Office-31 and Office-Home dataset

(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

Unified unsupervised and semi-supervised domain adaptation network for cross-scenario face anti-spoofing, Pattern Recognition

IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020)

Code of TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation