Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Wenxuan Zhou

Last update: Oct 28, 2022

Related tags

Deep Learning Contra-OOD

Overview

Contra-OOD

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Requirements

PyTorch
Transformers
datasets
wandb
tqdm
scikit-learn

Dataset

Most datasets used in paper are automatically downloaded by the datasets package. Instructions on downloading sst2 and multi30k are provided in readme.txt under the data folder.

Training and Evaluation

Finetune the PLM with the following command:

>> python run.py --task_name task

The task_name parameter can take sst2, imdb, trec, or 20ng. The training loss and evaluation results on the test set are synced to the wandb dashboard.

Comments

Problem of Table 2

The Table 2 seems confused. Why L_{margin} + MSP are not in green on SST2 dataset and in green on IMDb dataset? You write 'Results where the contrastive loss improves OOD detection on both evaluation metrics are highlighted in green.' And lower FAR95 is better.

opened by wjczf123 2
Mask is always zero in Margin-based Contrastive Loss
Reference: https://github.com/wzhouad/Contra-OOD/blob/main/model.py#L40

I have question about the implementation of Margin-based Contrastive Loss

mask = (labels.unsqueeze(1) == labels.unsqueeze(0)).float()

If the batch size is 64, the labels variable have a shape of [64] When the above code performs, ([64, 1] == [1, 64]).float() → [64, 64], which is exact 2D diagonal matrix.

mask = mask - torch.diag(torch.diag(mask))

But the problem is on the second line of code. If torch.diag(mask) performs, the result has a shape of [64] that is one-filled vector: $[1, 1, 1, ...]$ Therefore, the result of torch.diag(torch.diag(mask)) is exactly same with the mask, which is exact 2D diagonal matrix. Furthermore, if you subtract the result from mask, eventually the mask is always zero-filled matrix. Eventually, the mask variable have no power for gradient descending.

Is this really on your purpose?

I thought the mask variable is used for distinguishing $P(i)$ and $N(i)$ in equation. Is this right? Or am I missing a point?
opened by OrigamiDream 1
关于loss的一点疑问？

非常好的工作！请问一下论文中实现的Supervised Contrastive Learning Loss https://github.com/wzhouad/Contra-OOD/blob/2a1d63a61c8b03efdc27ca08b22f5fab2bc6001d/model.py#L46 和原论文中提供的SCL代码https://github.com/HobbitLong/SupContrast/blob/master/losses.py有什么差异吗？看起来本文实现的更为简洁？

opened by topDreamer 1

Codebase for Amodal Segmentation through Out-of-Task andOut-of-Distribution Generalization with a Bayesian Model

12 Nov 15, 2022

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Out-of-distribution Generalization Investigation on Vision Transformers This repository contains PyTorch evaluation code for Delving Deep into the Gen

72 Dec 13, 2022

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

56 Nov 15, 2022

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss

"# SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING" i

28 Dec 12, 2022

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

440 Jan 2, 2023

Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

210 Dec 28, 2022

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

11 Jan 1, 2023

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 7, 2023

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab

89 Dec 26, 2022

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Related tags

Overview

Contra-OOD

Requirements

Dataset

Training and Evaluation

You might also like...

Codebase for Amodal Segmentation through Out-of-Task andOut-of-Distribution Generalization with a Bayesian Model

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Official codebase for Pretrained Transformers as Universal Computation Engines.

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Comments

Problem of Table 2

Mask is always zero in Margin-based Contrastive Loss

关于loss的一点疑问？

Owner

Wenxuan Zhou

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

Principled Detection of Out-of-Distribution Examples in Neural Networks

Learning Confidence for Out-of-Distribution Detection in Neural Networks

RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection

Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018