Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

INK Lab @ USC

Last update: Oct 27, 2022

Related tags

Deep Learning DIG

Overview

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Overview of paths used in DIG and IG. w is the word being attributed. The gray region is the neighborhood of w. Green line depicts the straight-line path from w to w' used by IG and the green squares are the corresponding interpolation points. Left: In DIG-Greedy, we first monotonize each word in the neighborhood (red arrow). Then the word closest to its corresponding monotonic point is selected as the anchor (blue line to w_5 since the red arrow of w_5 has the shortest magnitude). Right: In DIG-MaxCount we first count the number of monotonic dimensions for each word in the neighborhood (shown in [.] above). Then, the word with the highest number of monotonic dimensions is selected as the anchor word (blue line to w_4), followed by changing the non-monotonic dimensions of w_4 (red line to c). Repeating this step gives the zigzag blue path. Finally, the red stars are the interpolated points used by our method. Please refer to the paper for more details.

Dependencies

Dependencies can be installed using requirements.txt.

Evaluating DIG:

Install all the requirements from requirements.txt.
Execute ./setup.sh for setting up the folder hierarchy for experiments.

Commands for reproducing the reported results on DistilBERT fine-tuned on SST2:

# Generate the KNN graph
python knn.py -dataset sst2 -nn distilbert

# DIG (strategy: Greedy)
python main.py -dataset sst2 -nn distilbert -strategy greedy

# DIG (strategy: MaxCount)
python main.py -dataset sst2 -nn distilbert -strategy maxcount

Similarly, commands can be changed for other settings.

Please contact Soumya for any clarifications or suggestions.

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

60 Dec 30, 2022

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

This repository contains data and code for our EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation. Please contact me at [email protected]

9 Oct 28, 2022

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge Introduction SentiLARE is a sentiment-aware pre-trained language

74 Dec 30, 2022

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

105 Nov 25, 2022

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

105 Nov 25, 2022

The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

savior是一个能够进行快速集成算法模块并支持高性能部署的轻量开发框架。能够帮助将团队进行快速想法验证（PoC），避免重复的去github上找模型然后复现模型；能够帮助团队将功能进行流程拆解，很方便的提高分布式执行效率；能够有效减少代码冗余，减少不必要负担。

125 Dec 22, 2022

ivadomed is an integrated framework for medical image analysis with deep learning.

Repository on the collaborative IVADO medical imaging project between the Mila and NeuroPoly labs.

144 Dec 19, 2022

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

AASIST This repository provides the overall framework for training and evaluating audio anti-spoofing systems proposed in 'AASIST: Audio Anti-Spoofing

56 Jan 2, 2023

Examples of using f2py to get high-speed Fortran integrated with Python easily

f2py Examples Simple examples of using f2py to get high-speed Fortran integrated with Python easily. These examples are also useful to troubleshoot pr

35 Aug 21, 2022

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Related tags

Overview

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Dependencies

Evaluating DIG:

You might also like...

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

ivadomed is an integrated framework for medical image analysis with deep learning.

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

Examples of using f2py to get high-speed Fortran integrated with Python easily

Owner

INK Lab @ USC

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

A mini library for Policy Gradients with Parameter-based Exploration, with reference implementation of the ClipUp optimizer from NNAISENSE.

A Pytorch implementation of the multi agent deep deterministic policy gradients (MADDPG) algorithm

Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning

EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling