Explaining neural decisions contrastively to alternative decisions.

Overview

Contrastive Explanations for Model Interpretability

This is the repository for the paper "Contrastive Explanations for Model Interpretability", about explaining neural model decisions against alternative decisions.

Authors: Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg

Getting Started

Setup

conda create -n contrastive python=3.8
conda activate contrastive
pip install allennlp==1.2.0rc1
pip install allennlp-models==1.2.0rc1.dev20201014
pip install jupyterlab
pip install pandas
bash scripts/download_data.sh

Contrastive projection

If you're here just to know how we implemented contrastive projection, here it is:

u = classifier_w[fact_idx] - classifier_w[foil_idx]
contrastive_projection = np.outer(u, u) / np.dot(u, u)

Very simple :)

contrastive_projection is a projection matrix that projects the model's latent representation of some example h into the direction of h that separates the logits of the fact and foil.

Training MNLI/BIOS models

bash scripts/train_sequence_classification.sh 

Highlight ranking (Sections 4.3, 5.3)

Run the notebooks/mnli-highlight-featurerank.ipynb or notebooks/bios-highlight-featurerank.ipynb jupyter notebooks.

These notebooks load the respective models, and then run the highlight ranking procedure.

Foil ranking (Section 4.1)

First, cache the model's encodings of the dev set examples:

bash scripts/cache_encodings_bios.sh

Then run the notebooks/bios-highlight-foilrank.ipynb notebook.

Contrastive decision making (Section 4.4)

First, cache the model's encodings of the dev set examples (skip if already executed):

bash scripts/cache_encodings_bios.sh

Then run the notebooks/bios-foilpower.ipynb notebook.

Foil ranking for BIOS concepts (Section 4.2)

First, generate concept labels as a numpy matrix from the BIOS dataset:

python scripts/bios_concepts.py --data-path data/bios/train.jsonl --concept-path experiments/models/bios/roberta-large/concepts/gender-male/train
python scripts/bios_concepts.py --data-path data/bios/dev.jsonl --concept-path experiments/models/bios/roberta-large/concepts/gender-male/dev
python scripts/bios_concepts.py --data-path data/bios/test.jsonl --concept-path experiments/models/bios/roberta-large/concepts/gender-male/test

Then, run Amnesic Probing:

Foil ranking for MNLI concepts (Section 5.2)

Overlap concept:

First, generate concept labels as a numpy matrix from the BIOS dataset:

python scripts/mnli_concepts.py --data-path data/mnli/train.jsonl --concept-path experiments/models/mnli/roberta-large/concepts/overlap/train
python scripts/mnli_concepts.py --data-path data/mnli/dev.jsonl --concept-path experiments/models/mnli/roberta-large/concepts/overlap/dev
python scripts/mnli_concepts.py --data-path data/mnli/test.jsonl --concept-path experiments/models/mnli/roberta-large/concepts/overlap/test

Then, run Amnesic Probing:

Negation concept:

The examples we used for the negation concept analysis are:

data/nli_negation_concept/entailment.jsonl  # entailment instances
data/nli_negation_concept/entailment_with_negation.jsonl  # the above entailment instances, paraphrased with negation words
data/nli_negation_concept/neutral.jsonl  # neutral instances
data/nli_negation_concept/neutral_with_negation.jsonl  # the above neutral instances, paraphrased with negation words

To analyze them with respect to the trained MultiNLI model, run the notebook notebooks/mnli-negation-foilrank.ipynb.

You might also like...
Code for
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom
Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom Sample on-line plotting while training(avg loss)/testing(writ

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks
DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

What is DeepHyper? DeepHyper is a software package that uses learning, optimization, and parallel computing to automate the design and development of

This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural tree born form a large search space
This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural tree born form a large search space

SeBoW: Self-Born Wiring for neural trees(PaddlePaddle version) This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural

Neural-fractal - Create Fractals Using Complex-Valued Neural Networks!

Neural Fractal Create Fractals Using Complex-Valued Neural Networks! Home Page Features Define Dynamical Systems Using Complex-Valued Neural Networks

Owner
AI2
AI2
Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021) Overview of paths used in DIG and IG. w is the word being attributed. The

INK Lab @ USC 17 Oct 27, 2022
PyArmadillo: an alternative approach to linear algebra in Python

PyArmadillo is a linear algebra library for the Python language, with an emphasis on ease of use.

Terry Zhuo 58 Oct 11, 2022
A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection 1. 介绍 用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来

null 44 Sep 15, 2022
An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

This repository contains the SystemVerilog RTL, C++, HLS (Intel FPGA OpenCL to wrap RTL code) and Python needed to reproduce the numerical results in

Facebook Research 373 Dec 31, 2022
A Python library that provides a simplified alternative to DBAPI 2

A Python library that provides a simplified alternative to DBAPI 2. It provides a facade in front of DBAPI 2 drivers.

Tony Locke 44 Nov 17, 2021
Contra is a lightweight, production ready Tensorflow alternative for solving time series prediction challenges with AI

Contra AI Engine A lightweight, production ready Tensorflow alternative developed by Styvio styvio.com » How to Use · Report Bug · Request Feature Tab

styvio 14 May 25, 2022
QR2Pass-project - A proof of concept for an alternative (passwordless) authentication system to a web server

QR2Pass This is a proof of concept for an alternative (passwordless) authenticat

null 4 Dec 9, 2022
Plover-tapey-tape: an alternative to Plover’s built-in paper tape

plover-tapey-tape plover-tapey-tape is an alternative to Plover’s built-in paper

null 7 May 29, 2022
A commany has recently introduced a new type of bidding, the average bidding, as an alternative to the bid given to the current maximum bidding

Business Problem A commany has recently introduced a new type of bidding, the average bidding, as an alternative to the bid given to the current maxim

Kübra Bilinmiş 1 Jan 15, 2022
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022