Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

Overview

Official Code Implementation of The Paper : XAI for Transformers: Better Explanations through Conservative Propagation

For the SST-2 and IMDB expermints follow the following instructions :


- For reproducing the results over SST-2 dataset , please run the following :
$ python /sst2/run_sst.py
  • This will yield the following files:
    • all_flips_pruning.p , all_flips_generate.p
    • These pickle files can be loaded inside the notebook "paper_plots" to generate the pertubation plots reported in the paper.
  • For IMDB dataset, the starting code can be found at /imdb/run_imdb.py
    • Follow the same instructions of SST-2 dataset above.
You might also like...
Collection of NLP model explanations and accompanying analysis tools
Collection of NLP model explanations and accompanying analysis tools

Thermostat is a large collection of NLP model explanations and accompanying analysis tools. Combines explainability methods from the captum library wi

📦 PyTorch based visualization package for generating layer-wise explanations for CNNs.
📦 PyTorch based visualization package for generating layer-wise explanations for CNNs.

Explainable CNNs 📦 Flexible visualization package for generating layer-wise explanations for CNNs. It is a common notion that a Deep Learning model i

This folder contains the implementation of the multi-relational attribute propagation algorithm.

MrAP This folder contains the implementation of the multi-relational attribute propagation algorithm. It requires the package pytorch-scatter. Please

Official code for
Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021. Introduction We proposed a novel model training paradi

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images. The project is an official implementation of our paper
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers
An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Sequence Feature Alignment (SFA) By Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-jun Zha, Yonggang Wen, and Dacheng Tao This repository is an o

Comments
  • Hi!

    Hi!

    How did you produce Figure 3: Conservation, or lack thereof, of GI and LRP attributions on two Transformer models of the paper? I do see a function plot_conservation in plot_utils.py, however I am unable to figure out what you pass as the conservation_dict?

    opened by Aakriti23 0
  • Relevance values explosion when applying the method on Vision Transformer

    Relevance values explosion when applying the method on Vision Transformer

    Hello,

    First thank you for this interesting work!

    I'm trying to test your algorithm on vision transformer. However, I encounter the "relevance explosion" problem: when relevance is distributed from one block to the previous, the relevance total scale just jump by more than 10x... and after 12 blocks... the values are in scale of 10^10 +

    I rewrite the LayerNorm and Linear layer similar to the way you had:

    For linear: I wrote the gamma rule with gamma = 0.02

        def alternative_inference(self, input):
    
            if 'player' not in self.__dict__:
                out_size, in_size = self.weight.shape
                player = Linear(in_size, out_size)
                player.weight = torch.nn.Parameter(self.weight + 0.02 * self.weight.clamp(min=0))
                player.bias = torch.nn.Parameter(self.bias + 0.02 * self.bias.clamp(min=0))
                self.player = player
    
            z = self(input)
            zp = self.player(input)
            return zp * (z / zp).data
    
    

    For layernorm, I only detached the std

        def alternative_inference(self, input):
            mean = input.mean(dim=-1, keepdim=True)
            std = input.std(dim=-1, keepdim=True)
            std = std.detach()
            input_norm = (input - mean) / (std + self.eps)
    
            input_norm = input_norm * self.weight + self.bias
            return input_norm
    

    And all relevance backpropagation is done via gradient* input, the same way as you did...

    this problem is also described in chefer et al. 2021 (transformer explanability beyond visualization). they explicitly handle the problem by forcing a normalizaiton at every add layer and only using LRP alpha-beta rule with alpha = 1 and beta= 0 ...

    In your paper, you described results on distilledBert from huggingface, so your work should be able to run on full scale transformer (12 blocks with all the bells and whistles).. I'm wondering if you have to apply other tricks to get it working...

    Thanks

    opened by zmy1116 0
Owner
Ameen Ali
PhD student - Tel Aviv University
Ameen Ali
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 3, 2023
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

Anirudh S Chakravarthy 6 May 3, 2022
Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer)

Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer) Introduction By applying the

Son Gyo Jung 1 Jul 9, 2022
Code for paper: Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

Group-CAM By Zhang, Qinglong and Rao, Lu and Yang, Yubin [State Key Laboratory for Novel Software Technology at Nanjing University] This repo is the o

zhql 98 Nov 16, 2022
PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation The paper: https://arxiv.org/abs/1704.03296 What makes

Jacob Gildenblat 322 Dec 17, 2022
Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

Implicit Internal Video Inpainting Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation paper | project

null 202 Dec 30, 2022
Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

BasicVSR_PlusPlus (CVPR 2022) [Paper] [Project Page] [Code] This is the official repository for BasicVSR++. Please feel free to raise issue related to

Kelvin C.K. Chan 227 Jan 1, 2023
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
audioLIME: Listenable Explanations Using Source Separation

audioLIME This repository contains the Python package audioLIME, a tool for creating listenable explanations for machine learning models in music info

Institute of Computational Perception 27 Dec 1, 2022