git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Overview

Self-Attention Attribution

This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. It includes the code for generating the self-attention attribution score, pruning attention heads with our method, constructing the attribution tree and extracting the adversarial triggers. All of our experiments are conducted on bert-base-cased model, our methods can also be easily transfered to other Transformer-based models.

Requirements

  • Python version >= 3.5
  • Pytorch version == 1.1.0
  • networkx == 2.3

We recommend you to run the code using the docker under Linux:

docker run -it --rm --runtime=nvidia --ipc=host --privileged pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-devel bash

Then install the following packages with pip:

pip install --user networkx==2.3
pip install --user matplotlib==3.1.0
pip install --user tensorboardX six numpy tqdm scikit-learn

You can install attattr from source:

git clone https://github.com/YRdddream/attattr
cd attattr
pip install --user --editable .

Download Pre-Finetuned Models and Datasets

Before running self-attention attribution, you can first fine-tune bert-base-cased model on a downstream task (such as MNLI) by running the file run_classifier_orig.py. We also provide the example datasets and the pre-finetuned checkpoints at Google Drive.

Get Self-Attention Attribution Scores

Run the following command to get the self-attention attribution score and the self-attention score.

python examples/generate_attrscore.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --batch_size 16 --num_batch 4 \
       --model_file ${model_file} --example_index ${example_index} \
       --get_att_attr --get_att_score --output_dir ${output_dir}

Construction of Attribution Tree

When you get the self-attribution scores of a target example, you could construct the attribution tree. We recommend you to run the file get_tokens_and_pred.py to summarize the data, or you can manually change the value of tokens in attribution_tree.py.

python examples/attribution_tree.py --attr_file ${attr_file} --tokens_file ${tokens_file} \
       --task_name ${task_name} --example_index ${example_index} 

You can generate the attribution tree from the provided example.

python examples/attribution_tree.py --attr_file ${model_and_data}/mnli_example/attr_zero_base_exp16.json \
       --tokens_file ${model_and_data}/mnli_example/tokens_and_pred_100.json \
       --task_name mnli --example_index 16

Self-Attention Head Pruning

We provide the code of pruning attention heads with both our attribution method and the Taylor expansion method. Pruning heads with our method.

python examples/prune_head_with_attr.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file}  --output_dir ${output_dir}

Pruning heads with Taylor expansion method.

python examples/prune_head_with_taylor.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file}  --output_dir ${output_dir}

Adversarial Attack

First extract the most important connections from the training dataset.

python examples/run_adver_connection.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --batch_size 16 --num_batch 4 --zero_baseline \
       --model_file ${model_file} --output_dir ${output_dir}

Then use these adversarial triggers to attack the original model.

python examples/run_adver_evaluate.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file} \
       --output_dir ${output_dir} --pattern_file ${pattern_file}

Reference

If you find this repository useful for your work, you can cite the paper:

@inproceedings{attattr,
  author = {Yaru Hao and Li Dong and Furu Wei and Ke Xu},
  title = {Self-Attention Attribution: Interpreting Information Interactions Inside Transformer},
  booktitle = {The Thirty-Fifth {AAAI} Conference on Artificial Intelligence},
  publisher = {{AAAI} Press},
  year      = {2021},
  url       = {https://arxiv.org/pdf/2004.11207.pdf}
}
Comments
  • attention tree error

    attention tree error

    Hi YRddream, I am running your code with your examples and commands. But I found an error when running "attention tree" module. When I use the following command, a blank graph will be generated, and there is an error that the dividend cannot be 0. Could you help me? Thank u so much~~

    !python examples/attribution_tree.py --attr_file "/content/drive/MyDrive/AttentionAttr/attattr/data/model_and_data/model_and_data/mnli_example/attr_zero_base_exp16.json" 
           --tokens_file "/content/drive/MyDrive/AttentionAttr/attattr/data/model_and_data/model_and_data/mnli_example/tokens_and_pred_100.json" 
           --task_name mnli --example_index 16

    opened by liuyuanhao 5
  • IndexError: list index out of range

    IndexError: list index out of range

    Thanks for your great codes! I enjoy reading the paper.

    When reproducing results following the provided readme,

    python examples/attribution_tree.py --attr_file ${model_and_data}/mnli_example/attr_zero_base_exp16.json
    --tokens_file ${model_and_data}/mnli_example/tokens_and_pred_100.json
    --task_name mnli --example_index 16

    I encounter the bugs

    Traceback (most recent call last): File "examples/attribution_tree.py", line 145, in main() File "examples/attribution_tree.py", line 98, in main with open(args.tokens_file) as fin: IndexError: list index out of range

    I found that the provided attribution json is 22*22 matrix, but the tokens of index 16 in tokens_and_pred_100.json only have 21 tokens.

    ['[CLS]', 'the', 'emotions', 'are', 'raw', 'and', 'will', 'strike', 'a', 'nerve', 'with', 'anyone', 'who', "'", 's', 'ever', 'had', 'family', 'trauma', '.', '[SEP]']

    Thus, it incurs the error.

    Could you check the provided the files or codes? Or I missed something?

    opened by Jxu-Thu 3
  • Getting attribution scores for unlabelled data.

    Getting attribution scores for unlabelled data.

    Hi, In the code for attribution score calculation in the file generate_attrscore.py, if we wanted to get the score using say SST-2 fine-tuned model. If we generate the attribution score for an input text without gold labels, i.e. making the assumption that $\text{predicted labels} = \text{true label}$. What would be the implications of it, would this affect the attribute tree generation so much as to render this approach unusable?

    Referring to the file generate_attrscore.py:

    tar_prob, grad = model(input_ids, segment_ids, input_mask, label_ids, tar_layer, one_batch_att, pred_label=pred_label)
    

    for $\text{predicted labels} = \text{true label}$, changing it to:

    pred_tensor = torch.argmax(baseline_logits)
    tar_prob, grad = model(input_ids, segment_ids, input_mask, pred_tensor.unsqueeze(0), tar_layer, one_batch_att, pred_label=pred_label)
    
    opened by gauneg 1
  • attention tree threshold

    attention tree threshold

    Hi, great work. I was trying to understand the code for the attribute tree construction. When calculating $max(Attr(A^t))$, why do you leave out the values of the 0th row. In the following section:

    proportion_all = copy.deepcopy(att_all)
    for i in range(len(proportion_all)):
        proportion_all[i] /= abs(proportion_all[i][1:,:].max())
    
    

    Why not use all the parameters to calculate the max? Is it a part of the heuristics of removing the interaction with the [CLS] token?

    opened by gauneg 1
  • Applying to time series

    Applying to time series

    Hi, I would like to ask if the method in your example can be simply modified to apply to time series instead of NLP? Because I see that you use bert as the baseline, but if you use it for time series, you need another form of transformer. Thank u so much

    opened by liuyuanhao 1
  • Interval parameter m in paper

    Interval parameter m in paper

    Hi,

    Catching up with this repo together with your paper. Thanks for providing great attribution methodology. In 3rd/4th pages of your paper, found a description of approximation step parameter m, which I couldn't find the way to tune yet.

    Could you point out the line of the codes you're setting it, or could you guide me a bit so that I can implement on my own?

    Thanks!

    opened by HireTheHero 1
  • batch wise attribution

    batch wise attribution

    Amazing work! I had a quick question: As of right now, I see that the method works for 1 example input at a time. Can we process more than one example at a time using the generate_attrscore.py file? And if so, how do I go about doing so? Thank you!

    opened by srn284 1
  • Effectiveness Analysis Code

    Effectiveness Analysis Code

    Hello, It seems that there is no corresponding code for Effectiveness Analysis in Section 4.1. Would you be able to add it? Besides, I have a question about Section 4.1: The I_{h} selection principle is not clear in the paper, how do you calculate the importance of the head based on attr (max pool or avg pool)?

    image

    opened by zyccyz 1
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
Owner
null
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 7, 2023
git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Commonsense Knowledge Base Completion with Structural and Semantic Context Code for the paper Commonsense Knowledge Base Completion with Structural an

AI2 96 Nov 5, 2022
《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification

null 76 Dec 5, 2022
Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

Microsoft 486 Dec 20, 2022
Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

Chenglin Yang 20 Dec 31, 2021
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Visual-Reasoning-eXplanation [CVPR 2021 A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts] Project Page | Vid

Andy_Ge 54 Dec 21, 2022
git《Tangent Space Backpropogation for 3D Transformation Groups》(CVPR 2021) GitHub:1]

LieTorch: Tangent Space Backpropagation Introduction The LieTorch library generalizes PyTorch to 3D transformation groups. Just as torch.Tensor is a m

Princeton Vision & Learning Lab 482 Jan 6, 2023
git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

null 233 Dec 29, 2022
git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

Yue Cao 51 Nov 22, 2022
Keep CALM and Improve Visual Feature Attribution

Keep CALM and Improve Visual Feature Attribution Jae Myung Kim1*, Junsuk Choe1*, Zeynep Akata2, Seong Joon Oh1† * Equal contribution † Corresponding a

NAVER AI 90 Dec 7, 2022
This is the pytorch implementation of the paper - Axiomatic Attribution for Deep Networks.

Integrated Gradients This is the pytorch implementation of "Axiomatic Attribution for Deep Networks". The original tensorflow version could be found h

Tianhong Dai 150 Dec 23, 2022
PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs

PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs This code aims to reproduce results obtained in the paper "Visual F

Orobix 93 Aug 17, 2022
This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

GMPQ: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation This is the pytorch implementation for the paper: Generalizable Mix

null 18 Sep 2, 2022
Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.

DistilBERT-Text-mining-authorship-attribution Dataset used: https://www.kaggle.com/azimulh/tweets-data-for-authorship-attribution-modelling/version/2

null 1 Jan 13, 2022
Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching Official pytorch implementation of "Show, Attend and Distill: Kn

Clova AI Research 80 Dec 16, 2022
Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection, AAAI 2021.

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection This repository is an official implementation of the AAAI 2021 paper Co-mi

MEGVII Research 20 Dec 7, 2022
Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

Michael Schlichtkrull 29 Sep 2, 2022
[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Dec 29, 2022