Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

Overview

BERT-ATTACK

Code for our EMNLP2020 long paper:

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Dependencies

Usage

To train a classification model, please use the run_glue.py script in the huggingface transformers==2.9.0.

To generate adversarial samples based on the masked-LM, run

python bertattack.py --data_path data_defense/imdb_1k.tsv --mlm_path bert-base-uncased --tgt_path models/imdbclassifier --use_sim_mat 1 --output_dir data_defense/imdb_logs.tsv --num_label 2 --use_bpe 1 --k 48 --start 0 --end 1000 --threshold_pred_score 0
  • --data_path: We take IMDB dataset as an example. Datasets can be obtained in TextFooler.
  • --mlm_path: We use BERT-base-uncased model as our target masked-LM.
  • --tgt_path: We follow the official fine-tuning process in transformers to fine-tune BERT as the target model.
  • --k 48: The threshold k is the number of possible candidates
  • --output_dir : The output file.
  • --start: --end: in case the dataset is large, we provide a script for multi-thread process.
  • --threshold_pred_score: a score in cutting off predictions that may not be suitable (details in Section5.1)

Note

The datasets are re-formatted to the GLUE style.

Some configs are fixed, you can manually change them.

If you need to use similar-words-filter, you need to download and process consine similarity matrix following TextFooler. We only use the filter in sentiment classification tasks like IMDB and YELP.

If you need to evaluate the USE-results, you need to create the corresponding tensorflow environment USE.

For faster generation, you could turn off the BPE substitution.

As illustrated in the paper, we set thresholds to balance between the attack success rate and USE similarity score.

The multi-thread process use the batchrun.py script

You can run

cat cmd.txt | python batchrun.py --gpus 0,1,2,3 

to simutaneously generate adversarial samples of the given dataset for faster generation. We use the IMDB dataset as an example.

Comments
  • About --tgt_path models/imdbclassifier

    About --tgt_path models/imdbclassifier

    When I operate according to the readme, I get the error requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/models/imdbclassifier; I don’t know how to solve it. Thank you very much.

    opened by JieJayCao 3
  • Please provide 'data_defense/counter-fitted-vectors.txt' and 'data_defense/cos_sim_counter_fitting.npy'

    Please provide 'data_defense/counter-fitted-vectors.txt' and 'data_defense/cos_sim_counter_fitting.npy'

    Hi Linyang Li!

    I want to ask if you could share the above two files: data_defense/counter-fitted-vectors.txt and data_defense/cos_sim_counter_fitting.npy? I see you use them in the code to filter out antonyms.

    Thank you so much.

    opened by dangne 2
  • Questions about antonyms

    Questions about antonyms

    Hi, thx for your sharing! Some antonyms were found in the crafted adversarial samples, such as finest->worst. How can I filter antonyms as described in section 3.2.1?

    opened by zedzx1uv 2
  • Issue of Truncation when running attack function

    Issue of Truncation when running attack function

    Hi! I used your code in bertattack.py and set the arguments as below: output_dir = rootPath+'/data_attack/yelp.tsv' start = 0 data_path = rootPath+'/data/yelp' mlm_path = 'bert-base-uncased' tgt_path = rootPath + '/classifier/yelp' num_label = 2 k = 48

    where data and model of yelp come from TextFooler repository. However, it keeps reporting Truncation was not explicitely activated butmax_lengthis provided a specific value, please usetruncation=Trueto explicitely truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy totruncation. when I ran feat = attack(feat, tgt_model, mlm_model, tokenizer_tgt, k, batch_size=32, max_length=512, cos_mat=cos_mat, w2i=w2i, i2w=i2w) I guess it should be somewhere included in the for looping of attack function, but I cannot tell which code I should modify.

    Can you come up with how to fix the truncation issue? Or would you mind to share the data and classifier you used? (but I guess it is not the problem with data and classifier)

    opened by ShuangNYU 2
  • --tgt_path

    --tgt_path

    Model name 'models/imdbclassifier' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert -base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, be rt-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-larg e-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1, bert-base-dutch-cased). We assumed 'models/imdbclassifier-0.6.6' was a path, a m odel identifier, or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this pat h or url. 请问这个问题怎么改?

    opened by marsssser 1
  • Question about automatic metric (success rate)

    Question about automatic metric (success rate)

    Hi! First of all, thank you for sharing your awesome work. I have a question about how the detailed calculation of the "success rate" in the paper.

    In detail, I'm not sure whether the original examples in which the victim already makes the wrong predictions are not considered in this metric.

    For instance, let's suppose that there are 100 examples and the model's accuracy is 90%. In this case, are the 10 examples with the wrong prediction not attacked? Also, are they also considered in the success rate or not?

    I'm sorry for the silly questions, and hope you have a nice day!

    opened by ddehun 1
  • Hi could you provide the model folder you are attacking

    Hi could you provide the model folder you are attacking

    I am fine tuning model BertForSequenceClassification with transformer 4.5.1 and pass the entire checkpoint to tgt_path, but it does not really work, it that the model you are using?

    opened by YerongLi 1
  • bertattack.py run time error

    bertattack.py run time error

    Dear Sir, when I execute the following script

    python bertattack.py --data_path data_defense/imdb_1k.tsv --mlm_path bert-base-uncased --tgt_path models/imdbclassifier --use_sim_mat 1 --output_dir data_defense/imdb_logs.tsv --num_label 2 --use_bpe 1 --k 48 --start 0 --end 1000 --threshold_pred_score 0

    and get the Exception Below :

    404 Client Error: Repository Not Found for url: https://huggingface.co/None/resolve/main/tokenizer_config.json Traceback (most recent call last): File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 2235, in get_file_from_repo resolved_file = cached_path( File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 1846, in cached_path output_path = get_from_cache( File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 2050, in get_from_cache _raise_for_status(r) File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 1971, in _raise_for_status raise RepositoryNotFoundError(f"404 Client Error: Repository Not Found for url: {request.url}") transformers.file_utils.RepositoryNotFoundError: 404 Client Error: Repository Not Found for url: https://huggingface.co/None/resolve/main/tokenizer_config.json

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "bertattack.py", line 506, in run_attack() File "bertattack.py", line 464, in run_attack tokenizer_tgt = BertTokenizer.from_pretrained(tgt_path, do_lower_case=True) File "/home/soar/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1655, in from_pretrained resolved_config_file = get_file_from_repo( File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 2247, in get_file_from_repo raise EnvironmentError( OSError: None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models' If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token or log in with huggingface-cli login and pass use_auth_token=True.

    opened by maverickyu 2
Owner
Linyang Li
FudanNLP
Linyang Li
Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet, CVPR2021 安全AI挑战者计划第六期:ImageNet无限制对抗攻击 决赛第四名(team name: Advers)

null 51 Dec 1, 2022
G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Single Node Injection Attack against Graph Neural Networks This repository is our Pytorch implementation of our paper: Single Node Injection Attack ag

Shuchang Tao 18 Nov 21, 2022
This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Ad^2Attack:Adaptive Adversarial Attack on Real-Time UAV Tracking Demo video ?? Our video on bilibili demonstrates the test results of Ad^2Attack on se

Intelligent Vision for Robotics in Complex Environment 10 Nov 7, 2022
Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

Leo 21 Nov 23, 2022
A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

SRI Lab, ETH Zurich 202 Dec 13, 2022
Defending graph neural networks against adversarial attacks (NeurIPS 2020)

GNNGuard: Defending Graph Neural Networks against Adversarial Attacks Authors: Xiang Zhang ([email protected]), Marinka Zitnik (marinka@hms.

Zitnik Lab @ Harvard 44 Dec 7, 2022
Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks Stable Neural ODE with Lyapunov-Stable Equilibrium

Kang Qiyu 8 Dec 12, 2022
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
Code for "Adversarial Attack Generation Empowered by Min-Max Optimization", NeurIPS 2021

Min-Max Adversarial Attacks [Paper] [arXiv] [Video] [Slide] Adversarial Attack Generation Empowered by Min-Max Optimization Jingkang Wang, Tianyun Zha

Jingkang Wang 12 Nov 23, 2022
Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

Tong WU 89 Dec 15, 2022
Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

Super-Fast-Adversarial-Training This is a PyTorch Implementation code for develo

LBK 26 Dec 2, 2022
Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad

cookielee77 30 Jan 1, 2023
FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack

FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack Case study of the FCA. The code can be find in FCA. Cas

IDRL 21 Dec 15, 2022
Super Pix Adv - Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)

Super_Pix_Adv Offical implemention of Robust Superpixel-Guided Attentional Adver

DLight 8 Oct 26, 2022
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

THUNLP 37 Oct 30, 2022
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 8, 2022
Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter

ACE Please find the preliminary version published at BMVC 2020 in the folder BMVC_version, and its extended journal version in Journal_version. Datase

null 28 Dec 25, 2022
LBK 35 Dec 26, 2022
LBK 26 Dec 28, 2022