Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

Linyang Li

Last update: Jan 4, 2023

Related tags

Deep Learning BERT-Attack

Overview

BERT-ATTACK

Code for our EMNLP2020 long paper:

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Dependencies

Python 3.7
PyTorch 1.4.0
transformers 2.9.0
TextFooler

Usage

To train a classification model, please use the run_glue.py script in the huggingface transformers==2.9.0.

To generate adversarial samples based on the masked-LM, run

python bertattack.py --data_path data_defense/imdb_1k.tsv --mlm_path bert-base-uncased --tgt_path models/imdbclassifier --use_sim_mat 1 --output_dir data_defense/imdb_logs.tsv --num_label 2 --use_bpe 1 --k 48 --start 0 --end 1000 --threshold_pred_score 0

--data_path: We take IMDB dataset as an example. Datasets can be obtained in TextFooler.
--mlm_path: We use BERT-base-uncased model as our target masked-LM.
--tgt_path: We follow the official fine-tuning process in transformers to fine-tune BERT as the target model.
--k 48: The threshold k is the number of possible candidates
--output_dir : The output file.
--start: --end: in case the dataset is large, we provide a script for multi-thread process.
--threshold_pred_score: a score in cutting off predictions that may not be suitable (details in Section5.1)

Note

The datasets are re-formatted to the GLUE style.

Some configs are fixed, you can manually change them.

If you need to use similar-words-filter, you need to download and process consine similarity matrix following TextFooler. We only use the filter in sentiment classification tasks like IMDB and YELP.

If you need to evaluate the USE-results, you need to create the corresponding tensorflow environment USE.

For faster generation, you could turn off the BPE substitution.

As illustrated in the paper, we set thresholds to balance between the attack success rate and USE similarity score.

The multi-thread process use the batchrun.py script

You can run

cat cmd.txt | python batchrun.py --gpus 0,1,2,3

to simutaneously generate adversarial samples of the given dataset for faster generation. We use the IMDB dataset as an example.

Comments

About --tgt_path models/imdbclassifier

When I operate according to the readme, I get the error requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/models/imdbclassifier; I don’t know how to solve it. Thank you very much.

opened by JieJayCao 3
Please provide 'data_defense/counter-fitted-vectors.txt' and 'data_defense/cos_sim_counter_fitting.npy'

Hi Linyang Li!

I want to ask if you could share the above two files: data_defense/counter-fitted-vectors.txt and data_defense/cos_sim_counter_fitting.npy? I see you use them in the code to filter out antonyms.

Thank you so much.

opened by dangne 2
Questions about antonyms

Hi, thx for your sharing! Some antonyms were found in the crafted adversarial samples, such as finest->worst. How can I filter antonyms as described in section 3.2.1？

opened by zedzx1uv 2
Issue of Truncation when running attack function

Hi! I used your code in bertattack.py and set the arguments as below: output_dir = rootPath+'/data_attack/yelp.tsv' start = 0 data_path = rootPath+'/data/yelp' mlm_path = 'bert-base-uncased' tgt_path = rootPath + '/classifier/yelp' num_label = 2 k = 48

where data and model of yelp come from TextFooler repository. However, it keeps reporting Truncation was not explicitely activated butmax_lengthis provided a specific value, please usetruncation=Trueto explicitely truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy totruncation. when I ran feat = attack(feat, tgt_model, mlm_model, tokenizer_tgt, k, batch_size=32, max_length=512, cos_mat=cos_mat, w2i=w2i, i2w=i2w) I guess it should be somewhere included in the for looping of attack function, but I cannot tell which code I should modify.

Can you come up with how to fix the truncation issue? Or would you mind to share the data and classifier you used? (but I guess it is not the problem with data and classifier)

opened by ShuangNYU 2
--tgt_path

Model name 'models/imdbclassifier' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert -base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, be rt-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-larg e-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1, bert-base-dutch-cased). We assumed 'models/imdbclassifier-0.6.6' was a path, a m odel identifier, or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this pat h or url. 请问这个问题怎么改？

opened by marsssser 1
Question about automatic metric (success rate)

Hi! First of all, thank you for sharing your awesome work. I have a question about how the detailed calculation of the "success rate" in the paper.

In detail, I'm not sure whether the original examples in which the victim already makes the wrong predictions are not considered in this metric.

For instance, let's suppose that there are 100 examples and the model's accuracy is 90%. In this case, are the 10 examples with the wrong prediction not attacked? Also, are they also considered in the success rate or not?

I'm sorry for the silly questions, and hope you have a nice day!

opened by ddehun 1
Hi could you provide the model folder you are attacking

I am fine tuning model BertForSequenceClassification with transformer 4.5.1 and pass the entire checkpoint to tgt_path, but it does not really work, it that the model you are using?

opened by YerongLi 1
bertattack.py run time error

Dear Sir, when I execute the following script

python bertattack.py --data_path data_defense/imdb_1k.tsv --mlm_path bert-base-uncased --tgt_path models/imdbclassifier --use_sim_mat 1 --output_dir data_defense/imdb_logs.tsv --num_label 2 --use_bpe 1 --k 48 --start 0 --end 1000 --threshold_pred_score 0

and get the Exception Below :

404 Client Error: Repository Not Found for url: https://huggingface.co/None/resolve/main/tokenizer_config.json Traceback (most recent call last): File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 2235, in get_file_from_repo resolved_file = cached_path( File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 1846, in cached_path output_path = get_from_cache( File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 2050, in get_from_cache _raise_for_status(r) File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 1971, in _raise_for_status raise RepositoryNotFoundError(f"404 Client Error: Repository Not Found for url: {request.url}") transformers.file_utils.RepositoryNotFoundError: 404 Client Error: Repository Not Found for url: https://huggingface.co/None/resolve/main/tokenizer_config.json

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "bertattack.py", line 506, in run_attack() File "bertattack.py", line 464, in run_attack tokenizer_tgt = BertTokenizer.from_pretrained(tgt_path, do_lower_case=True) File "/home/soar/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1655, in from_pretrained resolved_config_file = get_file_from_repo( File "/home/soar/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 2247, in get_file_from_repo raise EnvironmentError( OSError: None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models' If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token or log in with huggingface-cli login and pass use_auth_token=True.

opened by maverickyu 2

Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

Related tags

Overview

BERT-ATTACK

Dependencies

Usage

Note

Comments

About --tgt_path models/imdbclassifier

Please provide 'data_defense/counter-fitted-vectors.txt' and 'data_defense/cos_sim_counter_fitting.npy'

Questions about antonyms

Issue of Truncation when running attack function

--tgt_path

Question about automatic metric (success rate)

Hi could you provide the model folder you are attacking

bertattack.py run time error

Owner

Linyang Li

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

A certifiable defense against adversarial examples by training neural networks to be provably robust

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Code for "Adversarial Attack Generation Empowered by Min-Max Optimization", NeurIPS 2021

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack

Super Pix Adv - Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)

Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Causal-Adversarial-Instruments - PyTorch Implementation for Developing Library of Investigating Adversarial Examples on A Causal View by Instruments