[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Lue Tao

Last update: Sep 20, 2022

Related tags

Deep Learning Delusive-Adversary

Overview

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Code for NeurIPS 2021 paper "Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training" by Lue Tao, Lei Feng, Jinfeng Yi, Sheng-Jun Huang, and Songcan Chen.
This repository contains an implementation of the attacks (P1~P5) and the defense (adversarial training) in the paper.

Requirements

Our code relies on PyTorch, which will be automatically installed when you follow the instructions below.

conda create -n delusion python=3.8
conda activate delusion
pip install -r requirements.txt

Running Experiments

Pre-train a standard model on CIFAR-10 (the dataset will be automatically download).

python main.py --train_loss ST

Generate perturbed training data.

python poison.py --poison_type P1
python poison.py --poison_type P2
python poison.py --poison_type P3
python poison.py --poison_type P4
python poison.py --poison_type P5

Visualize the perturbed training data (optional).

tensorboard --logdir ./results

Standard training on the perturbed data.

python main.py --train_loss ST --poison_type P1
python main.py --train_loss ST --poison_type P2
python main.py --train_loss ST --poison_type P3
python main.py --train_loss ST --poison_type P4
python main.py --train_loss ST --poison_type P5

Adversarial training on the perturbed data.

python main.py --train_loss AT --poison_type P1
python main.py --train_loss AT --poison_type P2
python main.py --train_loss AT --poison_type P3
python main.py --train_loss AT --poison_type P4
python main.py --train_loss AT --poison_type P5

Results

Figure 1: An illustration of delusive attacks and adversarial training. Left: Random samples from the CIFAR-10 training set: the original training set D and the perturbed training set D_P5 generated using the P5 attack. Right: Natural accuracy evaluated on the CIFAR-10 test set for models trained with: i) standard training on D; ii) adversarial training on D; iii) standard training on D_P5; iv) adversarial training on D_P5. While standard training on D_P5 incurs poor generalization performance on D, adversarial training can help a lot.

Table 1: Below we report mean and standard deviation of the test accuracy for the CIFAR-10 dataset. As we can see, the performance deviations of the defense (i.e., adversarial training) are very small (< 0.50%), which hardly effect the results. In contrast, the results of standard training are relatively unstable.

Training method \ Training data	P1	P2	P3	P4	P5
Standard training	37.87±0.94	74.24±1.32	15.14±2.10	23.69±2.98	11.76±0.72
Adversarial training	86.59±0.30	89.50±0.21	88.12±0.39	88.15±0.15	88.12±0.43

Key takeaways: Our theoretical justifications in the paper, along with the empirical results, suggest that adversarial training is a principled and promising defense against delusive attacks.

Citing this work

@inproceedings{tao2021better,
    title={Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training},
    author={Tao, Lue and Feng, Lei and Yi, Jinfeng and Huang, Sheng-Jun and Chen, Songcan},
    booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
    year={2021}
}

You might also like...

Code for "Adversarial Attack Generation Empowered by Min-Max Optimization", NeurIPS 2021

Min-Max Adversarial Attacks [Paper] [arXiv] [Video] [Slide] Adversarial Attack Generation Empowered by Min-Max Optimization Jingkang Wang, Tianyun Zha

12 Nov 23, 2022

Deduplicating Training Data Makes Language Models Better

Deduplicating Training Data Makes Language Models Better This repository contains code to deduplicate language model datasets as descrbed in the paper

431 Dec 27, 2022

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

ST++ This is the official PyTorch implementation of our paper: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation. Lihe Ya

147 Jan 3, 2023

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet

145 Sep 16, 2022

[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Related tags

Overview

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Requirements

Running Experiments

Results

Citing this work

You might also like...

Code for "Adversarial Attack Generation Empowered by Min-Max Optimization", NeurIPS 2021

Deduplicating Training Data Makes Language Models Better

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Based on the paper "Geometry-aware Instance-reweighted Adversarial Training" ICLR 2021 oral

[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

[CVPR 2021] Teachers Do More Than Teach: Compressing Image-to-Image Models (CAT)

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

Owner

Lue Tao

BlockUnexpectedPackets - Preventing BungeeCord CPU overload due to Layer 7 DDoS attacks by scanning BungeeCord's logs

[ICLR 2021] Is Attention Better Than Matrix Decomposition?

Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

Distributed Asynchronous Hyperparameter Optimization better than HyperOpt.

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.