Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models (published in ICLR2018)

Maya Kabkab

Last update: Dec 7, 2022

Related tags

Overview

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

Pouya Samangouei*, Maya Kabkab*, Rama Chellappa

[*: authors contributed equally]

This repository contains the implementation of our ICLR-18 paper: Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

If you find this code or the paper useful, please consider citing:

@inproceedings{defensegan,
  title={Defense-GAN: Protecting classifiers against adversarial attacks using generative models},
  author={Samangouei, Pouya and Kabkab, Maya and Chellappa, Rama},
  booktitle={International Conference on Learning Representations},
  year={2018}
}

Installation
Usage

Installation

Clone this repository:

git clone --recursive https://github.com/kabkabm/defensegan
cd defensegan
git submodule update --init --recursive

Install requirements:

pip install -r requirements.txt

Note: if you don't have a GPU install the cpu version of TensorFlow 1.7.

Download the dataset and prepare data directory:

python download_dataset.py [mnist|f-mnist|celeba]

Create or link output and debug directories:

mkdir output
mkdir debug

ln -s <path-to-output> output
ln -s <path-to-debug> debug

Usage

Train a GAN model

python train.py --cfg <path> --is_train <extra-args>

--cfg This can be set to either a .yml configuration file like the ones in experiments/cfgs, or an output directory path.
<extra-args> can be any parameter that is defined in the config file.

The training will create a directory in the output directory per experiment with the same name as to save the model checkpoints. If <extra-args> are different from the ones that are defined in <config>, the output directory name will reflect the difference.

A config file is saved into each experiment directory so that they can be loaded if <path> is the address to that directory.

Example

After running

python train.py --cfg experiments/cfgs/gans/mnist.yml --is_train

output/gans/mnist will be created.

[optional] Save reconstructions and datasets into cache:

python train.py --cfg experiments/cfgs/<config> --save_recs
python train.py --cfg experiments/cfgs/<config> --save_ds

Example

After running the training code for mnist, the reconstructions and the dataset can be saved with:

python train.py --cfg output/gans/mnist --save_recs
python train.py --cfg output/gans/mnist --save_ds

As training goes on, sample outputs of the generator are written to debug/gans/<model_config>.

Black-box attacks

To perform black-box experiments run blackbox.py [Table 1 and 2 of the paper]:

python blackbox.py --cfg <path> \
    --results_dir <results_path> \
    --bb_model {A, B, C, D, E} \
    --sub_model {A, B, C, D, E} \
    --fgsm_eps <epsilon> \
    --defense_type {none|defense_gan|adv_tr}
    [--train_on_recs or --online_training]
    <optional-arguments>

--cfg is the path to the config file for training the iWGAN. This can also be the path to the output directory of the model.
--results_dir The path where the final results are saved in text files.
--bb_model The black-box model architectures that are used in Table 1 and Table 2.
--sub_model The substitute model architectures that are used in Table 1 and Table 2.
--defense_type specifies the type of defense to protect the classifier.
--train_on_recs or --online_training These parameters are optional. If they are set, the classifier will be trained on the reconstructions of Defense-GAN (e.g. in column Defense-GAN-Rec of Table 1 and 2). Otherwise, the results are for Defense-GAN-Orig. Note --online_training will take a while if --rec_iters, or L in the paper, is set to a large value.
<optional-arguments> A list of --<arg_name> <arg_val> that are the same as the hyperparemeters that are defined in config files (all lower case), and also a list of flags in blackbox.py. The most important ones are:
- --rec_iters The number of GD reconstruction iterations for Defense-GAN, or L in the paper.
- --rec_lr The learning rate of the reconstruction step.
- --rec_rr The number of random restarts for the reconstruction step, or R in the paper.
- --num_train The number of images to train the black-box model on. For debugging purposes set this to a small value.
- --num_test The number of images to test on. For debugging purposes set this to a small value.
- --debug This will save qualitative attack and reconstruction results in debug directory and will not run the adversarial attack part of the code.
Refer to blackbox.py for more flag descriptions.

Example

Row 1 of Table 1 Defense-GAN-Orig:

python blackbox.py --cfg output/gans/mnist \
    --results_dir defensegan \
    --bb_model A \
    --sub_model B \
    --fgsm_eps 0.3 \
    --defense_type defense_gan

If you set --nb_epochs 1 --nb_epochs_s 1 --data_aug 1 you will get a quick glance of how the script works.

White-box attacks

To test Defense-GAN for white-box attacks run whitebox.py [Tables 4, 5, 12 of the paper]:

python whitebox.py --cfg <path> \
       --results_dir <results-dir> \
       --attack_type {fgsm, rand_fgsm, cw} \
       --defense_type {none|defense_gan|adv_tr} \
       --model {A, B, C, D} \
       [--train_on_recs or --online_training]
       <optional-arguments>

--cfg is the path to the config file for training the iWGAN. This can also be the path to the output directory of the model.
--results_dir The path where the final results are saved in text files.
--defense_type specifies the type of defense to protect the classifier.
--train_on_recs or --online_training These parameters are optional. If they are set, the classifier will be trained on the reconstructions of Defense-GAN (e.g. in column Defense-GAN-Rec of Table 1 and 2). Otherwise, the results are for Defense-GAN-Orig. Note --online_training will take a while if --rec_iters, or L in the paper, is set to a large value.
<optional-arguments> A list of --<arg_name> <arg_val> that are the same as the hyperparemeters that are defined in config files (all lower case), and also a list of flags in whitebox.py. The most important ones are:
- --rec_iters The number of GD reconstruction iterations for Defense-GAN, or L in the paper.
- --rec_lr The learning rate of the reconstruction step.
- --rec_rr The number of random restarts for the reconstruction step, or R in the paper.
- --num_test The number of images to test on. For debugging purposes set this to a small value.
Refer to whitebox.py for more flag descriptions.

Example

First row of Table 4:

python whitebox.py --cfg <path> \
       --results_dir whitebox \
       --attack_type fgsm \
       --defense_type defense_gan \
       --model A

If you want to quickly see how the scripts work, add the following flags:

--nb_epochs 1 --num_tests 400

Comments

TypeError: load() got an unexpected keyword argument 'transform_type'

@po0ya @kabkabm . Thank you very much for the work .

I am trying to create deter physical adversarial attacks using defensegan and trained the GAN on celebA dataset. However, when I tried to reproduce the blackbox of whitebox attack I am stuck at this error. "TypeError: load() got an unexpected keyword argument 'transform_type'"
Any idea how to fix this ? I am running the code on a google cloud vm with Cuda 7.05 . Trained the gan up to 135000 iterations .

**2018-11-04 04:24:57.427800: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions tha t this TensorFlow binary was not compiled to use: AVX2 FMA 2018-11-04 04:24:59.468111: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read f rom SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-11-04 04:24:59.468505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties : name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235 pciBusID: 0000:00:04.0 totalMemory: 11.17GiB freeMemory: 11.10GiB 2018-11-04 04:24:59.468537: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0 2018-11-04 04:24:59.777777: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecu tor with strength 1 edge matrix: 2018-11-04 04:24:59.777830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0 2018-11-04 04:24:59.777839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N 2018-11-04 04:24:59.778142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/jo b:localhost/replica:0/task:0/device:GPU:0 with 10758 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bu s id: 0000:00:04.0, compute capability: 3.7) [*] Checkpoint is read successfully from output/gans/celeba 2018-11-04 04:24:59.805142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0 2018-11-04 04:24:59.805183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecu tor with strength 1 edge matrix: 2018-11-04 04:24:59.805200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0 2018-11-04 04:24:59.805210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N 2018-11-04 04:24:59.805428: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/jo b:localhost/replica:0/task:0/device:GPU:0 with 10758 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bu s id: 0000:00:04.0, compute capability: 3.7) Traceback (most recent call last): File "blackbox.py", line 762, in tf.app.run(main=main_cfg) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "blackbox.py", line 761, in main_cfg = lambda x: main(cfg, x) File "blackbox.py", line 685, in main defense_type=FLAGS.defense_type) File "blackbox.py", line 419, in blackbox get_cached_gan_data(gan, test_on_dev, orig_data_flag=True) File "blackbox.py", line 357, in get_cached_gan_data orig_data=orig_data_flag, File "blackbox.py", line 245, in get_celeba ds_test.load(split=dev_name, transform_type=1) TypeError: load() got an unexpected keyword argument 'transform_type' **

opened by bibin-sebastian 16
Whitebox attack not working?

Hi, I was running your code (so neat and nice), but I think fgsm attack seems not working in whitebox setting.

I ran separately adv_x to see how it looks like, and it was very clean. I checked the gradient of model.get_preds(images_pl) with respect to images_pl and it was all zero.

Am I doing something wrong?

opened by kwonchungli 6
Errors in download_dataset.py file

Hi, there seems to be an issue with the dataset downloading function for f-mnist dataset since its not getting downloaded, nor printing any log messages!

opened by Vaibhavi24 3
White box attack not working

Hi, I'm getting error after running the whitbox.py with the configuration given in the Readme file . I got the error at line #210 "adv_x = attack_obj.generate(images_pl, **attack_params)". Please help me to solve this issue. Please note that blackbox.py is working fine. Here is the screenshot for the error.

Thank you.

opened by shubham-malaviya 2
matching numbers from the paper

I'm trying to generate the Table 4, Row 1 results of the paper.

Set up is a white box, Model A, FGSM attack 0.3, No defense, 50 epochs, 1e-3 lr, MNIST, Adam optimizer. While the paper reports 99.7 classifier accuracy and 0.217 in case of no defense, the code produces around 99.4 classifier accuracy and 0.16 in the case of no defense.

Can you please tell us the changes to get close to original numbers. the only difference that I can notice is complete data (60K) is used in the code without any validation data

opened by krishnakanthnakka 1
absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'cfg'. Did you mean: cfg_path ?

I follow ur instrumentation,but when I wanna train. absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'cfg'. Did you mean: cfg_path ?

This problem troubled me. How to solve this problem.hope to get ur answer. thanks

opened by 1993cathyzhao1993 1
Error while running in python 3.5

absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'cfg'. Did you mean: cfg_path ?

This error arrises while running "python train.py --cfg experiments/cfgs/gans/mnist.yml --is_train" It arrises at "C:\Users\ELCOT\defensegan\utils\config.py", line 77, in load_config if hasattr(flags, k.lower()) this place.

opened by naveeen684 3
The distortion between benign images and adversarial images

When I run fgsm to attack the model with defense-GAN, I found that defense-GAN indeed defend the attack. However, the distortion between benign images and adversarial images is zero, which means that the adversarial examples are the same as benign examples. Is there something wrong? Or fgsm could not get the gradient because the existence of defense-GAN?

opened by xiaosen-wang 0
how to use my own picture dataset rather than MNIST or CelebA dataset?

I wanna to know how to use my own picture dataset to get the model which defenses Adversarial Attacks.How could I change the code to train my own dataset?

opened by jerryhero 1

Owner

Maya Kabkab

GitHub

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

GNNGuard: Defending Graph Neural Networks against Adversarial Attacks Authors: Xiang Zhang ([email protected]), Marinka Zitnik (marinka@hms.

44 Dec 7, 2022

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks Stable Neural ODE with Lyapunov-Stable Equilibrium

8 Dec 12, 2022

Hierarchical-Bayesian-Defense - Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical Variational Inference (Openreview)

Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical V

20 Dec 2, 2022

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

3 Dec 29, 2022

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

32 Dec 13, 2022

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

transfer_adv CVPR-2021 AIC-VI: unrestricted Adversarial Attacks on ImageNet CVPR2021 安全AI挑战者计划第六期赛道2：ImageNet无限制对抗攻击介绍：深度神经网络已经在各种视觉识别问题上取得了最先进的性能。

25 Dec 8, 2022

House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent

122 Dec 28, 2022

NR-GAN: Noise Robust Generative Adversarial Networks

NR-GAN: Noise Robust Generative Adversarial Networks (CVPR 2020) This repository provides PyTorch implementation for noise robust GAN (NR-GAN). NR-GAN

59 Dec 11, 2022

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

31 Dec 8, 2022

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

375 Dec 31, 2022

Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

15 Feb 10, 2022

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

2.9k Dec 28, 2022

Generate high quality pictures. GAN. Generative Adversarial Networks

ESRGAN generate high quality pictures. GAN. Generative Adversarial Networks """ Super-resolution of CelebA using Generative Adversarial Networks. The

1 Dec 14, 2021

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet, CVPR2021 安全AI挑战者计划第六期：ImageNet无限制对抗攻击决赛第四名（team name: Advers）

51 Dec 1, 2022

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

Attack-Probabilistic-Models This is the source code for Adversarial Attacks on Probabilistic Autoregressive Forecasting Models. This repository contai

25 Sep 14, 2022

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

The SLIDE package contains the source code for reproducing the main experiments in this paper. Dataset The Datasets can be downloaded in Amazon-

72 Dec 16, 2022

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

dcf-game-infrastructure All the components necessary to run a game of the OOO DC

46 Sep 13, 2022

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

1 Dec 28, 2021

Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.

Generative Models Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow. Also present here are RBM and Helmholtz Machine. Note: Gen

7k Jan 2, 2023

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models (published in ICLR2018)

Related tags

Overview

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

Contents

Installation

Usage

Train a GAN model

Example

[optional] Save reconstructions and datasets into cache:

Example

Black-box attacks

Example

White-box attacks

Example

Comments

Owner

Maya Kabkab

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Hierarchical-Bayesian-Defense - Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical Variational Inference (Openreview)

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

NR-GAN: Noise Robust Generative Adversarial Networks

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Generate high quality pictures. GAN. Generative Adversarial Networks

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.