Certified Patch Robustness via Smoothed Vision Transformers

Overview

Certified Patch Robustness via Smoothed Vision Transformers

This repository contains the code for replicating the results of our paper:

Certified Patch Robustness via Smoothed Vision Transformers
Hadi Salman*, Saachi Jain*, Eric Wong*, Aleksander Madry

Paper
Blog post Part I.
Blog post Part II.

    @article{salman2021certified,
        title={Certified Patch Robustness via Smoothed Vision Transformers},
        author={Hadi Salman and Saachi Jain and Eric Wong and Aleksander Madry},
        booktitle={ArXiv preprint arXiv:2110.07719},
        year={2021}
    }

Getting started

Our code relies on the MadryLab public robustness library, which will be automatically installed when you follow the instructions below.

  1. Clone our repo: git clone https://github.mit.edu/hady/smoothed-vit

  2. Install dependencies:

    conda create -n smoothvit python=3.8
    conda activate smoothvit
    pip install -r requirements.txt
    

Full pipeline for building smoothed ViTs.

Now, we will walk you through the steps to create a smoothed ViT on the CIFAR-10 dataset. Similar steps can be followed for other datasets.

The entry point of our code is main.py (see the file for a full description of arguments).

First we will train the base classifier with ablations as data augmentation. Then we will apply derandomizd smoothing to build a smoothed version of the model which is certifiably robust.

Training the base classifier

The first step is to train the base classifier (here a ViT-Tiny) with ablations.

python src/main.py \
      --dataset cifar10 \
      --data /tmp \
      --arch deit_tiny_patch16_224 \
      --pytorch-pretrained \
      --out-dir OUTDIR \
      --exp-name demo \
      --epochs 30 \
      --lr 0.01 \
      --step-lr 10 \
      --batch-size 128 \
      --weight-decay 5e-4 \
      --adv-train 0 \
      --freeze-level -1 \
      --drop-tokens \
      --cifar-preprocess-type simple224 \
      --ablate-input \
      --ablation-type col \
      --ablation-size 4

Once training is done, the mode is saved in OUTDIR/demo/.

Certifying the smoothed classifier

Now we are ready to apply derandomized smoothing to obtain certificates for each datapoint against adversarial patches. To do so, simply run:

python src/main.py \
      --dataset cifar10 \
      --data /tmp \
      --arch deit_tiny_patch16_224 \
      --out-dir OUTDIR \
      --exp-name demo \
      --batch-size 128 \
      --adv-train 0 \
      --freeze-level -1 \
      --drop-tokens \
      --cifar-preprocess-type simple224 \
      --resume \
      --eval-only 1 \
      --certify \
      --certify-out-dir OUTDIR_CERT \
      --certify-mode col \
      --certify-ablation-size 4 \
      --certify-patch-size 5

This will calculate the standard and certified accuracies of the smoothed model. The results will be dumped into OUTDIR_CERT/demo/.

That's it! Now you can replicate all the results of our paper.

Download our ImageNet models

If you find our pretrained models useful, please consider citing our work.

Models trained with column ablations

Model Ablation Size = 19
ResNet-18 LINK
ResNet-50 LINK
WRN-101-2 LINK
ViT-T LINK
ViT-S LINK
ViT-B LINK

We have uploaded the most important models. If you need any other model (for the sweeps for example) please let us know and we are happy to provide!

Maintainers

You might also like...
Patch SVDD for Image anomaly detection

Patch SVDD Patch SVDD for Image anomaly detection. Paper: https://arxiv.org/abs/2006.16067 (published in ACCV 2020). Original Code : https://github.co

Split your patch similarly to `git add -p` but supporting multiple buckets
Split your patch similarly to `git add -p` but supporting multiple buckets

split-patch.py This is git add -p on steroids for patches. Given a my.patch you can run ./split-patch.py my.patch You can choose in which bucket to p

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

Explainability for Vision Transformers (in PyTorch)
Explainability for Vision Transformers (in PyTorch)

Explainability for Vision Transformers (in PyTorch) This repository implements methods for explainability in Vision Transformers

PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers Pytorch implementation of CvT: Introducing Convolutions to Vision Transformers Usage: img = torch

Implementation of various Vision Transformers I found interesting

Implementation of various Vision Transformers I found interesting

 Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Twins: Revisiting the Design of Spatial Attention in Vision Transformers Very recently, a variety of vision transformer architectures for dense predic

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Self-Supervised Vision Transformers with DINO PyTorch implementation and pretrained models for DINO. For details, see Emerging Properties in Self-Supe

Exploring whether attention is necessary for vision transformers
Exploring whether attention is necessary for vision transformers

Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet Paper/Report TL;DR We replace the attention layer in a v

Comments
  • A minor issue

    A minor issue

    Hi, very solid work. But I think there is a typo in the class "MuskProcessor" of /src/utils/custom_models/preprocess.py. "ones_mask = torch.where(ones_mask.view(-1) > 0)[0]" this line choose those patches which intersect with ablation columns or blocks, and they belong to [0, 195]. To keep the class token always, the code just adds a 0 at the beginning of ones_mask in the next line. But I think we should first change the range from [0, 195] to [1, 196], then add the 0 :)

    opened by ljb121002 2
Owner
Madry Lab
Towards a Principled Science of Deep Learning
Madry Lab
The AWS Certified SysOps Administrator

The AWS Certified SysOps Administrator – Associate (SOA-C02) exam is intended for system administrators in a cloud operations role who have at least 1 year of hands-on experience with deployment, management, networking, and security on AWS.

Aiden Pearce 32 Dec 11, 2022
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

Keon Lee 114 Dec 12, 2022
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

Yuval Nirkin 182 Dec 14, 2022
Code for Learning Manifold Patch-Based Representations of Man-Made Shapes, in ICLR 2021.

LearningPatches | Webpage | Paper | Video Learning Manifold Patch-Based Representations of Man-Made Shapes Dmitriy Smirnov, Mikhail Bessmeltsev, Justi

Dima Smirnov 22 Nov 14, 2022
code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021) Introduction PBR is a conceptually simple yet effective

H.Chen 143 Jan 5, 2023
Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition This repository contains code for the CVPR2021 paper "Patch-NetV

QVPR 368 Jan 6, 2023
PyTorch implementation of adversarial patch

adversarial-patch PyTorch implementation of adversarial patch This is an implementation of the Adversarial Patch paper. Not official and likely to hav

Jamie Hayes 172 Nov 29, 2022
DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

DPT This repo is the official implementation of DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021). We provide code and model

CASIA-IVA-Lab 111 Dec 21, 2022
FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

FPGA & FreeNet Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification by Zhuo Zheng, Yanfei Zhong, Ailong M

Zhuo Zheng 92 Jan 3, 2023