Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Overview

Representation Robustness Evaluations

Our implementation is based on code from MadryLab's robustness package and Devon Hjelm's Deep InfoMax. For all the scripts, we assume the working directory to be the root folder of our code.

Get ready a pre-trained model

We have two methods to pre-train a model for evaluation. Method 1: Follow instructions from MadryLab's robustness package to train a standard model or a robust model with a given PGD setting. For example, to train a robust ResNet18 with l-inf constraint of eps 8/255

python -m robustness.main --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch resnet18 \
--epoch 150 \
--adv-train 1 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--exp-name resnet18_adv

Method 2: Use our wrapped code and set task=train-model. Optional commands:

  • --classifier-loss = robust (adversarial training) / standard (standard training)
  • --arch = baseline_mlp (baseline-h with last two layer as mlp) / baseline_linear (baseline-h with last two layer as linear classifier) / vgg16 / ...

Our results presented in Figure 1 and 2 use model architecture: baseline_mlp, resnet18, vgg16, resnet50, DenseNet121. For example, to train a baseline-h model with l-inf constraint of eps 8/255

python main.py --dataset cifar \
--task train-model \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch baseline_mlp \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--classifier-loss robust \
--exp-name baseline_mlp_adv

To parse the store file, run

from cox import store
s = store.Store('/path/to/model/parent-folder', 'model-folder')
print(s['logs'].df)
s.close()

 

Evaluate the representation robustness (Figure 1, 2, 3)

Set task=estimate-mi to load a pre-trained model and test the mutual information between input and representation. By subtracting the normal-case and worst-case mutual information we have the representation vulnerability. Optional commands:

  • --estimator-loss = worst (worst-case mutual information estimation) / normal (normal-case mutual information estimation)

For example, to test the worst-case mutual information of ResNet18, run

python main.py --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--task estimate-mi \
--representation-type layer \
--estimator-loss worst \
--arch resnet18 \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.best \
--exp-name estimator_worst__resnet18_adv \
--no-store

or to test on the baseline-h, run

python main.py --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--task estimate-mi \
--representation-type layer \
--estimator-loss worst \
--arch baseline_mlp \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.best \
--exp-name estimator_worst__baseline_mlp_adv \
--no-store

 

Learn Representations

Set task=train-encoder to learn a representation using our training principle. For train by worst-case mutual information maximization, we can use other lower-bound of mutual information as surrogate for our target, which may have slightly better empirical performance (e.g. nce). Please refer to arxiv.org/abs/1808.06670 for more information. Optional commands:

  • --estimator-loss = worst (worst-case mutual information maximization) / normal (normal-case mutual information maximization)
  • --va-mode = dv (Donsker-Varadhan representation) / nce (Noise-Contrastive Estimation) / fd (fenchel dual representation)
  • --arch = basic_encoder (Hjelm et al.) / ...

Example:

python main.py --dataset cifar \
--task train-encoder \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch basic_encoder \
--representation-type layer \
--estimator-loss worst \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--exp-name learned_encoder

 

Test on Downstream Classifications (Figure 4, 5, 6; Table 1, 3)

Set task=train-classifier to test the classification accuracy of learned representations. Optional commands:

  • --classifier-loss = robust (adversarial classification) / standard (standard classification)
  • --classifier-arch = mlp (mlp as downstream classifier) / linear (linear classifier as downstream classifier)

Example:

python main.py --dataset cifar \
--task train-classifier \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch basic_encoder \
--classifier-arch mlp \
--representation-type layer \
--classifier-loss robust \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.latest \
--exp-name test_learned_encoder
You might also like...
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).
Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)
Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Transfer Learning for Text Classification with Tensorflow Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646
Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

https://arxiv.org/abs/2102.11005
https://arxiv.org/abs/2102.11005

LogME LogME: Practical Assessment of Pre-trained Models for Transfer Learning How to use Just feed the features f and labels y to the function, and yo

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

Comments
  • Bugs

    Bugs

    You have setting optimizer to -1 in your train_unsup.py file Line 292-299

    sd_info = {
                'model': atm.state_dict(),
                'optimizer': -1,
                'schedule': -1,
                #'optimizer': opt.state_dict(),
                #'schedule': (schedule and schedule.state_dict()),
                'epoch': epoch + 1
            }
    

    This will cause that models trained in your scripts:

    python main.py --dataset cifar \
        --task train-model \
        --data ./data \
        --out-dir ./log \
        --arch baseline_mlp \
        --epoch 50 --lr 1e-4 --step-lr 10000 --workers 2 \
        --attack-lr=1e-2 --constraint inf --eps 8/255 \
        --classifier-loss robust \
        --exp-name baseline_mlp_adv
    

    can not be test on your test script:

    python main.py --dataset cifar \
    --data ./data \
    --out-dir ./log \
    --task estimate-mi \
    --representation-type layer \
    --estimator-loss worst \
    --arch baseline_mlp \
    --epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
    --attack-lr=1e-2 --constraint inf --eps 8/255 \
    --resume ./log/baseline_mlp_adv/checkpoint.pt.best \
    --exp-name estimator_worst__baseline_mlp_adv \
    --no-store
    

    Errors: TypeError: 'int' object is not subscriptable

    This bug should be fixed anywhy.

    opened by selous123 1
  • Evaluate the representation robustness is still a training process?

    Evaluate the representation robustness is still a training process?

    I test the command under Evaluate the representation robustness, it is still a training process, instead of the statement "load a pre-trained model and test the mutual information between input and representation." So this cannot be a evaluation measure for any pre-trained model, right? or is there something I understand or use in a wrong way?

    opened by zoeleeee 1
  • Potential problem in MINE formula

    Potential problem in MINE formula

    Hi,

    Thank you for the great work and code :) I would have a simple question concerning the MINE formula used: https://github.com/schzhu/learning-adversarially-robust-representations/blob/b642e771d3d828781dc39d3825a44981dc9d69aa/robust_representations/functions/dim_losses.py#L152

    To my understanding, there may be a problem with the masking part.

    Consider u = [u1, u2, u3], and for the negative term you want to mask out u2, i.e mask = [1, 0, 1]. Ideally, (leaving the constant out), you want to compute E_neg = log( exp(u1) + exp(u3)) = log( mask * exp(u)). However, what the current formula computes is E_neg = log(exp(u1) + exp(0) + exp(u3)) = log ( 1 + exp(u1) + exp(u2)) .

    This may slightly modify the dynamic of the gradient, as the denominator in the log derivative is now different. I don't know how much though. Am I missing something ?

    Best, Malik

    opened by mboudiaf 1
Owner
Sicheng
Sicheng
[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

null 153 Dec 14, 2022
This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

HRNet 367 Dec 27, 2022
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

null 15 Dec 27, 2022
Joint learning of images and text via maximization of mutual information

mutual_info_img_txt Joint learning of images and text via maximization of mutual information. This repository incorporates the algorithms presented in

Ruizhi Liao 10 Dec 22, 2022
Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement Recently, the power of unconditional image synthesis has significantly advanced th

null 967 Jan 4, 2023
Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Face Identity Disentanglement via Latent Space Mapping - Implement in pytorch with StyleGAN 2 Description Pytorch implementation of the paper Face Ide

Daniel Roich 58 Dec 24, 2022
Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Meta-Solver for Neural Ordinary Differential Equations Towards robust neural ODEs using parametrized solvers. Main idea Each Runge-Kutta (RK) solver w

Julia Gusak 25 Aug 12, 2021
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

Max Berrendorf 16 Oct 14, 2022
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This

null 458 Jan 2, 2023
Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

Kalpesh Krishna 41 Nov 8, 2022