Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Sicheng

Last update: Dec 7, 2022

Related tags

Deep Learning learning-adversarially-robust-representations

Overview

Representation Robustness Evaluations

Our implementation is based on code from MadryLab's robustness package and Devon Hjelm's Deep InfoMax. For all the scripts, we assume the working directory to be the root folder of our code.

Get ready a pre-trained model

We have two methods to pre-train a model for evaluation. Method 1: Follow instructions from MadryLab's robustness package to train a standard model or a robust model with a given PGD setting. For example, to train a robust ResNet18 with l-inf constraint of eps 8/255

python -m robustness.main --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch resnet18 \
--epoch 150 \
--adv-train 1 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--exp-name resnet18_adv

Method 2: Use our wrapped code and set task=train-model. Optional commands:

--classifier-loss = robust (adversarial training) / standard (standard training)
--arch = baseline_mlp (baseline-h with last two layer as mlp) / baseline_linear (baseline-h with last two layer as linear classifier) / vgg16 / ...

Our results presented in Figure 1 and 2 use model architecture: baseline_mlp, resnet18, vgg16, resnet50, DenseNet121. For example, to train a baseline-h model with l-inf constraint of eps 8/255

python main.py --dataset cifar \
--task train-model \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch baseline_mlp \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--classifier-loss robust \
--exp-name baseline_mlp_adv

To parse the store file, run

from cox import store
s = store.Store('/path/to/model/parent-folder', 'model-folder')
print(s['logs'].df)
s.close()

Evaluate the representation robustness (Figure 1, 2, 3)

Set task=estimate-mi to load a pre-trained model and test the mutual information between input and representation. By subtracting the normal-case and worst-case mutual information we have the representation vulnerability. Optional commands:

--estimator-loss = worst (worst-case mutual information estimation) / normal (normal-case mutual information estimation)

For example, to test the worst-case mutual information of ResNet18, run

python main.py --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--task estimate-mi \
--representation-type layer \
--estimator-loss worst \
--arch resnet18 \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.best \
--exp-name estimator_worst__resnet18_adv \
--no-store

or to test on the baseline-h, run

python main.py --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--task estimate-mi \
--representation-type layer \
--estimator-loss worst \
--arch baseline_mlp \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.best \
--exp-name estimator_worst__baseline_mlp_adv \
--no-store

Learn Representations

Set task=train-encoder to learn a representation using our training principle. For train by worst-case mutual information maximization, we can use other lower-bound of mutual information as surrogate for our target, which may have slightly better empirical performance (e.g. nce). Please refer to arxiv.org/abs/1808.06670 for more information. Optional commands:

--estimator-loss = worst (worst-case mutual information maximization) / normal (normal-case mutual information maximization)
--va-mode = dv (Donsker-Varadhan representation) / nce (Noise-Contrastive Estimation) / fd (fenchel dual representation)
--arch = basic_encoder (Hjelm et al.) / ...

Example:

python main.py --dataset cifar \
--task train-encoder \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch basic_encoder \
--representation-type layer \
--estimator-loss worst \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--exp-name learned_encoder

Test on Downstream Classifications (Figure 4, 5, 6; Table 1, 3)

Set task=train-classifier to test the classification accuracy of learned representations. Optional commands:

--classifier-loss = robust (adversarial classification) / standard (standard classification)
--classifier-arch = mlp (mlp as downstream classifier) / linear (linear classifier as downstream classifier)

Example:

python main.py --dataset cifar \
--task train-classifier \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch basic_encoder \
--classifier-arch mlp \
--representation-type layer \
--classifier-loss robust \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.latest \
--exp-name test_learned_encoder

Comments

Bugs

You have setting optimizer to -1 in your train_unsup.py file Line 292-299

sd_info = {
            'model': atm.state_dict(),
            'optimizer': -1,
            'schedule': -1,
            #'optimizer': opt.state_dict(),
            #'schedule': (schedule and schedule.state_dict()),
            'epoch': epoch + 1
        }

This will cause that models trained in your scripts:

python main.py --dataset cifar \
    --task train-model \
    --data ./data \
    --out-dir ./log \
    --arch baseline_mlp \
    --epoch 50 --lr 1e-4 --step-lr 10000 --workers 2 \
    --attack-lr=1e-2 --constraint inf --eps 8/255 \
    --classifier-loss robust \
    --exp-name baseline_mlp_adv

can not be test on your test script:

python main.py --dataset cifar \
--data ./data \
--out-dir ./log \
--task estimate-mi \
--representation-type layer \
--estimator-loss worst \
--arch baseline_mlp \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume ./log/baseline_mlp_adv/checkpoint.pt.best \
--exp-name estimator_worst__baseline_mlp_adv \
--no-store

Errors: TypeError: 'int' object is not subscriptable

This bug should be fixed anywhy.

opened by selous123 1

Evaluate the representation robustness is still a training process?

I test the command under Evaluate the representation robustness, it is still a training process, instead of the statement "load a pre-trained model and test the mutual information between input and representation." So this cannot be a evaluation measure for any pre-trained model, right? or is there something I understand or use in a wrong way?

opened by zoeleeee 1
Potential problem in MINE formula

Hi,

Thank you for the great work and code :) I would have a simple question concerning the MINE formula used: https://github.com/schzhu/learning-adversarially-robust-representations/blob/b642e771d3d828781dc39d3825a44981dc9d69aa/robust_representations/functions/dim_losses.py#L152

To my understanding, there may be a problem with the masking part.

Consider u = [u1, u2, u3], and for the negative term you want to mask out u2, i.e mask = [1, 0, 1]. Ideally, (leaving the constant out), you want to compute E_neg = log( exp(u1) + exp(u3)) = log( mask * exp(u)). However, what the current formula computes is E_neg = log(exp(u1) + exp(0) + exp(u3)) = log ( 1 + exp(u1) + exp(u2)) .

This may slightly modify the dynamic of the gradient, as the denominator in the log derivative is now different. I don't know how much though. Am I missing something ?

Best, Malik

opened by mboudiaf 1

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

76 Dec 22, 2022

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab

89 Dec 26, 2022

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

10 Jan 30, 2022

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

71 Nov 15, 2022

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Transfer Learning for Text Classification with Tensorflow Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01

82 Oct 22, 2022

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

46 Dec 14, 2022

https://arxiv.org/abs/2102.11005

LogME LogME: Practical Assessment of Pre-trained Models for Transfer Learning How to use Just feed the features f and labels y to the function, and yo

149 Dec 19, 2022

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

182 Dec 19, 2022

Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Related tags

Overview

Representation Robustness Evaluations

Get ready a pre-trained model

Evaluate the representation robustness (Figure 1, 2, 3)

Learn Representations

Test on Downstream Classifications (Figure 4, 5, 6; Table 1, 3)

You might also like...

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

https://arxiv.org/abs/2102.11005

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Comments

Bugs

Evaluate the representation robustness is still a training process?

Potential problem in MINE formula

Owner

Sicheng

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

Joint learning of images and text via maximization of mutual information

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).