AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

Related tags

Deep Learning augmix
Overview

AugMix

Introduction

We propose AugMix, a data processing technique that mixes augmented images and enforces consistent embeddings of the augmented images, which results in increased robustness and improved uncertainty calibration. AugMix does not require tuning to work correctly, as with random cropping or CutOut, and thus enables plug-and-play data augmentation. AugMix significantly improves robustness and uncertainty measures on challenging image classification benchmarks, closing the gap between previous methods and the best possible performance by more than half in some cases. With AugMix, we obtain state-of-the-art on ImageNet-C, ImageNet-P and in uncertainty estimation when the train and test distribution do not match.

For more details please see our ICLR 2020 paper.

Pseudocode

Contents

This directory includes a reference implementation in NumPy of the augmentation method used in AugMix in augment_and_mix.py. The full AugMix method also adds a Jensen-Shanon Divergence consistency loss to enforce consistent predictions between two different augmentations of the input image and the clean image itself.

We also include PyTorch re-implementations of AugMix on both CIFAR-10/100 and ImageNet in cifar.py and imagenet.py respectively, which both support training and evaluation on CIFAR-10/100-C and ImageNet-C.

Requirements

  • numpy>=1.15.0
  • Pillow>=6.1.0
  • torch==1.2.0
  • torchvision==0.2.2

Setup

  1. Install PyTorch and other required python libraries with:

    pip install -r requirements.txt
    
  2. Download CIFAR-10-C and CIFAR-100-C datasets with:

    mkdir -p ./data/cifar
    curl -O https://zenodo.org/record/2535967/files/CIFAR-10-C.tar
    curl -O https://zenodo.org/record/3555552/files/CIFAR-100-C.tar
    tar -xvf CIFAR-100-C.tar -C data/cifar/
    tar -xvf CIFAR-10-C.tar -C data/cifar/
    
  3. Download ImageNet-C with:

    mkdir -p ./data/imagenet/imagenet-c
    curl -O https://zenodo.org/record/2235448/files/blur.tar
    curl -O https://zenodo.org/record/2235448/files/digital.tar
    curl -O https://zenodo.org/record/2235448/files/noise.tar
    curl -O https://zenodo.org/record/2235448/files/weather.tar
    tar -xvf blur.tar -C data/imagenet/imagenet-c
    tar -xvf digital.tar -C data/imagenet/imagenet-c
    tar -xvf noise.tar -C data/imagenet/imagenet-c
    tar -xvf weather.tar -C data/imagenet/imagenet-c
    

Usage

The Jensen-Shannon Divergence loss term may be disabled for faster training at the cost of slightly lower performance by adding the flag --no-jsd.

Training recipes used in our paper:

WRN: python cifar.py

AllConv: python cifar.py -m allconv

ResNeXt: python cifar.py -m resnext -e 200

DenseNet: python cifar.py -m densenet -e 200 -wd 0.0001

ResNet-50: python imagenet.py <path/to/imagenet> <path/to/imagenet-c>

Pretrained weights

Weights for a ResNet-50 ImageNet classifier trained with AugMix for 180 epochs are available here.

This model has a 65.3 mean Corruption Error (mCE) and a 77.53% top-1 accuracy on clean ImageNet data.

Citation

If you find this useful for your work, please consider citing

@article{hendrycks2020augmix,
  title={{AugMix}: A Simple Data Processing Method to Improve Robustness and Uncertainty},
  author={Hendrycks, Dan and Mu, Norman and Cubuk, Ekin D. and Zoph, Barret and Gilmer, Justin and Lakshminarayanan, Balaji},
  journal={Proceedings of the International Conference on Learning Representations (ICLR)},
  year={2020}
}
Comments
  • Testing with the best model

    Testing with the best model

    I found that cifar.py code does not test the performance with the best model. https://github.com/google-research/augmix/blob/7c84885fe064435b4b7a8b596b937f0a879e458c/cifar.py#L431

    In addition, why the validation set of CIFAR is not used in the model selection ??

    opened by LeeDoYup 4
  • augmentations used in augmix

    augmentations used in augmix

    Hi,

    I have a couple of questions about the augmentations used in augmix -

    1. The AugMix paper mentions that contrast augmentations were removed from augmix as that would overlap with one of the tested corruptions (Contrast) - but I see that AutoContrast is still used in the code: https://github.com/google-research/augmix/blob/master/augmentations.py#L141

    2. I am curious how or why the augmentations in augmix impact performance on these corruptions as the connection between them is not immediately clear. Do you have a take on this, perhaps through an ablation study of the augmentations in augmix?

    Thank you.

    opened by kiranchari 3
  • Issue loading pretrained weights

    Issue loading pretrained weights

    When I download the checkpoint.pth.tar file for the pretrained ImageNet model I get the error tar: Error opening archive: Unrecognized archive format, would it be possible to release the weights in another format (like a zip perhaps?) Unfortunately I don't have access to the ImageNet dataset so I can't retrain it myself

    opened by humzaiqbal 3
  • Can you share any acc information per epoch?

    Can you share any acc information per epoch?

    Hi Author I am a college student studying deep learning in Korea. I read your paper very interesting. But I am running AugMix on imagenet and I have to wait 6 days with my GPU machine. So I'm wondering if learning is going well. To find out the trend in the middle, can you share any acc information per epoch?

    opened by seominseok0429 3
  • ImageNet hparams

    ImageNet hparams

    I'm having trouble achieving decent ImageNet results with the mixing + JSD loss. Are the hparams in the imagenet.py script what was used for paper results, was the same code used for the paper?

    Any details on hparams for the paper results for ImageNet would be appreciated. Are these correct?

    • Epochs = 90 or 180?
    • JSD loss lambda = 12
    • batch size =256, that represents an effective batch size of 256*3=768, which is large for ResNet50 at FP32 and suggests 4+ GPU?
    • LR = 0.1
    • AugMix severity = 1
    • AugMix prob coeff = 0.1

    Thanks

    opened by rwightman 3
  • Corruption acc. of a Resnet50 trained on cifar10 with augmix

    Corruption acc. of a Resnet50 trained on cifar10 with augmix

    Hi there, I trained a Resnet50 on CIFAR10 using the cifar.py script in this repository.

    The clean acc. was about 95% but corruption accuracy was less than reported in the original paper. I have pasted below accuracies for CIFAR10-C at Severity 5. The mean corruption acc. across all corruptions and severity levels was 81%. I understand the architecture used in the original work was different, so is this an expected corruption acc. variation with the architecture?

    Thank you.

    Corruption severity: 5 gaussian_noise 0.6097 shot_noise 0.6647 impulse_noise 0.6747 speckle_noise 0.6906 defocus_blur 0.79 glass_blur 0.5383 motion_blur 0.7412 zoom_blur 0.7565 gaussian_blur 0.738 snow 0.7785 frost 0.7448 fog 0.6972 brightness 0.8831 spatter 0.8641 contrast 0.454 elastic_transform 0.6711 pixelate 0.5521 jpeg_compression 0.753 saturate 0.8935

    opened by kiranchari 2
  • Jensen–Shannon divergence

    Jensen–Shannon divergence

    Hello,

    I'm trying to understand Jensen–Shannon divergence, I still don't understand the math behind it, but someone asked me to investigate about it and Augmix because of this paragraph:

    Alternatively, we can view each set as an empirical distribution and measure the distance between them using Kullback-Leibler (KL) or Jensen-Shannon (JS) divergence. The challenge for learning with KL or JS divergence is that no useful gradient is provided when the two empirical distributions have disjoint supports or have a non-empty intersection contained in a set of measure zero.

    from here: https://arxiv.org/pdf/1907.10764.pdf

    Is this problem presented in Augmix?

    opened by LamyaMohaned 2
  • mean and std for every channel is 0.5 instead of (0.5071, 0.4867. 0.4408) as mean  and (0.2657, 0.2565, 0.2761) as std

    mean and std for every channel is 0.5 instead of (0.5071, 0.4867. 0.4408) as mean  and (0.2657, 0.2565, 0.2761) as std

    Hi, For normalizing cifar100 dataset, you used (0.5, 0.,5 0.,5) for mean and std instead of (0.5071, 0.4867. 0.4408) as mean and (0.2657, 0.2565, 0.2761) as std which is more commonly used. Is this by design? and if so, could you please explain in brief, as to why?

    opened by shashankskagnihotri 2
  • There is a big gap between the results of the code and the results in the paper on CIFAR10-C

    There is a big gap between the results of the code and the results in the paper on CIFAR10-C

    I run this script several times to test the performance of proposed method on CIFAR10-C and CIFAR100-C: python cifar.py -m resnext -e 200

    10.9% error rate of CIFAR10-C is shown in the paper, however I got about 29% error rate. Then, I tried different models, but there is still a big gap between the results of this code and the results in the paper on CIFAR10-C.

    But the results on the CIFAR100-C are close to the paper. BTW, I did not modify the code.

    I would like to know:

    1. Is the hyperparameter setting of cifar10 incorrectly given?
    2. Are there more implementation details?

    or could you please give some explanation?

    THANKS A LOT!

    opened by LinusWu 2
  • `depth` is constant each `width`

    `depth` is constant each `width`

    Hello, thank you for releasing a implementation.

    In my understanding, we change number of op as depth each width , however this code should be same depth each width here. https://github.com/google-research/augmix/blob/7c84885fe064435b4b7a8b596b937f0a879e458c/augment_and_mix.py#L61

    So, we change code like below, right?

    depth = np.random.randint(1, depth + 1 if depth > 0 else 4)
    

    Best, regard.

    opened by yumion 2
  • CIFAR-10/ImageNet-P code

    CIFAR-10/ImageNet-P code

    Hi, thank you for releasing the official implementation.

    Do you plan to make public the codebase of CIFAR-10-P and ImageNet-P? Or are there official implementation for that elsewhere?

    Thank you in advance.

    opened by moskomule 2
  • Use combined with Object Detection

    Use combined with Object Detection

    Is it possible a ease-to-use/integration with object detection model like DETR ? I mean, if AugMix works only over the image so the bounding boxes wont be touched and no extra manipulations gonna be required for bounding boxes. Could you give me any hint, please.

    opened by ver0z 0
  • command for imagenet evaluation

    command for imagenet evaluation

    Hi, do you have a command for imagenet evaluation on clean imagenet dataset and corrupted dataset? I have downloaded the pretrained weight but found it difficult to reproduce the result.

    Thank you!

    opened by Sarimuko 0
Owner
Google Research
Google Research
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

amos_xwang 57 Dec 4, 2022
TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

YeongHyeon Park 7 Aug 28, 2022
Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

null 11 Nov 23, 2022
Keep CALM and Improve Visual Feature Attribution

Keep CALM and Improve Visual Feature Attribution Jae Myung Kim1*, Junsuk Choe1*, Zeynep Akata2, Seong Joon Oh1† * Equal contribution † Corresponding a

NAVER AI 90 Dec 7, 2022
Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization

Dynamic Stock Industrial Classification Use graph-based analysis to re-classify stocks and experiment different re-classification methodologies to imp

Sheng Yang 10 Dec 5, 2022
Fast and scalable uncertainty quantification for neural molecular property prediction, accelerated optimization, and guided virtual screening.

Evidential Deep Learning for Guided Molecular Property Prediction and Discovery Ava Soleimany*, Alexander Amini*, Samuel Goldman*, Daniela Rus, Sangee

Alexander Amini 75 Dec 15, 2022
TensorFlow implementation for Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How

Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How TensorFlow implementation for Bayesian Modeling and Unce

Shen Lab at Texas A&M University 8 Sep 2, 2022
SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version>=1.9.0 (new feature (torch.Tensor.take_along_dim) is used

IMLHF 3 Oct 11, 2022
Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening Introduction This is an implementation of the model used for breast

null 757 Dec 30, 2022
SmallInitEmb - LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence

SmallInitEmb LayerNorm(SmallInit(Embedding)) in a Transformer I find that when t

PENG Bo 11 Dec 25, 2022
Code for Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty

Deep Deterministic Uncertainty This repository contains the code for Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic

Jishnu Mukhoti 69 Nov 28, 2022
[CVPR'21] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

MonoRUn MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper] Hansheng Chen, Yuyao Huang, Wei Tian*

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 96 Dec 10, 2022
A library for uncertainty representation and training in neural networks.

Epistemic Neural Networks A library for uncertainty representation and training in neural networks. Introduction Many applications in deep learning re

DeepMind 211 Dec 12, 2022
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

Bae, Gwangbin 95 Jan 4, 2023
A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization

Website, Tutorials, and Docs    Uncertainty Toolbox A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualizatio

Uncertainty Toolbox 1.4k Dec 28, 2022
Code of Adverse Weather Image Translation with Asymmetric and Uncertainty aware GAN

Adverse Weather Image Translation with Asymmetric and Uncertainty-aware GAN (AU-GAN) Official Tensorflow implementation of Adverse Weather Image Trans

Jeong-gi Kwak 36 Dec 26, 2022
Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

SUO-SLAM This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv li

Robot Perception & Navigation Group (RPNG) 97 Jan 3, 2023