Learning Confidence for Out-of-Distribution Detection in Neural Networks

Overview

Learning Confidence Estimates for Neural Networks

This repository contains the code for the paper Learning Confidence for Out-of-Distribution Detection in Neural Networks. In this work, we demonstrate how to augment neural networks with a confidence estimation branch, which can be used to identify misclassified and out-of-distribution examples.

To learn confidence estimates during training, we provide the neural network with "hints" towards the correct output whenever it exhibits low confidence in its predictions. Hints are provided by pushing the prediction closer to the target distribution via interpolation, where the amount of interpolation proportional to the network's confidence that its prediction is correct. To discourage the network from always asking for free hints, a small penalty is applied whenever it is not confident. As a result, the network learns to only produce low confidence estimates when it is likely to make an incorrect prediction.

Bibtex:

@article{devries2018learning,
  title={Learning Confidence for Out-of-Distribution Detection in Neural Networks},
  author={DeVries, Terrance and Taylor, Graham W},
  journal={arXiv preprint arXiv:1802.04865},
  year={2018}
}

Results and Usage

We evalute our method on the task of out-of-distribution detection using three different neural network architectures: DenseNet, WideResNet, and VGG. CIFAR-10 and SVHN are used as the in-distribution datasets, while TinyImageNet, LSUN, iSUN, uniform noise, and Gaussian noise are used as the out-of-distribution datasets. Definitions of evaluation metrics can be found in the paper.

Dependencies

PyTorch v0.3.0
tqdm
visdom
seaborn
Pillow
scikit-learn

Training

Train a model with a confidence estimator with train.py. During training you can use visdom to see a histogram of confidence estimates from the test set. Training logs will be stored in the logs/ folder, while checkpoints are stored in the checkpoints/ folder.

Args Options Description
dataset cifar10,
svhn
Selects which dataset to train on.
model densenet,
wideresnet,
vgg13
Selects which model architecture to use.
batch_size [int] Number of samples per batch.
epochs [int] Number of epochs for training.
seed [int] Random seed.
learning_rate [float] Learning rate.
data_augmentation Train with standard data augmentation (random flipping and translation).
cutout [int] Indicates the patch size to use for Cutout. If 0, Cutout is not used.
budget [float] Controls how often the network can choose have low confidence in its prediction. Increasing the budget will bias the output towards low confidence predictions, while decreasing the budget will produce more high confidence predictions.
baseline Train the model without the confidence branch.

Use the following settings to replicate the experiments from the paper:

VGG13 on CIFAR-10

python train.py --dataset cifar10 --model vgg13 --budget 0.3 --data_augmentation --cutout 16

WideResNet on CIFAR-10

python train.py --dataset cifar10 --model wideresnet --budget 0.3 --data_augmentation --cutout 16

DenseNet on CIFAR-10

python train.py --dataset cifar10 --model densenet --budget 0.3 --epochs 300 --batch_size 64 --data_augmentation --cutout 16

VGG13 on SVHN

python train.py --dataset svhn --model vgg13 --budget 0.3 --learning_rate 0.01 --epochs 160 --data_augmentation --cutout 20

WideResNet on SVHN

python train.py --dataset svhn --model wideresnet --budget 0.3 --learning_rate 0.01 --epochs 160 --data_augmentation --cutout 20

DenseNet on SVHN

python train.py --dataset svhn --model densenet --budget 0.3 --learning_rate 0.01 --epochs 300 --batch_size 64  --data_augmentation --cutout 20

Out-of-distribution detection

Evaluate a trained model with out_of_distribution_detection.py. Before running this you will need to download the out-of-distribution datasets from Shiyu Liang's ODIN github repo and modify the data paths in the file according to where you saved the datasets.

Args Options Description
ind_dataset cifar10,
svhn
Indicates which dataset to use as in-distribution. Should be the same one that the model was trained on.
ood_dataset tinyImageNet_crop,
tinyImageNet_resize,
LSUN_crop,
LSUN_resize,
iSUN,
Uniform,
Gaussian,
all
Indicates which dataset to use as the out-of-distribution datset.
model densenet,
wideresnet,
vgg13
Selects which model architecture to use. Should be the same one that the model was trained on.
process baseline,
ODIN,
confidence,
confidence_scaling
Indicates which method to use for out-of-distribution detection. Baseline uses the maximum softmax probability. ODIN applies temperature scaling and input pre-processing to the baseline method. Confidence uses the learned confidence estimates. Confidence scaling applies input pre-processing to the confidence estimates.
batch_size [int] Number of samples per batch.
T [float] Temperature to use for temperature scaling.
epsilon [float] Noise magnitude to use for input pre-processing.
checkpoint [str] Filename of trained model checkpoint. Assumes the file is in the checkpoints/ folder. A .pt extension is also automatically added to the filename.
validation Use this flag for fine-tuning T and epsilon. If flag is on, the script will only evaluate on the first 1000 samples in the out-of-distribution dataset. If flag is not used, the remaining samples are used for evaluation. Based on validation procedure from ODIN.

Example commands for running the out-of-distribution detection script:

Baseline

python out_of_distribution_detection.py --ind_dataset svhn --ood_dataset all --model vgg13 --process baseline --checkpoint svhn_vgg13_budget_0.0_seed_0

ODIN

python out_of_distribution_detection.py --ind_dataset cifar10 --ood_dataset tinyImageNet_resize --model densenet --process ODIN --T 1000 --epsilon 0.001 --checkpoint cifar10_densenet_budget_0.0_seed_0

Confidence

python out_of_distribution_detection.py --ind_dataset cifar10 --ood_dataset LSUN_crop --model vgg13 --process confidence --checkpoint cifar10_vgg13_budget_0.3_seed_0

Confidence scaling

python out_of_distribution_detection.py --ind_dataset svhn --ood_dataset iSUN --model wideresnet --process confidence_scaling --epsilon 0.001 --checkpoint svhn_wideresnet_budget_0.3_seed_0
You might also like...
Codebase for Amodal Segmentation through Out-of-Task andOut-of-Distribution Generalization with a Bayesian Model

Codebase for Amodal Segmentation through Out-of-Task andOut-of-Distribution Generalization with a Bayesian Model

Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.
Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Sphere Confidence Face (SCF) This repository contains the PyTorch implementation of Sphere Confidence Face (SCF) proposed in the CVPR2021 paper: Shen

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

S2-BNN (Self-supervised Binary Neural Networks Using Distillation Loss) This is the official pytorch implementation of our paper: "S2-BNN: Bridging th

the code for paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration"

EOW-Softmax This code is for the paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration". Accepted by ICCV21. Usage Commnd exa

Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI Gym toolkit.

 Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)
Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN) This is the implementation of the paper Multi-Age

This is a model made out of Neural Network specifically a Convolutional Neural Network model
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

Comments
  • can't reproduce your results

    can't reproduce your results

    Hi, I'm running your code with the following command:

    python train.py --dataset cifar10 --model vgg13 --budget 0.3 --data_augmentation --cutout 16
    

    But the accuracy on CIFAR10 only achives 78%, while the normal accuracy should be around 95%. Is there anything that I missed?

    opened by wenhuchen 3
  • Implementation of section

    Implementation of section "Combating excessive regularization"

    In section 2.3.2 of the paper, it says:

    We find that giving hints with 50% probability addresses this
    issue, as the gradients from low confidence examples now
    have a chance of backpropagating unhindered and updating
    the decision boundary, but we still learn useful confidence
    estimates. One way we can implement this in practice is
    by applying Equation 2 to only half of the batch at each
    iteration.
    

    If I am not mistaken, the following lines are where this idea is implemented:

    https://github.com/uoguelph-mlrg/confidence_estimation/blob/c98f23d0c5e44b4310c14d6986e298ec81ecfcdc/train.py#L242-L244

    Does this really set half of the confidences to 1? Is applying bernoulli on uniform samples between [0, 1] really equivalent? Would applying a threshold at 0.5 on each uniform sample work as well?

    opened by hbredin 1
  • Can't learn the boundary between minimal confidence and maximum confidence

    Can't learn the boundary between minimal confidence and maximum confidence

    Thank you for the interesting work, I reproduce the results in paper, but when I add the confidence branch to my classifier in another work, during training procedure, the minimal confidence and maximum confidence tend to be the same(all big or all small with different learning rate). So I wonder to know how to figure out the reason and solve the problem. Do you have any suggestions? Thanks in advance :)

    opened by superkevingit 0
  • Problems with 2D XOR dataset experiments

    Problems with 2D XOR dataset experiments

    We reproduced the experiment on the XOR data set of your paper, and produced two sets of data respectively for experiment. However, the results were not ideal and there were some deviations from the images in your paper. Did you use any techniques not mentioned in the paper when conducting these experiments? Or is there something special about the distribution of the data you're using? We use two kinds of data: XOR_data XOR_data_2

    The results are: confidence_1 confidence_2

    We look forward to hearing from you.

    opened by wusj0723 1
Owner
null
Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples This project is for the paper "Training Confidence-Calibrated Clas

null 168 Nov 29, 2022
Principled Detection of Out-of-Distribution Examples in Neural Networks

ODIN: Out-of-Distribution Detector for Neural Networks This is a PyTorch implementation for detecting out-of-distribution examples in neural networks.

null 189 Nov 29, 2022
Source code of NeurIPS 2021 Paper ''Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration''

CaGCN This repo is for source code of NeurIPS 2021 paper "Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration". Paper L

null 6 Dec 19, 2022
Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Likelihood-Regret Official implementation of Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020. T

Xavier 33 Oct 12, 2022
The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

Jake YANG 62 Nov 21, 2022
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection

RODD Official Implementation of 2022 CVPRW Paper RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection Introduction: Recent studie

Umar Khalid 17 Oct 11, 2022
Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

StableNet StableNet is a deep stable learning method for out-of-distribution generalization. This is the official repo for CVPR21 paper "Deep Stable L

null 120 Dec 28, 2022
Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Fishr: Invariant Gradient Variances for Out-of-distribution Generalization Official PyTorch implementation of the Fishr regularization for out-of-dist

null 62 Dec 22, 2022
Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

ood-text-emnlp Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them" Files fine_tune.py is used to finetune the GPT-2 mo

Udit Arora 19 Oct 28, 2022