Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

Dan Hendrycks

Last update: Dec 27, 2022

Related tags

Deep Learning deep-learning pytorch calibration anomaly anomaly-detection out-of-distribution-detection ml-safety

Overview

Outlier Exposure

This repository contains the essential code for the paper Deep Anomaly Detection with Outlier Exposure (ICLR 2019).

Requires Python 3+ and PyTorch 0.4.1+.

Overview

Outlier Exposure (OE) is a method for improving anomaly detection performance in deep learning models. Using an out-of-distribution dataset, we fine-tune a classifier so that the model learns heuristics to distinguish anomalies and in-distribution samples. Crucially, these heuristics generalize to new distributions. Unlike ODIN, OE does not require a model per OOD dataset and does not require tuning on "validation" examples from the OOD dataset in order to work. This repository contains a subset of the calibration and multiclass classification experiments. Please consult the paper for the full results and method descriptions.

Contained within this repository is code for the NLP experiments and the multiclass and calibration experiments for SVHN, CIFAR-10, CIFAR-100, and Tiny ImageNet.

80 Million Tiny Images is available here (mirror link).

Citation

If you find this useful in your research, please consider citing:

@article{hendrycks2019oe,
  title={Deep Anomaly Detection with Outlier Exposure},
  author={Hendrycks, Dan and Mazeika, Mantas and Dietterich, Thomas},
  journal={Proceedings of the International Conference on Learning Representations},
  year={2019}
}

Outlier Datasets

These experiments make use of numerous outlier datasets. Links for less common datasets are as follows, 80 Million Tiny Images (mirror link), Icons-50, Textures, Chars74K, and Places365.

Comments

Questions about the test phase and the training detail
Hi, I'm new to out-of-distribution detection. After reading the paper and the code, I still cannot figure out how the out-of-distribution data is detected. I see there are two related lines of code as below.

out_score = get_ood_scores(ood_loader) measures = get_measures(out_score, in_score)

It seems that the detection process is related to the in_score. But what about the prediction in a real application scenario? I'm confused. You could give me some references to read if that's easier to explain.

In the training script, the cross entropy between the softmax distribution and the uniform distribution is implemented with this line.

loss += 0.5 * -(x[len(in_set[0]):].mean(1) - torch.logsumexp(x[len(in_set[0]):], dim=1)).mean()

How does torch.logsumexp(x[len(in_set[0]):], dim=1) represent the uniform distribution?

Thanks.
opened by xfffrank 6
Using OE for pixel level predictions

I was going through the paper and in section 3 where the OE minimization objective is defined, there is no constraint on the type of loss that is being used. Hence I was curious if the methodology can be applied to tasks where we do pixel-level prediction (example image translation) and use L1 or L2 loss? Similar to classification loss, if we use uniform random ground truth images as the label for ood images, will it work or dealing with pixel-level predictions, require a completely different methodology?

opened by shivamsaboo17 4
The understanding of the formula in the paper and the details of the training in the code

Hello, I am a graduate student in school and I just read this paper of yours. I want to ask you some questions that I don’t understand. First of all, I don't understand the latter part of the formula in the paper.I don’t know what it means.Is f(x') the predicted output of the model on OOD data?Is the cross entropy calculated for the second half? Second, loss += 0.5 * -(x[len(in_set[0]):].mean(1) - torch.logsumexp(x[len(in_set[0]):], dim=1)).mean() I don't understand how this formula shows that the obtained is cross entropy?And where is softmax used in the first half（x[len(in_set[0]):].mean(1)） of the formula?How does the second half of the formula（torch.logsumexp(x[len(in_set[0]):], dim=1)） represent uniform distribution? Thansk,sir!

opened by yanchenyuxia 3
Removing duplicates

Hi,

In your paper, you mention that 'We remove all examples of 80 Million Tiny Images which appear in the CIFAR datasets'. I was trying to download 80 million Tinyimagenet but apparantly it is not available anymore. Instead, I am trying to use imagenet32X32. Is there an effective way to remove duplicates of cifar10 or cifar100 from imagenet32X32 ?

opened by sesmae 3
About the metric you use
Hi, @hendrycks , thanks for your code!

I am new to Out of distribution detection, and I am not sure about the OOD score you used in the code.

_score.append(to_np((output.mean(1) - torch.logsumexp(output, dim=1))))

Don't we use the maximum probability predicted as the metric to discriminate in-distribution and out-distribution data? where does this score originate?

I appreciate it if you can refer me some papers that proposed this score

Thanks,
opened by d12306 3
Unable to understand the loss formulation for out-of-distibution samples

Hi,

I am not able to understand the loss function mentioned in the file oe_scratch.py in MNIST folder. It does not look like minimising the KL divergence between Uniform distribution and cross-entropy distribution.

The loss mentioned in the file is: loss += 0.5 * -(x[len(in_set[0]):].mean(1) - torch.logsumexp(x[len(in_set[0]):], dim=1)).mean()

Can you please help me in understanding what it is trying to do.

opened by aishgupta 3
Different results for newgroup text classification

Hi,

Thank you very much for the response. After the modifications you suggested, I was able to reproduce the results of the paper for the SST dataset.

However, there is a problem for the 20 Newsgroups dataset. When I evaluate your pretrained model, the results are way different from the ones mentioned in the paper. Actually, when we run the evaluation script for your model, the results we get are the following:

OOD dataset mean FPR: 0.6828 OOD dataset mean AUROC: 0.7115 OOD dataset mean AUPR: 0.2773

So, what could be the issue? If the evaluation datasets are the same ones as the ones you used in the SST, I cannot find a reason for which your pretrained model does not give the results mentioned in the paper. Which version of the 20 Newsgroups did you use and how did you do the train/test split? we use the newsgroup available from sklearn library

Thank you,

Originally posted by @AristotelisPap in https://github.com/hendrycks/outlier-exposure/issues/4#issuecomment-517967141

opened by nazim1021 2
Different Results for Text Classification Experiments

Hello,

I am currently trying to replicate the results of the paper related to text classification. For example, let's say I want to replicate the results for sst dataset with OE on the wikitext2 dataset. If I load your oe_tuned model and I run the script eval_OOD_sst, then I get results similar to the ones mentioned in the paper.

However, whenever I run the baseline script in order to train the baseline model and then I run the oe script in order to fine-tune the model, then if I evaluate the fine-tuned model using the script eval_OOD_sst, the results are way different from the ones mentioned in the paper. I guess something is going on with the training process. But what could be the issue?

Thank you,

Aris

opened by AristotelisPap 2
Pushing the code

Hi, Is it possible to push the rest of the code including the language modeling task? I'm doing some related work and it would be of great help. Thanks!

opened by Mehrad0711 2
Standardisation of OOD samples

Hi,

I was looking to understand the pre-processing involved prior to generating predictions from the neural network on OOD samples for computer vision(image classification). Specifically, how the OOD-test samples are standardised. Could you shed some light on it?

Regards.

opened by paganpasta 1
code for pixelcnn++

Maybe I am not looking hard enough but I can't seem to find the code for density estimation using pixelcnn++. Can you direct me to the file for the same?

opened by DikshaMeghwal 1
After fine-tuning pre-trained WideResNet, the ID classification drops a lot?

Hello! Thank you for your excellent work. I have some questions about the implementation. When I change the train_loader_ood with shuffle = True and the ID classification drops a lot after finetuning the pre-trained WideResNet with lr 0.001 and weight_decay 0.0005. The fine-tuned WideResNet in Cifar10 has 88.59% test classification accuracy compared with 94.39%. Instead, you use a random offset to induce the randomness. Could you please clarify the reason for this operation? Thanks again!

opened by lygjwy 0

Owner

Dan Hendrycks

PhD student at UC Berkeley.

GitHub

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

About Code release for Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy (ICLR 2022 Spotlight)

221 Dec 31, 2022

(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License PyOD is a comprehensive and sca

6.6k Jan 3, 2023

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Python Streaming Anomaly Detection (PySAD) PySAD is an open-source python framework for anomaly detection on streaming multivariate data. Documentatio

181 Dec 18, 2022

A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

757 Jan 4, 2023

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021] Pdf: https://openreview.net/forum?id=v5gjXpmR8J Code for our ICLR 2021 pape

113 Nov 27, 2022

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework Background: Outlier detection (OD) is a key data mining task for identify

127 Jan 5, 2023

Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

LUNAR Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks" Adam Goodge, Bryan Hooi, Ng See Kiong and

25 Dec 28, 2022

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

We propose a new method for effective shadow removal by regarding it as an exposure fusion problem.

Auto-exposure fusion for single-image shadow removal We propose a new method for effective shadow removal by regarding it as an exposure fusion proble

146 Dec 31, 2022

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

NIRPS-ETC Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph February 2

2 Sep 15, 2022

Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

Related tags

Overview

Outlier Exposure

Overview

Citation

Outlier Datasets

Comments

Owner

Dan Hendrycks

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

A Python Library for Graph Outlier Detection (Anomaly Detection)

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

We propose a new method for effective shadow removal by regarding it as an exposure fusion problem.

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

A PyTorch implementation of "Capsule Graph Neural Network" (ICLR 2019).

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection.

A gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor.

Certifiable Outlier-Robust Geometric Perception

VOS: Learning What You Don’t Know by Virtual Outlier Synthesis

Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Paper list of log-based anomaly detection