Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Overview

Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Code for our paper: "Open-Set Recognition: A Good Closed-Set Classifier is All You Need"

Abstract: The ability to identify whether or not a test sample belongs to one of the semantic classes in a classifier's training set is critical to practical deployment of the model. This task is termed open-set recognition (OSR) and has received significant attention in recent years. In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes. We find that this relationship holds across loss objectives and architectures, and further demonstrate the trend both on the standard OSR benchmarks as well as on a large-scale ImageNet evaluation. Second, we use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy, and with this strong baseline achieve a new state-of-the-art on the most challenging OSR benchmark. Similarly, we boost the performance of the existing state-of-the-art method by improving its closed-set accuracy, but this does not surpass the strong baseline on the most challenging dataset. Our third contribution is to reappraise the datasets used for OSR evaluation, and construct new benchmarks which better respect the task of detecting semantic novelty, as opposed to low-level distributional shifts as tackled by neighbouring machine learning fields. In this new setting, we again demonstrate that there is negligible difference between the strong baseline and the existing state-of-the-art.

image

Running

Dependencies

pip install -r requirements.txt

Datasets

A number of datasets are used in this work, many of them can be downloaded directly through PyTorch servers:

FGVC Open-set Splits:

For the proposed FGVC open-set benchmarks, the directory data/open_set_splits contains the proposed class splits as .pkl files. The files also include information on which open-set classes are most similar to which closed-set classes.

Config

Set paths to datasets and pre-trained models (for fine-grained experiments) in config.py

Set SAVE_DIR (logfile destination) and PYTHON (path to python interpreter) in bash_scripts scripts.

Run

To recreate results on TinyImageNet (Table 2). Our runs give us 82.60% AUROC for both (ARPL + CS)+ and Cross-Entropy+.

bash bash_scripts/osr_train_tinyimagenet.sh

Optimal Hyper-parameters:

We tuned label smoothing and RandAug hyper-parameters to optimise closed-set accuracy on a single random validation split for each dataset. For other hyper-parameters (image size, batch size, learning rate) we took values from the open-set literature for the standard datasets (specifically, the ARPL paper) and values from the FGVC literature for the proposed FGVC benchmarks.

Cross-Entropy optimal hyper-parameters:

Dataset Image Size Learning Rate RandAug M RandAug N Label Smoothing Batch Size
MNIST 32 0.1 1 8 0.0 128
SVHN 32 0.1 1 18 0.0 128
CIFAR-10 32 0.1 1 6 0.0 128
CIFAR + N 32 0.1 1 6 0.0 128
TinyImageNet 64 0.01 1 9 0.9 128
CUB 448 0.001 2 30 0.3 32
FGVC-Aircraft 448 0.001 2 15 0.2 32

ARPL + CS optimal hyper-parameters:

(Note the lower learning rate for TinyImageNet)

Dataset Image Size Learning Rate RandAug M RandAug N Label Smoothing Batch Size
MNIST 32 0.1 1 8 0.0 128
SVHN 32 0.1 1 18 0.0 128
CIFAR10 32 0.1 1 15 0.0 128
CIFAR + N 32 0.1 1 6 0.0 128
TinyImageNet 64 0.001 1 9 0.9 128
CUB 448 0.001 2 30 0.2 32
FGVC-Aircraft 448 0.001 2 18 0.1 32

Other

This repo also contains other useful utilities, including:

  • utils/logfile_parser.py: To directly parse stdout outputs for Accuracy / AUROC metrics
  • data/open_set_datasets.py: A useful framework for easily splitting existing datasets into controllable open-set splits into train, val, test_known and test_unknown. Note: ImageNet has not yet been integrated here.
  • utils/schedulers.py: Implementation of Cosine Warm Restarts with linear rampup as a PyTorch learning rate scheduler

Citation

If you use this code in your research, please consider citing our paper:

@article{vaze21openset,
    author  = {Sagar Vaze and Kai Han and Andrea Vedaldi and Andrew Zisserman},
    title   = {Open-Set Recognition: A Good Closed-Set Classifier is All You Need},
    journal = {arXiv preprint},
    year    = {2021},
  }

Furthermore, please also consider citing Adversarial Reciprocal Points Learning for Open Set Recognition, upon whose code we build this repo.

Comments
  • Clarification on ImageNet-21K

    Clarification on ImageNet-21K

    Hi,

    Thanks for making the code public which helps a lot. I'm wondering if you were using the fall11 version of ImageNet-21K for the open splits. I'm now using the winter21 version and some of the classes in the 'Hard' split do not exist anymore (e.g., 'n10506915').

    If this is true, then a problem is that the ImageNet website doesn't seem to host fall11 version anymore (correct me if I'm wrong). If this is the case, then would it be possible for you to update the 'Hard' split based on the available winter21 version?

    Thanks

    opened by zjysteven 7
  • about checkpoint

    about checkpoint

    Thanks for your wonderful work

    I want to know why the checkpoint is not saved when I only use the CEloss, I read the code but did not find where should be modify.

    opened by Hrren 4
  • Use of logits instead of softmax activations for OS scoring

    Use of logits instead of softmax activations for OS scoring

    Hi again,

    I read from the paper that "[...] we propose the use of the maximum logit rather than softmax probability for the open-set scoring rule. Logits are the raw outputs of the final linear layer in a deep classifier, while the softmax operation involves a normalization such that the outputs can be interpreted as a probability vector summing to one. As the softmax operation normalizes out much of the feature magnitude information present in the logits, we find logits lead to better open-set detection results" . Then you have figure 6c that shows AUROC on the test set(s) and how it evolves as training goes on, using both max-logits and max-softmax for scoring, showing how it might be better to use max-of-logits.

    However, the ARPL code for the Softmax loss (found here), which you are inheriting and using for testing, is a bit weird: it calls logits to the post-softmax activation, see here.

    Since you are taking the (false) logits from calling the criterion (here) during testing, and then you have a few lines below the option of (re-)applying softmax to them if we are running with 'use_softmax_in_eval, I am wondering if what you are calling in your experiments from the paper "logits" are actually softmax(logits), and what you call softmax activations are indeed softmax(softmax(logits))?

    Thanks!

    Adrian

    opened by agaldran 4
  • How to predict a single img with label known or unknown?

    How to predict a single img with label known or unknown?

    Hello again! I'm working on a task with your code, i have seen that all methods are evaluate by AUROC, a threshold independent metric, but now i need to predict the label by threshold, is there any func in code to do that?

    opened by LP308210365 3
  • pretrained model don't match

    pretrained model don't match

    Thanks for your work

    When i train the model with parameters "--model=timm_resnet50_pretrained --resnet50_pretrain=places", and i download pretrained model from https://github.com\nanxuanzhao\Good_transfer, i get the error :"Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean",.....................".

    i think this error causes by mismatched weight and model, could you please tell me the right method to train model with pretrained weight, or right link to download weight?

    opened by LP308210365 3
  • Training cub dataset low accuracy

    Training cub dataset low accuracy

    Hello, thank you for maintaining this repo!

    I have one question regarding a low accuracy using cub dataset.

    Here is the bash file that I am using:

    LOSS='ARPLoss'          # For TinyImageNet, ARPLoss and Softmax loss have the same
                            # RandAug and Label Smoothing hyper-parameters, but different learning rates
    
    # Fixed hyper params for both ARPLoss and Softmax
    DATASET='cub'
    AUG_M=30
    AUG_N=2
    LABEL_SMOOTHING=0.1
    
    # LR different for ARPLoss and Softmax
    if [ $LOSS = "Softmax" ]; then
       LR=0.01
    elif [ $LOSS = "ARPLoss" ]; then
       LR=0.001
    fi
    
    # GPU0-0 MIG-GPU-7391bfa5-fd39-632b-ac8a-cbd1359e940b/5/0
    
    
    # tinyimagenet
    for SPLIT_IDX in 0 1 2 3 4; do
    
      EXP_NUM=$(ls ${SAVE_DIR} | wc -l)
      EXP_NUM=$((${EXP_NUM}+1))
      echo $EXP_NUM
    
      ${PYTHON} -m methods.ARPL.osr --lr=0.001 --seed=0 \
                                 --transform='rand-augment' \
                                --rand_aug_m=${AUG_M} --rand_aug_n=${AUG_N} --loss=${LOSS} --label_smoothing=${LABEL_SMOOTHING} \
                                --dataset=${DATASET} --image_size=448 --cs --num_restarts=2 --gpus 0 --split_idx=${SPLIT_IDX} \
                                --scheduler='cosine_warm_restarts_warmup' --split_train_val='True' --batch_size=32 --num_workers=16 --max-epoch=600 \
    > ${SAVE_DIR}logfile_${DATASET}_cs_${LOSS}_${EXP_NUM}.out
    done
    
    Batch 150/150    Net 4.238 (4.178) G 19.857 (19.787) D 0.000 (0.000)
    

    I believe this is because the loss of D(discriminator) becomes 0.

    However, I was not able to get the accuracy more than 20%. Could you point me out what I am doing wrong?

    Thank you!

    opened by JaeLee18 2
  • Any plans to include Imagenet?

    Any plans to include Imagenet?

    Hello again,

    I have been able to train models for all benchmarks, old and newly proposed, so thanks for that!

    I am just missing the Imagenet experiment, which appears in Table 3 of the paper (with an easy and a hard split). The ImageNet splits seem to be present in data/open_set_split/imagenet_osr_splits.pkl, but there is not pytorch dataset that implements them, nor do ImageNet hyperparameters appear in utils/paper_hyperparameters.csv. I was wondering if you have any plans to release the ImageNet experimental setup anytime soon? Thanks!!

    Also, may I email you with a separate question about your work that does not fit in a github issue? Thank you very much!!

    Adrian

    opened by agaldran 2
  • In CIFAR10-OSR, the first and seond splits are the same

    In CIFAR10-OSR, the first and seond splits are the same

    https://github.com/sgvaze/osr_closed_set_all_you_need/blob/c3fee78818f83052cd8fed7826d5511a5a9d7165/data/open_set_splits/osr_splits.py#L19

    Hi, first of all, thank you for sharing the code.

    For the OSR-CIFAR10 experiment, the first and second splits are exactly the same. I am wondering if this is actually correct.

    opened by le4m 2
  • Reproducing results

    Reproducing results

    Hi! First, thank you very much for this work, it is very refreshing to see recent OSR methods put to test and finding out that mostly they are over-hyped over-complex approaches, and cross-entropy alone is so competitive if you give some care to training baselines properly, congratulations :)

    I am trying to reproduce your results, but I am struggling to understand how to do it. I am starting from Tiny Imagenet, which I have been able to re-train successfuly, after:

    1. running the create_val_img_folder function on the dataset folder, and
    2. correcting lines 18 of methods/ARPL/core/train.py, as well as lines 25 and 42 of methods/ARPL/core/test.py, because options['use_gpu'] does not exist; those lines should probably be replaced by if not options['use_cpu'], which works ok.

    Now, after properly manipulating config.py and bash_scripts/osr_train_tinyimagenet.sh, I carry out the entire training and I end up with a directory called, in this case, in methods/ARPL/log/(03.01.2022_|_32.677). Within this directory, one can find some tensorboard-related stuff, and two directories, namely checkpoints/ and arpl_models/tinyimagenet/checkpoints/. The former is empty and I guess it is created by mistake, whereas the latter contains a bunch of checkpoints, as it seems that you guys are storing a model checkpoint (and a "criterion checkpoint", which btw,I don't know what it is) each twenty epochs.

    My question is, how exactly do I evaluate the final performance of this experiment? I.e.:

    • How do I know which is the checkpoint with the highest closed-set performance, that I should then be using to compute Accuracy on the closed set, AUC on the open classes, plus the OSCR score, like in Table 5 or Table 3?
    • Which piece of code should I use to evaluate the checkpoint, and how do I go about it?

    I'm suspecting it might have something to do with methods/tests/openset_test.py, but I am not sure since there seem to be some hard-coded experiment names in there,and it seems to be only useful for evaluating the performance of an ensemble of five models. Could you please provide some instructions on how to assess final performance of a trained model?

    Thanks!!

    Adrian

    P.S.: In the next days or weeks I might be asking some more questions about your work, thanks for the patience!

    opened by agaldran 2
  • How to classify a sample as unknown?

    How to classify a sample as unknown?

    Hello! I am glad to see this work when I encounter OSR problems in practice. I'm wonder that how to give the unknown label to a sample with the classifier trained on the closed set.

    opened by FunnyDragonK 2
  • Question about fine-grained dataset CUB and Aircraft

    Question about fine-grained dataset CUB and Aircraft

    Hello When I tried to reproduce the AUC results of the Easy and Medium datasets for the CUB and Aircraft datasets, I couldn't reproduce the results of the paper, for CUB, Easy and Medium were only 87.5 and 81.8; for Aircraft is 89.0 and 85.4. I trained according to the bash script/osr_finegrained_train.sh and fix corresponding Optimal Hyper-parameters, and used the places_moco pre-trainmodel, but the results in the paper could not be reproduced. What is the reason?

    image image image

    opened by Hrren 1
  • About training results

    About training results

    Hi,

    I found that the evaluation metrics are unstable among different epochs when I conducted experiments on the custom dataset. I would like to ask whether you report the last or the best epoch results in your training exps.

    opened by LionRoarRoar 0
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • Longer training and stronger augmentation do not work for Cifar 100

    Longer training and stronger augmentation do not work for Cifar 100

    Hi,

    I am trying to use the RandAug instead of RandCrop and cosine learning rate schedule to train the whole Cifar100 dataset. But they do not work. The baseline is RandCrop with step learning rate schedule (initial learning rate is 0.1 and divided by 2 at [60, 120, 160] epochs with 300 epochs in total). The closed-set accuracy of baseline is 0.75, and RandAug and cosine learning rate schedule (restart 0 or 2 times for 600 epochs) does not work, can you give me some advise?

    Best regards.

    opened by Jun-CEN 0
Owner
null
Estimation of human density in a closed space using deep learning.

Siemens HOLLZOF challenge - Human Density Estimation Add project description here. Installing Dependencies: Install Python3 either system-wide, user-w

null 3 Aug 8, 2021
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis ?? A more detailed re

Lincedo Lab 4 Jun 9, 2021
automated systems to assist guarding corona Virus precautions for Closed Rooms (e.g. Halls, offices, etc..)

Automatic-precautionary-guard automated systems to assist guarding corona Virus precautions for Closed Rooms (e.g. Halls, offices, etc..) what is this

badra 0 Jan 6, 2022
OpenGAN: Open-Set Recognition via Open Data Generation

OpenGAN: Open-Set Recognition via Open Data Generation ICCV 2021 (oral) Real-world machine learning systems need to analyze novel testing data that di

Shu Kong 90 Jan 6, 2023
Code for "Diffusion is All You Need for Learning on Surfaces"

Source code for "Diffusion is All You Need for Learning on Surfaces", by Nicholas Sharp Souhaib Attaiki Keenan Crane Maks Ovsjanikov NOTE: the linked

Nick Sharp 247 Dec 28, 2022
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

Facebook Research 1k Dec 31, 2022
PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick."

PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick." [Project page] [Paper

Gyungin Shin 59 Sep 25, 2022
Per-Pixel Classification is Not All You Need for Semantic Segmentation

MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation Bowen Cheng, Alexander G. Schwing, Alexander Kirillov [arXiv] [Proj

Facebook Research 1k Jan 8, 2023
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 6, 2022
An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

Fast Transformer This repo implements Fastformer: Additive Attention Can Be All You Need by Wu et al. in TensorFlow. Fast Transformer is a Transformer

Rishit Dagli 139 Dec 28, 2022
Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

Yam Peleg 10 Jan 30, 2022
A PyTorch implementation of the Transformer model in "Attention is All You Need".

Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V

Yu-Hsiang Huang 7.1k Jan 4, 2023
pytorch implementation of Attention is all you need

A Pytorch Implementation of the Transformer: Attention Is All You Need Our implementation is largely based on Tensorflow implementation Requirements N

null 230 Dec 7, 2022
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022
BasicRL: easy and fundamental codes for deep reinforcement learning。It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.

BasicRL: easy and fundamental codes for deep reinforcement learning BasicRL is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up. It is

RayYoh 12 Apr 28, 2022
Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Patches Are All You Need? - ConvMixer ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in t

Sayan Nath 8 Oct 3, 2022
Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Attention Is All You Need Paper Implementation This is my from-scratch implementation of the original transformer architecture from the following pape

Brando Koch 195 Dec 30, 2022
[ACM MM 2021] Yes, "Attention is All You Need", for Exemplar based Colorization

Transformer for Image Colorization This is an implemention for Yes, "Attention Is All You Need", for Exemplar based Colorization, and the current soft

Wang Yin 30 Dec 7, 2022
TensorFlow implementation of "Attention is all you need (Transformer)"

[TensorFlow 2] Attention is all you need (Transformer) TensorFlow implementation of "Attention is all you need (Transformer)" Dataset The MNIST datase

YeongHyeon Park 4 Jan 5, 2022