Attack on Confidence Estimation algorithm from the paper "Disrupting Deep Uncertainty Estimation Without Harming Accuracy"

Last update: Mar 30, 2022

Related tags

Deep Learning ACE

Overview

Attack on Confidence Estimation (ACE)

This repository is the official implementation of "Disrupting Deep Uncertainty Estimation Without Harming Accuracy": https://arxiv.org/abs/2110.13741

Overview

ACE is an algorithm for crafting adversarial examples that disrupt a model's uncertainy estimation performance without harming its accuracy. The figure above conceptually illustrates how ACE works. Consider a classifier for cats vs. dogs that uses its prediction's softmax score as its uncertainty estimation measurement. An end user asks the model to classify several images, and output only the ones in which it has the most confidence. Since softmax quantifies the margin from an instance to the decision boundary, we visualize it on a 2D plane where the instances' distance to the decision boundary reflect their softmax score. In the example shown in the figure above, the classifier was mistaken about one image of a dog, classifying it as a cat, but fortunately its confidence in this prediction is the lowest among its predictions. A malicious attacker targeting the images in which the model has the most confidence would want to increase the confidence in the mislabeled instance by pushing it away from the decision boundary, and decrease the confidence in the correctly labeled instances by pushing them closer to the decision boundary.

Example

example.py shows a simple demonstration of how ACE decreases an EfficientNetB0's confidence (measured by max softmax score) in a corrent prediction (tank image), and how it increases its confidence in an incorrect prediction (binoculars incorrectly labeled as a tank).

To use it, simply run:

python example.py

The EfficientNetB0 used in the example (and in the paper) was taken from the excellent timm repository.

Requirements

To install requirements:

pip install -r requirements.txt

You might also like...

This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Ad^2Attack：Adaptive Adversarial Attack on Real-Time UAV Tracking Demo video 📹 Our video on bilibili demonstrates the test results of Ad^2Attack on se

Intelligent Vision for Robotics in Complex Environment

10 Nov 7, 2022

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

1 Oct 25, 2021

Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

21 Nov 23, 2022

Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022

RL algorithm PPO and IRL algorithm AIRL written with Tensorflow.

RL algorithm PPO and IRL algorithm AIRL written with Tensorflow. They have a parallel sampling feature in order to increase computation speed (especially in high-performance computing (HPC)).

3 Dec 28, 2021

Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad

30 Jan 1, 2023

Lightweight tool to perform MITM attack on local network

ARPSpy - A lightweight tool to perform MITM attack Using many library to perform ARP Spoof and auto-sniffing HTTP packet containing credential. (Never

8 Aug 28, 2022

We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will make a program to Crack Any Password Using Python. Show some ❤️ by starring this repository!

Crack Any Password Using Python We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will

11 Dec 3, 2022

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Single Node Injection Attack against Graph Neural Networks This repository is our Pytorch implementation of our paper: Single Node Injection Attack ag

18 Nov 21, 2022

Attack on Confidence Estimation algorithm from the paper "Disrupting Deep Uncertainty Estimation Without Harming Accuracy"

Related tags

Overview

Attack on Confidence Estimation (ACE)

Overview

Example

Requirements

You might also like...

This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

RL algorithm PPO and IRL algorithm AIRL written with Tensorflow.

Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Lightweight tool to perform MITM attack on local network

We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will make a program to Crack Any Password Using Python. Show some ❤️ by starring this repository!

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Owner

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

the code for paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration"

Source code of NeurIPS 2021 Paper ''Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration''

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Learning Confidence for Out-of-Distribution Detection in Neural Networks

Code for "The Box Size Confidence Bias Harms Your Object Detector"

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model