This code reproduces the results of the paper, "Measuring Data Leakage in Machine-Learning Models with Fisher Information"

Overview

Fisher Information Loss

This repository contains code that can be used to reproduce the experimental results presented in the paper:

Awni Hannun, Chuan Guo and Laurens van der Maaten. Measuring Data Leakage in Machine-Learning Models with Fisher Information. arXiv 2102.11673, 2021.

Installation

The code requires Python 3.7+, PyTorch 1.7.1+, and torchvision 0.8.2+.

Create an Anaconda environment and install the dependencies:

conda create --name fil
conda activate fil
conda install -c pytorch pytorch torchvision
pip install gitpython 

Usage

The script fisher_information.py computes the per-example FIL for the given dataset and model. An example run is:

python fisher_information.py \
    --dataset mnist \
    --model least_squares

To see usage options for the script run:

python fisher_information.py --help

Other scripts in the repository are:

  • reweighted.py : Run the iteratively reweighted Fisher information loss (IRFIL) algorithm.
  • model_inversion.py : Attribute inversion experiments for a non-private model.
  • private_model_inversion.py : Attribute inversion experiments for a private model.
  • test_jacobians.py : Unit tests.

To run the full set of experiments in the accompanying paper:

cd scripts/ && ./run_experiments.sh

Citing this Repository

If you use the code in this repository, please cite the following paper:

@article{hannun2021fil,
  title={Measuring Data Leakage in Machine-Learning Models with Fisher
    Information},
  author={Hannun, Awni and Guo, Chuan and van der Maaten, Laurens},
  journal={arXiv preprint arXiv:2102.11673},
  year={2021}
}

License

This code is released under a CC-BY-NC 4.0 license. Please see the LICENSE file for more information.

Please review Facebook Open Source Terms of Use and Privacy Policy.

You might also like...
Code to reproduce the results for Compositional Attention: Disentangling Search and Retrieval.

Compositional-Attention This repository contains the official implementation for the paper Compositional Attention: Disentangling Search and Retrieval

Code to reproduce the results for Statistically Robust Neural Network Classification, published in UAI 2021

Code to reproduce the results for Statistically Robust Neural Network Classification, published in UAI 2021

A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Bag of tricks for long-tailed visual recognition with deep convolutional neural networks This repository is the official PyTorch implementation of AAA

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time. [ICCV 2021]  Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

[ICCV 2021]  Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

EasyDatas An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results Installation pip install git+https

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]
A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

TCV-X21 validation for divertor turbulence simulations Quick links Intro Welcome to TCV-X21. We're glad you've found us! This repository is designed t

Comments
  • Is `XTXdiag` used in LeastSquares?

    Is `XTXdiag` used in LeastSquares?

    Sorry for bothering you again, we're studying your paper recently :)

    In class LeastSquares train, I saw the variable XTXdiag

      XTXdiag = torch.diagonal(XTX)
      XTXdiag += (n * l2)
    

    According to the paper, XTXdiag seems to be the Hessian, and will be used to calculate the Jacobian. However, in the codes, I didn't see anywhere using XTXdiag. But if I change the value of l2(only XTXdiag uses l2 ), the output results would change, which means there is somewhere using XTXdiag. Maybe calculating XTXdiag did influence XTX and thus influence self.A? I'm really confused about that and I'd appreciate it if I can get some explanation :)

    opened by charlotte12l 1
  • where is `load_iwpc`

    where is `load_iwpc`

    Thank you for making your codes open source! I'm wondering where is load_iwpc? In Line 219 of dataloading.py,

    from main import load_iwpc
    

    However there is no main or load_iwpc :(

    opened by charlotte12l 1
  • Divide quadratic loss by two

    Divide quadratic loss by two

    To capture the description in the paper correctly, the quadratic loss in the least-squares model should be divided by two. This change does not change the Fisher information loss values in any way.

    CLA Signed 
    opened by lvdmaaten 0
  • Fix bug in how noise scale is computed

    Fix bug in how noise scale is computed

    Per Equation 15 in the paper, the relation between the noise parameter sigma (scale) and the spectral norm of the Jacobian (eta_max) should involve division by eta, not multiplication. Simply put: to guarantee a lower FIL, you need to add noise with a larger scale.

    This corrects that mistake in the code.

    CLA Signed 
    opened by lvdmaaten 0
Owner
Facebook Research
Facebook Research
Reproduces ResNet-V3 with pytorch

ResNeXt.pytorch Reproduces ResNet-V3 (Aggregated Residual Transformations for Deep Neural Networks) with pytorch. Tried on pytorch 1.6 Trains on Cifar

Pau Rodriguez 481 Dec 23, 2022
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

João Fonseca 3 Jan 3, 2023
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Sohil Shah 197 Nov 29, 2022
Code for sound field predictions in domains with impedance boundaries. Used for generating results from the paper

Code for sound field predictions in domains with impedance boundaries. Used for generating results from the paper

DTU Acoustic Technology Group 11 Dec 17, 2022
Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".

Tensor Component Analysis for Interpreting the Latent Space of GANs [ paper | project page ] Code to reproduce the results in the paper "Tensor Compon

James Oldfield 4 Jun 17, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

Capital One 97 Jan 3, 2023
Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet

Elad Hoffer 145 Sep 16, 2022