Training PyTorch models with differential privacy

Last update: Dec 29, 2022

Related tags

Overview

Opacus is a library that enables training PyTorch models with differential privacy. It supports training with minimal code changes required on the client, has little impact on training performance, and allows the client to online track the privacy budget expended at any given moment.

Target audience

This code release is aimed at two target audiences:

ML practitioners will find this to be a gentle introduction to training a model with differential privacy as it requires minimal code changes.
Differential Privacy researchers will find this easy to experiment and tinker with, allowing them to focus on what matters.

Installation

The latest release of Opacus can be installed via pip:

pip install opacus

You can also install directly from the source for the latest features (along with its quirks and potentially ocassional bugs):

git clone https://github.com/pytorch/opacus.git
cd opacus
pip install -e .

Getting started

To train your model with differential privacy, all you need to do is to instantiate a PrivacyEngine and pass your model, data_loader, and optimizer to the engine's make_private() method to obtain their private counterparts.

# define your components as usual
model = Net()
optimizer = SGD(model.parameters(), lr=0.05)
data_loader = torch.utils.data.DataLoader(dataset, batch_size=1024)

# enter PrivacyEngine
privacy_engine = PrivacyEngine()
model, optimizer, data_loader = privacy_engine.make_private(
    module=model,
    optimizer=optimizer,
    data_loader=data_loader,
    noise_multiplier=1.1,
    max_grad_norm=1.0,
)
# Now it's business as usual

The MNIST example shows an end-to-end run using opacus. The examples folder contains more such examples.

Migrating to 1.0

Opacus 1.0 introduced many improvements to the library, but also some breaking changes. If you've been using Opacus 0.x and want to update to the latest release, please use this Migration Guide

Learn more

Interactive tutorials

We've built a series of IPython-based tutorials as a gentle introduction to training models with privacy and using various Opacus features.

Blogposts and talks

If you want to learn more about DP-SGD and related topics, check our our series of blogposts and talks:

FAQ

Checkout the FAQ page for answers to some of the most frequently asked questions about Differential Privacy and Opacus.

Contributing

See the CONTRIBUTING file for how to help out. Do also check out the README files inside the repo to learn how the code is organized.

Citation

To cite Opacus in your papers (much appreciated!), please use the following:

@article{opacus,
  title={Opacus: User-Friendly Differential Privacy Library in PyTorch},
  author={A. Yousefpour and I. Shilov and A. Sablayrolles and D. Testuggine and K. Prasad and M. Malek and J. Nguyen and S. Ghosh and A. Bharadwaj and J. Zhao and G. Cormode and I. Mironov},
  journal={arXiv preprint arXiv:2109.12298},
  year={2021}
}

License

This code is released under Apache 2.0, as found in the LICENSE file.

Comments

Added PackedSequence support in DPLSTM
Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)

[x] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

This is a pull request in response to the issue for adding support for PackedSequence in DPLSTM. An example usage is shown below:

from opacus.layers import DPLSTM import torch seq_batch = [torch.tensor([[1, 1], [2, 2], [3, 3], [4, 4], [5, 5]]), torch.tensor([[10, 10], [20, 20]])] seq_lens = [5, 2] padded_seq_batch = torch.nn.utils.rnn.pad_sequence(seq_batch, batch_first=True) packed_seq_batch = torch.nn.utils.rnn.pack_padded_sequence(padded_seq_batch, lengths=seq_lens, batch_first=True) dp_lstm = DPLSTM(input_size=2, hidden_size=3, batch_first=True) output, (hn, cn) = dp_lstm(packed_seq_batch.float())

How Has This Been Tested (if it applies)

This has been tested analogously to the way that DPLSTM (without PackedSequence) has been tested. The outputs and parameters of DPLSTM (with PackedSequence) and LSTM (with PackedSequence) have been compared in the tests.

Checklist

[x] The documentation is up-to-date with the changes I made.

[x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[x] All tests passed, and additional code has been covered with new tests.

CLA Signed Merged
opened by touqir14 58
Add benchmarks to CI
Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)

[X] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Docs change / refactoring / dependency upgrade

Issue: https://github.com/pytorch/opacus/issues/368

Motivation and Context / Related issue

There's a task #368 for committing benchmark code. In this change I add these benchmarks into CI integration tests. To choose thresholds I ran the benchmarks locally on all the layers with (batch size: 16, num_runs: 100, num_repeats: 20, forward_only: False), please check the comment below for more details.

Using the report and section 3 in the paper, I parameterised the runtime and memory thresholds for different layers.

How Has This Been Tested (if it applies)

I ran the jobs locally and generated reports.

Local CircleCI config validation circleci config process .circleci/config.yml

Local CircleCI job run: circleci local execute --job JOB_NAME

Checklist

[X] The documentation is up-to-date with the changes I made.

[X] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[x] All tests passed, and additional code has been covered with new tests.

CLA Signed
opened by moaradwan 34
Add PRVAccountant
This PR implements a new accountant PRVAccountant based on the paper Numerical Composition of Differential Privacy.

Code inspired heavily by the code that accompanied the paper: https://github.com/microsoft/prv_accountant

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)

[x] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

See #378

How Has This Been Tested (if it applies)

I have tested these changes with the following scripts, but would welcome suggestions on how to test further or write unit tests to cover these changes:

validate_gaussian.py recreates this notebook, which checks that we can recover upper and lower bounds on the privacy curve of a Gaussian mechanism correctly.

prv_accountant_cifar10.py runs this tutorial from the Opacus docs with the PRVAccountant instead of RDPAccountant.

Checklist

I have not yet written docstrings or tests for these changes both as it was slightly unclear to me how best to proceed, but also because I would like to validate the approach taken in this initial implementation before polishing.

[x] The documentation is up-to-date with the changes I made.

[x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[x] All tests passed, and additional code has been covered with new tests.

CLA Signed
opened by tcbegley 32
proposal to handle wasserestein loss (multiple loss.backward()) in pytorch-dp
Current implementation of Pytorch-dp does not support Wasserstein Loss in GAN (not support multiple loss.backward())

issue

We are working on integrating pytorch-dp to GAN model to generate differential private synthetic data. Currently, pytorch-dp can only support a single loss.backward() before calling optimizer.step(), this will not work for Wasserstein Loss in GAN.

why important

Wasserstein Loss with gradient penalty was approved to help alleviate the issues of mode collapse that KL divergence introduced and has been used by many different variants of GAN models.

possible solutions

One temporal work-around is to update _create_or_extend_grad_sample() in supported_layers_grad_samplers.py. When doing multiple loss.backward() and not virtual step mode, instead of doing torch.cat((param.grad_sample, grad_sample, batch_dim) , making it as accumulative sum such as "param.grad_sample = param.grad_sample + grad_sample" .

current implementation `def _create_or_extend_grad_sample( param: torch.Tensor, grad_sample: torch.Tensor, batch_dim: int ) -> None: """ Create a 'grad_sample' attribute in the given parameter, or append to it if the 'grad_sample' attribute already exists. """

if hasattr(param, "grad_sample"): # pyre-fixme[16]: `Tensor` has no attribute `grad_sample`. param.grad_sample = torch.cat((param.grad_sample, grad_sample), batch_dim) else: param.grad_sample = grad_sample`

suggested implementation when not in virtual step mode def _create_or_extend_grad_sample( param: torch.Tensor, grad_sample: torch.Tensor, batch_dim: int ) -> None: """ Create a 'grad_sample' attribute in the given parameter, or append to it if the 'grad_sample' attribute already exists. """

if hasattr(param, "grad_sample"): # pyre-fixme[16]: `Tensor` has no attribute `grad_sample`. param.grad_sample = param.grad_sample + grad_sample else: param.grad_sample = grad_sample
enhancement
opened by AprilXiaoyanLiu 29
1.0 API

We're introducing new Opacus API. See Readme and updated turorials for details.

For code reviews and discussions see PR history https://github.com/pytorch/opacus/pulls?q=is%3Apr+is%3Aclosed+base%3Aexperimental_v1.0
CLA Signed

opened by ffuuugor 24
Setup and test on Python 3.6.9

Summary: We recently changed the minimal supported version of Python to 3.6 to be Google Colab friendly. We should, therefore, also ensure our tests and baselines are run on this version.

We also run unittests on Py 3.7 and 3.8. The unit tests on nightly are only run on 3.8 as we only concern ourselves with the latest versions for that.

Differential Revision: D23796762
Merged fb-exported

opened by karthikprasad 23
Support checkpoints
Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)

[x] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

#373

How Has This Been Tested (if it applies)

Unit tests.

Checklist

[x] The documentation is up-to-date with the changes I made.

[x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[x] All tests passed, and additional code has been covered with new tests.

enhancement CLA Signed
opened by karthikprasad 22
Compute privacy
Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)

[X] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

Command-line script for computing privacy of a model trained with DP-SGD.

Based on Google's TF Privacy: https://github.com/tensorflow/privacy/blob/master/tensorflow_privacy/privacy/analysis/compute_dp_sgd_privacy.py

Checklist

[X] The documentation is up-to-date with the changes I made. Just one line added in README.md

[X] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[x] All tests passed, and additional code has been covered with new tests. Only one "test": default values are taken from TF example and result is the same. (Not so surprising since the code is quite the same, except main() argument managing, in the only one new file added.)

Thank you for considering this request. Sincerely.
CLA Signed Merged
opened by jmg-74 21
Alternative grad sample algorithm for Conv

Implementation of convolution backward with a convolution. Original implementation due to @zou3519 (https://gist.github.com/zou3519/080f3a296f190ea1730d97396d5267d6).

The original code has been extended to handle the general case (i.e., groups, dilation and stride).

There is still one minor problem that I couldn't find a nice solution to: in some cases, the backward will produce a grad sample that is slightly bigger than the correct one (e.g. kernel of size 3 with stride 2 and input of size 6). The current solution is to just ignore the last dimensions (line 52 in grad_sample/conv.py)
CLA Signed

opened by alexandresablayrolles 19
DDP support for faster distributed training
This PR starts to add support for PyTorch DDP as proposed in https://github.com/pytorch/opacus/issues/191 (cc @thomasflynn918 @aik7). We ported our code to a more recent version of Opacus, but there are still some changes to discuss, in particular how to plug the DDP hook into GradSample module.

I also fixed some bugs related to distributed training (see below) in the same PR. I'm happy to open separate PRs to merge these smaller changes separately if you prefer.

Types of changes

[x] Bug fix (non-breaking change which fixes an issue)

[x] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

More context in this issue: https://github.com/pytorch/opacus/issues/191.

This PR:

Added a prototype for the DDP hook in grad_sample_module.py

Added distributed Poisson sampling

Updated the CIFAR10 end-to-end example (with Slurm cluster support)

Fixed bugs in the tests and examples where all the GPUs were working on the same data

Fixed bugs in the naive distributed training implementation (missing initial synchronization, attempt to do allreduce on parameters without gradients)

Fixed bug where DPDDP was not detected, because the privacy engine was checking self.privacy_engine.module instead of looking inside the GradSample module self.privacy_engine.module._module.

How Has This Been Tested (if it applies)

Added tests to compare the weights with/without DDP hook

Checklist

[ ] The documentation is up-to-date with the changes I made. (this is still work in progress, I'll update the documentation once we agree on the API)

[x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[x] All tests passed, and additional code has been covered with new tests.

TODO

[x] Validate how the hook is plugged into the GradSample module

[x] Add support for empty Poisson batches in the hook

[x] Ensure that the privacy accounting is still accurate with DDP

[x] Generate the noise only on the GPU that needs it, to save some time (especially with CSPRNG)

[x] Support per-layer clipping with different thresholds for each layer

CLA Signed Merged
opened by tholop 18
Make DPMultiheadAttention drop-in compatible with nn.MultiheadAttention

Summary: This PR is target to resolve #123 on GitHub by having an additional re-naming mechanism to match the state_dict structure of nn.MultiheadAttention.

Differential Revision: D40671870
CLA Signed fb-exported

opened by Wei-1 17

cannot install opacus with torch(+gpu)

Upon calling pip install opacus pip is trying to delete my installed torch(GPU) version and install a CPU version instead.

To Reproduce

:warning: We cannot help you without you sharing reproducible code. Do not ignore this part :) Steps to reproduce the behavior:

install pytorch 1.13.1: pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
install opacus: pip install opacus

This is what I got:

Collecting opacus
  Downloading opacus-1.3.0-py3-none-any.whl (216 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 216.9/216.9 kB 1.5 MB/s eta 0:00:00
Collecting functorch
  Using cached functorch-1.13.0-py2.py3-none-any.whl (2.1 kB)
Requirement already satisfied: torch>=1.8 in /home/gilad/venv_py37_new/lib/python3.7/site-packages (from opacus) (1.13.1+cu117)
Requirement already satisfied: scipy>=1.2 in /home/gilad/venv_py37_new/lib/python3.7/site-packages (from opacus) (1.7.3)
Requirement already satisfied: numpy>=1.15 in /home/gilad/venv_py37_new/lib/python3.7/site-packages (from opacus) (1.19.5)
Requirement already satisfied: opt-einsum>=3.3.0 in /home/gilad/venv_py37_new/lib/python3.7/site-packages (from opacus) (3.3.0)
Requirement already satisfied: typing-extensions in /home/gilad/venv_py37_new/lib/python3.7/site-packages (from torch>=1.8->opacus) (4.0.1)
Collecting torch>=1.8
  Downloading torch-1.13.0-cp37-cp37m-manylinux1_x86_64.whl (890.2 MB)

It seems that functorch is forcing a non cuda pytorch version. I experienced the same behavior also when I tried to install opacus for torch==1.12.1+cu113.

I am using python vertion 3.7.9

Environment

(venv_py37_new) gilad@Yoshua:~/workspace$ python collect_env.py
Collecting environment information...
PyTorch version: 1.13.1+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.26

Python version: 3.7.9 (default, Aug 18 2020, 06:22:45)  [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.0-65-generic-x86_64-with-Ubuntu-18.04-bionic
Is CUDA available: True
CUDA runtime version: 11.1.74
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 2080 Ti
GPU 1: NVIDIA GeForce RTX 2080 Ti
GPU 2: NVIDIA GeForce RTX 2080 Ti
GPU 3: NVIDIA GeForce RTX 2080 Ti
GPU 4: NVIDIA GeForce RTX 2080 Ti
GPU 5: NVIDIA GeForce RTX 2080 Ti
GPU 6: NVIDIA GeForce RTX 2080 Ti
GPU 7: NVIDIA GeForce RTX 2080 Ti

Nvidia driver version: 525.60.11
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] geotorch==0.3.0
[pip3] numpy==1.19.5
[pip3] pytorch-ignite==0.4.8
[pip3] torch==1.13.1+cu117
[pip3] torchaudio==0.13.1+cu117
[pip3] torchdiffeq==0.2.2
[pip3] torchsummary==1.5.1
[pip3] torchvision==0.14.1+cu117
[conda] Could not collect

opened by giladcohen 2

Update FAQ to meet current api
Update hint for fixing IncompatibleModuleException from deprecated opacus.utils.module_modification.convert_batchnorm_modules to ModuleValidator.fix()

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)

[ ] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[x] Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

outdated FAQ

https://discuss.pytorch.org/t/convert-batchnorm-modules-does-not-exist/157156/3

How Has This Been Tested (if it applies)

Checklist

[x] The documentation is up-to-date with the changes I made.

[x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[ ] All tests passed, and additional code has been covered with new tests.

CLA Signed
opened by MarcinMisiurewicz 3
AttributeError: 'Parameter' object has no attribute 'grad_sample' in Projected GAN
🐛 Bug

When I applied opacus to the code of the Projected GAN, I had this problem: “AttributeError: 'Parameter' object has no attribute 'grad_sample'”.I've replaced batch_norm with group_norm for the discriminator module, but the error persists.this is the trace:

File "train.py", line 267, in main() # pylint: disable=no-value-for-parameter File "/mnt/LJH/wd/.conda/envs/new/lib/python3.8/site-packages/click/core.py", line 1128, in call return self.main(*args, **kwargs) File "/mnt/LJH/wd/.conda/envs/new/lib/python3.8/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/mnt/LJH/wd/.conda/envs/new/lib/python3.8/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/mnt/LJH/wd/.conda/envs/new/lib/python3.8/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "train.py", line 253, in main launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run) File "train.py", line 101, in launch_training subprocess_fn(rank=0, c=c, temp_dir=temp_dir) File "train.py", line 47, in subprocess_fn training_loop.training_loop(rank=rank, **c) File "/mnt/LJH/wd/test/training/training_loop.py", line 410, in training_loop loss.accumulate_gradients(phase=phase.name, real_img=real_img, real_c=real_c, gen_z=gen_z, gen_c=gen_c, gain=phase.interval, cur_nimg=cur_nimg) File "/mnt/LJH/wd/test/training/loss.py", line 86, in accumulate_gradients loss_Dgen.backward() File "/mnt/LJH/wd/.conda/envs/new/lib/python3.8/site-packages/torch/_tensor.py", line 396, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/mnt/LJH/wd/.conda/envs/new/lib/python3.8/site-packages/torch/autograd/init.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "/mnt/LJH/wd/.conda/envs/new/lib/python3.8/site-packages/opacus/privacy_engine.py", line 71, in forbid_accumulation_hook if p.grad_sample is not None: AttributeError: 'Parameter' object has no attribute 'grad_sample'

Since this code is a bit complicated, I will explain it here for your convenience. The original ProjectedGAN contained a generator and a Projected Discriminator. Here Projected Discriminator contains a feature_network (pre-trained network, CCM, and CSM) and four discriminators that provide image features to the discriminators, feature_network will not be updated by training, but only four discriminators. Instead of splitting the structure and loss function of the original multiple discriminator mergers, I directly processed each discriminator using opacus, and when trained on Projected Discriminator, optimized using four opts returned by opacus. These are all changed in the training loop file ... Because the original code of ProjectedGAN needs to be serialized, but the wrap_collate_with_empty() in the data_loader file of opacus contains another function：collate(batch),so I made some small changes to the code to change the case of the function closure.You need to replace data_loader files in the original opacus package

To Reproduce

1.Use data_loader file in github to replace the origin data_loader file of opacus 2.python dataset_tool.py --source=./data --dest=./data/beauty256.zip --resolution=256x256 --transform=center-crop 3.python train.py --outdir=./training-runs/ --cfg=fastgan --data=./data/beauty256.zip --gpus=1 --batch=32 --mirror=1 --snap=50 --batch-gpu=16 --kimg=600

github repository

Sorry, I tried to reproduce the code on colab, but probably because I haven't used colab before and got some errors. here is my code: https://github.com/sword-king1/ProjectedGAN

Environment

Python version: 3.8(64-bit runtime)

Is CUDA available: True

PyTorch Version (e.g., 1.0): 1.12.1

OS (e.g., Linux) : Linux

CUDA/cuDNN version : 11.6

GPU models and configuration:GPU 0：RTX 8000

opacus==1.3.0

Any other relevant information: The environment configuration file ：“environmentpg.yml”is placed in GitHub

Additional context

The environment configuration file ：environmentpg.yml is placed in GitHub
opened by sword-king1 1

Support empty batches for arbitrary dataset structures

For context see discussion in #530 (and thanks @joserapa98 for pointing out the issue)

At the moment (to be precise, after #530 will have been merged) Opacus can support empty batches only for datasets with a simple structure - every record should be a tuple of a simple type: either tensor or a primitive type.

For instance, datasets with records like this (Tensor, int) or this (Tensor, Tensor) are supported. However datasets like this (Tensor, (int, int)) are not.

Pytorch adresses similar problem with the following piece of code:

if isinstance(elem, collections.abc.Mapping):
    try:
        return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
    except TypeError:
        # The mapping type may not support `__init__(iterable)`.
        return {key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}
elif isinstance(elem, tuple) and hasattr(elem, '_fields'):  # namedtuple
    return elem_type(*(collate(samples, collate_fn_map=collate_fn_map) for samples in zip(*batch)))
elif isinstance(elem, collections.abc.Sequence):
    # check to make sure that the elements in batch have consistent size
    it = iter(batch)
    elem_size = len(next(it))
    if not all(len(elem) == elem_size for elem in it):
        raise RuntimeError('each element in list of batch should be of equal size')
    transposed = list(zip(*batch))  # It may be accessed twice, so we use a list.


    if isinstance(elem, tuple):
        return [collate(samples, collate_fn_map=collate_fn_map) for samples in transposed]  # Backwards compatibility.
    else:
        try:
            return elem_type([collate(samples, collate_fn_map=collate_fn_map) for samples in transposed])
        except TypeError:
            # The sequence type may not support `__init__(iterable)` (e.g., `range`).
            return [collate(samples, collate_fn_map=collate_fn_map) for samples in transposed]

We need to adapt it to our needs and make sure DPDataLoader can handle datasets of arbitrary structure.

Relevant code pointer: https://github.com/pytorch/opacus/blob/7393ae47fdf824ad65d5035461dc391c0f4cc932/opacus/data_loader.py#L31

opened by ffuuugor 0

Per sample grad correctness util
Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)

[x] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[x] Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

Implementation of the utility described in https://github.com/pytorch/opacus/issues/484.

Refactored the code to avoid code duplicates.

How Has This Been Tested (if it applies)

Added the new utility as a test case for existing tests stored in tests.grad_samples.

Checklist

[x] The documentation is up-to-date with the changes I made.

[x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).

[x] All tests passed, and additional code has been covered with new tests.

CLA Signed
opened by psolikov 19

Opacus: I made a model-agnostic callback for PyTorch Lightning

🚀 Feature

I could get Opacus to work with PyTorch Lightning (pl) using a pl.Callback. Note that the callback is model-agnostic, and the model's pl.LightningModule class does not have anything related to Opacus.

Motivation

We need an easy way for PyTorch Lightning users to use Opacus without them having to refactor their LightningModule classes. See below.

Pitch

We need something as follows: (I could actually implement this for real but it only works only for models with one optimizer)

import pytorch_lightning as pl
from opacus import OpacusCallback
from pl_bolts.models import LitMNIST
from pl_bolts.datamodules import MNISTDataModule

trainer = pl.Trainer(
        callbacks=[
            OpacusCallback(...), # all that is needed for DP-training
        ],
    )
trainer.fit(model=LitMNIST(...), datamodule=MNISTDataModule(...))

Additional context

In my version for OpacusCallback, all I do is call .make_private in the on_train_epoch_start hook:

# --- pseudo code --- #

def on_train_epoch_start(
        self,
        trainer: pl.Trainer,
        pl_module: pl.LightningModule,
    ) -> None:
    optimizers: ty.List[Optimizer] = []
        # for loop: begin
        for i in range(len(trainer.optimizers)):
            optimizer = trainer.optimizers[i]
           # this works
            _, dp_optimizer, _ = self.privacy_engine.make_private( # or make_private_with_epsilon
                    module=pl_module,
                    optimizer=optimizer,
                    data_loader=trainer._data_connector._train_dataloader_source.dataloader(),
                    noise_multiplier=self.noise_multiplier,
                    max_grad_norm=self.max_grad_norm,
                    clipping=self.clipping,  # "flat" or "per_layer" or "adaptive"
                    poisson_sampling=self.poisson_sampling,
                )
            optimizers.append(dp_optimizer)
        ### this will fail
        #  if not hasattr(pl_module, "autograd_grad_sample_hooks"):
        #         pl_module = GradSampleModule(pl_module)
        # dp_optimizer = privacy_engine._prepare_optimizer(
        #         optimizer,
        #         noise_multiplier=self.noise_multiplier,
        #         max_grad_norm=self.max_grad_norm,
        #         expected_batch_size=expected_batch_size,
        #    )
        # for loop: end
    trainer.optimizers = optimizers

What's cool is that this is an EarlyStopping callback, so it will stop training when enough privacy budget has been spent.

opened by gianmarcoaversanoenx 0

Releases(v1.3)

v1.3(Nov 14, 2022)
New features

Implement the PRVAccountant based on the paper Numerical Composition of Differential Privacy (#493)

Support nn.EmbeddingBag (#519)

Bug fixes

Fix benchmarks (#503, #507, #508)

Align make_private_with_epsilon with make_private (#509, #526)

Test fixes (#513, #515, #527, #533)

Summed discriminator losses to perform one backprop step (#474)

Fixed issue with missing argument in MNIST example (#520)

Functorch gradients: investigation and fix (#510)

Support empty batches (#530)

Source code(tar.gz)
Source code(zip)
v1.2.0(Sep 9, 2022)
We're glad to present Opacus v1.2, which contains some major updates to per sample gradient computation mechanisms and includes all the good stuff from the recent PyTorch releases.

Highlights

Functorch - per sample gradients for all

With the recent release of functorch it's now easy to compute per sample gradients for any module, without any limitations we've had to set before.

Here's the new default behaviour:

First, we check if the input module contains any layers known to be incompatible with the DP-SGD (e.g. BatchNorm). Note, that these restrictions are fundamental to how DP-SGD works and will always be revelant

Then, for each layer we select a method of computing per sample gradients. For performance reasons, we still use old manually written grad samplers for the layers we support and fall back to the generic functorch-based grad sampler for all other layers.

You can also force functorch-based grad sampler for every layer by passing grad_sample_mode="functorch" to PrivacyEngine.make_private() or force_functorch=False to GradSampleModule's constructor.

If you're using functorch for your training pipeline already, consider using GradSampleModuleNoOp (grad_sample_mode="no_op") . As suggested by the name, is performs no action and expects client to compute per sample gradients themselves. See our CIFAR-10 example for code demonstration.

Note, that this functionality is still in beta and we haven't fully explored it's limitations. Please report any weird behaviour or inconsistencies you encounter to out github issues, we greatly appreciate the feedback.

ExpandedWeights - yet another way to compute per sample gradients

One more exciting feature now available in core PyTorch is ExpandedWeights. This feature uses old Opacus' approach of manually-written vectorized per sample gradient computations, but achieves much better performance.

To activate ExpandedWeights pass grad_sample_mode="ew" to PrivacyEngine.make_private() or use GradSampleModuleExpandedWeights

Summary: 3 different ways to compute per sample gradients

With the recent updates, Opacus now supports 3 different ways to compute per sample gradients. Below is the quick comparison. For more details refer to the grad sample README.md

TL;DR: If you want stable implementation, use GradSampleModule (grad_sample_mode="hooks"). If you want to experiment with the new functionality, you have two options. Try GradSampleModuleExpandedWeights(grad_sample_mode="ew") for better performance and grad_sample_mode=functorch if your model is not supported by GradSampleModule.

Please switch back to GradSampleModule(grad_sample_mode="hooks") if you encounter strange errors or unexpexted behaviour. We'd also appreciate it if you report these to us

| xxx | Hooks | Expanded Weights | Functorch | |:----------------------------:|:-------------------------------:|:----------------:|:------------:| | Required PyTorch version | 1.8+ | 1.13+ | 1.12 (to be updated) | | Development status | Underlying mechanism deprecated | Beta | Beta | | Runtime Performance† | baseline | ✅ ~25% faster | 🟨 0-50% slower | | Any DP-allowed†† layers | Not supported | Not supported | ✅ Supported | | Most popular nn.* layers | ✅ Supported | ✅ Supported | ✅ Supported | | torchscripted models | Not supported | ✅ Supported | Not supported | | Client-provided grad sampler | ✅ Supported | Not supported | ✅ Not needed | | batch_first=False | ✅ Supported | Not supported | ✅ Supported | | Recurrent networks | ✅ Supported | Not supported | ✅ Supported | | Padding same in Conv | ✅ Supported | Not supported | ✅ Supported |

† Note, that performance differences are unstable and can vary a lot depending on the exact model and batch size. Numbers above are averaged over benchmarks with small models consisting of convolutional and linear layers. Note, that performance differences are only observed on GPU training, CPU performance seem to be almost identical for all approaches.

†† Layers that produce joint computations on batch samples (e.g. BatchNorm) are not allowed under any approach

Other improvements

Fix utils.unfold2d with non-symmetric pad/dilation/kernel_size/stride (#443)

Add support for "same" and "valid" padding for hooks-based grad sampler for convolution layers

Improve model validation to support frozen layers and catch copied parameters (#489)

Remove annoying logging from set_to_none (#471)

Improved documentation (#480, #478, #482, #485, #486, #487, #488)

Imtegration test improvements (#407, #479, #481. #473)

Source code(tar.gz)
Source code(zip)
v1.1.3(Jul 13, 2022)
Improvements

Checkpoint support (#429)

Support for layers with mix of frozen and trainable params (#437)

Optimized einsum (#440)

Improved parameter sanity check (#439)

Bug Fixes

Fix unfold2d (#443)

Switch CI to latest PyTorch version (#434)

Typos and editing (#430, #438, #449)

Misc

Tutorials on distributed training (#428)

Source code(tar.gz)
Source code(zip)
v1.1.2(May 6, 2022)
Bug fixes

Support tied parameters (#417)

Fix callsite sensitiveness of zero_grad() (#422, #423)

Improve microbenchmark argument parsing and tests (#425)

Fix opacus nn.functional import (#426)

Miscellaneous

Add microbenchmarks (#412, #416)

Add more badges to readme (#424)

Source code(tar.gz)
Source code(zip)
v1.1.1(Apr 8, 2022)
Bug fixes

Fix accountant when using number of steps instead of epochs

Add params check when converting BatchNorm to GroupNorm (#390)

Fix typo in gdp accountant mechansim name (#386)

Fix linter errors (#392)

Add friendly and detailed message for unsupported layers (#401)

Run linter on nightly workflow (#399)

Add warning for Gaussian DP accounting (#400)

Clone replacement modules on the same device as original (#356)

Implementing 3D dilation (#408)

fix(batch_memory_manager): Ensures split_idxs use native python types (#410)

Miscellaneous

Migrate nightly CircleCI flows to scheduled pipelines (#402)

Migrate from ubuntu 16.04 to 20.04 on CircleCI (#403)

Source code(tar.gz)
Source code(zip)
v.1.1.0(Mar 15, 2022)
v1.1.0

New Feature

Add support for GDP accounting in get_noise_multiplier (#303)

Bug fixes

Conservative search for target epsilon in get_noise_multiplier (#348)

Warn and ignore "drop_last" when set in DPDataLoader (#357)

Fix per-layer clipping in distributed (#347)

Miscellaneous

Update code of conduct and file headers

Add "Support Ukraine" banner to opacus website homepage

Lint fixes

Source code(tar.gz)
Source code(zip)
v1.0.2(Feb 9, 2022)
Bug fixes

DPOptimizer

Passes through .defaults field to match pytorch Optimizer (#329)

Better exception message in .step() when p.grad_sample=None (#331)

Correct closure call after applying DP noise (#330)

Proper gradient scaling in DDP mode

Corrections of typos and errors in tutorials

Miscellaneous

Opacus can be installed with conda: added recipe in conda-forge (#326)

Formatting change in accordance with black-22.1.0

Source code(tar.gz)
Source code(zip)
v1.0.1(Jan 4, 2022)
Bug fixes

Hidden states of RNN is passed to device (#314)

Validate and fix trainable modules only (#316)

Miscellaneous

Minor corrections and typo fixes in links, documentation, and tutorials.

Source code(tar.gz)
Source code(zip)

v1.0.0(Dec 1, 2021)

We are excited to announce the release of Opacus 1.0. This release packs in lot of new features and bug fixes, and most importantly, brings forth new APIs that are simpler, more modular, and easily extensible.

We have bumped up the major version number from 0 to 1 and have introduced breaking changes; although, the major version bump also indicates a step-function upgrade in the capabilities.

What's new?

With this release we're introducing a slightly different approach to the user-facing library API. While heavily based on the old API, updated API better represents abstractions and algorithms used in DP in ML, enabling private training exactly as it's described in the papers, with no assumptions or simplifications. And in doing so we maintain our focus on high performance training.

Clearer semantics

Previously, PrivacyEngine accepted model as an argument, and then needed to be explicitly attached to optimizer. While simple, it wasn't very clear. The new syntax brings abundant clarity with an explicit make_private() method.

Opacus 0.x Opacus 1.0

Opacus 0.x	Opacus 1.0
`privacy_engine = PrivacyEngine( model, sample_rate=0.01, alphas=[10, 100], noise_multiplier=1.3, max_grad_norm=1.0, ) privacy_engine.attach(optimizer)`	`privacy_engine = PrivacyEngine() model, optimizer, data_loader = privacy_engine.make_private( module=model, optimizer=optimizer, data_loader=data_loader, noise_multiplier=1.1, max_grad_norm=1.0, )`

privacy_engine = PrivacyEngine(
    model,
    sample_rate=0.01,
    alphas=[10, 100],
    noise_multiplier=1.3,
    max_grad_norm=1.0,
)
privacy_engine.attach(optimizer)

privacy_engine = PrivacyEngine()
model, optimizer, data_loader = privacy_engine.make_private(
    module=model,
    optimizer=optimizer,
    data_loader=data_loader,
    noise_multiplier=1.1,
    max_grad_norm=1.0,
)

To avoid mutually exclusive method parameters, we're now providing separate method to initialize training loop if epsilon is to be provided instead of noise_multiplier

model, optimizer, data_loader = privacy_engine.make_private_with_epsilon(
    module=model,
    optimizer=optimizer,
    data_loader=data_loader,
    epochs=EPOCHS,
    target_epsilon=EPSILON,
    target_delta=DELTA,
    max_grad_norm=MAX_GRAD_NORM,
)

Increased focus on data handling

You might have noticed that we are now passing data loader to make_private in addition to module and optimizer. This is intentional. Batch sampling is an important component of DP-SGD (e.g. privacy accounting relies on amplification by sampling) and Poisson sampling is quite tricky to get right, so now Opacus takes control of three PyTorch training objects: model, optimizer, and data loader.

More modularised components

This release makes more functionalities modular, allowing for easy extensibility, while embracing cleaner semantics:

model is wrapped with GradSampleModule, which computes per sample gradients.
optimizer is wrapped with DPOptimizer, which does gradient clipping and noise addition.
data loader is transformed to a DPDataLoader, which performs uniform-with-replacement batch sampling, as required by privacy accountant.
Module validation and fix follows the same pattern as GradSampleModule resulting in compartmentalized validation code that is easily extensible and over-rideable.

Privacy analysis

Privacy analysis functions are now promoted into an Accounant class allowing for a more generic API. This has already allowed us to implement two accountants: RDP (default and recommended one) and Gaussian DP accountant; and will enable you to add more without having to worry about messing with the core library.

- eps, alpha = privacy_engine.get_privacy_spent(delta=target_delta)
+ eps = privacy_engine.get_epsilon(delta=target_delta)

Working around device memory

Training with Opacus consumes more memory as it needs to keep track of per-sample gradients. Opacus 0.x featured the concept of virtual steps - you could decouple the logical batch size (that defined how often model weights are updated and how much DP noise is added) and physical batch size (that defined the maximum physical batch size processed by the model at any one time). While the concept is extremely useful, it suffers from serious flaws when used with Poisson sampling. Opacus 1.0 introduces a BatchMemoryManager for your dataloader, which takes care of the logical and physical batch sizes internally.

Dynamic privacy parameters

Opacus now supports changes to the privacy parameters during training, and adjusts the privacy accounting accordingly. Use various schedulers provided in opacus.scheduler module to adjust the amount of noise during training (the implementation mimics the interface of lr_schedulers). For all the other parameters Opacus supports subsequent calls to make_private method, while maintaining consistent privacy accounting.

Designed to be extensible

Opacus 1.0 is designed to be flexible and extensible.

GradSampleModule supports user-provided grad samplers for custom modules.
DPOptimizer can easily be extended with additional or alternative functionality.
Support for user-provided privacy accountants via optimizer hooks.
Support for custom model validation rules.

PEP 3102

Almost all functions are now PEP 3102 compliant; meaning they only accept keyword arguments. You no longer have to memorize or be confused by the position of the arguments to be passed to a functions. This also makes the API future proof as adding non-default arguments becomes easier.

Lightning Support

Now you can add DP training to PyTorch Ligthning code. The lightning framework allows you to make the code cleaner and avoid boilerplate; simply add make_private call to configure_optimizers() method of your LightningModel. A Lightning version of MNIST task is available as a guide at examples/mnist_lightning.py.

Tutorials

We have updated all the existing tutorials and also added some new tutorials to aid migration. While the changes to the library has been significant, we expect user facing changes to be minimal and simple. Please feel free to reach out to us on our forum if you need help.

New features and bug fixes

We have also added new features and fixed some bugs along the way. Some of the notable ones are:

Robustness against floating point attacks (#260)
Fixing weird einsum behaviour (#242)
Revival of compute privacy script (#251)
Faster unfolding in Conv grad_sampler (#256)
batch_first support for SequenceBias layer (#274)

Source code(tar.gz)
Source code(zip)

v0.15.0(Nov 25, 2021)
New Features

DDP support for faster distributed training (#196)

Support of GRU and RNN. Refactored LSTM implementation. (#222)

PyTorch Lightning Demo (#244)

Bug fixes

Improve nn.Linear grad sampler memory consumption (#192)

Update Opacus to stop using deprecated torch.set_deterministic (#197)

Fix optimizer.step after engine.detach()

Test fixes

Miscellaneous

Better validation error reporting (#199)

grad sampler type checking (#241)

Source code(tar.gz)
Source code(zip)
v0.14.0(Jun 23, 2021)
New features

Major refactoring - per-sample gradient computation is separated into its own module - GradSampleModule (#175)

Improved RDP to (eps, delta)-DP conversion (#162)

Multi-GPU support (#166)

Bug fixes

Handle empty batches in Poisson sampling (#164)

Fixed memory leak from no_grad execution (#180)

Source code(tar.gz)
Source code(zip)
v0.13.0(Mar 10, 2021)
v0.13.0

New features

PackedSequence support for DPLSTM (#150) (thanks @touqir14 !)

Miscellaneous

Pytest moved to dev installation (#144)

Source code(tar.gz)
Source code(zip)
v0.12.0(Mar 3, 2021)
v0.12.0

This version introduces a mildly-breaking change: the privacy engine will now support sampling with variable batch size, just like in the Abadi et al. paper. To accommodate this feature, we have made batch_size a kwarg (no longer positional). We are also enforcing that all kwargs must not be specified positionally. If you had code that passed kwargs positionally, you will find an error (which will be very simple to fix).

New features

Enforce kwargs to Privacy Engine (#136).

Fix batch construction and privacy engine (#128). (thanks @ConstanceBeguier!)

Compute required sigma to reach (epsilon, delta) budget (#126)

Friendly user message for unused parameters (#118).

Print helpful message when models are not in train mode (#113)

Bug fixes

Now the Opacus package has a __version__ attribute.

Fix immer security issue, fix website errors

Updated setup.py version requirements to support 3.6.8 for Windows (#108) (thanks @madhavajay!)

Miscellaneous

Rewrote the grad_sample tests to use Hypothesis (#125). (thanks @touqir14!)

Source code(tar.gz)
Source code(zip)
v0.11.0(Dec 17, 2020)
v0.11.0

New features

Extend DPLSTM to support multilayer, dropout (#101)

Modifications to Char LSTM name classification example

Introduce issue templates for GitHub (#102)

Added support for Conv3D layers

Bug fixes

Linter fixes for Conv3D (#105)

Miscellaneous

Make TorchCSPRNG an optional dependency (#106)

Removed unnecessary calls to zero_grad from examples and tutorials (#96)

Source code(tar.gz)
Source code(zip)
v0.10.1(Nov 21, 2020)
v0.10.1

Bug fixes

Fix PyPI deployment (#91).

Miscellaneous

Refactor grad sample tests (#90).

Avoid storing activations in certain scenarios (#87)

Source code(tar.gz)
Source code(zip)
v0.10.0(Nov 6, 2020)
v0.10.0

New features

Reimplemented the Embedding layer, making it 9x faster with lower memory footprint (#73).

Reimplemented the DPLSTM layer, making it 2x faster with lower memory footprint.

Extended our Conv support to grouped convolutions (#78).

Bug fixes

Small fixes to clipping logic (#45).

Miscellaneous

Changed docstring style from numpy -> Google.

Throw an error if sample rate > 1 in privacy engine.

Migrated our IMDB example from TorchText -> HuggingFace (#85).

Added PRNG shuffling to our examples.

Source code(tar.gz)
Source code(zip)
v0.9.1(Sep 21, 2020)
Bug fixes

Compatibility with Python 3.6 (Minimum required version changed from 3.7 to 3.6.9).

Allow DP-LSTM to have null init.

Source code(tar.gz)
Source code(zip)
v0.9.0(Aug 31, 2020)

With this release, we get out of beta and rename from pytorch-dp to Opacus :)
Source code(tar.gz)
Source code(zip)
opacus-0.9.0-py3-none-any.whl(74.71 KB)
opacus-0.9.0.tar.gz(54.24 KB)
v0.1-beta.1(May 20, 2020)
New Features

Initial commit

Add per-sample gradient support for multiple ops

Expand unit test coverage

Add experimental clipping strategies

Add script for privacy computation

Source code(tar.gz)
Source code(zip)
pytorch-dp-0.1b1.tar.gz(36.17 KB)
pytorch_dp-0.1b1-py3-none-any.whl(49.30 KB)