Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

Overview

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

In this work, we propose an algorithm DP-SCAFFOLD(-warm), which is a new version of the so-called SCAFFOLD algorithm ( warm version : wise initialisation of parameters), to tackle heterogeneity issues under mathematical privacy constraints known as Differential Privacy (DP) in a federated learning framework. Using fine results of DP theory, we have succeeded in establishing both privacy and utility guarantees, which show the superiority of DP-SCAFFOLD over the naive algorithm DP-FedAvg. We here provide numerical experiments that confirm our analysis and prove the significance of gains of DP-SCAFFOLD especially when the number of local updates or the level of heterogeneity between users grows.

Two datasets are studied:

  • a real-world dataset called Femnist (an extended version of EMNIST dataset for federated learning), which you see the Accuracy growing with the number of communication rounds (50 local updates first and then 100 local updates)

image_femnist image_femnist

  • synthetic data called Logistic for logistic regression models, which you see the train loss decreasing with the number of communication rounds (50 local updates first and then 100 local updates),

image_logistic image_logistic

Significant results are available for both of these datasets for logistic regression models.

Structure of the code

  • main.py: four global options are available.
    • generate: to generate data, introduce heterogeneity, split data between users for federated learning and preprocess data
    • optimum (after generate): to run a phase training with unsplitted data and save the "best" empirical model in a centralized setting to properly compare rates of convergence
    • simulation (after generate and optimum): to run several simulations of federated learning and save the results (accuracy, loss...)
    • plot (after simulation): to plot visuals

./data

Contains generators of synthetic (Logistic) and real-world (Femnist) data ( file data_generator.py), designed for a federated learning framework under some similarity parameter. Each folder contains a file data where the generated data (train and test) is stored.

./flearn

  • differential_privacy : contains code to apply Gaussian mechanism (designed to add differential privacy to mini-batch stochastic gradients)
  • optimizers : contains the optimization framework for each algorithm (adaptation of stochastic gradient descent)
  • servers : contains the super class Server (in server_base.py) which is adapted to FedAvg and SCAFFOLD (algorithm from the point of view of the server)
  • trainmodel : contains the learning model structures
  • users : contains the super class User (in user_base.py) which is adapted to FedAvg and SCAFFOLD ( algorithm from the point of view of any user)

./models

Stores the latest models over the training phase of federated learning.

./results

Stores several metrics of convergence for each simulation, each similarity/privacy setting and each algorithm.

Metrics (evaluated at each round of communication):

  • test accuracy over all users,
  • train loss over all users,
  • highest norm of parameter difference (server/user) over all selected users,
  • train gradient dissimilarity over all users.

Software requirements:

  • To download the dependencies: pip install -r requirements.txt

References

You might also like...
Official code for Score-Based Generative Modeling through Stochastic Differential Equations
Official code for Score-Based Generative Modeling through Stochastic Differential Equations

Score-Based Generative Modeling through Stochastic Differential Equations This repo contains the official implementation for the paper Score-Based Gen

Code for
Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Infinitely Deep Bayesian Neural Networks with SDEs This library contains JAX and Pytorch implementations of neural ODEs and Bayesian layers for stocha

Supplementary code for the paper
Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Meta-Solver for Neural Ordinary Differential Equations Towards robust neural ODEs using parametrized solvers. Main idea Each Runge-Kutta (RK) solver w

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations
PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

SDEdit: Image Synthesis and Editing with Stochastic Differential Equations Project | Paper | Colab PyTorch implementation of SDEdit: Image Synthesis a

Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Leibniz is a python package which provide facilities to express learnable partial differential equations with PyTorch
Leibniz is a python package which provide facilities to express learnable partial differential equations with PyTorch

Leibniz is a python package which provide facilities to express learnable partial differential equations with PyTorch

HyDiff: Hybrid Differential Software Analysis

HyDiff: Hybrid Differential Software Analysis This repository provides the tool and the evaluation subjects for the paper HyDiff: Hybrid Differential

Differential fuzzing for the masses!

NEZHA NEZHA is an efficient and domain-independent differential fuzzer developed at Columbia University. NEZHA exploits the behavioral asymmetries bet

PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)
PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

Score-Based Generative Modeling through Stochastic Differential Equations This repo contains a PyTorch implementation for the paper Score-Based Genera

Owner
null
Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

federated is the source code for the Bachelor's Thesis Privacy-Preserving Federated Learning Applied to Decentralized Data (Spring 2021, NTNU) Federat

Dilawar Mahmood 25 Nov 30, 2022
GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning

GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning, as well as corresponding mitigation strategies.

null 129 Dec 30, 2022
Breaching - Breaching privacy in federated learning scenarios for vision and text

Breaching - A Framework for Attacks against Privacy in Federated Learning This P

Jonas Geiping 139 Jan 3, 2023
Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.

Meet Fidesops: Privacy as Code for DSAR Orchestration A part of the greater Fides ecosystem. ⚡ Overview Fidesops (fee-dez-äps, combination of the Lati

Ethyca 44 Dec 6, 2022
Deep learning library for solving differential equations and more

DeepXDE Voting on whether we should have a Slack channel for discussion. DeepXDE is a library for scientific machine learning. Use DeepXDE if you need

Lu Lu 1.4k Dec 29, 2022
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Documentation | FAQ | Release Notes | Roadmap | MACE Model Zoo | Demo | Join Us | 中文 Mobile AI Compute Engine (or MACE for short) is a deep learning i

Xiaomi 4.7k Dec 29, 2022
Radar-to-Lidar: Heterogeneous Place Recognition via Joint Learning

radar-to-lidar-place-recognition This page is the coder of a pre-print, implemented by PyTorch. If you have some questions on this project, please fee

Huan Yin 37 Oct 9, 2022
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data Au

null 14 Nov 28, 2022
Session-based Recommendation, CoHHN, price preferences, interest preferences, Heterogeneous Hypergraph, Co-guided Learning, SIGIR2022

This is our implementation for the paper: Price DOES Matter! Modeling Price and Interest Preferences in Session-based Recommendation Xiaokun Zhang, Bo

Xiaokun Zhang 27 Dec 2, 2022
Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

scikit-opt Swarm Intelligence in Python (Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Algorithm, Immune Algorithm,A

郭飞 3.7k Jan 3, 2023