Differential Privacy Machine Learning Li Xiong Big Data + Machine - PowerPoint PPT Presentation

CS573 Data Privacy and Security Differential Privacy – Machine Learning Li Xiong

Big Data + Machine Learning +

Machine Learning Under Adversarial Settings • Data privacy/confidentiality attacks • membership attacks, model inversion attacks • Model integrity attacks • Training time: data poisoning attacks • Inference time: adversarial examples

Differential Privacy for Machine Learning • Data privacy attacks • Model inversion attacks • Membership inference attacks • Differential privacy for deep learning • Noisy SGD • PATE

Neural Networks

Learning the parameters: Gradient Descent

Stochastic Gradient Descent  Gradient Descent (batch GD)  The cost gradient is based on the complete training set, can be costly and longer to converge to minimum  Stochastic Gradient Descent (SGD, iterative or online-GD)  Update the weight after each training sample  The gradient based on a single training sample is a stochastic approximation of the true cost gradient  Converges faster but the path towards minimum may zig-zag  Mini-Batch Gradient Descent (MB-GD)  Update the weights based on small group of training samples

Training-data extraction attacks Fredrikson et al. (2015) : Output (label) Private training dataset Philip Facial Jack Input (facial image) Recognitio Monica n Model … unknown

Membership Inference Attacks against Machine Learning Models Reza Shokri , Marco Stronati, Congzheng Song, Vitaly Shmatikov

5 Membership Inference Attack Model Prediction Training Was this specific DATA data record part of Input Classification the training set? data airplane automobile … ship truck

8 Membership Inference Attack on Summary Statistics • Summary statistics (e.g., average) on each attribute • Underlying distribution of data is known [Homer et al. (2008)], [Dwork et al. (2015)], [Backes et al. (2016)] on Machine Learning Models Black-box setting: • No knowledge about the models’ parameters • No access to internal computations of the model • No knowledge about the underlying distribution of data

9 Exploit Model’s Predictions Main insight : ML models overfit to Model their training data Prediction API Training API DATA

9 Exploit Model’s Predictions Main insight : ML models overfit to Model their training data Prediction API Training API Input from DATA Classification the training set

9 Exploit Model’s Predictions Main insight : ML models overfit to Model their training data Prediction API Training API Input from DATA Classification the training set Input NOT from Classification the training set

9 Exploit Model’s Predictions Model Prediction API Training API Input from DATA Classification the training set Input NOT from Classification the training set Recognize the difference

10 ML against ML Model Prediction API Training API Input from DATA Classification the training set Input not from Classification the training set recognize the difference Train a ML model to

11 Train Attack Model using Shadow Models … Shadow Shadow Shadow Model 1 Model 2 Model k classification classification classification Train 1 Test 1 Train 2 Test 2 Train k Test k IN OUT IN OUT IN OUT Train the attack model to predict if an input was a member of the training set (in) or a non-member (out)

12 Obtaining Data for Training Shadow Models • Real : similar to training data of the target model (i.e., drawn from same distribution) • Synthetic : use a sampling algorithm to obtain data classified with high confidence by the target model

14 Constructing the Attack Model SYNTHETIC AT TA C K Tr a i n i n g Attack Shadow DATA DATA Shadow Model Model Shadow Shadow Shadow Shadow Shadow Models Prediction API

14 Constructing the Attack Model SYNTHETIC AT TA C K Tr a i n i n g Attack Shadow DATA DATA Shadow Model Model Shadow Shadow Shadow Shadow Shadow Models Prediction API Using the Attack Model Attack Model Model one single membership classification data record probability Prediction API

15 1 Real Data Marginal-Based Synthetic 0.9 Model-Based Synthetic Cumulative Fraction of Classes 0.8 0.7 0.6 overall accuracy: 0.5 0.89 0.4 shadows trained overall accuracy: 0.3 on synthetic data 0.93 0.2 shadows trained 0.1 on real data 0 0 0.2 0.4 0.6 0.8 1 Membership inference precision Purchase Dataset — Classify Customers (100 classes)

16 Privacy Learning Model training set data universe

16 Privacy Learning Does the model leak information about data in the training set? Model training set data universe

16 Privacy Learning Does the model leak Does the model information about data generalize to data in the training set? outside the training set? Model training set data universe

16 Privacy Learning Does the model leak Does the model information about data generalize to data in the training set? outside the training set? Model Overfitting is training set the common enemy! data universe

17 Not in a Direct Conflict! Privacy-preserving machine learning Utility (prediction accuracy) Privacy

Differential Privacy for Machine Learning • Data privacy attacks • Model inversion attacks • Membership inference attacks • Differential privacy for deep learning • Noisy SGD • PATE

DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal T alwar , Li Zhang Google * OpenAI

Differential Privacy (ε, δ) -Differential Privacy: The distribution of the output M ( D ) on database D is (nearly) the same as M ( D′ ) : Pr[ M ( D ) ∊ S ] ≤ exp(ε) ∙ Pr[ M ( D′ ) ∊ S ]+δ. ∀ S : quantifies information leakage allows for a small probability of failure

Interpreting Differential Privacy Training Data SGD Model D D′

Differential Privacy: Gaussian Mechanism If ℓ 2 - sensitivity of f : D → ℝ n : max D , D ′ || f ( D ) − f ( D ′)|| 2 < 1, then the Gaussian mechanism f ( D ) + N n (0, σ 2 ) offers (ε, δ) - differential privacy , where δ ≈ exp(- (εσ) 2 /2). Dwork, Kenthapadi, McSherry, Mironov, Naor , “Our Data, Ourselves”, Eurocrypt 2006

Basic Composition Theorem If f is (ε 1 , δ 1 ) -DP and g is (ε 2 , δ 2 ) -DP , then f ( D ), g ( D ) is (ε 1 +ε 2 , δ 1 +δ 2 ) -DP

Simple Recipe for CompositeFunctions Tocompute composite f with differential privacy 1. Bound sensitivity of f ’s components 2. Apply the Gaussian mechanism to each component 3. Compute total privacy via the composition theorem

Deep Learning with DifferentialPrivacy

Differentially Private Deep Learning 1. Loss function softmax loss 2. Training / Test data MNIST andCIFAR-10 PCA+ neural network 3. Topology Differentially private SGD 4. Training algorithm tune experimentally 5. Hyperparameters

Naïve Privacy Analysis 1. Choose = 4 (1.2, 10 -5 ) -DP 2. Each step is (ε, δ) -DP 3. Number of steps T 10,000 (12,000, .1) -DP 4. Composition: ( T ε, T δ) -DP

Advanced Composition Theorems

Composition theorem +ε for Blue +.2ε for Blue + ε for Red

Strong Composition Theorem 1. Choose = 4 (1.2, 10 -5 ) -DP 2. Each step is (ε, δ) -DP 3. Number of steps T 10,000 4. Strong comp: ( , T δ) -DP (360, .1) -DP Dwork, Rothblum, Vadhan, “Boosting and Differential Privacy”, FOCS 2010 Dwork, Rothblum, “Concentrated Differential Privacy”, https://arxiv .org/abs/1603.0188

Amplification by Sampling = 4 1. Choose 1% 2. Each batch is q fraction of data (.024, 10 -7 ) -DP 3. Each step is (2 q ε, q δ) -DP 4. Number of steps T 10,000 (10, .001) -DP 5. Strong comp: ( , qT δ) -DP S. Kasiviswanathan, H. Lee, K. Nissim, S. Raskhodnikova, A. Smith, “What Can We Learn Privately?”, SIAM J. Comp, 2011

Moments Accountant = 4 1. Choose 1% 2. Each batch is q fraction of data 3. Keeping track of privacy loss’s moments 4. Number of steps T 10,000 (1.25, 10 -5 ) -DP 5. Moments: ( , δ) -DP

Results

Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: CIFAR-10 dataset: 70,000 images 60,000 color images 28 ⨉ 28 pixels each 32 ⨉ 32 pixels each

Summary of Results Baseline no privacy MNIST 98.3% CIFAR-10 80%

Summary of Results [SS15] [WKC+16] Baseline reports ε per no privacy ε =2 parameter MNIST 98.3% 98% 80% CIFAR-10 80%

Summary of Results [SS15] this work Baseline [WKC+16] ε =8 ε =2 ε =0.5 reports ε per ε =2 no privacy parameter δ = 10 -5 δ = 10 -5 δ = 10 -5 MNIST 98.3% 98% 80% 97% 95% 90% CIFAR-10 80% 73% 67%

Contributions Differentially private deep learning applied to publicly ● available datasets and implemented in TensorFlow ○ https://github.com/tensorflow/models ● Innovations ○ Bounding sensitivity ofupdates Moments accountant to keep tracking of privacy loss ○ Lessons ● Recommendations for selection ofhyperparameters ○ Full version: https://arxiv .org/abs/1607.00133 ●

Differential Privacy Machine Learning Li Xiong Big Data + Machine - PowerPoint PPT Presentation

CS573 Data Privacy and Security Differential Privacy Machine Learning Li Xiong Big Data + Machine Learning + Machine Learning Under Adversarial Settings Data privacy/confidentiality attacks membership attacks, model inversion

CS573 Data Privacy and Security Differential Privacy Real World Deployments Li Xiong

Toniann Pitassi Outline 1. Differential Privacy: The Basics 2. Differential Privacy in New

Differential Privacy Techniques Beyond Differential Privacy Steven Wu Assistant Professor

Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques

CS573 Data Privacy and Security Local Differential Privacy Li Xiong Privacy at Scale: Local

Differential Privacy (Part III) Approximate (or ( , ))-differential privacy

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

Differential Privacy Maria-Florina Balcan 04/22/2015 Learning and Privacy To do machine

DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOLATILES DIFFERENTIAL AROMA

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

Graph Analysis with Node Differential Privacy Node Differential Privacy Sofya Sofya

Mobile Data Collection and Analysis with Local Differential Privacy - Part 1 Ninghui Li (Purdue

Data Mining with Differential Privacy Arik Friedman and Assal Schuster by Slawomir Goryczka

iSCSI SANs Dont Have To Suck Derek J. Balling Data Center Manager derekb@answers.com

Introduc)onto Computa)onal LexicalSeman)cs BillMacCartney

Multiple zeta values for classical special functions Tanay Wakhare, Christophe Vignat May 23,

the Control System? Stephane Deghaye (BE-CO) Many thanks to all the colleagues who gave me

tr t rr

Anglo-Saxon Short Poetry 05.23.13 || English 2322: British Literature: Anglo-Saxon Mid 18th

class Fruit: ripe = False def init(self, taste, size): self.taste = taste self.size = size

Natural Language Processing with Deep Learning LSTM, GRU, and applications in summarization and

Differential Privacy Machine Learning Li Xiong Big Data + Machine - PowerPoint PPT Presentation

CS573 Data Privacy and Security Differential Privacy Machine Learning Li Xiong Big Data + Machine Learning + Machine Learning Under Adversarial Settings Data privacy/confidentiality attacks membership attacks, model inversion

CS573 Data Privacy and Security Differential Privacy Real World Deployments Li Xiong

Toniann Pitassi Outline 1. Differential Privacy: The Basics 2. Differential Privacy in New

Differential Privacy Techniques Beyond Differential Privacy Steven Wu Assistant Professor

Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques

CS573 Data Privacy and Security Local Differential Privacy Li Xiong Privacy at Scale: Local

Differential Privacy (Part III) Approximate (or ( , ))-differential privacy

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

Differential Privacy Maria-Florina Balcan 04/22/2015 Learning and Privacy To do machine

DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOLATILES DIFFERENTIAL AROMA

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

Graph Analysis with Node Differential Privacy Node Differential Privacy Sofya Sofya

Mobile Data Collection and Analysis with Local Differential Privacy - Part 1 Ninghui Li (Purdue

Data Mining with Differential Privacy Arik Friedman and Assal Schuster by Slawomir Goryczka

iSCSI SANs Dont Have To Suck Derek J. Balling Data Center Manager derekb@answers.com

Introduc)onto Computa)onal LexicalSeman)cs BillMacCartney

Multiple zeta values for classical special functions Tanay Wakhare, Christophe Vignat May 23,

the Control System? Stephane Deghaye (BE-CO) Many thanks to all the colleagues who gave me

tr t rr

Anglo-Saxon Short Poetry 05.23.13 || English 2322: British Literature: Anglo-Saxon Mid 18th

class Fruit: ripe = False def __init__(self, taste, size): self.taste = taste self.size = size

Natural Language Processing with Deep Learning LSTM, GRU, and applications in summarization and

class Fruit: ripe = False def init(self, taste, size): self.taste = taste self.size = size