CS573 Data Privacy and Security Differential Privacy – Machine Learning
Li Xiong
Differential Privacy Machine Learning Li Xiong Big Data + Machine - - PowerPoint PPT Presentation
CS573 Data Privacy and Security Differential Privacy Machine Learning Li Xiong Big Data + Machine Learning + Machine Learning Under Adversarial Settings Data privacy/confidentiality attacks membership attacks, model inversion
Li Xiong
Big Data + Machine Learning
+
Machine Learning Under Adversarial Settings
Differential Privacy for Machine Learning
Neural Networks
Gradient Descent (batch GD)
The cost gradient is based on the complete training set, can be costly and
longer to converge to minimum
Stochastic Gradient Descent (SGD, iterative or online-GD)
Update the weight after each training sample The gradient based on a single training sample is a stochastic
approximation of the true cost gradient
Converges faster but the path towards minimum may zig-zag
Mini-Batch Gradient Descent (MB-GD)
Update the weights based on small group of training samples
Facial Recognitio n Model
Private training dataset
Philip Jack Monica unknown
Input (facial image) Output (label) …
Training-data extraction attacks
Fredrikson et al. (2015) :
Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov
5
Model Training
DATA
Prediction Input data Classification Was this specific data record part of the training set?
airplane automobile … ship truck
8
[Homer et al. (2008)], [Dwork et al. (2015)], [Backes et al. (2016)]
Black-box setting:
9
Model Training API
DATA
Prediction API
Main insight: ML models overfit to their training data
9
Model Training API
DATA
Prediction API
Input from the training set Classification
Main insight: ML models overfit to their training data
9
Model Training API
DATA
Prediction API
Input from the training set Input NOT from the training set Classification Classification
Main insight: ML models overfit to their training data
9
Model Training API
DATA
Prediction API
Input from the training set Input NOT from the training set Classification Classification
Recognize the difference
10
Model Training API
DATA
Prediction API
Input from the training set Input not from the training set Classification Classification
recognize the difference Train a ML model to
11
IN OUT IN OUT IN OUT classification classification classification
Shadow Model 2 Shadow Model k Shadow Model 1
Train the attack model
Train 1 Test 1 Train 2 Test 2 Train k Test k
to predict if an input was a member of the training set (in) or a non-member (out)
(i.e., drawn from same distribution)
classified with high confidence by the target model
12
14
Model
Prediction API
DATA SYNTHETIC
Shadow Shadow Shadow Shadow Shadow Shadow
Shadow Models
DATA
AT TA C K Tr a i n i n g
Attack Model
14
Model
Prediction API
Attack Model
membership probability classification
data record
Model
Prediction API
DATA SYNTHETIC
Shadow Shadow Shadow Shadow Shadow Shadow
Shadow Models
DATA
AT TA C K Tr a i n i n g
Attack Model
15
Purchase Dataset — Classify Customers (100 classes)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.4 0.6 0.8 1 Cumulative Fraction of Classes Real Data Marginal-Based Synthetic Model-Based Synthetic
shadows trained
0.93
shadows trained
0.89
Membership inference precision
16
data universe
training set Model
16
data universe
training set Model
Does the model leak information about data in the training set?
16
data universe
training set Model
Does the model leak information about data in the training set? Does the model generalize to data
16
data universe
training set Model
Does the model leak information about data in the training set? Does the model generalize to data
17
Privacy-preserving machine learning
Privacy
Utility
(prediction accuracy)
Differential Privacy for Machine Learning
Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal T alwar , Li Zhang
* OpenAI
Dwork, Kenthapadi, McSherry, Mironov, Naor , “Our Data, Ourselves”, Eurocrypt 2006
Composition theorem
Dwork, Rothblum, Vadhan, “Boosting and Differential Privacy”, FOCS 2010 Dwork, Rothblum, “Concentrated Differential Privacy”, https://arxiv .org/abs/1603.0188
Baseline no privacy
Baseline [SS15] [WKC+16] no privacy
reports ε per parameter
ε =2
Baseline [SS15]
[WKC+16]
this work
no privacy
reports ε per parameter
ε =2 ε =8 δ = 10-5 ε =2 δ = 10-5 ε =0.5 δ = 10-5
○ https://github.com/tensorflow/models
○ Bounding sensitivity ofupdates ○ Moments accountant to keep tracking of privacy loss
○ Recommendations for selection ofhyperparameters
Differential Privacy for Machine Learning
In their work, the threat model assumes:
Intuitive privacy analysis:
the privacy cost is small.
information
The aggregated teacher violates the threat model:
privacy budgets create a tension between the accuracy and number of predictions
Privacy guarantees should hold in the face of white-box adversaries
Privacy Analysis:
information can be revealed from student model is unlabeled public data and labels from aggregate teacher which is protected with privacy
Generator:
Input: noise sampled from random distribution Output: synthetic input close to the expected training distribution
Discriminator:
Input: output from generator OR example from real training distribution Output: in distribution OR fake
Gaussia n sample Fake sample Sample
P(real) = … P(fake) = …
IJ Goodfellow et al. (2014) Generative Adversarial Networks
2 computing models
Generator:
Input: noise sampled from random distribution Output: synthetic input close to the expected training distribution
Discriminator:
Input: output from generator OR example from real training distribution Output: in distribution (which class) OR fake
Gaussia n sample Fake sample Sample
P(real0) = … P(real1) = … … P(realN) = … P(fake) = …
Generator Discriminato r Public Data Queries Not available to the adversary Available to the adversary
(2, 10−5) (8, 10−5) 97% 95% (0.5, 10−5) 90%
M Abadi et al. (2016) Deep Learning with Differential Privacy
increase # teachers will increase privacy guarantee, but decrease model accuracy # teachers is constrained by task’s complexity and the available data