Differential Privacy for Regularised Linear Regression (slides) - PowerPoint PPT Presentation

Introduction Functional Mechanism Experiments Discussion Conclusion Differential Privacy for Regularised Linear Regression (slides) Ashish Dandekar, Debabrota Basu, St´ ephane Bressan July 18, 2019 1 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Differential Privacy A randomised algorithm M with domain D is ǫ -differentially private if for all S ∈ Range ( M ) and D , D ′ ∈ D such that D and D ′ are neighbouring datasets �� Pr ( M ( D ) ∈ S ) � � log ≤ ǫ � � Pr ( M ( D ′ ) ∈ S ) � � Privacy preserving mechanisms that satisfy the definition of ǫ -differential privacy are ǫ -differentially private mechanisms. 2 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Linear Regression Linear regression uses a linear function to map predictor attributes, x i ∈ R d , of a data point t i = ( x i , y i ) to its response attribute, y i ∈ R . 3 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Objective function for the regression Linear Regression The parameters θ ∈ R d are estimated by minimising mean squared loss over the training dataset, T .    1 θ ∗ = arg min � ( x t i θ − y i ) 2  | T | θ t i ∈ T 4 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Objective function for regularised linear regressions Ridge �� 1 θ ∗ = arg min θ n ( X θ − Y ) 2 � + λ � θ � 2 � 2 LASSO �� 1 θ ∗ = arg min θ n ( X θ − Y ) 2 � � + λ � θ � 1 Elastic net �� 1 θ ∗ = arg min θ n ( X θ − Y ) 2 � + λ ( α � θ � 2 � 2 + (1 − α ) � θ � 1 ) 5 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Functional mechanism The functional mechanism is a privacy preserving mechanism that introduces noise in the objective function that is minimised to estimate the parameters of a machine learning model. Zhang et al. propose a functional mechanism. [ ? ] that satisfies ǫ -differential privacy. It does so by adding a calibrated amount of Laplace noise in the coefficients of the Taylor series expansion of the function of the linear regression. 6 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Satisfying ǫ -differential privacy Sensitivity Sensitivity of a function f , denoted as ∆ f , is the maximum change in the output when any pair of neighbouring datasets is given to the function. ∆ f = max x ∼ y � f ( x ) − f ( y ) � 1 Laplace Mechanism Dwork et al. [ ? ] propose a privacy preserving mechanism that adds Laplace random noise to the output of a function, f . If the scale of the Laplace random variable is set to ∆ f ǫ , the mechanism satisfies ǫ -differential privacy. 7 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Satisfying ǫ -differential privacy Sensitivity of the mean squared loss In [ ? ], Zhang et al. show that the sensitivity of the loss function less than to two times the sum of maximum values of coefficients in the Taylor expansion. Taylor expansion of the mean squared loss 1 � ( x t i θ − y i ) 2 loss T ( θ ) = | T | t i ∈ T   d 1 ( y i ) 2 − 1 � � �  θ j =  2 y i x ij | T | | T | ( x i , y i ) ∈ T j =1 d i ∈ T   + 1 �  �  θ j θ l x ij x jl | T | 1 ≤ j , l ≤ d d i ∈ T 7 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Satisfying ǫ -differential privacy If we normalize all the attributes in any dataset, D , of size n such that they lie in [ − 1 , 1],   d ∆ loss ( θ ) ≤ 2  y 2 + 2 � � n max yx ij + x ij x jl  t i ∈ D j =1 1 ≤ j , l ≤ d = 2 n (1 + 2 d + d 2 ) 7 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Summing up... For a training dataset T with n data points, the predictor attributes can be represented as a matrix X ∈ R n × d and the response attribute as a vector Y ∈ R n . In this notation, the loss function is written as: loss T ( θ ) = 1 θ t ( X t X ) θ − 2 θ t ( X t Y ) + Y t Y � � n = 1 θ t ( M ) θ − 2 θ t ( N ) + O � � n Laplace mechanism is used that adds Laplace (0 , ∆ f ǫ ) noise in M , N and O . And the output of the functional mechanism, θ ∗ , is calculated as: 1 θ ∗ = arg min � θ t ( M noisy ) θ − 2 θ t ( N noisy ) + O noisy � n θ 8 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Dataset Canonical Census Dataset We conduct experiments on a subset of the 2000 US census dataset provided by Minnesota Population Center in its Integrated Public Use Microdata Series [ ? ].We consider 316 , 276 records of the heads of households in our dataset. Each record has 6 attributes, namely, Age, Gender, Race, Marital Status, Education, Income out of which Age and Income are numerical attributes and rest of them are categorical attributes. Regression analysis is performed using Income as the response attribute and the rest of the attributes as predictor attributes. 9 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Results Boxplot of RMSE of elastic net regression with functional mechanism for different values of ǫ for the census dataset. 10 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Results 10 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Discussion Instability in the result Addition of noise to the quadratic term in the Taylor expansion leads to loss in the convexity of the objective function. Non-convex objective functions do not possess a global minimum. Hence, the local optima give rise to unstable results. Validity of the privacy guarantee In [], functional mechanism proves the privacy guarantee of the loss function. Authors say that the same proof naturally leads to the privacy guarantee of the parameters of the regression. 11 / 12

Introduction Functional Mechanism Experiments Discussion Conclusion Conclusion 12 / 12

Differential Privacy for Regularised Linear Regression (slides) - PowerPoint PPT Presentation

Introduction Functional Mechanism Experiments Discussion Conclusion Differential Privacy for Regularised Linear Regression (slides) Ashish Dandekar, Debabrota Basu, St ephane Bressan July 18, 2019 1 / 12 Introduction Functional

CS573 Data Privacy and Security Differential Privacy Real World Deployments Li Xiong

Toniann Pitassi Outline 1. Differential Privacy: The Basics 2. Differential Privacy in New

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Differential Privacy Techniques Beyond Differential Privacy Steven Wu Assistant Professor

Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques

CS573 Data Privacy and Security Local Differential Privacy Li Xiong Privacy at Scale: Local

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Differential and Linear Cryptanalysis Lars R. Knudsen June 2014 L.R. Knudsen Differential and

Differential Privacy (Part III) Approximate (or ( , ))-differential privacy

Differential forms in non-linear Cartesian differential categories Hayley Reid and Jonathan

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Multiple Regression Analysis Independent Variables Mechanics and Interpretation of OLS

Introduction to machine learning Yifeng Tao School of Computer Science Carnegie Mellon

Chinese investors in Serbia: Case study Speaker: Ivana Kopilovi , Attorney at Law Kopilovic

Advanced Taxation 27.09.20 Virtual Lecture 1 Content / agenda 1. Welcome 2. Overview Weeks 1 4

Poisson Regression Models for Count Data Outline Review Introduction to Poisson

CS 147: Computer Systems Performance Analysis Linear Regression Models 1 / 32 Overview CS147

Overview Course 02402 Introduction to Statistics Running example: Height and weight 1 Lecture

Resource Mobilization for Population Health December 11, 2017 Dave A. Chokshi, MD MSc Chief