1
On Robustness of Principal Component Regression
Anish Agarwal Devavrat Shah, Dennis Shen, Dogyoon Song MIT
On Robustness of Principal Component Regression Anish Agarwal - - PowerPoint PPT Presentation
On Robustness of Principal Component Regression Anish Agarwal Devavrat Shah, Dennis Shen, Dogyoon Song MIT 1 What is PCR? 1 2 What is PCR? 1 3 What is PCR? 1 Step 1: PCA 4 What is PCR? 1 Step 1: PCA ( k -components) 5 What is PCR?
1
Anish Agarwal Devavrat Shah, Dennis Shen, Dogyoon Song MIT
2
3
4
Step 1: PCA
Step 1: PCA
(k-components)
5
6
Step 2: Regression
minimize
7
Step 3: Prediction
8
9
“IF DATA IS (APPROXIMATELY) LOW-DIMENSIONAL, USE PCR!”
Anonymous Data ta Scienti tists ts
10
11
12
13
? ? ? ?
1 3. 3.14
14
15
16
17
? ? ? ?
Representative of modern datasets
18
Causal Inference (Synthetic Control) Time Series Analysis Differentially-private Regression Mixed Valued Regression
(noise by design) (measurement noise) (structural noise) (measurement noise)
19
Causal Inference (Synthetic Control) Time Series Analysis Differentially-private Regression Mixed Valued Regression
(noise by design) (structural noise) (measurement noise) (measurement noise)
20
21
OLS minmax error rate (low-dimensional, noiseless, fully observed covariates) PCR implicitly denoises covariates!
If principal components chosen correctly (" = $)
fraction of observations number of covariates
22
If principal components not chosen correctly (" ≠ $)
Test Error Train Error with PCR(")
PCR implicitly de-noises covariates PCR implicitly performs &'-regularization
Choose k that minimizes above
23
When To and Not to Use PCR? – Look at Spectrum
Magnitude of Singular Values Singular Values (ordered by magnitude)
Case 1 Case 3 Case 2 Case 4
24
Exponential-decaying spectrum is ubiquitous in real-world data
GDP Trajectories (Macroeconomics)
25
Avito Ad-Click Dataset (E-Commerce) Exponential-decaying spectrum is ubiquitous in real-world data
26
Cricket Trajectories (Sports) Exponential-decaying spectrum is ubiquitous in real-world data
27
28
Causal Inference (Synthetic Control) Time Series Analysis Differentially-private Regression Mixed Valued Regression
(noise by design) (measurement noise) (structural noise) (measurement noise)
29
30
Intuitively, an algorithm is ε-differentially private if ou
a stati tatisti tical al query ry on a database ca cannot ch change by mo more than ε due to
pr presence/absence of any us user data record
Example of Statistical Query: “Average Income of all users between ages 25 and 30”
31
database
Laplacian N Noise ⁄ " #
32
Ca Can n we achi hieve good prediction n error and nd still maint ntain n privacy? y? Ye Yes!
33
Ca Can n we achi hieve good prediction n error and nd still maint ntain n privacy? y?
Step 1: Data Owner adds Laplacian Noise Step 2: Analyst Performs PCR
Don Done!
34
Prediction Error
35
36
Magnitude of Singular Values Singular Values (ordered by magnitude)
Case 1 Case 2
de-noises
regularizes
37
Step 1: Dimension Reduction
Linear Case
Li Linea ear l low-di dimens nsional nal covar ariat ate pre- proc processing has many implicit benefits (e.g. de- noising, regularizing)
Non-Linear Case
Does non-linear covariate pre-processing (e.g. GANs) have similar benefits for unstructured data?
38