On Robustness of Principal Component Regression Anish Agarwal - PowerPoint PPT Presentation

On Robustness of Principal Component Regression Anish Agarwal Devavrat Shah, Dennis Shen, Dogyoon Song MIT 1

What is PCR? 1 2

What is PCR? 1 3

What is PCR? 1 Step 1: PCA 4

What is PCR? 1 Step 1: PCA ( k -components) 5

What is PCR? 1 Step 2: Regression minimize 6

What is PCR? 1 Step 3: Prediction 7

When & Why Use PCR 2 8

2 Data Science Folklore “IF DATA IS (APPROXIMATELY) LOW-DIMENSIONAL, USE PCR!” -- -- An Anonymous Data ta Scienti tists ts Whe When n exactly sho should we be usi sing ng PC PCR? 9 -- LOREM IPSUM

2 Key Questions We Answer Theoretical properties of PCR? Is dimension-reduction only benefit to PCR? 10

Our Theoretical Analysis of PCR helps answer following questions.. How low-rank do covariates need to be? How many principal components to pick? How well does PCR perform on a test data (i.e. generalization properties)? 11

Is Dimension-Reduction Only Benefit? NO! 12 -- LOREM IPSUM

2 PCR (as is) works for a wide variety of settings! Noisy ? 0 Missing 3. 3.14 ? 1 Mixed valued ? ? Sensitive 13

We We show PCR R is surprisingly ly robu bust to proble blems ms th that p t plague ue l larg rge-sca scale m modern rn d data tase sets ts Ma Main in Con ontrib ibut ution ion of of this is Wor ork 14 -- LOREM IPSUM

Erro rror-In Vari ariab able Regre ression (S (Setti etting We e Consider) er) 15 -- LOREM IPSUM

2 Classical (high-dimensional) Regression 16

2 Error-in-Variable (EIV) Regression ? ? ? ? Representative of modern datasets 17

2 EIV - Surprising Number of Applications Time Series Analysis (measurement noise) Causal Inference (Synthetic Control) (measurement noise) Differentially-private Regression (noise by design) Mixed Valued Regression (structural noise) 18

2 EIV - Surprising Number of Applications Time Series Analysis (measurement noise) Causal Inference (Synthetic Control) (measurement noise) Differentially-private Regression (noise by design) Mixed Valued Regression (structural noise) 19

Formal R Results 20 -- LOREM IPSUM

2 Theorem (Informal): Training Error If principal components chosen correctly (" = $) number of covariates PCR implicitly denoises covariates! fraction of observations OLS minmax error rate (low-dimensional, noiseless, fully observed covariates) 21

2 Theorem (Informal): Testing Error If principal components not chosen correctly (" ≠ $) Train Error with PCR (") Test Error PCR implicitly de-noises PCR implicitly performs covariates & ' -regularization Choose k that minimizes above 22

2 When To and Not to Use PCR? – Look at Spectrum Use PCR! Don’t Use PCR! Case 3 Magnitude of Case 1 Singular Values Singular Values (ordered by magnitude) Case 4 Case 2 23

2 Exponential-decaying spectrum is ubiquitous in real-world data GDP Trajectories (Macroeconomics) 24

2 Exponential-decaying spectrum is ubiquitous in real-world data Avito Ad-Click Dataset (E-Commerce) 25

2 Exponential-decaying spectrum is ubiquitous in real-world data Cricket Trajectories (Sports) 26

Surprising Applications of PCR 3 27

3 Applications of Error-In-Variable Regression Time Series Analysis (measurement noise) Causal Inference (Synthetic Control) (measurement noise) Differentially-private Regression (noise by design) Mixed Valued Regression (structural noise) 28

Da Data p privacy i is t top-of of-mind mind as s we we inc increasing singly apply ML on n se sensit nsitiv ive use ser data (gene netic ic data, purcha hase se hist history etc.) 29

Standard N Notion o of P Priva vacy i in M ML ε -Differential P Priva vacy Intuitively, an algorithm is ε -differentially private if ou outcom ome of of a a more than ε due to stati tatisti tical al query ry on a database ca cannot ch change by mo pr presence/absence of any us user data record Example of Statistical Query: “ Average Income of all users between ages 25 and 30” 30

hieve ε -di differ eren entially priva vacy? Ho How w to achie Laplace M Mechanism Laplacian N Noise ⁄ " # database 31

Pr Predict ictiv ive Accu ccuracy cy vs. s. Pr Priv ivacy cy Tradeoff ff Ca Can n we achi hieve good prediction n error and nd still maint ntain n privacy? y? Yes! Ye 32

Pr Predict ictiv ive Accu ccuracy cy vs. s. Pr Priv ivacy cy Tradeoff ff Can Ca n we achi hieve good prediction n error and nd still maint ntain n privacy? y? Step 1: Data Owner adds Laplacian Noise Step 2: Analyst Performs PCR Done! Don 33

Wh What i t is s sample c complexity ty c cost f t for r ε - di differential p privacy? Prediction Error Do Does de de-no noising ising st step (PC PCA) break priv ivacy cy? No, PCA only de-noises covariates on average with respect to the - norm 34

Conclusion 4 35

Inspec In pect spec pectrum of yo your cova variate e matrix Magnitude of Case 1 Singular Values Singular Values Use PCR! (ordered by magnitude) de-noises Case 2 regularizes 36

Po Possib ssible Implica icatio ions ns fo for Modern n ML Linear Case Non-Linear Case Step 1: Dimension Reduction PCA GANs? Li Linea ear l low-di dimens nsional nal covar ariat ate pre- Does non-linear covariate pre-processing proc processing has many implicit benefits (e.g. de- (e.g. GANs) have similar benefits for noising, regularizing) unstructured data? 37

Co Come Me Meet Us s At Our Post ster #3 – East Exhibition Hall B + C, 5-7pm, Thursday Po Post ster #3 Shameless Plug Sh ug :) PCR for Time Series Analysis: ts tspd pdb.mit. t.edu PCR for Causal Inference: gi github.com/Rom Romcos os/SC SC_de demo 38

On Robustness of Principal Component Regression Anish Agarwal - PowerPoint PPT Presentation

On Robustness of Principal Component Regression Anish Agarwal Devavrat Shah, Dennis Shen, Dogyoon Song MIT 1 What is PCR? 1 2 What is PCR? 1 3 What is PCR? 1 Step 1: PCA 4 What is PCR? 1 Step 1: PCA ( k -components) 5 What is PCR?

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Functional components Notification component Application received Refuse ? Notification

WIO IOSAP Project Budget Nairobi Convention WIO IOSAP Budget per Project Component COMPONENT

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine Bayessche AG

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Public Keys Arjen K. Lenstra (EPFL, Switzerland) James P. Hughes (Self, Palo Alto, USA) Maxime

Market Imperfections and Concepts (Welch, Chapter 11) Ivo Welch (No) Maintained Assumptions 1.

A Study of Linux File System Evolution Lanyue Lu Andrea C. Arpaci-Dusseau Remzi H.

The Birth of Drama The Birth of Drama The three great Classical tragedians: Aeschylus

Student-Centered Learning: Functional Requirements for Integrated Systems to Optimize Learning

Reverse Traceroute Ethan Katz-Bassett, Harsha V. Madhyastha, Vijay K. Adhikari, Colin Scott,

A Brief Intro to Verilog Brought to you by: Sat Garcia Meet your 141(L) TA Sat Garcia

Hosea Prophet Preached in Northern Kingdom (Israel, Jacob, Ephraim) from 750-725 BCE The