Model Selection Model Selection under Covariate Shift under - PowerPoint PPT Presentation

Model Selection Model Selection under Covariate Shift under Covariate Shift Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan Klaus-Robert Müller Fraunhofer FIRST, Berlin, Germany University of Potsdam, Potsdam, Germany

2 Standard Regression Problem Standard Regression Problem � Learning target function: � Training examples: � Test input ： � Goal: Obtain approximation that minimizes expected error for test inputs (or generalization error)

3 Training Input Distribution Training Input Distribution � Common assumption: Training input follows the same distribution as test input: � Here, we suppose distributions are different. Covariate shift

4 Covariate Shift Covariate Shift � Is covariate shift important to investigate? � Yes! It often happens in reality. � Interpolation / extrapolation � Active learning (experimental design) � Classification from imbalanced data

5 Ordinary Least Squares Ordinary Least Squares under Covariate Shift under Covariate Shift � Asymptotically unbiased if model is correct. � Asymptotically biased for misspecified models. � Need to reduce bias.

6 Weighted Least Squares Weighted Least Squares for Covariate Shift for Covariate Shift (Shimodaira, 2000) :Assumed known and strictly positive � Asymptotically unbiased for misspecified models. � Can have large variance. � Need to reduce variance.

7 -Weighted Least Squares -Weighted Least Squares (Shimodaira, 2000) Large bias Small bias (Intermediate) Small variance Large variance should be chosen appropriately! (Model Selection)

8 Generalization Error Estimation Generalization Error Estimation under Covariate Shift under Covariate Shift � is determined so that (estimated) True generalization error generalization error is minimized. Cross-validation � However, standard methods such as cross-validation is Proposed estimator heavily biased. � Goal: Derive better estimator

9 Setting Setting � I.i.d. noise with mean 0 and variance � Linear regression model: � -weighted least squares:

10 Decomposition of Decomposition of Generalization Error Generalization Error Accessible Estimated Constant (ignored) � We estimate

11 Orthogonal Decomposition of Orthogonal Decomposition of Learning Target Function Learning Target Function :Optimal parameter

12 Unbiased Estimation of Unbiased Estimation of :Expectation over noise � Suppose we have , which gives linear unbiased estimator of � :Unbiased estimator of noise variance � � Then we have an unbiased estimator of : � But are not always available. Use approximations instead

13 Approximations of Approximations of � � � If model is correct, � If model is misspecified,

14 New Generalization Error Estimator New Generalization Error Estimator Bias ： � If model is correct, � If model is almost correct, � If model is misspecified,

15 Simulation (Toy) Simulation (Toy)

16 Results Results True generalization error 10-fold cross-validation Proposed estimator

17 Simulation (Abalone from DELVE) Simulation (Abalone from DELVE) � Estimate the age of abalones from 7 physical measurements. � We add bias to 4 th attribute (weight of abalones) � Training and test input densities are estimated by standard kernel density estimator. �

18 Generalization Error Estimation Generalization Error Estimation Mean over 300 trials True gen error 10CV Proposed

19 Test Error After Model Selection Test Error After Model Selection Extrapolation in 4 th attribute n 50 200 800 9.86 ± 4.27 7.40 ± 1.77 6.54 ± 1.34 OPT 11.67 ± 5.74 7.95 ± 2.15 6.77 ± 1.40 Proposed 10.88 ± 5.05 8.06 ± 1.91 7.24 ± 1.37 10CV T-test (5%) Extrapolation in 6 th attribute n 50 200 800 9.04 ± 4.04 6.76 ± 1.68 6.05 ± 1.25 OPT 10.67 ± 6.19 7.31 ± 2.24 6.20 ± 1.33 Proposed 10.15 ± 4.95 7.42 ± 1.81 6.68 ± 1.25 10CV

20 Conclusions Conclusions � Covariate shift: Training and test input distributions are different � Ordinary LS: Biased � Weighted LS: Unbiased but large variance. � -WLS: Model selection needed. � Cross-validation: Biased � Proposed generalization error estimator: � Exactly unbiased (correct models) � Asymptotically unbiased (misspecified models)

Model Selection Model Selection under Covariate Shift under - PowerPoint PPT Presentation

Model Selection Model Selection under Covariate Shift under Covariate Shift Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan Klaus-Robert Mller Fraunhofer FIRST, Berlin, Germany University of Potsdam, Potsdam, Germany 2

Covariate Adjustment and Statistical Power Tara Slough EGAP Learning Days X Covariate Adjustment

1 2 nd Shift Associates 2 nd Shift Associates 3 rd Shift Associates 3 rd Shift Associates 2

Importance-Weighted Cross- Importance-Weighted Cross- Validation for Covariate Shift Validation

Motivation: disease progression modelling Covariate-GPLVM Motivation: disease progression

PRLab TUDelft NL LEARNING UNDER COVARIATE SHIFT Domain Adaptation, Transfer Learning, Data

HOLY SHIFT! Linda Zheng Roadmap You are here My Shift Introduction Shift AST Experience

Strong Baselines for Neural Semi-supervised Learning under Domain Shift Sebastian Ruder Barbara

Paradigm Shift: Moving from Vertical Paradigm Shift: Moving from Vertical Paradigm Shift:

Sharon Mast, Facilitator IIRP World Conference Bethlehem PA October 27, 2014 Shift your

Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton

Covariate Balancing Propensity Score Kosuke Imai Princeton University Winter Conference in

Covariate Balancing Propensity Score Kosuke Imai Princeton University June 1, 2012 Joint work

Treatment choice with many covariate values Aleksey Tetenov (University of Bristol) Cemmap

TATA HARRIER Harrier Gear Shift Knob TATA HARRIER GEAR KNOB TATA NEXON Nexon Gear Shift Knob

Shift Work and the Impact on Wellbeing Helen Lawson Objectives Shift work in context &

VHDL Modeling for Synthesis Hierarchical Design Textbook Section 4.8: Add and Shift Multiplier

OpenTHOS Multi-window Introduction Chen Gang <chengang@emindsoft.com.cn> 2016-09-24

Steroids For Your App Security Assessment Marco Grassi Mobile Security Researcher ~ whoami

Philosophy of Mind PHIL 255 Chris Eliasmith T / Th 4-5 : 20 p AL 208 The Traditional View:

AP BIOLOGY Investigation #7 Cell Division: Mitosis and Meiosis Summer 2014 www.njctl.org Slide

A GENERIC FRAMEWORK FOR ENGAGING ONLINE DATA SOURCES IN INTRODUCTORY PROGRAMMING COURSES NADEEM

Subspace Clustering Ensembles Carlotta Domeniconi Department of Computer Science George Mason

Projective Clustering Ensembles F. Gullo C. Domeniconi A. Tagarelli Dept. of

c i,j max k,m c k,m 4 Wednesday, 2 Oct. 2019 Machine Learning (COMP 135) 3 Wednesday, 2

Model Selection Model Selection under Covariate Shift under - PowerPoint PPT Presentation

Model Selection Model Selection under Covariate Shift under Covariate Shift Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan Klaus-Robert Mller Fraunhofer FIRST, Berlin, Germany University of Potsdam, Potsdam, Germany 2

Covariate Adjustment and Statistical Power Tara Slough EGAP Learning Days X Covariate Adjustment

1 2 nd Shift Associates 2 nd Shift Associates 3 rd Shift Associates 3 rd Shift Associates 2

Importance-Weighted Cross- Importance-Weighted Cross- Validation for Covariate Shift Validation

Motivation: disease progression modelling Covariate-GPLVM Motivation: disease progression

PRLab TUDelft NL LEARNING UNDER COVARIATE SHIFT Domain Adaptation, Transfer Learning, Data

HOLY SHIFT! Linda Zheng Roadmap You are here My Shift Introduction Shift AST Experience

Strong Baselines for Neural Semi-supervised Learning under Domain Shift Sebastian Ruder Barbara

Paradigm Shift: Moving from Vertical Paradigm Shift: Moving from Vertical Paradigm Shift:

Sharon Mast, Facilitator IIRP World Conference Bethlehem PA October 27, 2014 Shift your

Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton

Covariate Balancing Propensity Score Kosuke Imai Princeton University Winter Conference in

Covariate Balancing Propensity Score Kosuke Imai Princeton University June 1, 2012 Joint work

Treatment choice with many covariate values Aleksey Tetenov (University of Bristol) Cemmap

TATA HARRIER Harrier Gear Shift Knob TATA HARRIER GEAR KNOB TATA NEXON Nexon Gear Shift Knob

Shift Work and the Impact on Wellbeing Helen Lawson Objectives Shift work in context &amp;

VHDL Modeling for Synthesis Hierarchical Design Textbook Section 4.8: Add and Shift Multiplier

OpenTHOS Multi-window Introduction Chen Gang &lt;chengang@emindsoft.com.cn&gt; 2016-09-24

Steroids For Your App Security Assessment Marco Grassi Mobile Security Researcher ~ whoami

Philosophy of Mind PHIL 255 Chris Eliasmith T / Th 4-5 : 20 p AL 208 The Traditional View:

AP BIOLOGY Investigation #7 Cell Division: Mitosis and Meiosis Summer 2014 www.njctl.org Slide

A GENERIC FRAMEWORK FOR ENGAGING ONLINE DATA SOURCES IN INTRODUCTORY PROGRAMMING COURSES NADEEM

Subspace Clustering Ensembles Carlotta Domeniconi Department of Computer Science George Mason

Projective Clustering Ensembles F. Gullo C. Domeniconi A. Tagarelli Dept. of

c i,j max k,m c k,m 4 Wednesday, 2 Oct. 2019 Machine Learning (COMP 135) 3 Wednesday, 2

Shift Work and the Impact on Wellbeing Helen Lawson Objectives Shift work in context &

OpenTHOS Multi-window Introduction Chen Gang <chengang@emindsoft.com.cn> 2016-09-24