Learning About Selection: An Improved Correction Procedure Iain G. - PowerPoint PPT Presentation

Learning About Selection: An Improved Correction Procedure Iain G. Snoddy 27 July 2018 Ph.D. Candidate Vancouver School of Economics 2018 Canadian Stata Conference

Motivation: Old Method, New Techniques Question: How to estimate the returns to schooling when people select across locations? Influential Paper in Economics to control for self-selection: Dahl (2002), Econometrica 1/16

Dahl’s Contribution 2/16 ◦ Reduces dimension of problem ◦ Non-parametric implementation ◦ Control function approach

Set-up: Roy Model Earnings Equation: Utility Equation: i indexes individuals, c states, j birth state 3/16 y ic = α c + β 1 c s i + β 2 c x i + u ic , c = 1 , . . . , C V ijc = y ic + π ijc , c = 1 , . . . , C where π ijc = γ jc z i + ϵ ijc , c = 1 , . . . , C

The Selection Rule We can re-write the utility function as: The selection rule: k Selection bias: 4/16 V ijc = E [ y ic | s i , x i ] + E [ π ijc | z i ] + ϵ ijc + u ic = ϑ jc + ω ijc ( ) ⇐ ⇒ max ϑ jk − ϑ jc + ω ijk − ω ijc ≤ 0 y ic observed E [ u ic | y ic observed ] = E [ u ic | ϑ jc − ϑ jk ≥ ω ijk − ω ijc , ∀ k ̸ = c ] ̸ = 0

Dahl’s Insight Full set of migration probabilities summarise the selection Estimating equation: j 5/16 problem: ( p ij 1 , ..., p ijN ) ∑ y ic = α c + β 1 c s i + β 2 c x i + M ijc × µ jc ( p ij 1 , ..., p ijN ) + v ic

Dahl’s Assumption Dahl makes the Single Index Suffjciency Assumption (SISA). Which implies: 6/16 All of the information in ( p ij 1 , ..., p ijN ) is summarised by p ijc . cov ( u ic , ω ijm − ω ijc ) = K , ∀ m ̸ = k

Dahl’s Implementation Estimating Equation: j into cells Martin Fournier, and Marc Gurgand 7/16 ∑ y ic = α c + β 1 c s i + β 2 c x i + M ijc × ˆ µ jc ( p ijc ) + v ic ◦ Migration probabilities estimated by grouping individuals ◦ selmlog13 Stata command by François Bourguignon,

8/16 Improvement 1: Better P Estimates ◦ Cell approach involves ad hoc choices ◦ Alternative: use a Neural Network, or Random Forest ◦ Ties researchers’ hands ◦ Reduces variance ◦ Reduces noise from poor predictors

Improvement 2: Better Variable Selection The SISA is restrictive! Start with full model: v ic Use Double-Post LASSO to select included terms! 9/16 y ic = α c + β 1 c s i + β 2 c x i + ˜ µ c (ˆ p i 1 , ..., ˆ p iN ) + ˜

Improvement 2: Double-Post LASSO Belloni, Chernozhukov, and Hansen (2014) LASSO: where t is a free parameter that determines regularization. Procedure: 1. Run LASSO of y on terms 2. Run LASSO of x on terms 3. Run y on x plus terms included in 1 & 2 10/16 ( y − X β ) T ( y − X β ) min subject to || β || 1 ≤ t β

Improvement 2: Does it Work??? Monte Carlo experiment: Use the Roy Model Three cases: 11/16 The SISA: u ic = τ c a i + b ic ◦ SISA holds ◦ SISA weak violation ◦ SISA strong violation

Lassopack Implemented using Lassopack - Ahrens, Hansen, and Schafger Use square-root LASSO: rlasso y p*,sqrt partial(x) rlasso s p*,sqrt partial(x) Use loop over macro e(selected) to select terms 12/16

Improvement 2: Yes it Works! N=1000 LASSO Full OLS N=10000 Table 1: Monte Carlo Output: 5 Sectors LASSO Full OLS 13/16 Bias Bias RMSE Bias RMSE RMSE τ c = 1 τ c = β c τ 1 ̸ = 1 0 . 060 − 0 . 046 0 . 112 − 0 . 105 0 . 064 − 0 . 051 Dahl P1 0 . 049 − 0 . 027 0 . 087 − 0 . 077 0 . 062 − 0 . 048 − 0 . 024 − 0 . 037 0 . 064 0 . 003 0 . 067 0 . 069 0 . 056 0 . 010 0 . 060 − 0 . 018 0 . 058 − 0 . 029 0 . 048 − 0 . 046 0 . 105 − 0 . 105 0 . 052 − 0 . 051 Dahl P1 0 . 019 − 0 . 013 0 . 055 − 0 . 054 0 . 045 − 0 . 044 0 . 037 0 . 014 0 . 034 0 . 004 0 . 035 − 0 . 018 0 . 034 0 . 018 0 . 032 0 . 014 0 . 027 − 0 . 009

Empirical Example

The Returns to Schooling Sample: white males, 25-54, using 1990 US Census. Migration probabilities estimated using: 14/16 ◦ Birth state ◦ 5 education categories ◦ Married ◦ # children 5-18, # children <5 ◦ Divorced ◦ Live with roommate, family member, alone

Final Results College Adv College Double-Post LASSO Table 2: Corrected Estimates versus OLS Adv 15/16 OLS NY Calif. Florida Texas Kansas Illinois 0 . 4291 0 . 4506 0 . 3689 0 . 3465 0 . 4399 0 . 5166 ( 0 . 0075 ) ( 0 . 0098 ) ( 0 . 0096 ) ( 0 . 0192 ) ( 0 . 0084 ) ( 0 . 0086 ) 0 . 5865 0 . 6618 0 . 5445 0 . 4970 0 . 6037 0 . 6840 ( 0 . 0105 ) ( 0 . 0154 ) ( 0 . 0138 ) ( 0 . 0315 ) ( 0 . 0113 ) ( 0 . 0131 ) 0 . 3727 0 . 3919 0 . 3779 0 . 3737 0 . 4192 0 . 5036 ( 0 . 0138 ) ( 0 . 0145 ) ( 0 . 0233 ) ( 0 . 0345 ) ( 0 . 0248 ) ( 0 . 0167 ) 0 . 4864 0 . 5344 0 . 4798 0 . 4807 0 . 5462 0 . 6727 ( 0 . 0205 ) ( 0 . 0209 ) ( 0 . 023 ) ( 0 . 0447 ) ( 0 . 0145 ) ( 0 . 019 )

Final Results College Adv College LASSO v Dahl Table 3: Hausman Test of Difgerence Adv 16/16 LASSO v OLS NY Calif. Florida Texas Kansas Illinois − 5 . 586 ∗∗∗ − 5 . 823 ∗∗∗ 0 . 955 2 . 763 − 2 . 032 − 1 . 254 − 10 . 686 ∗∗∗ − 13 . 021 ∗∗∗ − 7 . 042 ∗∗∗ − 2 . 187 − 6 . 185 ∗∗∗ − 1 . 5 − 5 . 146 ∗∗∗ − 4 . 489 ∗∗∗ 4 . 854 ∗∗ 2 . 809 7 . 366 ∗∗∗ 0 . 727 − 8 . 294 ∗∗∗ − 11 . 12 ∗∗∗ − 1 . 507 − 1 . 648 4 . 893 ∗∗∗ 2 . 334

Learning About Selection: An Improved Correction Procedure Iain G. - PowerPoint PPT Presentation

Learning About Selection: An Improved Correction Procedure Iain G. Snoddy 27 July 2018 Ph.D. Candidate Vancouver School of Economics 2018 Canadian Stata Conference Motivation: Old Method, New Techniques Question: How to estimate the returns

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Quantum Information Processing and Quantum Error Correction and Quantum Error Correction with

Eight Truths about Correction from the Book of Proverbs 3 1. The right attitude to correction

Improved pythonDEVS Simulator Improved pythonDEVS Simulator Improved pythonDEVS Simulator

Optimizing Procedure Calls Inlining Procedure calls can be costly (A.k.a. procedure integration,

Optimizing Procedure Calls Inlining Procedure calls can be costly (A.k.a. procedure integration,

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Conference Site Selection Stephanie Sabal Program Coordinator: Site Selection sabal@acm.org

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

Selection Rules: Selection Rules Each of the spectroscopies have associated selection

Wavefront compensation for deep tissue optical microscopy Full correction System correction 60

Flux Correction of the Land Surface Temperature in the UM Model Chen Li, Dietmar Dommenget What

Who we are & What we do Local Research for Better Lives Slide 1 | Title WHAT IS GDN?

Relationships between necessary optimality conditions for the 2 - 0 minimization problem.

Macroeconomics and Household Heterogeneity Dirk Krueger 1 Kurt Mitman 2 Fabrizio Perri 3 1

The European Research Council and Informa6cs: Situa6on and

ss s trt

Middle Classes Branko Milanovic Senior Scholar, Luxembourg Income Study Center, Visiting

The Dynamics of Inequalities and their Perceptions DynIper Project coordinator: Michel Lubrano 1

The Golden Fetters and the Causal Effects of Countercyclical Monetary Policy Kris James Mitchener

Learning About Selection: An Improved Correction Procedure Iain G. - PowerPoint PPT Presentation

Learning About Selection: An Improved Correction Procedure Iain G. Snoddy 27 July 2018 Ph.D. Candidate Vancouver School of Economics 2018 Canadian Stata Conference Motivation: Old Method, New Techniques Question: How to estimate the returns

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Quantum Information Processing and Quantum Error Correction and Quantum Error Correction with

Eight Truths about Correction from the Book of Proverbs 3 1. The right attitude to correction

Improved pythonDEVS Simulator Improved pythonDEVS Simulator Improved pythonDEVS Simulator

Optimizing Procedure Calls Inlining Procedure calls can be costly (A.k.a. procedure integration,

Optimizing Procedure Calls Inlining Procedure calls can be costly (A.k.a. procedure integration,

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Conference Site Selection Stephanie Sabal Program Coordinator: Site Selection sabal@acm.org

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

Selection Rules: Selection Rules Each of the spectroscopies have associated selection

Wavefront compensation for deep tissue optical microscopy Full correction System correction 60

Flux Correction of the Land Surface Temperature in the UM Model Chen Li, Dietmar Dommenget What

Who we are &amp; What we do Local Research for Better Lives Slide 1 | Title WHAT IS GDN?

Relationships between necessary optimality conditions for the 2 - 0 minimization problem.

Macroeconomics and Household Heterogeneity Dirk Krueger 1 Kurt Mitman 2 Fabrizio Perri 3 1

The European Research Council and Informa6cs: Situa6on and

ss s trt

Middle Classes Branko Milanovic Senior Scholar, Luxembourg Income Study Center, Visiting

The Dynamics of Inequalities and their Perceptions DynIper Project coordinator: Michel Lubrano 1

The Golden Fetters and the Causal Effects of Countercyclical Monetary Policy Kris James Mitchener

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Who we are & What we do Local Research for Better Lives Slide 1 | Title WHAT IS GDN?