generalized sample selection model
play

Generalized sample selection model Magorzata Wojty 1 , Giampiero - PowerPoint PPT Presentation

Generalized sample selection model Magorzata Wojty 1 , Giampiero Marra 2 1 Plymouth University, 2 University College London XLII Konferencja "Statystyka Matematyczna", Bdlewo, November 29, 2016 Magorzata Wojty 1 , Giampiero


  1. Generalized sample selection model Małgorzata Wojtyś 1 , Giampiero Marra 2 1 Plymouth University, 2 University College London XLII Konferencja "Statystyka Matematyczna", Będlewo, November 29, 2016 Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  2. Plan Sample selection problem: Classical Heckman model Generalized model using GAM and copulae Estimation approach Real life application example Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  3. Motivating example Example : HIV prevalence P(HIV positive) ∼ socio-economic and health characteristics Some individuals in the sample refused to say whether they are HIV positive. They may differ in important characteristics from individ- uals who did answer the question. If the link between decision to provide an answer and being HIV pos- itive exists and is not only through observables then sample selection bias arises and univariate equation model is not appropriate. Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  4. Sample selection Regression of primary interest: i ∼ x (1) Y ∗ , i = 1 , . . . , n , i where x ( 1 ) - row vector of predictors. i But: observations on some Y ∗ are missing, based on a combination of i observed and unobserved characteristics. Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  5. Sample selection Regression of primary interest: i ∼ x (1) Y ∗ , i = 1 , . . . , n , i where x ( 1 ) - row vector of predictors. i But: observations on some Y ∗ are missing, based on a combination of i observed and unobserved characteristics. Observables: Y i = Y ∗ i U i , where U i - binary selection variable, U i ∈ { 0 , 1 } . Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  6. Sample selection Regression of primary interest: i ∼ x (1) Y ∗ , i = 1 , . . . , n , i where x ( 1 ) - row vector of predictors. i But: observations on some Y ∗ are missing, based on a combination of i observed and unobserved characteristics. Observables: Y i = Y ∗ i U i , where U i - binary selection variable, U i ∈ { 0 , 1 } . Selection mechanism: P ( U i = 1) ∼ x (2) , i where x (2) - vector of covariates. i Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  7. Classical Heckmann’s (1979) model For i = 1 , . . . , n β (1) + ε 1 i i = x (1) Y ∗ i β (2) + ε 2 i i = x (2) U ∗ i Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  8. Classical Heckmann’s (1979) model For i = 1 , . . . , n β (1) + ε 1 i i = x (1) Y ∗ i β (2) + ε 2 i i = x (2) U ∗ i where � ε 1 i �� 0 � σ 2 � � �� ρσ ∼ N , ε 2 i 0 ρσ 1 Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  9. Classical Heckmann’s (1979) model For i = 1 , . . . , n β (1) + ε 1 i i = x (1) Y ∗ i β (2) + ε 2 i i = x (2) U ∗ i where � ε 1 i �� 0 � σ 2 � � �� ρσ ∼ N , ε 2 i 0 ρσ 1 Latent variables: Y ∗ i , U ∗ i . Observables: U i = I ( U ∗ i > 0) ( ⇒ probit regression) Y i = Y ∗ i U i Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  10. Classical Heckmann’s (1979) model For i = 1 , . . . , n β (1) + ε 1 i i = x (1) Y ∗ i β (2) + ε 2 i i = x (2) U ∗ i where � ε 1 i �� 0 � σ 2 � � �� ρσ ∼ N , ε 2 i 0 ρσ 1 Latent variables: Y ∗ i , U ∗ i . Observables: U i = I ( U ∗ i > 0) ( ⇒ probit regression) Y i = Y ∗ i U i Modifications: eg. bivariate t -distribution (Marchenko & Genton, 2012), Archimedean copulas (Smith, 2003). Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  11. Generalized sample selection model Random component Y ∗ ∼ f 1 belongs to an exponential family of distributions: � y η 1 − b ( η 1 ) � f 1 ( y | η 1 , φ ) = exp + c ( y , φ ) φ for some b ( · ) and c ( · ). It holds E ( Y ∗ ) = b ′ ( η 1 ) and Var ( Y ∗ ) = b ′′ ( η 1 ). Selection variable U = I ( U ∗ > 0) and U ∗ ∼ f 2 , where − ( u − η 2 ) 2 � � f 2 ( u | η 2 ) = exp . implying the probit regression model for U . F ( y , u ) – joint cdf of ( Y ∗ , U ∗ ), F 1 ( y ), F 2 ( u ) - marginal cdf’s. C θ – the copula such that F ( y , u ) = C θ ( F 1 ( y ) , F 2 ( u )) , where θ - dependence parameter of copula. Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  12. Generalized sample selection model Random component Y ∗ ∼ f 1 belongs to an exponential family of distributions: � y η 1 − b ( η 1 ) � f 1 ( y | η 1 , φ ) = exp + c ( y , φ ) φ for some b ( · ) and c ( · ). It holds E ( Y ∗ ) = b ′ ( η 1 ) and Var ( Y ∗ ) = b ′′ ( η 1 ). Selection variable U = I ( U ∗ > 0) and U ∗ ∼ f 2 , where − ( u − η 2 ) 2 � � f 2 ( u | η 2 ) = exp . implying the probit regression model for U . F ( y , u ) – joint cdf of ( Y ∗ , U ∗ ), F 1 ( y ), F 2 ( u ) - marginal cdf’s. C θ – the copula such that F ( y , u ) = C θ ( F 1 ( y ) , F 2 ( u )) , where θ - dependence parameter of copula. Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  13. Generalized sample selection model Random component Y ∗ ∼ f 1 belongs to an exponential family of distributions: � y η 1 − b ( η 1 ) � f 1 ( y | η 1 , φ ) = exp + c ( y , φ ) φ for some b ( · ) and c ( · ). It holds E ( Y ∗ ) = b ′ ( η 1 ) and Var ( Y ∗ ) = b ′′ ( η 1 ). Selection variable U = I ( U ∗ > 0) and U ∗ ∼ f 2 , where − ( u − η 2 ) 2 � � f 2 ( u | η 2 ) = exp . implying the probit regression model for U . F ( y , u ) – joint cdf of ( Y ∗ , U ∗ ), F 1 ( y ), F 2 ( u ) - marginal cdf’s. C θ – the copula such that F ( y , u ) = C θ ( F 1 ( y ) , F 2 ( u )) , where θ - dependence parameter of copula. Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  14. Likelihood Likelihood of an observed outcome ( Y , U ): � P ( U = 0) if U = 0 , L = f Y | U ( Y | U = 1) P ( U = 1) if U = 1 , Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  15. Likelihood Likelihood of an observed outcome ( Y , U ): � P ( U = 0) if U = 0 , L = f Y | U ( Y | U = 1) P ( U = 1) if U = 1 , It holds f Y | U ( y | U = 1) = ∂ ∂ y P ( Y ≤ y | U = 1) = Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  16. Likelihood Likelihood of an observed outcome ( Y , U ): � P ( U = 0) if U = 0 , L = f Y | U ( Y | U = 1) P ( U = 1) if U = 1 , It holds f Y | U ( y | U = 1) = ∂ ∂ y P ( Y ≤ y | U = 1) = P ( Y ∗ ≤ y , U ∗ > 0) = ∂ = ∂ F 1 ( y ) − F ( y , 0) = ∂ y P ( U = 1) ∂ y P ( U = 1) Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  17. Likelihood Likelihood of an observed outcome ( Y , U ): � P ( U = 0) if U = 0 , L = f Y | U ( Y | U = 1) P ( U = 1) if U = 1 , It holds f Y | U ( y | U = 1) = ∂ ∂ y P ( Y ≤ y | U = 1) = P ( Y ∗ ≤ y , U ∗ > 0) = ∂ = ∂ F 1 ( y ) − F ( y , 0) = ∂ y P ( U = 1) ∂ y P ( U = 1) 1 � f 1 ( y ) − ∂ � = ∂ y F ( y , 0) P ( U = 1) Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  18. Likelihood Likelihood of an observed outcome ( Y , U ): � P ( U = 0) if U = 0 , L = f Y | U ( Y | U = 1) P ( U = 1) if U = 1 , It holds f Y | U ( y | U = 1) = ∂ ∂ y P ( Y ≤ y | U = 1) = P ( Y ∗ ≤ y , U ∗ > 0) = ∂ = ∂ F 1 ( y ) − F ( y , 0) = ∂ y P ( U = 1) ∂ y P ( U = 1) 1 � f 1 ( y ) − ∂ � = ∂ y F ( y , 0) P ( U = 1) So � P ( U = 0) = F 2 (0) if U = 0 , L = f 1 ( y ) − ∂ ∂ y F ( y , 0) | y = Y if U = 1 , Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  19. Log-likelihood So: � U � f 1 ( y ) − ∂ L ( Y , U ) = F 2 (0) 1 − U × ∂ y F ( y , 0) | y = Y Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  20. Log-likelihood So: � U � f 1 ( y ) − ∂ L ( Y , U ) = F 2 (0) 1 − U × ∂ y F ( y , 0) | y = Y Using copula representation, we obtain log-likelihood: ℓ = (1 − U ) log F 2 (0) + U log ( f 1 ( Y ) (1 − z ( Y , η 1 , η 2 ))) , where z ( y , η 1 , η 2 ) = ∂ � ∂ v C θ ( v , F 2 (0)) � v → F 1 ( y ) The function z can be also expressed as z ( y , η 1 , η 2 ) = P ( U = 0) f Y ∗ | U ( y | U = 0)( f 1 ( y | η 1 )) − 1 . Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  21. The fact that E ( Y ) = b ′ ( η 1 ) implies ∂ ∂η 1 z ( Y , η 1 , η 2 ) ∂ ℓ = U ( Y − µ 1 ) + U ∂η 1 1 − z ( Y , η 1 , η 2 ) where µ 1 = E ( Y ). As E ( ∂ ∂η 1 ℓ ) = 0, � ∂ � ∂η 1 z ( Y , η 1 , η 2 ) Cov ( U , Y ) = − E U 1 − z ( Y , η 1 , η 2 ) which provides another interpretation for the function z ( Y , η 1 , η 2 ). Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

  22. The fact that E ( Y ) = b ′ ( η 1 ) implies ∂ ∂η 1 z ( Y , η 1 , η 2 ) ∂ ℓ = U ( Y − µ 1 ) + U ∂η 1 1 − z ( Y , η 1 , η 2 ) where µ 1 = E ( Y ). As E ( ∂ ∂η 1 ℓ ) = 0, � ∂ � ∂η 1 z ( Y , η 1 , η 2 ) Cov ( U , Y ) = − E U 1 − z ( Y , η 1 , η 2 ) which provides another interpretation for the function z ( Y , η 1 , η 2 ). Małgorzata Wojtyś 1 , Giampiero Marra 2 Generalized sample selection model

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend