On the Behavior of Marginal and Conditional Akaike Information - PowerPoint PPT Presentation

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Sonja Greven Department of Mathematics Department of Biostatistics Carl von Ossietzky University Oldenburg Johns Hopkins University L¨ ubeck, 5.12.2009

Thomas Kneib Outline Outline • Akaike Information Criterion • Linear Mixed Models • Marginal Akaike Information Criterion • Conditional Akaike Information Criterion • Application: Childhood Malnutrition in Nigeria On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 1

Thomas Kneib Akaike Information Criterion Akaike Information Criterion • Most commonly used model choice criterion for comparing parametric models. • Definition: AIC = − 2 l ( ˆ ψ ) + 2 k. where l ( ˆ ψ ) is the log-likelihood evaluated at the maximum likelihood estimate ˆ ψ for the unknown parameter vector ψ and k = dim( ψ ) is the number of parameters. • Properties: – Compromise between model fit and model complexity. – Allows to compare non-nested models. – Selects rather too many than too few variables in variable selection problems. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 2

Thomas Kneib Akaike Information Criterion • Data y generated from a true underlying model described in terms of density g ( · ) . • Approximate the true model by a parametric class of models f ψ ( · ) = f ( · ; ψ ) . • Measure the discrepancy between a model f ψ ( · ) and the truth g ( · ) by the Kullback- Leibler distance � K ( f ψ , g ) = [log( g ( z )) − log( f ψ ( z ))] g ( z ) d z = E z [log( g ( z )) − log( f ψ ( z ))] . where z is an independent replicate following the same distribution as y . • Decision rule: Out of a sequence of models, choose the one that minimises K ( f ψ , g ) . On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 3

Thomas Kneib Akaike Information Criterion • In practice, the parameter ψ will have to be estimated as ˆ ψ ( y ) for the different models. • To focus on average properties not depending on a specific data realisation, minimise the expected Kullback-Leibler distance � � �� E y [ K ( f ˆ ψ ( y ) , g )] = E y E z log( g ( z )) − log( f ˆ ψ ( y ) ( z )) • Since g ( · ) does not depend on the data, this is equivalent to minimising � � �� − 2 E y E z log( f ˆ ψ ( y ) ( z )) (1) (the expected relative Kullback-Leibler distance). On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 4

Thomas Kneib Akaike Information Criterion • The best available estimate for (1) is given by − 2 log( f ˆ ψ ( y ) ( y )) . • While (1) is a predictive quantity depending on both the data y and an independent replication z , the density and the parameter estimate are evaluated for the same data. ⇒ Introduce a correction term. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 5

Thomas Kneib Akaike Information Criterion • Let ˜ ψ denote the parameter vector minimising the Kullback-Leibler distance. • Then AIC = − 2 log( f ˆ ψ ( y ) ( y )) + 2 E y [log( f ˆ ψ ( y ) ( y )) − log( f ˜ ψ ( y ))] + 2 E y [E z [log( f ˜ ψ ( z )) − log( f ˆ ψ ( y ) ( z ))]] is unbiased for (1). On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 6

Thomas Kneib Akaike Information Criterion • Consider the regularity conditions – ψ is a k -dimensional parameter with parameter space Ψ = R k (possibly achieved by a change of coordinates). – y consists of independent and identically distributed replications y 1 , . . . , y n . • In this case, the AIC simplifies since � � a χ 2 2 log( f ˆ ψ ( y ) ( y )) − log( f ˜ ψ ( y )) ∼ k , � � a χ 2 2 E z log( f ˜ ψ ( z )) − log( f ˆ ψ ( y ) ( z )) ∼ k and therefore an (asymptotically) unbiased estimate for (1) is given by AIC = − 2 log( f ˆ ψ ( y ) ( y )) + 2 k. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 7

Thomas Kneib Linear Mixed Models Linear Mixed Models • Mixed models form a very useful class of regression models with general form y = Xβ + Zb + ε where β are usual regression coefficients while b are random effects with distributional assumption σ 2 I � � �� ε 0 0 ∼ N , . b 0 0 D • In the following, we will concentrate on mixed models with only one variance component where b ∼ N( 0 , τ 2 I ) b ∼ N( 0 , τ 2 Σ ) or with Σ known. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 8

Thomas Kneib Linear Mixed Models • Special case I: Random intercept model for longitudinal data y ij = x ′ ij β + b i + ε ij , j = 1 , . . . , J i , i = 1 , . . . , I, where i indexes individuals while j indexes repeated observations on the same individual. • The random intercept b i accounts for shifts in the individual level of response trajectories and therefore also for intra-subject correlations. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 9

Thomas Kneib Linear Mixed Models • Special case II: Penalised spline smoothing for nonparametric function estimation y i = m ( x i ) + ε i , i = 1 , . . . , n, where m ( x ) is a smooth, unspecified function. • Approximating m ( x ) in terms of a spline basis of degree d leads (for example) to the truncated power series representation d K β j x j + � � b j ( x − κ j ) d m ( x ) = + j =0 j =1 where κ 1 , . . . , κ K denotes a sequence of knots. • Assume random effects distribution b ∼ N( 0 , τ 2 I ) for the basis coefficients of truncated polynomials to enforce smoothness. • Works also for other basis choices (e.g. B-splines) and other types of flexible modelling components (varying coefficients, surfaces, spatial effects, etc.). On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 10

Thomas Kneib Linear Mixed Models • Additive mixed models consist of a combination of random effects and flexible modelling components such as penalised splines. • Example: Childhood malnutrition in Zambia. • Determine the nutritional status of a child in terms of a Z-score. • We consider chronic malnutrition measured in terms of insufficient height for age (stunting), i.e. zscore i = cheight i − med , s where med and s are the median and standard deviation of (age-stratified) height in a reference population. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 11

Thomas Kneib Linear Mixed Models • Additive mixed model for stunting: x ′ = i β + m 1 ( cage i ) + m 2 ( cfeed i ) + m 3 ( mage i ) + m 4 ( mbmi i ) zscore i + m 5 ( mheight i ) + b s i + ε i , with covariates gender of the child (1 = male, 0 = female) csex duration of breastfeeding (in months) cfeed age of the child (in months) cage age of the mother (at birth, in years) mage height of the mother (in cm) mheight body mass index of the mother mbmi education of the mother (1 = no education, 2 = primary school, 3 = medu elementary school, 4 = higher) employment status of the mother (1 = employed, 0 = unemployed) mwork residential district (54 districts in total) s • The random effect b s i captures spatial variability induced by unobserved spatially varying covariates. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 12

Thomas Kneib Linear Mixed Models • Marginal perspective on a mixed model: y ∼ N( Xβ , V ) where V = σ 2 I + τ 2 Z Σ Z ′ • Interpretation: The random effects induce a correlation structure and therefore enable a proper statistical analysis of correlated data. • Conditional perspective on a mixed model: y | b ∼ N( Xβ + Zb , σ 2 I ) . • Interpretation: Random effects are additional regression coefficients (for example subject-specific effects in longitudinal data) that are estimated subject to a regulari- sation penalty. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 13

Thomas Kneib Linear Mixed Models • Interest in the following is on the selection of random effects: Compare b ∼ N( 0 , τ 2 Σ ) M 1 : y = Xβ + Zb + ε , and M 2 : y = Xβ + ε . • Equivalent: Compare model with random effects ( τ 2 > 0 ) and without random effects ( τ 2 = 0 ). • Random Intercept: τ 2 > 0 versus τ 2 = 0 corresponds to the inclusion and exclusion of the random intercept and therefore to the presence or absence of intra-individual correlations. • Penalised splines: τ 2 > 0 versus τ 2 = 0 differentiates between a spline model and a simple polynomial model. In particular, we can compare linear versus nonlinear models. On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models 14

On the Behavior of Marginal and Conditional Akaike Information - PowerPoint PPT Presentation

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Sonja Greven Department of Mathematics Department of Biostatistics Carl von Ossietzky University Oldenburg Johns Hopkins University

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Advanced Section #2 Model Selection & Information Criteria Akaike Information Criterion

Model Selection & Information Criteria: Akaike Information Criterion A uthors : M. M attheakis

11/15/16 Conditional distributions Let X and Y be discrete r.v.s. Conditional probability mass

Review: Conditional Probability Conditional Probability The conditional probability of event

Developing tools to identify marginal lands and assess their potential for bioenergy production

Joint and marginal probabilities Joint: Marginal: How to compute the probability of observations

Short Run Marginal Cost Short Run Marginal Cost K Peter Kolf General Manager Economic

VICTORIA HARBOUR: VICTORIA HARBOUR: marginal valuation & marginal valuation & un-priced

BEHAVIOR @ HOME Behavior Basics Simple strategies that can make a big difference! Presented by

15. The Conditional 15.1 The conditional: Formation and uses 15.2 Mise en pratique 15.1 The

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Markov random fields 2. conditional specifications 3. conditional auto-regression Rasmus

Conditional Statements Python Conditional Statements Sometimes a statement (or a block of

Conditional Sentences as Conditional Speech Acts Workshop Questioning Speech Acts Universitt

Conditional Probability & Independence Conditional Probabilities Question : How should we

BREAK EVEN OR BUST 5 Selling Systems DigitalMarketer Needed for Growth John Grimshaw

COMMUNITY TRANSLATION IN AFRICA DENIS GIKUNDA, LOCALIZATION PRG MANAGER w3c: The Multilingual

Comparative Review of Classification Trees by Leonardo Auslender, leoldv12 at gmail

Overloading and Subtyping Liam OConnor CSE, UNSW (and data61) Term 3 2019 1 Overloading

Data Science for Public Policy Case of Aspirational Districts Program S ( Subu ) V Subramanian,

Sturgeon and the Cool Kids Problems with Top- N Recommender Evaluation Michael D. Ekstrand

Regulation 261/2004: The Follow-Up of Sturgeon Italian experience and Recent Development L2B

web web standards < > PURPOSE PRINCIPLES PURPOSE PATTERNS PRINCIPLES

Sambuz

Useful Links

Newsletter

Mail Us

On the Behavior of Marginal and Conditional Akaike Information - PowerPoint PPT Presentation

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Sonja Greven Department of Mathematics Department of Biostatistics Carl von Ossietzky University Oldenburg Johns Hopkins University

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Advanced Section #2 Model Selection &amp; Information Criteria Akaike Information Criterion

Model Selection &amp; Information Criteria: Akaike Information Criterion A uthors : M. M attheakis

11/15/16 Conditional distributions Let X and Y be discrete r.v.s. Conditional probability mass

Review: Conditional Probability Conditional Probability The conditional probability of event

Developing tools to identify marginal lands and assess their potential for bioenergy production

Joint and marginal probabilities Joint: Marginal: How to compute the probability of observations

Short Run Marginal Cost Short Run Marginal Cost K Peter Kolf General Manager Economic

VICTORIA HARBOUR: VICTORIA HARBOUR: marginal valuation &amp; marginal valuation &amp; un-priced

BEHAVIOR @ HOME Behavior Basics Simple strategies that can make a big difference! Presented by

15. The Conditional 15.1 The conditional: Formation and uses 15.2 Mise en pratique 15.1 The

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Markov random fields 2. conditional specifications 3. conditional auto-regression Rasmus

Conditional Statements Python Conditional Statements Sometimes a statement (or a block of

Conditional Sentences as Conditional Speech Acts Workshop Questioning Speech Acts Universitt

Conditional Probability &amp; Independence Conditional Probabilities Question : How should we

BREAK EVEN OR BUST 5 Selling Systems DigitalMarketer Needed for Growth John Grimshaw

COMMUNITY TRANSLATION IN AFRICA DENIS GIKUNDA, LOCALIZATION PRG MANAGER w3c: The Multilingual

Comparative Review of Classification Trees by Leonardo Auslender, leoldv12 at gmail

Overloading and Subtyping Liam OConnor CSE, UNSW (and data61) Term 3 2019 1 Overloading

Data Science for Public Policy Case of Aspirational Districts Program S ( Subu ) V Subramanian,

Sturgeon and the Cool Kids Problems with Top- N Recommender Evaluation Michael D. Ekstrand

Regulation 261/2004: The Follow-Up of Sturgeon Italian experience and Recent Development L2B

web web standards &lt; &gt; PURPOSE PRINCIPLES PURPOSE PATTERNS PRINCIPLES

Sambuz

Useful Links

Newsletter

Mail Us

Advanced Section #2 Model Selection & Information Criteria Akaike Information Criterion

Model Selection & Information Criteria: Akaike Information Criterion A uthors : M. M attheakis

VICTORIA HARBOUR: VICTORIA HARBOUR: marginal valuation & marginal valuation & un-priced

Conditional Probability & Independence Conditional Probabilities Question : How should we

web web standards < > PURPOSE PRINCIPLES PURPOSE PATTERNS PRINCIPLES