A comparisons of some criteria for states selection of the latent - PowerPoint PPT Presentation

A comparisons of some criteria for states selection of the latent Markov model for longitudinal data Silvia Bacci ∗ 1 , Francesco Bartolucci ∗ , Silvia Pandolfi ∗ , Fulvia Pennoni ∗∗ ∗ Dipartimento di Economia, Finanza e Statistica - Università di Perugia ∗∗ Dipartimento di Statistica - Università di Milano-Bicocca Università di Catania, Catania, 6-7 September 2012 1 silvia.bacci@stat.unipg.it MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 1 / 27

Outline Introduction 1 Preliminaries: multivariate basic Latent Markov (LM) model 2 Model selection criteria 3 Monte Carlo study 4 References 5 MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 2 / 27

Introduction Introduction Background: Latent Markov (LM) models (Wiggins, 1973; Bartolucci et al., 2012) are successfully applied in the analysis of longitudinal data: they allow to take into account several aspects, such as serial dependence between observations, measurement errors, unobservable heterogeneity LM models assume that one or more occasion-specific response variables depends only on a discrete latent variable characterized by a given number of latent states which in turn depends on the latent variables corresponding to the previous occasions according to a first-order Markov chain LM models are characterized by several parameters: the initial probabilities to belong to a given latent state, the transition probabilities from a latent state to another one, the conditional response probabilities given the discrete latent variable MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 3 / 27

Introduction Problem: a crucial point with LM models is represented by the selection of the number of latent states Aim: we compare the behavior of several model selection criteria to choose the number of latent states Special attention is devoted to classification-based criteria that take explicitly into account the partition of observations in different latent states, through a specific measurement of the quality of classification, denoted as entropy MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 4 / 27

Preliminaries: multivariate basic Latent Markov (LM) model Multivariate basic LM model: notation Y ( t ) = ( Y ( t ) 1 , . . . , Y ( t ) r ) : vector of discrete categorical response variables Y j ( j = 1 , . . . , r ) observed at time t ( t = 1 , . . . , T ), having c j categories Y = ( Y ( 1 ) , . . . , Y ( T ) ) : vector of observed responses made of the union of vectors Y ( t ) ; usually, it is referred to repeated measurements of the same variables Y j ( j = 1 , . . . , r ) on the same individuals at different time points U ( t ) : latent state at time t with state space { 1 , . . . , k } U = ( U ( 1 ) , . . . , U ( T ) ) : vector describing the latent process MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 5 / 27

Preliminaries: multivariate basic Latent Markov (LM) model Multivariate basic LM model: main assumptions vectors Y ( t ) ( t = 1 , . . . , T ) are conditionally independent given the latent process U and the response variables in each Y ( t ) are conditionally independent given U ( t ) (local independence), i.e., each occasion-specific observed variable Y ( t ) is independent of j Y ( t − 1 ) , . . . , Y ( 1 ) and of each Y ( t ) h , for all h � = j = 1 , . . . , r , given U ( t ) j j latent process U follows a first-order Markov chain with k latent states, i.e., each latent variable U ( t ) is independent of U ( t − 2 ) , . . . , U ( 1 ) , given U ( t − 1 ) Y ( 1 ) 1 , . . . , Y ( 1 ) Y ( 2 ) 1 , . . . , Y ( 2 ) Y ( T ) , . . . , Y ( T ) · · · r r r 1 ✚✚✚ ❃ ✚✚✚ ❃ ✚✚✚ ❃ ✻ ✻ ✻ LM: ✲ ✲ · · · ✲ U ( 1 ) U ( 2 ) U ( T ) MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 6 / 27

Preliminaries: multivariate basic Latent Markov (LM) model Multivariate basic LM model: parameters k � r j = 1 ( c j − 1 ) conditional response probabilities = y | U ( t ) = u ) φ ( t ) jy | u = p ( Y ( t ) j = 1 , . . . , r ; t = 1 , . . . , T ; u = 1 , . . . , k ; y = j 0 , . . . , c j − 1 = y r | U ( t ) = u ) φ ( t ) j = 1 φ ( t ) jy | u = p ( Y ( t ) = y 1 , . . . , Y ( t ) y | u = � r r 1 ( k − 1 ) initial probabilities π u = p ( U ( 1 ) = u ) u = 1 , . . . , k ( T − 1 ) k ( k − 1 ) transition probabilities = p ( U ( t ) = u | U ( t − 1 ) = v ) π ( t | t − 1 ) t = 2 , . . . , T ; u , v = 1 , . . . , k u | v # par = k � r j = 1 ( c j − 1 ) + ( k − 1 ) + ( T − 1 ) k ( k − 1 ) MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 7 / 27

Preliminaries: multivariate basic Latent Markov (LM) model Multivariate basic LM model: probability distributions t = 2 π ( t | t − 1 ) = π u · π ( 2 | 1 ) u 2 | u . . . π ( T | T − 1 ) � T p ( U = u ) = π u u | v u T | u T − 1 t = 1 φ ( t ) y | u = φ ( 1 ) y | u · φ ( 2 ) y | u . . . φ ( T ) p ( Y = y | U = u ) = � T y | u manifest distribution of Y � � p ( Y = y ) = p ( Y = y , U = u ) = p ( U = u ) · p ( Y = y | U = u ) u u π u φ ( 1 ) π ( 2 | 1 ) u 2 | u φ ( 2 ) π ( T | T − 1 ) u T | u T − 1 φ ( T ) � � � = y | u · y | u . . . y | u u u 2 u T T T π ( t | t − 1 ) φ ( t ) � � � � � = . . . π u u | v y | u u u 2 u T t = 2 t = 1 Note that computing p ( Y = y ) involves all the possible k T configurations of vector u MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 8 / 27

Preliminaries: multivariate basic Latent Markov (LM) model Multivariate basic LM model: maximum likelihood (ML) estimation Log-likelihood of the model � ℓ ( θ ) = n ( y ) log [ p ( Y = y )] y θ : vector of all model parameters ( π u , π ( t | t − 1 ) , φ ( t ) jy | u ) u | v n ( y ) : frequency of the response configuration y in the sample ℓ ( θ ) may be maximized with respect to θ by an Expectation- Maximization (EM) algorithm (Dempster et al ., 1977) MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 9 / 27

Preliminaries: multivariate basic Latent Markov (LM) model EM algorithm Complete data log-likelihood of the model c − 1 r T k a ( t ) juy log φ ( t ) ℓ ∗ ( θ ) = � � � � jy | u + j = 1 t = 1 u = 1 y = 0 k T k k vu log π ( t | t − 1 ) � b ( 1 ) � � � b ( t ) + log π u + u u | v u = 1 t = 2 v = 1 u = 1 a ( t ) juy : frequency of subjects responding by y for the j -th response variable and belonging to latent state u , at time t b ( 1 ) u : frequency of subjects in latent state u at time 1 b ( t ) vu : frequency of subjects which move from latent state v to u at time t MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 10 / 27

Preliminaries: multivariate basic Latent Markov (LM) model EM algorithm The algorithm alternates two steps until convergence in ℓ ( θ ) : E : compute the expected values of frequencies a ( t ) juy , b ( 1 ) u , and b ( t ) vu , given the observed data and the current value of θ , so as to obtain the expected value of ℓ ∗ ( θ ) M : update θ by maximizing the expected value of ℓ ∗ ( θ ) obtained above; explicit solutions for θ estimations are available The E-step is performed by means of certain recursions which may be easily implemented through matrix notation (Bartolucci, 2006) MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 11 / 27

Preliminaries: multivariate basic Latent Markov (LM) model Forward and backward recursions To efficiently compute the probability p ( Y = y ) and the posterior probabilities f ( t ) u | y and f ( t | t − 1 ) we can use forward and backward recursions for obtaining the u | v , y following intermediate quantities Forward recursions k u , y = p ( U ( t ) = u , Y ( 1 ) , . . . , Y ( t ) ) = q ( t ) q ( t − 1 ) π ( t | t − 1 ) φ ( t ) � u = 1 , . . . , k v , y u | v y | u v = 1 starting with q ( 1 ) u , y = π u φ ( 1 ) y | u Backward recursions k v , y = p ( Y ( t + 1 ) , . . . , Y ( T ) | U ( t ) = v ) = q ( t ) q ( t + 1 ) π ( t + 1 | t ) φ ( t + 1 ) � ¯ ¯ v = 1 , . . . , k u , y u | v y | u u = 1 q ( T ) starting with ¯ v , y = 1 MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 12 / 27

Model selection criteria Model selection criteria A crucial point with LM models concerns the selection of k , the number of latent states We may rely on the literature about finite mixture models and hidden Markov models The most well-known criteria are Akaike’s Information Criterion (AIC - Akaike, 1973) AIC = − 2 ℓ ( θ ) + 2 · # par or its variants: Consistent AIC (CAIC) CAIC = − 2 ℓ ( θ ) + # par · ( log ( n ) + 1 ) AIC3 AIC3 = − 2 ℓ ( θ ) + 3 · # par Bayesian Information Criterion (BIC - Schwarz, 1978) BIC = − 2 ℓ ( θ ) + # par · log ( n ) MBC 2 Bacci, Bartolucci, Pandolfi, Pennoni (unipg, unimib) 13 / 27

A comparisons of some criteria for states selection of the latent - PowerPoint PPT Presentation

A comparisons of some criteria for states selection of the latent Markov model for longitudinal data Silvia Bacci 1 , Francesco Bartolucci , Silvia Pandolfi , Fulvia Pennoni Dipartimento di Economia, Finanza e Statistica -

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Case Comparisons Department of Government London School of Economics and Political Science Uses

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

29 May 2015 ADMISSIBILITY AND SELECTION CRITERIA PART 1: FUND OVERVIEW AND ADMISSIBILITY CRITERIA

24 States in Total 14 States: Prison Programs 16 States: Jail Programs 2 States: Federal

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Comparisons of gyrokinetic PIC and CIP codes Comparisons of gyrokinetic PIC and CIP codes

Graph Resistance and Learning from Pairwise Comparisons pairwise comparisons of items. In

BMI-206 Structure-Structure comparisons Sequence-Structure comparisons Marc A. Marti-Renom

Multiple Comparisons Occasionally, e.g., at the start of a research project, we do not have a

I10 - Multiple comparisons STAT 401 (Engineering) - Iowa State University March 2, 2018

Correction for multiple comparisons in FreeSurfer 1 Problem of Multiple Comparisons p < 10 -7

New Grissom High School Selection Process Request for Qualifications Identify Selection 15

Lesson 3: Likelihood-based inference for POMP models Aaron A. King, Edward L. Ionides, Kidus

Motivation Partial Wave Analysis Up to know: worked on + with

Lecture 7: Cross-Validation Instructor: Prof. Shuai Huang Industrial and Systems Engineering

Ridge/Lasso Regression, Model selection Xuezhi Wang Computer Science Department Carnegie Mellon

BAYESIAN OPTIMIZATION FOR AUTOMATED MODEL SELECTION Gustavo Malkomes Chip Schaff Roman Garnett

Prediction in MLM Model comparisons and regularization PSYC 575 October 13, 2020 (updated: 25

Learning a Belief Network If you know the structure have observed all of the variables

How well can HMM model load signals 3rd International Workshop on Non-Intrusive Load Monitoring,

A comparisons of some criteria for states selection of the latent - PowerPoint PPT Presentation

A comparisons of some criteria for states selection of the latent Markov model for longitudinal data Silvia Bacci 1 , Francesco Bartolucci , Silvia Pandolfi , Fulvia Pennoni Dipartimento di Economia, Finanza e Statistica -

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Case Comparisons Department of Government London School of Economics and Political Science Uses

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

29 May 2015 ADMISSIBILITY AND SELECTION CRITERIA PART 1: FUND OVERVIEW AND ADMISSIBILITY CRITERIA

24 States in Total 14 States: Prison Programs 16 States: Jail Programs 2 States: Federal

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Comparisons of gyrokinetic PIC and CIP codes Comparisons of gyrokinetic PIC and CIP codes

Graph Resistance and Learning from Pairwise Comparisons pairwise comparisons of items. In

BMI-206 Structure-Structure comparisons Sequence-Structure comparisons Marc A. Marti-Renom

Multiple Comparisons Occasionally, e.g., at the start of a research project, we do not have a

I10 - Multiple comparisons STAT 401 (Engineering) - Iowa State University March 2, 2018

Correction for multiple comparisons in FreeSurfer 1 Problem of Multiple Comparisons p &lt; 10 -7

New Grissom High School Selection Process Request for Qualifications Identify Selection 15

Lesson 3: Likelihood-based inference for POMP models Aaron A. King, Edward L. Ionides, Kidus

Motivation Partial Wave Analysis Up to know: worked on + with

Lecture 7: Cross-Validation Instructor: Prof. Shuai Huang Industrial and Systems Engineering

Ridge/Lasso Regression, Model selection Xuezhi Wang Computer Science Department Carnegie Mellon

BAYESIAN OPTIMIZATION FOR AUTOMATED MODEL SELECTION Gustavo Malkomes Chip Schaff Roman Garnett

Prediction in MLM Model comparisons and regularization PSYC 575 October 13, 2020 (updated: 25

Learning a Belief Network If you know the structure have observed all of the variables

How well can HMM model load signals 3rd International Workshop on Non-Intrusive Load Monitoring,

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Correction for multiple comparisons in FreeSurfer 1 Problem of Multiple Comparisons p < 10 -7