estimation of discretely observed markov jump processes
play

Estimation of discretely observed Markov Jump Processes with - PowerPoint PPT Presentation

Estimation of discretely observed Markov Jump Processes with applications in survival analysis Salim Serhan Technical University of Denmark (DTU) Outline Problem formulation Complete-data problem EM-algorithm Extensions


  1. Estimation of discretely observed Markov Jump Processes with applications in survival analysis Salim Serhan Technical University of Denmark (DTU)

  2. Outline • Problem formulation • Complete-data problem • EM-algorithm • Extensions • Conclusion 2 of 18

  3. Problem formulation Problem formulation • Consider a Markov Jump Process, { X ( t ) } t ≥ 0 , of dimension k , initial probability vector π and generator Q = C + D . • X ( t ) generates a Markovian arrival process (MAP). • We examine following estimation problem: We observe state of X ( t ) at certain discrete time points, as well as at the time of all arrivals in the MAP. • It follows that the states have a physical interpretation. • We wish to estimate θ = ( π , C , D ) . 3 of 18

  4. Problem formulation Illustration k − − 3 2 1 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 Figure: An illustration of the discrete observation sampling scheme. The stars are arrivals while the crosses are discrete observations. • Observations are labeled as discrete observations or arrivals. 4 of 18

  5. Problem formulation An example from survival analysis  0 0   0 0 0 0  . c 12 c 13 d 15 0 . 0 c 24 0 0 0 0 0 d 25         π = (1 , 0 , 0 , 0 , 0) , C = 0 0 . c 34 0 , D = 0 0 0 0 d 35 .         0 0 0 0 0 0 0 0  .   d 45      0 0 0 0 0 0 0 0 0 0 5 of 18

  6. Complete-data problem Illustration k − − 3 2 1 0 Figure: A complete sample path of the Markov jump process generating the MAP 6 of 18

  7. Complete-data problem Likelihood function • The Complete-data likelihood function is k k k k c n ij d n ij � π b i � � � � L ( θ ) = i · ij exp( − c ij z i ) · ij exp( − d ij z i ) . i =1 i =1 j � = i i =1 j =1 • Where • b i , the number of processes that start in state i , • z i , the total time spent in state i , • n ij , the total number of transitions from state i to state j not associated with an arrival, • n ij , the total number of transitions from state i to state j associated with an arrival, • The maximum likelihood estimators are c ij = n ij d ij = n ij ˆ π i = b i , ˆ ˆ , . (1) z i z i 7 of 18

  8. EM-algorithm EM-algorithm • Now consider the case of incomplete-data. • We observe a vector of states X = ( x t 1 , x t 2 , . . . , x t n ) , where t 1 < t 2 < . . . < t n . • We also observe a vector of indicators I = ( i t 1 , i t 2 , . . . , i t n ) . i t h equals 1 if the h ’th observation is an arrival, 0 otherwise. • The pair ( X , I ) is the incomplete data. • For the E-step, we need expressions for E ( Z k | X , I ) , E ( N ij | X , I ) , E ( N ij | X , I ) and E ( B i | X , I ) 8 of 18

  9. EM-algorithm Some notation • First, some notation. Put ∆ h = t h − t h − 1 , h = 2 , . . . , ( n − 1) , with ∆ h = t 1 . • M k ij ( h ) = E ( Z k | X (0) = i, X (∆ h ) = j ) = the expected sojourn time in state k , given that the process was initialised in state i and is in state j at time t . • f kl ij ( h ) = E ( N kl | X (0) = i, X (∆ h ) = j ) = the expected number of jumps not caused by an event from k to l , given that X was initialised in state i and is in state j after time t . kl ij ( h ) = E ( N kl | X (0) = i, X (∆ h ) = j ) = same as for f kl • f ij ( t ) , but for the number of jumps caused by an event. 9 of 18

  10. EM-algorithm Some notation • Assuming homogeneity, we may then write π x t 1 (1) + � n • E ( Z k | X ) = M k h =2 M k x th − 1 x th ( h ) . π x t 1 (1) + � n • E ( N ij | X ) = f ij h =2 f ij x th − 1 x th ( h ) . ij ij π x t 1 (1) + � n • E ( N ij | X ) = f h =2 f x th − 1 x th ( h ) . • E ( B i | X ) = E ( B i | X ( t 1 ) = x t 1 , I ( t 1 ) = i t 1 ) . • Thus, the problem is reduced to finding expressions for M, f, f and E ( B i | X t 1 , I t 1 ) . 10 of 18

  11. EM-algorithm Integral calculation • Define the matrices � ∆ h • M kk ′ ( h ) = exp( C u ) e k e ′ k exp( C (∆ h − u ))d u . 0 � ∆ h • M kl ′ ( h ) = exp( C u ) e k e ′ l exp( C (∆ h − u ))d u . 0 • Where e i is the i ’th unit vector of appropriate dimension. • A way to calculate the integrals is � I �� e k e ′ � � � � C 0 M kl ′ ( t ) = 0 � l exp t , 0 C I • where I is the identity matrix of dimension k × k and 0 is a matrix of zeroes of dimension k × k . 11 of 18

  12. EM-algorithm E-step formulas • The E-step formulas are as follows, when h ≥ 2 e i M kk ′ ( h ) D i th e j e i M kl ′ ( h ) D i th e j M k f kl ij ( h ) = , ij ( h ) = c kl , e i exp( C ∆ h ) D i th e j e i exp( C ∆ h ) D i th e j e i exp( C ∆ h ) D i th e k kl kl ij ( h ) = 0 for l � = j, ij ( h ) = d kj for l = j. f f e i exp( C ∆ h ) D i th e j • When h = 1 , replace all the e i vectors by π . Also, E ( B i | X ( t 1 ) , I t 1 ) = π i e ′ i exp( C t 1 ) D i 1 e x t 1 . π exp( C t 1 ) D i 1 e x t 1 12 of 18

  13. Extensions Covariates • We can parameterize the transition intensities using covariates. • Let Z denote the covariaties. • A popular model in survival analysis is the Cox proportional hazards model λ ( t | Z ) = λ 0 ( t ) exp ( β Z ) . • This gives an inhomogeneous model, unless we put λ 0 ( t ) = λ . 13 of 18

  14. Extensions Phase-type sojourn times • Exponential sojourn times may be unrealistic. • Consider the Markov jump process Y ( t ) with an expanded state space { 1 1 , . . . , 1 m 1 } ∪ { 2 1 , . . . , 2 m 2 } ∪ . . . ∪ { k 1 , . . . , k m k } • Where m i , i = 1 , 2 , . . . , k is the number of sub-states for the i ’th batch state. Let m = m 1 + m 2 + . . . + m k denote the dimension of the expanded state space. • Canonical representations should be used. That is, Coxian structures with increasing mean sojourn times. • The sub-states do not have a physical interpretation, i.e. we cannot observe them. • Y ( t ) is a semi-Markov jump process with the following relation to X ( t ) . P ( X ( t ) = r | Y ( t ) = r i ) = 1 • This is a hidden Markov model with deterministic state-dependent distributions. 14 of 18

  15. Extensions Estimation with Phase-type sojourn times • The likelihood function is � n � � L ( θ ) = π Γ ( h ) P ( x t h ) h =1 • Where Γ ( h ) is an m × m matrix, where the ( i, j ) -th element is P ( X (∆ h ) = j | X (0) = i, I t h = i t h ) . We find these by e i exp( C ∆ h ) D i th e j e i exp( C ∆ h ) D i th 1 . • Where 1 is a vector of ones of appropriate dimension. • P ( x t h ) is an m × m diagonal matrix, where the i ’th diagonal elements is P ( X ( t h ) = x t h | Y ( t h ) = i ) 15 of 18

  16. Extensions Misclassification models • With a Hidden Markov Model defined, we can easily include the possibility of misclassification. • This can be the case when there is uncertainty on the state observations. • In survival analysis, this is known as a censored state. • Let e rs denote the probability of wrongly classifying X ( t ) in batch-state s , when the true batch-state is r . We can write this as P ( X ( t h ) = r | Y ( t h ) = s ) = e rs . • This gives categorical state-dependent distributions, and we may use the previous likelihood function. 16 of 18

  17. Conclusion Conclusion • We have extended some EM-algorithms from the literature to account for different observation types. • We have shown how these models may be applied to a certain model from survival analysis. • Covariates can be included, with certain limitations. • We can have phase-type sojourn times at the cost of a harder estimation problem. • And finally, we can allow uncertainty on the state observations. 17 of 18

  18. Conclusion Further Work • Derive formulas for the Fisher information matrix. • Study the large sample properties of the algorithm. • Develop estimators for non-homogeneous Markov processes. 18 of 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend