an extended random effects approach to analysing repeated
play

An Extended Random-effects Approach to Analysing Repeated, - PowerPoint PPT Presentation

An Extended Random-effects Approach to Analysing Repeated, Overdispersed Count Data Clarice G. B. Dem etrio ESALQ/USP, Piracicaba, SP, Brasil Clarice.demetrio@usp.br joint work with Geert Molenberghs, Hasselt University, Belgium


  1. ✬ ✩ An Extended Random-effects Approach to Analysing Repeated, Overdispersed Count Data Clarice G. B. Dem´ etrio ESALQ/USP, Piracicaba, SP, Brasil Clarice.demetrio@usp.br joint work with Geert Molenberghs, Hasselt University, Belgium Geert Verbeke, Katholieke Universiteit Leuven, Belgium VIII Encontro dos Alunos P´ os-gradua¸ c˜ ao em Estat´ ıstica e Experimenta¸ c˜ ao Agronˆ omica Piracicaba, SP November, 20, 2018 ✫ ✪ 1

  2. ✬ ✩ Outline • Motivating application - A Clinical Trial in Epileptic Patients • Generalized linear models • Poisson regression models • Overdispersion in GLM’s • Univariate overdispersed count data • Longitudinal overdispersed count data • Estimation • Discussion of the example • Final remarks ✫ ✪ 2

  3. ✬ ✩ Motivation - A Clinical Trial in Epileptic Patients • a randomized, double-blind, parallel group multi-center study for the comparison of placebo with a new anti-epileptic drug (AED) • after a 12-week baseline period, 45 epilepsy patients were assigned to the placebo group, 44 to the active (new) treatment group • patients measured weekly during 16 weeks (double-blind) and some up to 27 weeks in a long-term open-extension study • outcome of interest: the number of epileptic seizures experienced during the last week, i.e., since the last time the outcome was measured • key research question: whether or not the additional new treatment reduces the number of epileptic seizures ✫ ✪ 3

  4. ✬ ✩ Considerations about the data • a very skewed distribution, with the largest observed value equal to 73 seizures in week ✫ ✪ 4

  5. ✬ ✩ • unstable behavior explained by: – presence of extreme values, – very little observations available at some of the time-points, especially past week 20 • longitudinal count data: – discrete data – possible correlation between measurements for the same ✫ ✪ individual 5

  6. ✬ ✩ # Observations Week Placebo Treatment Total 1 45 44 89 5 42 42 84 10 41 40 81 15 40 38 78 16 40 37 77 17 18 17 35 20 2 8 10 27 0 3 3 • serious drop in number of measurements past the end of the actual double-blind period, i.e., past week 16 ✫ ✪ 6

  7. ✬ ✩ Generalized Linear Models (GLM’s) – unifying framework for much statistical modelling (Nelder and Wedderburn, 1972) – an extension to the standard normal theory linear model – three components: • independent random variables Y i , i = 1 , . . . , n , from a linear exponential family distribution with means µ i and constant scale parameter φ , φ − 1 [ yθ − ψ ( θ )] + c ( y, φ ) { } f ( y ) ≡ f ( y | θ, φ ) = exp , where µ = E [ Y ] = ψ ′ ( θ ) and Var ( Y ) = φψ ′′ ( θ ) . • a linear predictor vector η given by η = X β where β is a vector of p unknown parameters and X = [ x 1 , . . . , x n ] T , the design matrix; • a link function g ( · ) relating the mean to the linear predictor, i.e. g ( µ i ) = η i = x T i β ✫ ✪ 7

  8. ✬ ✩ Poisson regression models If Y i , i = 1 , . . . , n , are counts with means µ i , the standard Poisson model assumes that Y i ∼ Pois ( µ i ) with f ( y i ) = e − µ i µ y i i y i ! and E ( Y i ) = µ i and Var ( Y i ) = µ i (too restrictive!) The canonical link function is the log g ( µ i ) = log( µ i ) = η i and η i = x T i β . For a well fitting model (Hinde and Dem´ etrio, 1998a,b): ✫ ✪ Residual Deviance ≈ Residual d.f. 8

  9. ✬ ✩ Overdispersion in GLM’s What if Residual Deviance ≫ Residual d.f.? (i) Badly fitting model • omitted terms/variables • incorrect relationship (link) • outliers (ii) variation greater than predicted by model: ⇒ Overdispersion = • count data: Var ( Y ) > µ • counted proportion data: Var ( Y ) > mπ (1 − π ) ✫ ✪ 9

  10. ✬ ✩ Univariate Overdispersed Count Data Y i – counts with means λ i (Hinde and Dem´ etrio, 1998a,b) Negative Binomial Type Variance log λ i = x T Y i | λ i ∼ Pois ( λ i ) with i β E ( Y i | λ i ) = λ i Var ( Y i | λ i ) = λ i • no particular distributional form: E ( λ i ) = µ i and Var ( λ i ) = σ 2 i Var ( Y i ) = µ i + σ 2 E ( Y i ) = µ i i • λ i ∼ Γ( α, β i ) Var ( Y i ) = αβ i (1+ β i ) = µ i + µ 2 i E [ Y i ] = µ i = αβ i ( NegBinII ) α • λ i ∼ Γ( α i , β ) E [ Y i ] = µ i = α i β Var ( Y i ) = µ i (1 + β ) = φµ i ( NegBinI ) ✫ ✪ 10

  11. ✬ ✩ Poisson-normal model Individual level random effect in the linear predictor log λ i = x T Y i | b i ∼ Pois ( λ i ) with i β + b i where b i ∼ N (0 , d ) , which gives i β + 1 2 d := µ i e x T E [ Y i ] = i β + 1 2 d + e 2 x T i β + d ( e d − 1) = µ i + µ i ( e d − 1) µ i e x T Var ( Y i ) = i.e. a variance function of the form Var ( Y i ) = µ i + kµ 2 i ✫ ✪ 11

  12. ✬ ✩ Longitudinal Overdispersed Count Data Y ij : the j th outcome for subject i , i = 1 , . . . , N , j = 1 , . . . , n i Y i = ( Y i 1 , . . . , Y in i ) ′ : the vector of measurements for subject i Negative Binomial Type Variance extension Y ij | λ ij ∼ Poi ( λ ij ) , λ i = ( λ i 1 , . . . , λ in i ) ′ , with E ( λ i ) = µ i and Var ( λ i ) = Σ i Unconditionally, E ( Y i ) = µ i , and Var ( Y i ) = M i + Σ i where M i is a diagonal matrix with the vector µ i along the diagonal ✫ ✪ 12

  13. ✬ ✩ • the diagonal structure of M i reflects the conditional independence assumption – all dependence between measurements on the same unit stem from the random effects • components of λ i independent – pure overdispersion model, without correlation between the repeated measures • λ ij = λ i ⇒ Var ( Y i ) = M i + σ 2 i J n i – a Poisson version of compound symmetry • also possible to combine general correlation structures between the components of λ i ✫ ✪ 13

  14. ✬ ✩ Poisson-normal model extension – a GLMM Y ij | b i ∼ Poi ( λ ij ) , x ′ ij β + z ′ ln( λ ij ) = ij b i , ∼ N ( 0 , D ) b i x ij and z ij : p - and q -dimensional vectors of known covariate values β : a p -dimensional vector of unknown fixed regression coefficients Then, unconditionally, µ i = E ( Y i ) has components: ( ij β + 1 ) x ′ 2 z ′ µ ij = exp ij D z ij and the variance-covariance matrix is ( ) e Z i DZ ′ i − J n i Var ( Y i ) = M i + M i M i ✫ ✪ 14

  15. ✬ ✩ Models Combining Overdispersion With Normal Random Effects Y ij | θ ij , b i ∼ Poi ( λ ij ) ( ) x ′ ij β + z ′ = θ ij exp λ ij ij b i ∼ N ( 0 , D ) b i E ( θ i ) = E [( θ i 1 , . . . , θ in i ) ′ ] = Φ i Var ( θ i ) = Σ i Then, µ i = E ( Y i ) has components: ( ) ij β + 1 x ′ 2 z ′ µ ij = φ ij exp ij D z ij The variance-covariance matrix is Var ( Y i ) = M i + M i ( P i − J n i ) M i where the ( j, k ) th element of P i is ) σ i,jk + φ ij φ ik ( 1 ( 1 ) 2 z ′ 2 z ′ p i,jk = exp ij D z ik exp ik D z ij φ ij φ ik ✫ ✪ 15

  16. ✬ ✩ Estimation for the Poisson-normal and Combined Models • random-effects models fitted by maximization of the marginal likelihood, by integrating out the random effects from conditional densities • likelihood contribution of subject i is from: n i ∫ ∏ f i ( y i | β , D, φ ) = f ij ( y ij | b i , β , φ ) f ( b i | D ) d b i j =1 • likelihood for β , D , and φ : N n i ∫ ∏ ∏ L ( β , D, φ ) = f ij ( y ij | b i , β , φ ) f ( b i | D ) d b i . i =1 j =1 ✫ ✪ 16

  17. ✬ ✩ • key problem: presence of N integrals – in general no closed-form solution exists (Verbeke and Molenberghs, 2000; Molenberghs and Verbeke, 2005). To solve the problem, use of – numerical integration – SAS procedure NLMIXED – series expansion methods (penalized quasi-likelihood, marginal quasi-likelihood), Laplace approximation, etc – SAS procedure GLIMMIX – hybrid between analytic and numerical integration • in some special cases (linear mixed effects model, Poisson-normal model), these integrals can be worked out analytically – also true for the combined model • Fully Bayesian inferences ✫ ✪ 17

  18. ✬ ✩ Full Marginal Density for the Combined Model The joint probability of Y i takes the form:   n i ( ) ( ) y ij + t j α j + y ij + t j − 1 y ij + t j ∑ ∏ ( − 1) t j β P ( Y i = y i ) =   j α j − 1 y ij t j =1   n i ∑ ( y ij + t j ) x ′ × exp ij β   j =1       n i n i 1   ∑ ∑  D ( y ij + t j ) z ′ × exp ( y ij + t j ) z ij  ij   2   j =1 j =1 where t = ( t 1 , . . . , t n i ) ranges over all non-negative integer vectors – special cases can be obtained very easily – usefully used to implement maximum likelihood estimation, with numerical accuracy governed by the number of terms included in the series ✫ ✪ 18

  19. ✬ ✩ Partial marginalization – integrate over the gamma random effects only, leaving the normal random effects untouched The corresponding probability is: ( ) ( ) y ij ( ) α j α j + y ij − 1 1 β j κ y ij P ( Y ij = y ij | b i ) = ij 1 + κ ij β j 1 + κ ij β j α j − 1 where κ ij = exp[ x ′ ij β + z ′ ij b i ] – we assume that the gamma random effects are independent within a subject – the correlation is induced by the normal random effects – easy to obtain the fully marginalized probability by numerically integration the normal random effects out of P ( Y ij = y ij | b i ) , using ✫ SAS procedure NLMIXED ✪ 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend