Analysis of variance and regression May 13, 2008 Repeated - PowerPoint PPT Presentation

Analysis of variance and regression May 13, 2008

Repeated measurements over time • Presentation of data • Traditional ways of analysis • Variance component model (the dogs revisited) • Random regression • Baseline considerations

Lene Theil Skovgaard, Dept. of Biostatistics, Institute of Public Health, University of Copenhagen e-mail: L.T.Skovgaard@biostat.ku.dk http://staff.pubhealth.ku.dk/~pd/regression08_1

1 Repeated measurements, May 2008 Traditional presentation of longitudinal data: Ex: Aspirin absorption for healthy and ill subjects ( Matthews et.al.,1990 ) Comparison of groups for each time: • mass significance problem • tests are not independent • interpretation may be difficult

2 Repeated measurements, May 2008 What is the purpose of the investigation? • Description of time course • Comparison of groups – in which respect? level, trend,... overall pattern

3 Repeated measurements, May 2008 Why is this difficult? – or at least different from usual analyses • We have several measurements on each individual – traditional independence assumption is violated – repeated observations on the same individual are correlated (look alike) – ignoring this correlation may lead to bias , wrong standard error and therefore potentially misleading conclusions • Time course may be quite irregular, with no obvious structure, to be treated as a class-variable (using many parameters) in ANOVA-type models ( variance component models ) • Time course may vary between individuals Random regression

4 Repeated measurements, May 2008 Notation from multi-level models: level unit covariate 1 single observations time effects 2 individuals treatment effects If we fail to take this correlation into account, we will experience: • possible bias in the mean value structure • low efficiency (type 2 error) for evaluation of level 1 covariates (time-related effects) • too small standard errors (type 1 error) for estimates of level 2 effects (treatments)

5 Repeated measurements, May 2008 Possible bias? Individual time courses Average curve sometimes referred to as the healthy worker effect

6 Repeated measurements, May 2008 Missing values • MCAR Missing completely at random • MAR Missing at random - may depend on past observations • NR Informative missing (non-random) - depends on the missing value itself

7 Repeated measurements, May 2008 Level 1 covariates (unit: single observations), i.e. • Time itself • Covariates varying with time: blood pressure, heart rate, age If correlation is not taken into account, we ignore the paired situation, leading to low efficiency , i.e. too large P-values (type 2 error) Effects may go undetected!

8 Repeated measurements, May 2008 Level 2 covariates (unit: individuals), i.e. • Treatment • Gender, age If correlation is ignored, we act as if we have (a lot) more information than we actually have, leading to too small P-values (type 1 error) ’Noise’ may be taken to be real effects!

9 Repeated measurements, May 2008 Average curves may hide important structures! • They give no indication of the variation in the time profiles • Comparisons between groups should not be performed for each time point separately • Comparisons between time points cannot be judged from the curves (they are paired)

10 Repeated measurements, May 2008 The model must describe the characteristic differences between individuals, and the rest (noise, error) should be of an unsystematic, random nature. • Do not average over individual profiles, unless these have identical shapes, i.e. only shifts in level are seen between individuals. • Alternative: Calculate individual characteristics

11 Repeated measurements, May 2008 Individual time profiles (spaghettiogram) - divided into groups Do we see time profiles of identical shape? Are the averages representative?

12 Repeated measurements, May 2008 Commonly used characteristics • The response for selected times, e.g. endpoint • Average over a specific period of time • The slope, perhaps for a specific period • Peak value • Time to peak • The area under the curve (AUC). • A measure of cyclic behaviour. These are analysed as new observations .

13 Repeated measurements, May 2008 Ex: Aspirin • time to peak • peak value Conclusion: P=0.02 for identity of peak values. Quantifications!

14 Repeated measurements, May 2008 Example : 2 groups of dogs (5 resp. 6 dogs). Average profiles: of osmolality, measured 4 times (including treatments along the way)

15 Repeated measurements, May 2008 Do we have ’identical’ repetitions (except for level)?

16 Repeated measurements, May 2008 Model control Residual plot for 2-way ANOVA in (dog, treatment) We see a clear trumpet shape , because dogs with a high level also vary more than dogs with a low level. Multiplicative structure Solution: Make a logarithmic transformation!

17 Repeated measurements, May 2008 Profiles on logarithmic scale, with corresponding residual plot:

18 Repeated measurements, May 2008 Multilevel model structure: level/niveau 1 2 unit single measurements individuals variation within individuals between individuals σ 2 ω 2 W B covariates x z time , grp*time grp Multilevel models are part of the broader class of models: variance component models (which are not necessarily hierarchical)

19 Repeated measurements, May 2008 Two-level model : • Observations Y gdt (group, dog, time) • Random dog-level, Var( a gd ) = ω 2 B • Residual variation, within dogs, Var( ε gdt ) = σ 2 W • Systematic effect of time and grp proc mixed data=dog; class grp time_no dog; model losmol=grp time_no grp*time_no / ddfm=satterth; random dog(grp); run;

20 Repeated measurements, May 2008 This model assumes the socalled compound symmetry , i.e. that all measurements on the same individual are equally correlated : ω 2 B Corr( Y gdt 1 , Y gdt 2 ) = ρ = ω 2 B + σ 2 W This means that the distance in time is not taken into account!!

21 Repeated measurements, May 2008 Two-level model with random dog level: Class Levels Values grp 2 1 2 time_no 4 1 2 3 4 dog 11 1 2 3 4 5 6 7 8 9 10 11 P=0.08 for test of Covariance Parameter Estimates interaction, Standard Z Cov Parm Estimate Error Value Pr Z i.e. no convincing dog(grp) 0.06587 0.03532 1.86 0.0311 indication of this. Residual 0.03554 0.009672 3.67 0.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F grp 1 9 2.85 0.1257 time_no 3 27 21.35 <.0001 grp*time_no 3 27 2.50 0.0805

22 Repeated measurements, May 2008 Factor diagram: ✲ [ Dog ] Grp ✟ ✯ ✟✟✟ ✸ ✑ ✑✑✑✑ [ I ] = [ Dog ∗ Time ] ❍❍❍ ❥ ❍ ✲ Grp ∗ Time Time We have used the notation [ ] for the random effects, corresponding to variance components. We may note the following: • The effect of Grp*Time is evaluated against Dog*Time • If Grp*Time is not considered significant, we thereafter evaluate – Time against Dog*Time – Grp against Dog(Grp)

23 Repeated measurements, May 2008 The variance component model with random dog level specifies the covariance structure : 0 ω 2 B + σ 2 ω 2 ω 2 ω 2 1 0 1 1 ρ ρ ρ W B B B ω 2 ω 2 B + σ 2 ω 2 ω 2 B C B C ρ 1 ρ ρ = ( ω 2 B + σ 2 B B W B B C B C W ) B C B C ω 2 ω 2 ω 2 B + σ 2 ω 2 1 ρ ρ ρ B C B C B B W B @ A @ A ω 2 ω 2 ω 2 ω 2 B + σ 2 1 ρ ρ ρ B B B W called the compound symmetry structure. The correlation ρ is here estimated to ω 2 B ρ = Corr( Y gdt 1 , Y gdt 2 ) = ω 2 B + σ 2 W 0 . 06587 0 . 06587 + 0 . 03554 = 0 . 65 ≈

24 Repeated measurements, May 2008 Note, that the specification ’random dog(grp);’ can be written in two other ways: random intercept / subject=dog(grp); repeated time / type=CS subject=dog(grp); In the following, we shall see generalisations of the constructions above.

25 Repeated measurements, May 2008 Compound symmetry analysis proc mixed data=dog; Covariance Parameter Estimates class grp time dog; Cov Parm Subject Estimate model losmol=grp time grp*time CS dog(grp) 0.06587 / ddfm=satterth; Residual 0.03554 repeated time / type=cs Fit Statistics subject=dog(grp) rcorr; run; -2 Res Log Likelihood 14.8 AIC (smaller is better) 18.8 Estimated R Correlation Matrix for dog(grp) 1 1 Type 3 Tests of Fixed Effects Row Col1 Col2 Col3 Col4 Num Den Effect DF DF F Value Pr > F 1 1.0000 0.6496 0.6496 0.6496 2 0.6496 1.0000 0.6496 0.6496 grp 1 9 2.85 0.1257 3 0.6496 0.6496 1.0000 0.6496 time_no 3 27 21.35 <.0001 4 0.6496 0.6496 0.6496 1.0000 grp*time_no 3 27 2.50 0.0805

26 Repeated measurements, May 2008 The option ddfm=satterth (- or kenwardrogers ): • When the distributions are exact, they have no effect – in balanced situations • When approximations are necessary, these are considered best – in unbalanced situations, i.e for almost all observational designs – in case of missing observations • It may give rise to fractional degrees of freedom • The computations may require a little more time, but in most cases this will not be noticable • When in doubt, use it!

Analysis of variance and regression May 13, 2008 Repeated - PowerPoint PPT Presentation

Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline considerations Lene

Analysis of variance and regression 2009-3-11 Lene Theil Skovgaard Repeated measurements May

Variance Will Perkins January 22, 2013 Variance Definition The variance of a random variable X

Analysis of variance and regression December 4, 2007 Variance component models Variance

Analysis of variance and regression Other types of regression models Other types of regression

Estimating Variance under Estimating Mean . . . Interval and Fuzzy Estimating Variance . . .

Alex Psomas: Lecture 18. Random Variables: Variance 1. Variance 2. Distributions Variance Flip

Variance = E[I 2 ] 2pE[I] + p 2 = E[I] 2p p + p 2 = 2 2 = p-2p+ p pq variance.1

Module 15 Standard Costing and Variance Analysis Dr. Varadraj Bapat 1 Standard Costing

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Analysis of variance and regression November 27, 2007 Other types of regression models Counts

Dynamic Games in Environmental Economics PhD minicourse Part I: Repeated Games and Self-Enforcing

Environmental Economics 4910 Brd Harstad UiO February 2019 Brd Harstad (UiO) Repeated

Repeated games Felix Munoz-Garcia Strategy and Game Theory - Washington State University Repeated

Repetition vs. Pattern vs. Rhythm Repetition One object or shape that is repeated Pattern A

Study 109 Switch to Elvitegravir-Cobicistat-TAF-FTC Study 109: Design Study Design: Study 109

Briefing NSF Biomaterials Workshop: Important Areas for Future Investment June 19 -20, 2012

Creating the Ecosystem for Taking Genetics from Bench to Bedside in a Developing Country: A

Il trapianto allogenico: quando e per chi? Daniela Cilloni (Torino) Number of allogeneic HCTs

Flattened Image Trees: A powerful kernel image format Feb 21, 2013 Joel A Fernandes

In Vitro Analysis of Foot and Ankle Kinematics: Robotic Gait Simulation William R. Ledoux

Disclosures Update 2018 April 1-6, 2018 No relevant financial relationship exists Common Knee

Mission Statement Our mission is to educate, support, and advocate for individuals with breast