oversigt course 02429 analysis of correlated data mixed
play

Oversigt Course 02429 Analysis of correlated data: Mixed Linear - PowerPoint PPT Presentation

Overview Oversigt Course 02429 Analysis of correlated data: Mixed Linear Models Module 1: Introduction to mixed models Overview 1 Simple intro example 2 Per Bruun Brockhoff The mixed model 3 DTU Compute Building 324 - room 220 Missing


  1. Overview Oversigt Course 02429 Analysis of correlated data: Mixed Linear Models Module 1: Introduction to mixed models Overview 1 Simple intro example 2 Per Bruun Brockhoff The mixed model 3 DTU Compute Building 324 - room 220 Missing value example 4 Technical University of Denmark 2800 Lyngby – Denmark e-mail: perbb@dtu.dk Why use mixed models 5 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 1 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 2 / 20 Overview Overview Course Preface Modules of the course Module 1: General introduction to mixed models. Randomized blocks design. Applied statistics: Analysis of variance and regression analysis Module 2: Factor structure diagrams. Limited settings in basic courses Module 3: Drying of beech wood - a case study, part I. Restrictive assumptions Module 4: Mixed model theory, part I. This course: Module 5: Hierarchical random effects Wider settings Relax assumptions on independence and variance homogeneity Module 6: Model diagnostics Tool: Mixed Linear (Normal) Models Module 7: The analysis of split-plot design data Provides the basis for continuing to Module 8: Analysis of covariance Non-normal data (including binary, category and ordinal scales) Module 9: Random coefficient models Non-linear models Module 10: Mixed model theory, part II Module 11: Repeated measures, simple methods. Module 12: Repeated measures, advanced methods. Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 3 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 4 / 20

  2. Overview Simple intro example Content of Module 1: Oversigt Overview 1 Course material preface Introductory example. Simple intro example 2 Randomized blocks design. Random versus fixed block effect. The mixed model 3 Comparison of fixed and mixed model. Example with missing values. Missing value example 4 Why use mixed models? R introduction. Why use mixed models 5 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 5 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 6 / 20 Simple intro example Simple intro example Introductory example: NIR predictions of HPLC Simple analysis by paired t-test measurements The uncertainty of the estimated difference: SE d = s d √ n = 0 . 2953 √ = 0 . 0934 Data: 10 HPLC NIR Difference Hypothesis test: Tablet 1 10.4 10.1 0.3 d = − 0 . 05 Tablet 2 10.6 10.8 -0.2 t = 0 . 0934 = − 0 . 535 SE d Tablet 3 10.2 10.2 0.0 Tablet 4 10.1 9.9 0.2 A 95%-confidence band: Tablet 5 10.3 11 -0.7 d ± t 0 . 975 (9) SE d ⇐ ⇒ − 0 . 05 ± 0 . 21 Tablet 6 10.7 10.5 0.2 Tablet 7 10.3 10.2 0.1 Result: No significant method difference. (P-value = 0 . 61 ) Tablet 8 10.9 10.9 0.0 The statistical model: Tablet 9 10.1 10.4 -0.3 d i = µ + ε i , ε ∼ N (0 , σ 2 ) , Tablet 10 9.8 9.9 -0.1 Aim: Study the method differences. Model parameter estimates: µ = d, ˆ σ = s d ˆ Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 7 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 8 / 20

  3. Simple intro example Simple intro example Simple analysis by an ANOVA approach The problem of the ANOVA approach The uncertainty of the average NIR value within the model: Randomized Blocks setup with 10 "blocks" (the tablets) and two "treatments" (the methods): ˆ σ SE ( y 1 ) = √ 10 = 0 . 066 y ij = µ + α i + β j + ε ij , ε ij ∼ N (0 , σ 2 ) , This is "wrong"! Same analysis as the paired t-test: The tablet-to-tablet variation is ignored. Source of Degrees of Sums of Mean F P Using ONLY the 10 NIR-values gives: variation freedom squares squares s 1 s 1 = 0 . 4012 , SE ( y 1 ) = √ 10 = 0 . 127 Tablets 9 2.0005 0.2223 5.10 0.0118 Methods 1 0.0125 0.0125 0.29 0.6054 The ANOVA approach: Residual 9 0.3925 0.0436 Is valid only for statements about the 10 specific tablets in the The uncertainty of the average method difference: experiment. The 2nd approach: � 1 � 10 + 1 � Considers the 10 tablets as a random sample. (But ignores the SE ( y 2 − y 1 ) = σ 2 ˆ = 0 . 0934 information from the HPLC measurements) 10 Is valid for tablets in general. Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 9 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 10 / 20 The mixed model The mixed model Oversigt The solution: Overview 1 Combine the random sample assumption with a model for the entire data set! Simple intro example 2 The mixed linear model: Considers the tablet differences as random effects : The mixed model 3 y ij = µ + a i + β j + ε ij , ε ij ∼ N (0 , σ 2 ) , a i ∼ N (0 , σ 2 T ) . Missing value example 4 Note: Statements about method DIFFERENCES are the same for the two kinds of validity. Why use mixed models 5 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 11 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 12 / 20

  4. The mixed model The mixed model Comparison of fixed and mixed model Analysis by mixed model ANOVA table as for the fixed model: A model can be characterized by three features: Source of Degrees of Sums of Mean E(MS) The expected value of the ij th observation y ij 1 variation freedom squares squares The variance of the ij th observation y ij 2 2 σ 2 T + σ 2 The relation between two different observations Tablets 9 2.0005 0.2223 3 σ 2 + 10 � β 2 (covariance/correlation) Methods 1 0.0125 0.0125 j σ 2 Residual 9 0.3925 0.0436 A comparison of the two models: Variance components estimated by: Fixed model Mixed model 1. E ( y ij ) µ + α i + β j µ + β j T = 0 . 2223 − 0 . 0436 σ 2 = 0 . 0436 , ˆ σ 2 σ 2 T + σ 2 σ 2 2. var ( y ij ) ˆ = 0 . 0894 2 ′ ) σ 2 3. cov ( y ij , y i ′ j ′ ) 0 T (if i = i ′ ) ′ ) The uncertainty of the average NIR-value in the mixed model is: ( j � = j 0 (if i � = i In summary: (for the mixed model) � √ 0 . 0894 + 0 . 0436 σ 2 σ 2 ˆ T + ˆ The tablet differences become a part of the variance structure. SE ( y 1 ) = √ = √ = 0 . 115 The observations are no longer independent 10 10 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 13 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 14 / 20 Missing value example Missing value example Oversigt Example with missing values Data: HPLC NIR Difference Tablet 1 10.4 10.1 0.3 Overview 1 Tablet 2 10.6 10.8 -0.2 Tablet 3 10.2 10.2 0.0 Tablet 4 10.1 9.9 0.2 Tablet 5 10.3 11 -0.7 Simple intro example 2 Tablet 6 10.7 10.5 0.2 Tablet 7 10.3 10.2 0.1 Tablet 8 10.9 10.9 0.0 The mixed model 3 Tablet 9 10.1 10.4 -0.3 Tablet 10 9.8 9.9 -0.1 Tablet 11 10.8 Tablet 12 9.8 Missing value example 4 Tablet 13 10.5 Tablet 14 10.3 Tablet 15 9.7 Why use mixed models 5 Tablet 16 10.3 Tablet 17 9.6 Tablet 18 10.0 Tablet 19 10.2 Tablet 20 9.9 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 15 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 16 / 20

  5. Missing value example Missing value example Analysis by fixed effects ANOVA Analysis by mixed model The results (as given by PROC MIXED in SAS): σ 2 = 0 . 0435 , ANOVA table: σ 2 ˆ ˆ T = 0 . 1019 Source of Degrees of Sums of Mean F P β 2 − ˆ ˆ SE (ˆ β 2 − ˆ β 1 = − 0 . 07211 , β 1 ) = 0 . 0870 variation freedom squares squares Tablets 19 3.7230 0.1959 4.49 0.0129 Information from all data is used! Methods 1 0.0125 0.0125 0.29 0.6054 Consider ONLY tablets 11-20: Two (independent) samples t-test setup Residual 9 0.3925 0.0436 with 5 tablets in each group: The results for the method differences EXACTLY as before: y 1 = 10 . 22 , s 1 = 0 . 4658 , y 2 = 10 . 00 , s 2 = 0 . 2739 � 1 � 10 + 1 � SE (ˆ β 2 − ˆ σ 2 β 1 ) = ˆ = 0 . 0934 The results from the two separate analyzes can be summarized as: 10 Tablets 1-10 Tablets 11-20 Difference -0.05 -0.22 The information in Tablets 11-20 are NOT used! SE 2 0.00872 0.0584 The mixed model analysis combines these two sets of information! Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 17 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 18 / 20 Why use mixed models Why use mixed models Oversigt Why use mixed models? Overview 1 To avoid mistakes. Simple intro example 2 To broaden the statistical inference. To recover all relevant information in the data. The mixed model 3 To be able to handle correlation structures in the data. To be able to handle non-homogeneous variances. Missing value example 4 To use the only reasonable model for the data! Why use mixed models 5 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 19 / 20 Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 20 / 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend