Applied Statistics and Data Modeling Part 3: Analysis of Variance - - PowerPoint PPT Presentation

Applied Statistics and Data Modeling Part 3: Analysis of Variance - Two way ANOVA Luc Duchateau 1 Paul Janssen 2 1 Faculty of Veterinary Medicine Ghent University, Belgium 2 Center for Statistics Hasselt University, Belgium 2020 UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 1 / 77

Two-Way Analysis of Variance Overview Overview Introducing two-way data sets Why factorial experiments? Constructing models for two-way data ANOVA for two-way data Testing specific hypothesis for two-way data ANOVA for two-way data with exactly 1 replication UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 2 / 77

Two-Way Analysis of Variance Introducing two-way data sets Introducing two-way data sets Example 1: Treatment effect on PCV for Boran and Holstein cows with trypanosomosis Cowid Breed Drug PCV-before PCV-after PCV-difference 1 Boran Berenil 18.4 26.3 7.9 2 Boran Berenil 20.3 28.1 7.8 3 Boran Berenil 22.2 27.8 5.6 4 Boran Samorin 16.3 30.1 13.8 5 Boran Samorin 15.4 27.3 11.9 6 Boran Samorin 19.2 32.7 13.5 7 Holstein Berenil 21.3 28.3 7.0 8 Holstein Berenil 17.4 26.8 9.4 9 Holstein Berenil 18.2 25.8 7.6 10 Holstein Samorin 22.2 38.1 15.9 11 Holstein Samorin 19.8 32.3 12.5 UGent 12 Holstein Samorin 20.4 30.8 10.4 STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 3 / 77

Two-Way Analysis of Variance Introducing two-way data sets Example 2: Milk production as a function of parity and inoculation dose Cowid Parity Inoculation dose Milk0 Milk48 Reduction (%) 1 heifer high 32.4 30.2 6.79 2 heifer high 33.6 32.3 3.87 3 heifer medium 29.3 20.5 30.03 4 heifer medium 34.4 21.3 38.08 5 heifer low 31.3 14.5 53.67 6 heifer low 35.3 13.4 62.04 7 multiparous high 42.4 39.5 6.84 8 multiparous high 43.3 39.7 8.31 9 multiparous medium 45.2 23.9 47.12 10 multiparous medium 44.4 24.8 44.14 11 multiparous low 41.5 6.7 83.86 12 multiparous low 45.2 4.1 90.93 UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 4 / 77

Two-Way Analysis of Variance Why factorial experiments? Why factorial experiments? Factorial versus ’one at a time’ Investigate two factors separately, e.g. → Compare heifers with multiparous cows at high inoculation dose → Multiparous cows higher reduction, and thus more appropriate as experimental model → Compare inoculation doses for multiparous cows UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 5 / 77

Two-Way Analysis of Variance Why factorial experiments? Investigate two factors jointly, e.g. → Take 6 heifers and 6 multiparous cows → Assign inoculation doses at random so that each inoculation dose has 2 heifers and 2 multiparous cows UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 6 / 77

Two-Way Analysis of Variance Why factorial experiments? Disadvantages of the ’one factor at a time’ approach We do not evaluate all treatment combinations We cannot evaluate interaction between 2 factors Treatment combinations (of the two factors) of first experiment are not randomly assigned with respect to those of second experiment Logistically more demanding because 2 experiments UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 7 / 77

Two-Way Analysis of Variance Why factorial experiments? Advantages of the factorial approach More replications due to ’hidden’ replication In the case of no interaction, all subjects receiving a treatment level of one factor can be considered replication for that factor level Interaction can be evaluated More easily generalizable in the absence of interaction In the case of no interaction, results of one factor are valid at all levels of the other factor UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 8 / 77

Two-Way Analysis of Variance Constructing models for two-way data Constructing models for two-way data Assume that all population means are known! Population means of treatment combinations µ ij is population mean for level i of factor A and level j of factor B Factor B: inoculation dose Factor A: Parity j = 1, low j = 2, medium j = 3, high Row mean i = 1, heifer 75 ( µ 11 ) 42 ( µ 12 ) 3 ( µ 13 ) 40 ( µ 1 . ) i = 2, multiparous 75 ( µ 21 ) 42 ( µ 22 ) 3 ( µ 22 ) 40 ( µ 2 . ) Column mean 75 ( µ . 1 ) 42 ( µ . 2 ) 3 ( µ . 3 ) 40 ( µ .. ) UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 9 / 77

Two-Way Analysis of Variance Constructing models for two-way data Population means of factor levels µ i . is population mean for observations of level i of factor A b � µ ij j =1 µ i . = b µ . j is population mean for observations of level j of factor B a � µ ij i =1 µ . j = a µ .. is the overall population mean a b b a � � � µ ij µ . j � µ i . i =1 j =1 j =1 i =1 µ .. = = = UGent ab a b STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 10 / 77

Two-Way Analysis of Variance Constructing models for two-way data Translation to factor effects Main effect of level i of factor A: α i = µ i . − µ .. Main effect of level j of factor B: β j = µ . j − µ .. a � a µ i . i =1 � As µ .. = ⇒ α i = 0 a i =1 b � µ . j b j =1 As µ .. = ⇒ � β j = 0 b j =1 ⇒ Sum Restrictions UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 11 / 77

Two-Way Analysis of Variance Constructing models for two-way data What is α 1 , α 2 , β 1 , β 2 and β 3 in example below? Factor B: inoculation dose Factor A: Parity j = 1, low j = 2, medium j = 3, high Row mean i = 1, heifer 75 ( µ 11 ) 42 ( µ 12 ) 3 ( µ 13 ) 40 ( µ 1 . ) i = 2, multiparous 75 ( µ 21 ) 42 ( µ 22 ) 3 ( µ 22 ) 40 ( µ 2 . ) Column mean 75 ( µ . 1 ) 42 ( µ . 2 ) 3 ( µ . 3 ) 40 ( µ .. ) UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 12 / 77

Two-Way Analysis of Variance Constructing models for two-way data What is α 1 , α 2 , β 1 , β 2 and β 3 in example below? Factor B: inoculation dose Factor A: Parity j = 1, low j = 2, medium j = 3, high Row mean i = 1, heifer 75 ( µ 11 ) 42 ( µ 12 ) 3 ( µ 13 ) 40 ( µ 1 . ) i = 2, multiparous 75 ( µ 21 ) 42 ( µ 22 ) 3 ( µ 22 ) 40 ( µ 2 . ) Column mean 75 ( µ . 1 ) 42 ( µ . 2 ) 3 ( µ . 3 ) 40 ( µ .. ) α 1 = 0, α 2 = 0 UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 12 / 77

Two-Way Analysis of Variance Constructing models for two-way data What is α 1 , α 2 , β 1 , β 2 and β 3 in example below? Factor B: inoculation dose Factor A: Parity j = 1, low j = 2, medium j = 3, high Row mean i = 1, heifer 75 ( µ 11 ) 42 ( µ 12 ) 3 ( µ 13 ) 40 ( µ 1 . ) i = 2, multiparous 75 ( µ 21 ) 42 ( µ 22 ) 3 ( µ 22 ) 40 ( µ 2 . ) Column mean 75 ( µ . 1 ) 42 ( µ . 2 ) 3 ( µ . 3 ) 40 ( µ .. ) α 1 = 0, α 2 = 0 β 1 = µ . 1 − µ .. = 75 − 40 = 35 UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 12 / 77

Two-Way Analysis of Variance Constructing models for two-way data What is α 1 , α 2 , β 1 , β 2 and β 3 in example below? Factor B: inoculation dose Factor A: Parity j = 1, low j = 2, medium j = 3, high Row mean i = 1, heifer 75 ( µ 11 ) 42 ( µ 12 ) 3 ( µ 13 ) 40 ( µ 1 . ) i = 2, multiparous 75 ( µ 21 ) 42 ( µ 22 ) 3 ( µ 22 ) 40 ( µ 2 . ) Column mean 75 ( µ . 1 ) 42 ( µ . 2 ) 3 ( µ . 3 ) 40 ( µ .. ) α 1 = 0, α 2 = 0 β 1 = µ . 1 − µ .. = 75 − 40 = 35 β 2 = 2, β 3 = − 37 UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 12 / 77

Two-Way Analysis of Variance Constructing models for two-way data Additive factor effects We say that factor effects are additive if we only need the factor effects to obtain the population means, i.e., µ ij = µ .. + α i + β j This corresponds to → Absence of interaction → Effect of one factor does not depend on the level of the other factor UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 13 / 77

Two-Way Analysis of Variance Constructing models for two-way data Are the two factors additive in the example below? Factor B: inoculation dose Factor A: Parity j = 1, low j = 2, medium j = 3, high Row mean i = 1, heifer 75 ( µ 11 ) 42 ( µ 12 ) 3 ( µ 13 ) 40 ( µ 1 . ) i = 2, multiparous 75 ( µ 21 ) 42 ( µ 22 ) 3 ( µ 22 ) 40 ( µ 2 . ) Column mean 75 ( µ . 1 ) 42 ( µ . 2 ) 3 ( µ . 3 ) 40 ( µ .. ) UGent STATS VM L. Duchateau & P.Janssen (UG & UH) Applied Statistics and Data Modeling 2020 14 / 77

Applied Statistics and Data Modeling Part 3: Analysis of Variance - - PowerPoint PPT Presentation

Applied Statistics and Data Modeling Part 3: Analysis of Variance - Two way ANOVA Luc Duchateau 1 Paul Janssen 2 1 Faculty of Veterinary Medicine Ghent University, Belgium 2 Center for Statistics Hasselt University, Belgium 2020 UGent STATS

The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

Nested designs Applied Statistics and Experimental Design Chapter 7 Peter Hoff Statistics,

Section 1 Time Series Modeling 1 / 37 Time Series Modeling ST 810-006 Statistics and Financial

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

Language Modeling CSE354 - Spring 2020 Task Language Modeling Probabilistic Modeling

Applied Statistics and Data Modeling Part 3: Analysis of Variance - Balanced block designs Luc

Applied Statistics and Data Modeling Part 3: Analysis of Variance - One way ANOVA Luc Duchateau 1

Applied Statistics and Data Modeling An introduction to R Luc Duchateau 1 Paul Janssen 2 1 Faculty

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Geostatistical data Barry Rowlingson Geostatistician DataCamp Spatial Statistics in R Data

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &

Mon., 21 Sept. 2015 (delayed slides) Conditional and unconditional branches The go to

NLG Evaluation Ehud Reiter (Abdn Uni and Arria/Data2text) Ehud Reiter, Computing Science,

Second Stage of Labor: No financial disclosures related to this talk When to Start and Stop

SPECTRAHEDRA Bernd Sturmfels UC Berkeley Mathematics Colloquium, North Carolina State University

management in group September 7, 2016 housing systems Julie Mnard , Agr, DVM F. Mnard Inc.

An embedded, ecological and evidence- based approach to improving outcomes for families with

Statistical Methods for Plant Biology PBIO 3150/5150 Anirudh V. S. Ruhil February 24, 2016 The

CS5412: TRANSACTIONS (I) Lecture XVII Ken Birman Transactions A widely used reliability

Sambuz

Useful Links

Newsletter

Mail Us

Applied Statistics and Data Modeling Part 3: Analysis of Variance - - PowerPoint PPT Presentation

Applied Statistics and Data Modeling Part 3: Analysis of Variance - Two way ANOVA Luc Duchateau 1 Paul Janssen 2 1 Faculty of Veterinary Medicine Ghent University, Belgium 2 Center for Statistics Hasselt University, Belgium 2020 UGent STATS

The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

Nested designs Applied Statistics and Experimental Design Chapter 7 Peter Hoff Statistics,

Section 1 Time Series Modeling 1 / 37 Time Series Modeling ST 810-006 Statistics and Financial

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

Language Modeling CSE354 - Spring 2020 Task Language Modeling Probabilistic Modeling

Applied Statistics and Data Modeling Part 3: Analysis of Variance - Balanced block designs Luc

Applied Statistics and Data Modeling Part 3: Analysis of Variance - One way ANOVA Luc Duchateau 1

Applied Statistics and Data Modeling An introduction to R Luc Duchateau 1 Paul Janssen 2 1 Faculty

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Geostatistical data Barry Rowlingson Geostatistician DataCamp Spatial Statistics in R Data

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &amp;

Mon., 21 Sept. 2015 (delayed slides) Conditional and unconditional branches The go to

NLG Evaluation Ehud Reiter (Abdn Uni and Arria/Data2text) Ehud Reiter, Computing Science,

Second Stage of Labor: No financial disclosures related to this talk When to Start and Stop

SPECTRAHEDRA Bernd Sturmfels UC Berkeley Mathematics Colloquium, North Carolina State University

management in group September 7, 2016 housing systems Julie Mnard , Agr, DVM F. Mnard Inc.

An embedded, ecological and evidence- based approach to improving outcomes for families with

Statistical Methods for Plant Biology PBIO 3150/5150 Anirudh V. S. Ruhil February 24, 2016 The

CS5412: TRANSACTIONS (I) Lecture XVII Ken Birman Transactions A widely used reliability

Sambuz

Useful Links

Newsletter

Mail Us

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &