Operational Trials: Data Analysis Wendy Bergerud Research Branch - - PowerPoint PPT Presentation

operational trials
SMART_READER_LITE
LIVE PREVIEW

Operational Trials: Data Analysis Wendy Bergerud Research Branch - - PowerPoint PPT Presentation

Operational Trials: Data Analysis Wendy Bergerud Research Branch BC Min. of Forests May 2003 Proposed Analysis When designing the trial we should consider what form of analysis we expect to run on the data. What design requirements


slide-1
SLIDE 1

Operational Trials:

Data Analysis

Wendy Bergerud

Research Branch BC Min. of Forests

May 2003

slide-2
SLIDE 2

WAB

Proposed Analysis

When designing the trial we should consider what form of analysis we expect to run on the data.

What design requirements will this proposed

analysis have?

Can these design requirements be met?

Q12

slide-3
SLIDE 3

WAB

Proposed Analysis

Should additional variables be collected and/or are some unnecessary? Does the analysis require some assumptions that the design can’t meet? Are some treatment combinations necessary or not? Should all treatment combinations be equally replicated?

slide-4
SLIDE 4

WAB

Types of Variables

Measured or continuous variables Counts Categorical variables

Non-ordered Ordered Dichotomous (proportions)

Response vs. Explanatory Variables

slide-5
SLIDE 5

WAB

Response Variables

What variables measure the response we are interested in and how will we measure them?

Can the variables we are interested in be

directly measured or must we use a proxy or surrogate measure?

Q10

slide-6
SLIDE 6

WAB

What is a model?

A model is simply a description that summarizes the groups and/or relationships that we think exist in the subject matter that we are studying. Statistical models often have a formal mathematical representation. But these models are developed from the ideas that we have about the subject matter we are studying.

slide-7
SLIDE 7

WAB

Statistical Models for Data Analysis

Categorical: 2 levels 2 or more levels Continuous: 1 variable 2 or more variables Categorical & Continuous: 2 or more & 1 or more t-test ANOVA & contrasts Simple Regression Multiple Regression ANCOVA

These are for continuous response variables,

modelled using the normal distribution. Type of Explanatory Variable General Linear Model (GLM)

slide-8
SLIDE 8

WAB

Other Models for Data Analysis

The previous models can be adapted for

  • ther types of response variables:

Counts

Poisson distribution

Proportions

binomial distribution

Contingency tables can also be used to test counts and proportions obtained from simple study designs.

slide-9
SLIDE 9

WAB

Data Analysis Steps

1 Look for patterns, errors, and outliers. 2 Correct data as necessary. 3 Fit models and test terms in the model. 4 Examine selected model for adequate fit. Repeat any steps as necessary. 5 Summarize data analysis and the final fitted model.

slide-10
SLIDE 10

WAB

Data Analysis Example

D1

DBH (cm) Height (m) BWBS 23.1 26.6 19.3 4.1 26.1 35.1 24.0 24.6 22.3 27.2 19.9 21.5 26.6 25.9 24.0 24.7 20.5 22.4 20.2 20.4 52.6 . 27.5 . 22.2 . 22.7 . SBS 35.0 34.2 31.2 25.5 21.1 22.3 22.4 19.3 24.2 27.5 27.2 24.1 34.3 26.6 27.2 24.9 26.3 . 28.2 . 26.8 . 26.5 . 26.9 . 27.6 .

slide-11
SLIDE 11

WAB

BWBS SBS Diameter

Ste Stem m Leaf Leaf # # 5| 5|3 3 1 1 4| 4| 4| 4| 3| 3|5 5 1 1 3| 3| 2| 2|6677 66777 7 5 5 2| 2|0222 02223 3 5 5 | |----

  • ---+----

+----+-- +----+

  • -+----+
  • ---+

Mu Mult ltiply iply Stem Stem & & Lea Leaf by f by 10 10 S Stem tem Le Leaf af # # 34| 34|23 230 3 3 32| 32| 30| 30| 28| 28| 26| 26|36 36895 895 5 5 24| 24|2 2 1 1 22| 22|3 3 1 1 20| 20|1 1 1 1 | |----+--

  • -+----+--
  • -+----+---
  • +----+
  • +

Step 1 -- Stem and Leaf Plots

D2

One unusually high diameter in BWBS.

Two groups in SBS.

slide-12
SLIDE 12

WAB

BWBS SBS Diameter

Ste Stem m Leaf Leaf # # 5| 5|3 3 1 1 4| 4| 4| 4| 3| 3|5 5 1 1 3| 3| 2| 2|6677 66777 7 5 5 2| 2|0222 02223 3 5 5 | |----

  • ---+----

+----+-- +----+

  • -+----+
  • ---+

Mu Mult ltiply iply Stem Stem & & Lea Leaf by f by 10 10 S Stem tem Le Leaf af # # 34| 34|23 230 3 3 32| 32| 30| 30| 28| 28| 26| 26|36 36895 895 5 5 24| 24|2 2 1 1 22| 22|3 3 1 1 20| 20|1 1 1 1 | |----+--

  • -+----+--
  • -+----+---
  • +----+
  • +

Step 1 -- Stem and Leaf Plots

D2

One unusually high diameter in BWBS.

Two groups in SBS.

slide-13
SLIDE 13

WAB

BWBS SBS Height

Stem Stem Leaf Leaf # 2| 2|558 558 3 2| 2|0002344 0002344 7 1| 1|9 9 1 1| 1| 0| 0| 0| 0|4 4 1 | |----+---

  • --+----+--
  • +----+-
  • +----+
  • --+

Mult Multiply Ste iply Stem & & Leaf Leaf by 10 by 10 Stem Stem Leaf Leaf # # 30 30|2 |2 1 1 28 28|2 |2 1 1 26 26|5226 |5226 4 4 24 24|195 |195 3 3 22 22|4 |4 1 1 20 20| 18 18|3 |3 1 1 |----+----+- |----+----+----+----+

  • --+----+

Step 1 -- Stem and Leaf Plots

D3

Two unusually low heights, one in each zone.

The SBS value is actually 19.3.

slide-14
SLIDE 14

WAB

Step 1 -- Stem and Leaf Plots

D3

BWBS SBS Height

Stem Stem Leaf Leaf # 2| 2|558 558 3 2| 2|0002344 0002344 7 1| 1|9 9 1 1| 1| 0| 0| 0| 0|4 4 1 | |----+---

  • --+----+--
  • +----+-
  • +----+
  • --+

Mult Multiply Ste iply Stem & & Leaf Leaf by 10 by 10 Stem Stem Leaf Leaf # # 30 30|2 |2 1 1 28 28|2 |2 1 1 26 26|5226 |5226 4 4 24 24|195 |195 3 3 22 22|4 |4 1 1 20 20| 18 18|3 |3 1 1 |----+----+- |----+----+----+----+

  • --+----+

Two unusually low heights, one in each zone.

The SBS value is actually 19.3.

slide-15
SLIDE 15

WAB

BWBS SBS 20 30 40 50 60 D i a m e t e r BEC-Zone BWBS SBS 5 10 15 20 25 30 35 H e i g h t BEC-Zone

Step 1 -- Boxplots

D4

slide-16
SLIDE 16

WAB

BEC-Zone BWBS SBS Height 10 20 30 40 Diameter 20 30 40 50 60

Step 1 -- Scatterplot

D5

slide-17
SLIDE 17

WAB

DBH (cm) Height (m) BWBS 23.1 26.6 19.3 4.1 26.1 35.1 24.0 24.6 22.3 27.2 19.9 21.5 26.6 25.9 24.0 24.7 20.5 22.4 20.2 20.4 52.6 . 27.5 . 22.2 . 22.7 . SBS 35.0 34.2 31.2 25.5 21.1 22.3 22.4 19.3 24.2 27.5 27.2 24.1 34.3 26.6 27.2 24.9 26.3 . 28.2 . 26.8 . 26.5 . 26.9 . 27.6 .

Data Analysis Example

D1

slide-18
SLIDE 18

WAB

Step 2 -- Correct data

We have found three unusual points:

Are these ‘real’ values? If not, what should

the values be?

Sometimes the unusual data points have the

most interesting story to tell - don’t casually throw them out!

Suppose that the height of 4.1 should be 24.7, but that the other values are okay.

D6

slide-19
SLIDE 19

WAB

Step 3 -- Fit a model

Separate response variables:

Analyse dbh and height separately with a

t-test or ANOVA.

Are the two groups different?

Relationship between response variables:

Analyse the relationship between height and

diameter.

Is the relationship the same for both groups?

D7

slide-20
SLIDE 20

WAB

Step 3 -- Consider Five models

D8

Model Name Description

1

One group

Data belong to just one group. 2

Two groups

Data belong to two groups defined by BEC-Zone. 3

One line

There is a linear relationship between height and diameter but it is the same for the two groups. 4

Two parallel lines

There is a linear relationship between height and diameter but the line for one group is higher than for the other group, while both have the same slope. 5

Two lines

There is a linear relationship between height and diameter but the slope for one group is steeper than for the other group.

slide-21
SLIDE 21

WAB

Step 3 -- Parallel Line Model

D9

BEC-Zone BWBS SBS BEC-Zone BWBS SBS Height 15 20 25 30 35 Diameter 20 30 40 50 60 15 20 25 30 35

slide-22
SLIDE 22

WAB

Step 3 -- Model Fit

D10

Model Name Degrees of Freedom (df) Residual Sums

  • f Squares

(SSR) Residual Mean Square (MSR)

1

One group

22 223.3565 10.15 2

Two groups

21 170.4710 8.12 3

One line

21 147.8133 7.04 4

Two parallel lines

20 96.7094 4.84 5

Two lines

19 89.8038 4.73

slide-23
SLIDE 23

WAB

Step 3 -- Comparing the models

D11

Testing: Models Used Difference in SSR F-test (df) p-value Result

  • 1. Are the lines parallel?

4 & 5 6.91 1.5 (1, 19) p = 0.24 Yes† Given that lines are parallel:

  • 2. Can we use just one line?

3 & 4 51.10 10.5 (1, 20) p = 0.0040 No

  • 3. Can we use just two groups?

2 & 4 73.76 15.25 (1,20) p = 0.0046 No

† Technically we can’t accept a null hypothesis, but in order to proceed we must make decisions, even if they might

be wrong.

Parallel Lines model provides a good overall fit. Significance is easy to see when p-values are presented with just two significant digits.

slide-24
SLIDE 24

WAB

Step 4 - Checking the model

We do this my examining the residuals:

Are they normally distributed? (or at least,

symmetric?)

Do they have any patterns with respect to

the explanatory variables and the fitted values?

Does their variability look similar regardless

  • f the explanatory variables?

D12

slide-25
SLIDE 25

WAB

S Stem Le Leaf af # Bo Boxp xplo lot 3 3 5 5 1 | | 2 2 02 0233 338 8 5 + +-----+ 1 1 35 356 6 3 | | | 0 9 9 1 | | + + |

  • 0 76

7632 32 4 * *-----*

  • 1 97

9775 7520 20 6 + +-----+

  • 2 30

30 2 | |

  • 3

| |

  • 4

| |

  • 5 1

1 1 | |

  • -+-

+----+----+-

  • +---
  • --+
  • +

Step 4 - Checking the model

D13

Stem & Leaf Plot with Boxplot

(from SAS’s Proc Univariate)

slide-26
SLIDE 26

WAB

Step 4 - Checking the model

D13

Stem & Leaf Plot with Boxplot

(from SAS’s Proc Univariate)

S Stem Le Leaf af # Bo Boxp xplo lot 3 3 5 5 1 | | 2 2 02 0233 338 8 5 + +-----+ 1 1 35 356 6 3 | | | 0 9 9 1 | | + + |

  • 0 76

7632 32 4 * *-----*

  • 1 97

9775 7520 20 6 + +-----+

  • 2 30

30 2 | |

  • 3

| |

  • 4

| |

  • 5 1

1 1 | |

  • -+-

+----+----+-

  • +---
  • --+
  • +
slide-27
SLIDE 27

WAB

Step 4 - Checking the model

Scatterplot against Diameter

D14

BEC-Zone BWBS SBS Residuals

  • 6.0
  • 5.0
  • 4.0
  • 3.0
  • 2.0
  • 1.0

0.0 1.0 2.0 3.0 4.0 5.0 Diameter 20 30 40 50 60

slide-28
SLIDE 28

WAB

Step 4 - Checking the model

Good overall plots for examining residuals

D15

BEC-Zone BWBS SBS Residuals

  • 6.0
  • 5.0
  • 4.0
  • 3.0
  • 2.0
  • 1.0

0.0 1.0 2.0 3.0 4.0 5.0 Predicted Height 15.0 20.0 25.0 30.0 35.0

slide-29
SLIDE 29

WAB

Step 5 - Summarizing the results

Final model has parameters:

D16

Group 1: BWBS Height = 15.6 + 0.262 * Diameter se: (1.96) (0.067) Group 2: SBS Height = 18.5 + 0.262 * Diameter

se: (1.98) (0.067)

Include the plot from slide D9 in report, as well as the tests from slide D11. The slope is the same for both groups.

slide-30
SLIDE 30

WAB

Data Interpretation

This is your job! Discuss what the results mean in the context in which you are working. Discuss how the results confirm current practices or suggest changes.