Business Statistics CONTENTS Hypotheses on the median The sign - - PowerPoint PPT Presentation

business statistics
SMART_READER_LITE
LIVE PREVIEW

Business Statistics CONTENTS Hypotheses on the median The sign - - PowerPoint PPT Presentation

MEDIAN: NON-PARAMETRIC TESTS Business Statistics CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study HYPOTHESES ON THE MEDIAN The median is a central value that may be more


slide-1
SLIDE 1

MEDIAN: NON-PARAMETRIC TESTS

Business Statistics

slide-2
SLIDE 2

Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study CONTENTS

slide-3
SLIDE 3

▪ The median is a central value that may be more suitable for strongly asymmetric distributions

▪ and for distributions with fat tails

▪ Can we test a population median?

▪ e.g., 𝐼0: 𝑁 = 400

▪ Note:

▪ for a more or less symmetric distribution, 𝑁 ≈ 𝜈, so a 𝑢-test of mean is appropriate (if 𝑜 ≥ 15) ▪ although perhaps more sensitive to large positive or negative

  • utliers in the sample

HYPOTHESES ON THE MEDIAN

𝑁 is here the population

  • median. Think of it as a

Greek letter ...

slide-4
SLIDE 4

▪ What is the median of a sample?

▪ it is the middle value, i.e. 𝑦 𝑜/2

▪ So, if 𝐼0: 𝑁 = 400 would be true, approximately half of the data in the sample would be lower, and half would be higher ▪ Therefore, if we count the number of data points that is lower and compare it to the number of observations, we can develop a test statistic ▪ Two varieties of such non-parametric tests today:

▪ sign test ▪ Wilcoxon signed rank test

HYPOTHESES ON THE MEDIAN

slide-5
SLIDE 5

The sign test ▪ involves simply counting the number of positive or negative signs in a sequence of 𝑜 signs ▪ is based on the binomial distribution ▪ can be applied without requirements on the population distribution THE SIGN TEST

slide-6
SLIDE 6

Computational steps: ▪ for each data point 𝑦𝑗 compute the difference with the median (𝑁) of the null hypothesis (𝐼0): 𝑒𝑗 = 𝑦𝑗 − 𝑁 ▪ omit zero differences (𝑒𝑗 = 0); effective sample size is 𝑜′ ▪ assign +1 to positive differences (𝑒𝑗 > 0) and −1 to negative differences (𝑒𝑗 < 0) ▪ test statistic 𝑌 is the sum of the positive numbers (= number of positive observations) THE SIGN TEST

slide-7
SLIDE 7

Example: Context: battery life until failure (in hours) ▪ 𝐼0: 𝑁 = 400; 𝐼1: 𝑁 ≠ 400 ▪ use 𝛽 = 0.05 ▪ sample of 𝑜 = 13 observations (𝑦1, … , 𝑦13) ▪ reject for large and for small numbers of positive signs THE SIGN TEST

slide-8
SLIDE 8

Example (𝐼0: 𝑁 = 400): ▪ data: 𝑦𝑗 (𝑗 = 1, … , 13) ▪ difference with 𝑁: 𝑒𝑗 = 𝑦𝑗 − 400 ▪ no cases where 𝑒𝑗 = 0, so 𝑜′ = 𝑜 ▪ 𝑡𝑗 = ቊ 1 if 𝑒𝑗 > 0 −1 if 𝑒𝑗 < 0 ▪ 𝑡𝑗

+ = ቊ1

if 𝑒𝑗 > 0 if 𝑒𝑗 < 0 ▪ 𝑦 = σ𝑗=1

𝑜′ 𝑡𝑗 + = 8

THE SIGN TEST

xi xi-400 si si

(+)

342

  • 58
  • 1

426 26 1 1 317

  • 83
  • 1

545 145 1 1 264

  • 136
  • 1

451 51 1 1 1049 649 1 1 631 231 1 1 512 112 1 1 266

  • 134
  • 1

492 92 1 1 562 162 1 1 298

  • 102
  • 1
slide-9
SLIDE 9

Example (continued): ▪ 𝑦 = 8 ▪ under 𝐼0: 𝑌~𝑐𝑗𝑜 13,0.5 ▪ 𝑄𝑐𝑗𝑜 13,0.5 𝑌 ≥ 8 = 0.291

▪ why ≥ 8? ▪ if we would reject for 8, we would also reject for 9

▪ 𝑞-value: 2 × 0.291 = 0.581

▪ why 2 ×? ▪ because it’s a two-sided null hypothesis

▪ there is no reason to reject 𝐼0 THE SIGN TEST

slide-10
SLIDE 10

Suppose we have more observations (𝑜 = 130) and find 𝑦 = 80. Can you look up 𝑄𝑐𝑗𝑜 130,0.5 𝑌 ≥ 80 ? EXERCISE 1

slide-11
SLIDE 11

In the sign test, we replace the numerical values by signs (+ or −) Advantage: ▪ we don’t need any assumption on normality, symmetry, etc.

▪ that’s why we say it’s non-parametric: we don’t have to assume a certain distribution with parameters

Disadvantage: ▪ we discard much information, so that the test is not very sensitive (has low “power”; see later) Are there other non-parametric tests that are more powerful? ▪ is there a compromise between value and sign that still needs some assumptions, but not too many assumptions? Yes, replacing data by their rank

THE SIGN TEST

slide-12
SLIDE 12

Wilcoxon signed rank test ▪ involves comparing the sum of ranks of the values larger than the test value with the sum of ranks of the values smaller than the test value Computational Steps: ▪ for each data point 𝑦𝑗 compute the absolute difference with the median (𝑁) of the null hypothesis: 𝑒𝑗 = 𝑦𝑗 − 𝑁 ▪ omit zero differences (𝑒𝑗 = 0); effective sample size is 𝑜′ ▪ assign ranks (1, … , 𝑜′) to the 𝑒𝑗 ▪ reassign + and − to the ranks ▪ test statistic (𝑋) is the sum of the positive ranks THE WILCOXON SIGNED RANK TEST

slide-13
SLIDE 13

Example (𝐼0: 𝑁 = 400): ▪ data: 𝑦𝑗 (𝑗 = 1, … , 13) ▪ difference with 𝑁: 𝑒𝑗 = 𝑦𝑗 − 400 ▪ no cases where 𝑒𝑗 = 0, so 𝑜′ = 𝑜 ▪ 𝑥 = σ𝑗=1

𝑜′ 𝑠 𝑗 + = 61

▪ under 𝐼0: 𝑋~? (use table) ▪ 𝑄𝐼0 𝑋 ≥ 61 =? THE WILCOXON SIGNED RANK TEST

xi xi– 400 |xi–400| ri ri

(+)

342

  • 58

58

  • 3

426 26 26 1 1 317

  • 83

83

  • 4

545 145 145 10 10 264

  • 136

136

  • 9

451 51 51 2 2 1049 649 649 13 13 631 231 231 12 12 512 112 112 7 7 266

  • 134

134

  • 8

492 92 92 5 5 562 162 162 11 11 298

  • 102

102

  • 6
slide-14
SLIDE 14

Testing the median using the Wilcoxon 𝑋 statistic ▪ small samples: using a table of critical values

▪ included in tables at exam

▪ large samples: using a normal approximation of 𝑋

▪ valid when 𝑜 ≥ 20

▪ The test is only valid for symmetrically distributed populations

▪ if not, use sign test

THE WILCOXON SIGNED RANK TEST

slide-15
SLIDE 15

Small samples: critical values of Wilcoxon statistic ▪ two-sided, 𝛽 = 0.05, 𝑜 = 13: 𝑥𝑚𝑝𝑥𝑓𝑠 = 17 and 𝑥𝑣𝑞𝑞𝑓𝑠 = 74 ▪ 𝑆crit = [0,17] ∪ [74,91] ▪ 𝑥calc = 61, so do not reject 𝐼0 at 𝛽 = 0.05

THE WILCOXON SIGNED RANK TEST

a = 0.05 a = 0.025 a = 0.01 a = 0.005 a = 0.10 a = 0.05 a = 0.02 a = 0.01 n

5 0 , 15

  • -- , ---
  • -- , ---
  • -- , ---

6 2 , 19 0 , 21

  • -- , ---
  • -- , ---

7 3 , 25 2 , 26 0 , 28

  • -- , ---

8 5 , 31 3 , 33 1 , 35 0 , 36 9 8 , 37 5 , 40 3 , 42 1 , 44 10 10 , 45 8 , 47 5 , 50 3 , 52 11 13 , 53 10 , 56 7 , 59 5 , 61 12 17 , 61 13 , 65 10 , 68 7 , 71 13 21 , 70 17 , 74 12 , 79 10 , 81

two-tail: (lower , upper)

Lower and Upper Critical Values W of Wilcoxon Signed-Ranks Test

  • ne-tail:

Table is available at the exam (and on the course website)

slide-16
SLIDE 16

Large samples: under 𝐼0:, it can be shown that ▪ 𝐹 𝑋 =

𝑜 𝑜+1 4

▪ var 𝑋 =

𝑜 𝑜+1 2𝑜+1 24

Further, for 𝑜 ≥ 20, approximately: ▪

𝑋−𝑜 𝑜+1

4 𝑜 𝑜+1 2𝑜+1 24

~𝑂 0,1 ▪ so you can compute 𝑨calc =

𝑥calc−𝑜 𝑜+1

4 𝑜 𝑜+1 2𝑜+1 24

▪ and compare it to 𝑨crit (e.g., ±1.96)

THE WILCOXON SIGNED RANK TEST

slide-17
SLIDE 17

Example, continued: ▪ 𝑥 = σ𝑗=1

𝑜′ 𝑠 𝑗 + = 61

▪ under 𝐼0: 𝑋~𝑂 𝐹 𝑋 , var 𝑋 ▪ so, under 𝐼0:

𝑋−𝐹 𝑋 var 𝑋 ~𝑂 0,1

▪ 𝑄𝑂 𝑋 ≥ 61 = 𝑄

𝑋−𝐹 𝑋 var 𝑋 ≥ 61−45.5 14.31

= 𝑄ሺ ሻ 𝑎 ≥ 1.08 = 0.1401 ▪ 𝑞-value: 2 × 0.1401 = 0.2802 ▪ there is no reason to reject 𝐼0 THE WILCOXON SIGNED RANK TEST

In fact, not a good idea because 𝑜 = 13 ≱ 20. We do it just to show how it works ...

slide-18
SLIDE 18

23 March 2015, Q1l-m OLD EXAM QUESTION

slide-19
SLIDE 19

Doane & Seward 5/E 16.1-16.3 Tutorial exercises week 3 Wilcoxon signed rank test, sign test FURTHER STUDY