Business Statistics CONTENTS Hypotheses on the median The sign - PowerPoint PPT Presentation

MEDIAN: NON-PARAMETRIC TESTS Business Statistics

CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study

HYPOTHESES ON THE MEDIAN ▪ The median is a central value that may be more suitable for strongly asymmetric distributions ▪ and for distributions with fat tails ▪ Can we test a population median? 𝑁 is here the population median. Think of it as a ▪ e.g., 𝐼 0 : 𝑁 = 400 Greek letter ... ▪ Note: ▪ for a more or less symmetric distribution, 𝑁 ≈ 𝜈 , so a 𝑢 -test of mean is appropriate (if 𝑜 ≥ 15 ) ▪ although perhaps more sensitive to large positive or negative outliers in the sample

HYPOTHESES ON THE MEDIAN ▪ What is the median of a sample? ▪ it is the middle value, i.e. 𝑦 𝑜/2 ▪ So, if 𝐼 0 : 𝑁 = 400 would be true, approximately half of the data in the sample would be lower, and half would be higher ▪ Therefore, if we count the number of data points that is lower and compare it to the number of observations, we can develop a test statistic ▪ Two varieties of such non-parametric tests today: ▪ sign test ▪ Wilcoxon signed rank test

THE SIGN TEST The sign test ▪ involves simply counting the number of positive or negative signs in a sequence of 𝑜 signs ▪ is based on the binomial distribution ▪ can be applied without requirements on the population distribution

THE SIGN TEST Computational steps: ▪ for each data point 𝑦 𝑗 compute the difference with the median ( 𝑁 ) of the null hypothesis ( 𝐼 0 ): 𝑒 𝑗 = 𝑦 𝑗 − 𝑁 ▪ omit zero differences ( 𝑒 𝑗 = 0 ); effective sample size is 𝑜 ′ ▪ assign +1 to positive differences ( 𝑒 𝑗 > 0 ) and −1 to negative differences ( 𝑒 𝑗 < 0 ) ▪ test statistic 𝑌 is the sum of the positive numbers (= number of positive observations)

THE SIGN TEST Example: Context: battery life until failure (in hours) ▪ 𝐼 0 : 𝑁 = 400 ; 𝐼 1 : 𝑁 ≠ 400 ▪ use 𝛽 = 0.05 ▪ sample of 𝑜 = 13 observations ( 𝑦 1 , … , 𝑦 13 ) ▪ reject for large and for small numbers of positive signs

THE SIGN TEST Example ( 𝐼 0 : 𝑁 = 400 ): (+) x i x i -400 s i s i 342 -58 -1 ▪ data: 𝑦 𝑗 ( 𝑗 = 1, … , 13 ) 426 26 1 1 ▪ difference with 𝑁 : 𝑒 𝑗 = 𝑦 𝑗 − 400 317 -83 -1 ▪ no cases where 𝑒 𝑗 = 0 , so 𝑜 ′ = 𝑜 545 145 1 1 264 -136 -1 ▪ 𝑡 𝑗 = ቊ 1 if 𝑒 𝑗 > 0 451 51 1 1 −1 if 𝑒 𝑗 < 0 1049 649 1 1 + = ቊ1 if 𝑒 𝑗 > 0 631 231 1 1 ▪ 𝑡 𝑗 512 112 1 1 0 if 𝑒 𝑗 < 0 266 -134 -1 𝑜 ′ 𝑡 𝑗 + = 8 ▪ 𝑦 = σ 𝑗=1 492 92 1 1 562 162 1 1 298 -102 -1

THE SIGN TEST Example (continued): ▪ 𝑦 = 8 ▪ under 𝐼 0 : 𝑌~𝑐𝑗𝑜 13,0.5 ▪ 𝑄 𝑐𝑗𝑜 13,0.5 𝑌 ≥ 8 = 0.291 ▪ why ≥ 8 ? ▪ if we would reject for 8 , we would also reject for 9 ▪ 𝑞 -value: 2 × 0.291 = 0.581 ▪ why 2 × ? ▪ because it’s a two -sided null hypothesis ▪ there is no reason to reject 𝐼 0

EXERCISE 1 Suppose we have more observations ( 𝑜 = 130 ) and find 𝑦 = 80 . Can you look up 𝑄 𝑐𝑗𝑜 130,0.5 𝑌 ≥ 80 ?

THE SIGN TEST In the sign test, we replace the numerical values by signs ( + or − ) Advantage: ▪ we don’t need any assumption on normality, symmetry, etc. that’s why we say it’s non - parametric: we don’t have to assume a certain ▪ distribution with parameters Disadvantage: ▪ we discard much information, so that the test is not very sensitive (has low “power”; see later) Are there other non-parametric tests that are more powerful? ▪ is there a compromise between value and sign that still needs some assumptions, but not too many assumptions? Yes, replacing data by their rank

THE WILCOXON SIGNED RANK TEST Wilcoxon signed rank test ▪ involves comparing the sum of ranks of the values larger than the test value with the sum of ranks of the values smaller than the test value Computational Steps: ▪ for each data point 𝑦 𝑗 compute the absolute difference with the median ( 𝑁 ) of the null hypothesis: 𝑒 𝑗 = 𝑦 𝑗 − 𝑁 ▪ omit zero differences ( 𝑒 𝑗 = 0 ); effective sample size is 𝑜 ′ ▪ assign ranks ( 1, … , 𝑜 ′ ) to the 𝑒 𝑗 ▪ reassign + and − to the ranks ▪ test statistic ( 𝑋 ) is the sum of the positive ranks

THE WILCOXON SIGNED RANK TEST x i – | x i – 400| x i r i r i (+) Example ( 𝐼 0 : 𝑁 = 400 ): 400 ▪ data: 𝑦 𝑗 ( 𝑗 = 1, … , 13 ) 342 -58 58 -3 426 26 26 1 1 ▪ difference with 317 -83 83 -4 𝑁 : 𝑒 𝑗 = 𝑦 𝑗 − 400 545 145 145 10 10 ▪ no cases where 𝑒 𝑗 = 0 , 264 -136 136 -9 so 𝑜 ′ = 𝑜 451 51 51 2 2 𝑜 ′ 𝑠 1049 649 649 13 13 + = 61 ▪ 𝑥 = σ 𝑗=1 631 231 231 12 12 𝑗 ▪ under 𝐼 0 : 𝑋~? (use table) 512 112 112 7 7 266 -134 134 -8 ▪ 𝑄 𝐼 0 𝑋 ≥ 61 =? 492 92 92 5 5 562 162 162 11 11 298 -102 102 -6

THE WILCOXON SIGNED RANK TEST Testing the median using the Wilcoxon 𝑋 statistic ▪ small samples: using a table of critical values ▪ included in tables at exam ▪ large samples: using a normal approximation of 𝑋 ▪ valid when 𝑜 ≥ 20 ▪ The test is only valid for symmetrically distributed populations ▪ if not, use sign test

THE WILCOXON SIGNED RANK TEST Small samples: critical values of Wilcoxon statistic Lower and Upper Critical Values W of Wilcoxon Signed-Ranks Test a = 0.05 a = 0.025 a = 0.01 a = 0.005 one-tail: a = 0.10 a = 0.05 a = 0.02 a = 0.01 two-tail: (lower , upper) n 5 0 , 15 --- , --- --- , --- --- , --- Table is available at 6 2 , 19 0 , 21 --- , --- --- , --- the exam (and on 7 3 , 25 2 , 26 0 , 28 --- , --- 8 5 , 31 3 , 33 1 , 35 0 , 36 the course website) 9 8 , 37 5 , 40 3 , 42 1 , 44 10 10 , 45 8 , 47 5 , 50 3 , 52 11 13 , 53 10 , 56 7 , 59 5 , 61 12 17 , 61 13 , 65 10 , 68 7 , 71 13 21 , 70 17 , 74 12 , 79 10 , 81 ▪ two-sided, 𝛽 = 0.05 , 𝑜 = 13 : 𝑥 𝑚𝑝𝑥𝑓𝑠 = 17 and 𝑥 𝑣𝑞𝑞𝑓𝑠 = 74 ▪ 𝑆 crit = [0,17] ∪ [74,91] ▪ 𝑥 calc = 61 , so do not reject 𝐼 0 at 𝛽 = 0.05

THE WILCOXON SIGNED RANK TEST Large samples: under 𝐼 0 : , it can be shown that 𝑜 𝑜+1 ▪ 𝐹 𝑋 = 4 𝑜 𝑜+1 2𝑜+1 ▪ var 𝑋 = 24 Further, for 𝑜 ≥ 20 , approximately: 𝑋− 𝑜 𝑜+1 ▪ 4 ~𝑂 0,1 𝑜 𝑜+1 2𝑜+1 24 𝑥 calc − 𝑜 𝑜+1 ▪ so you can compute 𝑨 calc = 4 𝑜 𝑜+1 2𝑜+1 24 ▪ and compare it to 𝑨 crit (e.g., ±1.96 )

THE WILCOXON SIGNED RANK TEST In fact, not a good idea Example, continued: because 𝑜 = 13 ≱ 20 . We do 𝑜 ′ 𝑠 it just to show how it works ... + = 61 ▪ 𝑥 = σ 𝑗=1 𝑗 ▪ under 𝐼 0 : 𝑋~𝑂 𝐹 𝑋 , var 𝑋 𝑋−𝐹 𝑋 ▪ so, under 𝐼 0 : var 𝑋 ~𝑂 0,1 𝑋−𝐹 𝑋 61−45.5 ▪ 𝑄 𝑂 𝑋 ≥ 61 = 𝑄 = 𝑄ሺ var 𝑋 ≥ 𝑎 ≥ 14.31 ሻ 1.08 = 0.1401 ▪ 𝑞 -value: 2 × 0.1401 = 0.2802 ▪ there is no reason to reject 𝐼 0

OLD EXAM QUESTION 23 March 2015, Q1l-m

FURTHER STUDY Doane & Seward 5/E 16.1-16.3 Tutorial exercises week 3 Wilcoxon signed rank test, sign test

Business Statistics CONTENTS Hypotheses on the median The sign - PowerPoint PPT Presentation

MEDIAN: NON-PARAMETRIC TESTS Business Statistics CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study HYPOTHESES ON THE MEDIAN The median is a central value that may be more

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Global assessment of linking trade statistics and the business register Nancy Snyder United

Introduction to Business Statistics Professor Jarad Niemi STAT 226 - Iowa State University

Business and Business Environment Business and Business Environment Introduction Business is

Business statistics and Globalisation UN Committee of Experts on Business Statistics First

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 3 t

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 4 t

REPUBLIC OF NAMIBIA WHAT IS FOREIGN TRADE STATISTICS WHAT IS FOREIGN TRADE STATISTICS Records

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Measurably Entire Functions and Their Growth Adi Glcksam University of Toronto AMS Sectional

Estimation of moment-based models with latent variables work in progress Raaella Giacomini and

Markets take the stairs up, but the elevator down Kris Boudt Professor of finance and

Traditional and Heavy-Tailed Self Regularization in Neural Network Models Charles H. Martin &

Questions that linguistics should answer Corpora What kinds of things do people say? What

Michael Zmistowski District Director Chairman

The distribution of turbulence driven wind speed extremes; a closed form asymptotic formulation

An Automata Based Approach for Verification of Information Flow Properties Deepak DSouza,

Sambuz

Useful Links

Newsletter

Mail Us

Business Statistics CONTENTS Hypotheses on the median The sign - PowerPoint PPT Presentation

MEDIAN: NON-PARAMETRIC TESTS Business Statistics CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question Further study HYPOTHESES ON THE MEDIAN The median is a central value that may be more

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics &amp; Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Global assessment of linking trade statistics and the business register Nancy Snyder United

Introduction to Business Statistics Professor Jarad Niemi STAT 226 - Iowa State University

Business and Business Environment Business and Business Environment Introduction Business is

Business statistics and Globalisation UN Committee of Experts on Business Statistics First

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 3 t

Introduction to Business Statistics Introduction to Business Statistics QM 120 Ch Chapter 4 t

REPUBLIC OF NAMIBIA WHAT IS FOREIGN TRADE STATISTICS WHAT IS FOREIGN TRADE STATISTICS Records

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Measurably Entire Functions and Their Growth Adi Glcksam University of Toronto AMS Sectional

Estimation of moment-based models with latent variables work in progress Raaella Giacomini and

Markets take the stairs up, but the elevator down Kris Boudt Professor of finance and

Traditional and Heavy-Tailed Self Regularization in Neural Network Models Charles H. Martin &amp;

Questions that linguistics should answer Corpora What kinds of things do people say? What

Michael Zmistowski District Director Chairman

The distribution of turbulence driven wind speed extremes; a closed form asymptotic formulation

An Automata Based Approach for Verification of Information Flow Properties Deepak DSouza,

Sambuz

Useful Links

Newsletter

Mail Us

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning

Traditional and Heavy-Tailed Self Regularization in Neural Network Models Charles H. Martin &