nonparametric hypothesis tests and permutation tests
play

Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. - PowerPoint PPT Presentation

Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon and Mann-Whitney Tests


  1. Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 1 / 36

  2. Probability Generating Functions (pgf) Let Y be an integer-valued random variable with a lower bound (typically Y � 0 ). The probability generating function is defined as � P Y ( t ) = E ( t Y ) = P Y ( y ) t y y Simple example Suppose P X ( x ) = x / 10 for x = 1 , 2 , 3 , 4 , P X ( x ) = 0 otherwise. Then P X ( t ) = . 1 t + . 2 t 2 + . 3 t 3 + . 4 t 4 Poisson distribution Let X be Poisson with mean µ . Then ∞ e − µ µ k ∞ e − µ ( µ t ) k � � · t k = = e − µ e µ t = e µ ( t − 1 ) P X ( t ) = k ! k ! k = 0 k = 0 Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 2 / 36

  3. Properties of pgfs Plugging in t = 1 gives total probability=1: � P Y ( 1 ) = P Y ( y ) = 1 y Differentiating and plugging in t = 1 gives E ( Y ) : Y ( t ) = � y P Y ( y ) · y t y − 1 P ′ Y ( 1 ) = � P ′ y P Y ( y ) · y = E ( Y ) Y ( 1 )) 2 : Variance is Var ( Y ) = P ′′ Y ( 1 ) + P ′ Y ( 1 ) − ( P ′ Y ( t ) = � y P Y ( y ) · y ( y − 1 ) t y − 2 P ′′ Y ( 1 ) = � y P Y ( y ) · y ( y − 1 ) = E ( Y ( Y − 1 )) = E ( Y 2 ) − E ( Y ) P ′′ Var ( Y ) = E ( Y 2 ) − ( E ( Y )) 2 = P ′′ Y ( 1 )) 2 Y ( 1 ) + P ′ Y ( 1 ) − ( P ′ Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 3 / 36

  4. Example of pgf properties: Poisson Properties P Y ( t ) = � y P Y ( y ) t Y P Y ( 1 ) = 1 E ( Y ) = P ′ Y ( 1 ) Var ( Y ) = E ( Y 2 ) − ( E ( Y )) 2 = P ′′ Y ( 1 )) 2 Y ( 1 ) + P ′ Y ( 1 ) − ( P ′ For X Poisson with mean µ , we saw P X ( t ) = e µ ( t − 1 ) . P X ( 1 ) = e µ ( 1 − 1 ) = e 0 = 1 X ( 1 ) = µ e µ ( 1 − 1 ) = µ X ( t ) = µ e µ ( t − 1 ) P ′ P ′ and Indeed, E ( X ) = µ for Poisson. X ( t ) = µ 2 e µ ( t − 1 ) P ′′ X ( 1 ) = µ 2 e µ ( 1 − 1 ) = µ 2 P ′′ X ( 1 )) 2 = µ 2 + µ − µ 2 = µ Var ( X ) = P ′′ X ( 1 ) + P ′ X ( 1 ) − ( P ′ Indeed, Var ( X ) = µ for Poisson. Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 4 / 36

  5. Probability generating function of X + Y Consider adding rolls of two biased dice together: X = roll of biased 3-sided die Y = roll of biased 5-sided die P ( X + Y = 2 ) = P X ( 1 ) P Y ( 1 ) P ( X + Y = 3 ) = P X ( 1 ) P Y ( 2 ) + P X ( 2 ) P Y ( 1 ) P ( X + Y = 4 ) = P X ( 1 ) P Y ( 3 ) + P X ( 2 ) P Y ( 2 ) + P X ( 3 ) P Y ( 1 ) P ( X + Y = 5 ) = P X ( 1 ) P Y ( 4 ) + P X ( 2 ) P Y ( 3 ) + P X ( 3 ) P Y ( 2 ) P ( X + Y = 6 ) = P X ( 1 ) P Y ( 5 ) + P X ( 2 ) P Y ( 4 ) + P X ( 3 ) P Y ( 3 ) P ( X + Y = 7 ) = P X ( 2 ) P Y ( 5 ) + P X ( 3 ) P Y ( 4 ) P ( X + Y = 8 ) = P X ( 3 ) P Y ( 5 ) Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 5 / 36

  6. Probability generating function of X + Y P X ( t ) = P X ( 1 ) t + P X ( 2 ) t 2 + P X ( 3 ) t 3 P Y ( t ) = P Y ( 1 ) t + P Y ( 2 ) t 2 + P Y ( 3 ) t 3 + P Y ( 4 ) t 4 + P Y ( 5 ) t 5 � � t 2 + P X ( t ) P Y ( t ) = P X ( 1 ) P Y ( 1 ) � � t 3 + P X ( 1 ) P Y ( 2 ) + P X ( 2 ) P Y ( 1 ) � � t 4 + P X ( 1 ) P Y ( 3 ) + P X ( 2 ) P Y ( 2 ) + P X ( 3 ) P Y ( 1 ) � � t 5 + P X ( 1 ) P Y ( 4 ) + P X ( 2 ) P Y ( 3 ) + P X ( 3 ) P Y ( 2 ) � � t 6 + P X ( 1 ) P Y ( 5 ) + P X ( 2 ) P Y ( 4 ) + P X ( 3 ) P Y ( 3 ) � � t 7 + P X ( 2 ) P Y ( 5 ) + P X ( 3 ) P Y ( 4 ) � � t 8 P X ( 3 ) P Y ( 5 ) P ( X + Y = 2 ) t 2 + · · · + P ( X + Y = 8 ) t 8 = = P X + Y ( t ) Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 6 / 36

  7. Probability generating function of X + Y Suppose X and Y are independent random variables. Then P X + Y ( t ) = P X ( t ) · P Y ( t ) Proof. P X + Y ( t ) = E ( t X + Y ) = E ( t X t Y ) = E ( t X ) E ( t Y ) = P X ( t ) P Y ( t ) � Second proof. � � x P ( X = x ) t x �� � y P ( Y = y ) t y � P X ( t ) · P Y ( t ) = Multiply that out and collect by powers of t . The coefficient of t w is � x P ( X = x ) P ( Y = w − x ) Since X , Y are independent, this simplifies to P ( X + Y = w ) , which is the coefficient of t w in P X + Y ( t ) . � Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 7 / 36

  8. Binomial distribution Suppose X 1 , . . . , X n are i.i.d. with P ( X i = 1 ) = p , P ( X i = 0 ) = 1 − p (Bernoulli distribution) . P X i ( t ) = ( 1 − p ) t 0 + pt 1 = 1 − p + pt The Binomial ( n , p ) distribution is X = X 1 + · · · + X n . P X ( t ) = P X 1 ( t ) · · · P X n ( t ) = ( 1 − p + pt ) n Check: � n � n n � � (( 1 − p ) + pt ) n = ( 1 − p ) n − k p k · t k = P Y ( k ) t k k k = 0 k = 0 where Y is the Binomial ( n , p ) distribution. Note: If X and Y have the same pgf, then they have the same distribution. Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 8 / 36

  9. Moment generating function (mgf) in Chapter 1.1 & 2.3 Let Y be a continuous or discrete random variable. The moment generating function (mgf) is M Y ( θ ) = E ( e θ Y ) . Discrete: Same as the pgf with t = e θ , and not just for integer-valued variables: M Y ( θ ) = � y P Y ( y ) e θ y Continuous: It’s essentially the “2-sided Laplace transform” of f Y ( y ) : � ∞ − ∞ f Y ( y ) e θ y dy M Y ( θ ) = The derivative tricks for pgf have analogues for mgf: d k d θ k M Y ( θ ) = E ( Y k e θ Y ) M ( k ) Y ( 0 ) = E ( Y k ) = k th moment of Y M Y ( 0 ) = E ( 1 ) = 1 = Total probability M ′ Y ( 0 ) = E ( Y ) = Mean Y ( 0 ) = E ( Y 2 ) Y ( 0 )) 2 M ′′ Var ( Y ) = M ′′ Y ( 0 ) − ( M ′ so Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 9 / 36

  10. Non-parametric hypothesis tests Parametric hypothesis tests assume the random variable has a specific probability distribution (normal, binomial, geometric, . . . ). The competing hypotheses both assume the same type of distribution but with different parameters. A distribution free hypothesis test (a.k.a. non-parametric hypothesis test) doesn’t assume any particular type of distribution. So it can be applied even if the distribution isn’t known. If the type of distribution is known, a parametric test that takes it into account can be more precise (smaller Type II error for same Type I error) than a non-parametric test that doesn’t. Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 10 / 36

  11. Wilcoxon Signed Rank Test Let X be a continuous random variable with a symmetric distribution. Let M be the median of X : P ( X > M ) = P ( X < M ) = 1 / 2 , or F X ( M ) = . 5 . Note that if the pdf of X is symmetric, the median equals the mean. If it’s not symmetric, they usually are not equal. We will develop a test for H 0 : M = M 0 vs. H 1 : M � M 0 (or M < M 0 or M > M 0 ) based on analyzing a sample x 1 , . . . , x n of data. Example: If U , V have the same distribution, then X = U − V has a symmetric distribution centered around its median, 0 . 0.15 0.6 0.10 0.4 pdf pdf 0.05 0.2 0.00 0.0 � 15 � 10 � 5 0 5 10 15 0 5 10 15 x u Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 11 / 36

  12. Computing the Wilcoxon test statistic Is median M 0 = 5 plausible, given data 1 . 1 , 8 . 2 , 2 . 3 , 4 . 4 , 7 . 5 , 9 . 6 ? Get a sample x 1 , . . . , x n : 1 . 1 , 8 . 2 , 2 . 3 , 4 . 4 , 7 . 5 , 9 . 6 Compute the following: Compute each x i − M 0 . Order | x i − M 0 | from smallest to largest and assign ranks 1 , 2 , . . . , n (1=smallest, n =largest). � if x i − M 0 < 0 0 Let r i be the rank of | x i − M 0 | and z i = if x i − M 0 > 0 . 1 Note: Since X is continuous, P ( X − M 0 = 0 ) = 0 . Compute test statistic w = z 1 r 1 + · · · + z n r n (sum of r i ’s with x i > M 0 ) x i − M 0 sign i x i r i z i n = 6 1 . 1 − 3 . 9 − 1 5 0 8 . 2 3 . 2 + 2 4 1 | x i − M 0 | in order: 2 . 3 − 2 . 7 − 3 3 0 . 6 , 2 . 5 , 2 . 7 , 3 . 2 , 3 . 9 , 4 . 6 4 . 4 − . 6 − 4 1 0 7 . 5 2 . 5 + 5 2 1 w = 4 + 2 + 6 = 12 9 . 6 4 . 6 + 6 6 1 Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 12 / 36

  13. Computing the pdf of W The variable whose rank is i contributes either 0 or i to W . Under the null hypothesis, both of those have probability 1 / 2 . Call this contribution W i , either 0 or i with prob. 1 / 2 . Then W = W 1 + · · · + W n The W i ’s are independent because the signs are independent. The pgf of W i is 2 t i = 1 + t i P W i ( t ) = E ( t W i ) = 1 2 t 0 + 1 2 The pgf of W is n � P W ( t ) = P W 1 + ··· + W n ( t ) = P W 1 ( t ) · · · P W n ( t ) = 2 − n ( 1 + t i ) i = 1 Expand the product. The coefficient of t w is P ( W = w ) , the pdf of W . Prof. Tesler Wilcoxon and Mann-Whitney Tests Math 283 / Fall 2018 13 / 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend