Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21 - PowerPoint PPT Presentation

Randomization Model Population Model Rank Tests Assignment Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21

Randomization Model Population Model Rank Tests Assignment Permutation Methods Non-parametric methods for testing difference among samples (or groups). These tests can serve as alternatives to some classical tests such two-sample t -tests and ANOVA tests. First introduced in Fisher (1935) and Pitman (1937) There are two typical settings. Randomization Model : randomization tests Population Model : permutation tests Provide a unified framework for rank-based tests such as Wilcoxon rank test It is computationally intensive 2 / 21

Randomization Model Population Model Rank Tests Assignment Randomization Model Basis: subjects are randomly assigned to different treatments (usual practice in medicine) The only random aspect of the model is the assignment of treatments. Inference is limited to subjects under study. There is no population. 3 / 21

Randomization Model Population Model Rank Tests Assignment Randomization Model - Example Example (Ernst (2004)) A new treatment for post-surgical recovery is compared with a standard treatment. Of the n subjects available for the study, n 1 are randomly assigned to receive the new treatment, while the remaining n 2 = n − n 1 receive the standard treatment. The corresponding recovery times (in days) are recorded: X 1 , . . . , X n 1 and Y 1 , . . . , Y n 2 , for new and standard treatments, respectively. H 0 : There is no difference between the treatments. H a : The new treatment decreases the recovery times. Test statistic: T d = ¯ X − ¯ Y . 4 / 21

Randomization Model Population Model Rank Tests Assignment Randomization Model - Example Specifically, n = 7 , n 1 = 4 and n 2 = 3 and ( x 1 , x 2 , x 3 , x 4 ) = (19 , 22 , 25 , 26) , ( y 1 , y 2 , y 3 ) = (23 , 33 , 40) t d = ¯ x − ¯ y = − 9 How to compute the p -value? The only random aspect is the random assignment of treatment. So if H 0 is true , then the recovery time for each subject will be the same regardless of which treatment is received. Under H 0 , the distribution of T d is obtained based on the permutation of the values of x i ’s and y i ’s. 5 / 21

Randomization Model Population Model Rank Tests Assignment Randomization Model - Example � 7 � There are in total =35 equally likely randomizations. 4 i X 1 X 2 X 3 X 4 Y 1 Y 2 Y 3 t i 1 19 22 25 26 23 33 40 -9.00 2 22 23 25 26 19 33 40 -6.67 3 22 33 25 26 19 23 40 -0.83 4 22 25 26 40 19 23 33 3.25 ... 35 19 23 33 40 22 25 26 4.42 The p -value is given by � 35 i =1 I ( t i ≤ t d ) p = P H 0 ( T d < t d ) = ≈ 0 . 0857 . 35 6 / 21

Randomization Model Population Model Rank Tests Assignment Randomization Model - Example Randomisation distribution 0.07 prob 0.05 0.03 -10 -5 0 5 10 t Figure : Reference distribution 7 / 21

Randomization Model Population Model Rank Tests Assignment Randomization Model - Remarks Since the subjects are not randomly chosen, the conclusion can not be generalized to a broader range than the subjects under studied. 8 / 21

Randomization Model Population Model Rank Tests Assignment Population Model Suppose there are two independent random samples: X 1 , . . . , X n 1 and Y 1 , . . . , Y n 2 . d H 0 : X 1 = Y 1 versus H 1 : E ( X 1 ) > E ( Y 1 ) . Test statistic T = ¯ X n 1 − ¯ Y n 2 . We reject H 0 for large value of T . Under H 0 , the reference distribution of T is obtained in the same way as in the randomization model. 9 / 21

Randomization Model Population Model Rank Tests Assignment Population Model Let n = n 1 + n 2 . Define Z 1 = X 1 , . . . , Z n 1 = X n 1 , Z n 1 +1 = Y 1 , . . . , Z n = Y n 2 , and denote the observed values by ( z 1 , . . . , z n ) . � n 1 � n 2 1 1 T = i =1 Z i − i =1 Z i + n 1 . n 1 n 2 Under H 0 , Z i ’s are iid. Define the event E = { ( Z 1 , . . . , Z n ) = ( z pe (1) , . . . , z pe ( n ) ) , for some permutation pe } . Then for any permutation ˜ p , p ( n ) ) | E ) = 1 P H 0 (( Z 1 , . . . , Z n ) = ( z ˜ p (1) , . . . , z ˜ n ! . � n 1 � n 2 1 1 Let t i = i =1 z ˜ p ( i ) − i =1 z ˜ p ( i + n 1 ) . We have, n 1 n 2 1 P H 0 ( T = t i | E ) = � . � n n 1 10 / 21

Randomization Model Population Model Rank Tests Assignment Population Model We obtain the conditional sample of T : { t 1 , . . . , t m } , where � n � m = . n 1 Write t = ¯ x n 1 − ¯ y n 2 . The p -value is given by # { i : t i ≥ t } . � n � n 1 1 Note that the p -value is at least n 1 ) . ( n 11 / 21

Randomization Model Population Model Rank Tests Assignment Population Model Lemma k If the significance level α = n 1 ) and we can take [ t n − k +1 , ∞ ] as the ( n critical region, where t ( n − k +1) is the k -th largest value of t i ’s. Then the permutation test is exact, that is P H 0 ( T ≥ t ( n − k +1) | E ) = α. 12 / 21

Randomization Model Population Model Rank Tests Assignment Population Model Lemma k If the significance level α = n 1 ) and we can take [ t n − k +1 , ∞ ] as the ( n critical region, where t ( n − k +1) is the k -th largest value of t i ’s. Then the permutation test is exact, that is P H 0 ( T ≥ t ( n − k +1) | E ) = α. Note that the critical value t ( n − k +1) is a random cut as it depends on the data (or observations). It is a conditional test as it generates the permutation distribution of T conditional on the observed values. Conditional on the observed values, the permutation distribution of T does not depend on the underlying population G and F . Hence, the test is distribution free. 12 / 21

Randomization Model Population Model Rank Tests Assignment Population Model - Remarks The basic idea is to generate a reference distribution by recalculating a statistic for many permutations of the data. Not all statistics can be used in permutation methods. Suppose X ∼ N ( µ 1 , σ 2 1 ) and Y ∼ N ( µ 2 , σ 2 2 ) . Based on two independent samples, we want to test H 0 : µ 1 = µ 2 . If the variances are unknown and hence not necessary equal. Consider the t -statistics, X m − ¯ ¯ Y n T = , � S 2 X /m + S 2 Y /n The distribution of T is not invariant under permutation. 13 / 21

Randomization Model Population Model Rank Tests Assignment Population Model - Remarks Exhuasitively computing all permutations is unfeasible for large values of n 1 and n 2 . For instance, if n 1 = n 2 = 15 , � 30 � > 155 million . 15 We can use Monte-Carlo methods to estimate the p -value. Generate B samples from the permutation distribution.The function boot in R package boot can be useful for this purpose. Approximate p -value by its sample counterpart. p = 1 + � B i =1 I ( t i ≥ t ) ˆ . 1 + B 14 / 21

Randomization Model Population Model Rank Tests Assignment Population Model - Example Byzantine coins. This is example 15.6 in Kvam and Vidakovic (2007). Researchers investigated the silver content ( % Ag) of a num- ber of Byzantine coins discovered in Cyprus. The coins are from the first and fourth coinage in the reign of King Manuel I, Commenus (1143-1180). Based on the following data, we want to test if there is a significant difference between the two coinages in terms of silver content. For coins from the first coinage ( X ): (5.9, 6.8, 6.4, 7.0, 6.6, 7.7, 7.2, 6.9, 6.2) For coins from the fourth coinage ( Y ): (5.3, 5.6, 5.5, 5.1, 6.2, 5.8, 5.8) d H 0 : X = Y versus H 1 : E ( X ) � = E ( Y ) . This is a two-sided alternative. 15 / 21

Randomization Model Population Model Rank Tests Assignment Population Model - Example We choose the test statistic T = ¯ X − ¯ Y . Note that n 1 = 9 and n 2 = 7 . � 16 � For each of the = 11440 =: m permutations, we calculate the value 9 t i . Permutation distribution, observed value in blue 1.0 0.8 0.6 Density 0.4 0.2 0.0 -1.0 -0.5 0.0 0.5 1.0 T � m The test statistics t = 1 . 13 . Let ¯ t = 1 i =1 t i be the mean of the m permutation distribution. We define the two-sided p value as m p = 1 � I ( | t i − ¯ t | ≥ | t − ¯ t | ) = 0 . 000699 . m i =1 16 / 21

Randomization Model Population Model Rank Tests Assignment Wilconxon/Mann-Whitney test iid iid Suppose X 1 , . . . , X n 1 ∼ F X and Y 1 , . . . , Y n 2 ∼ F Y . Both F X and F Y are continuous. H 0 : F X = F Y versus H 1 : F X < F Y Under H 1 , X 1 is stochastically larger than Y 1 . Let ( R 1 , . . . , R n 1 + n 2 ) be the ranks of the pooled sample ( X 1 , . . . , X n 1 , Y 1 , . . . , Y n 2 ) . So R 1 is the rank of X 1 in all n = n 1 + n 2 observations. Wilcoxon’s test statistics is T = � n 1 i =1 R i . We reject H 0 for large value of T . What is the reference distribution of T Under H 0 ? 17 / 21

Randomization Model Population Model Rank Tests Assignment Wilconxon/Mann-Whitney test Under H 0 , we have ( R 1 , . . . , R n 1 ) is a random sample without replacement from { 1 , 2 , . . . , n 1 + n 2 } ; the distribution of T is known and does NOT depend on F X ( = F Y ). 18 / 21

Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21 - PowerPoint PPT Presentation

Randomization Model Population Model Rank Tests Assignment Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21 Randomization Model Population Model Rank Tests Assignment Permutation Methods Non-parametric methods for testing

The diameter of permutation groups permutation groups H. A. Helfgott February 2017 The

Growth in permutation groups and linear New work on algebraic groups permutation groups H. A.

The diameter of permutation groups Proof ideas H. A. Helfgott and . Seress July 2013 Cayley

The diameter of permutation groups kos Seress May 2012 Cayley graphs The diameter of

Enumeration schemes for permutation patterns dashed permutation patterns Lara Pudwell Dashed

Algorithms for Permutation groups Alice Niemeyer UWA, RWTH Aachen Alice Niemeyer (UWA, RWTH

Statistics on permutation tableaux Pawel Hitczenko Drexel University parts based on joint work

Permutation Groups and Transformation Semigroups Lecture 1: Introduction Peter J. Cameron

Formal Methods and Cryptography Lecture 25 Formal Methods Formal Methods Logical foundations

Formal Methods and Cryptography Lecture 24 1 Formal Methods 2 Formal Methods Logical

New Form of Permutation Bias and Secret Key Leakage in Keystream Bytes of RC4 Subhamoy Maitra ,

Relations between the shape of a permutation and the shape of the base poset derived from the

Beautiful Bijections for Permutation Lara Pudwell Patterns Pattern- Avoiding Permutations

Permutation Groups John Bamberg, Michael Giudici and Cheryl Praeger Centre for the Mathematics of

Basic Algorithms for Permutation Groups Alexander Hulpke Department of Mathematics Colorado

Permutation Based Cryptography for IoT Guido Bertoni 1 Joint work with CIoT 2012, Antwerp,

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Processing Latent Variable Models and Signal Separation Bhiksha Raj Class 13. 15 Oct 2013

Types of Types Each type supports a set of valid operations. Types can be latent or

CSC 1800 Organization of Programming Languages Scope 1 Scope and Names Scope determines

EDSSU developing a successful MOC The Westmead Experience Amith Shetty Staff Specialist

1 General Population Individuals with CD HLADQ2 or HLADQ8 J Clin Invest.

2. Empirical analysis and comparisons of stochastic optimization algorithms Petr Po s k

Economies of Scope and Trade Niklas Herzig Bielefeld University June 16, 2015 Niklas Herzig

Sambuz

Useful Links

Newsletter

Mail Us