P -values, Randomization Tests, and Nonparametric Combinations of - PowerPoint PPT Presentation

P -values, Randomization Tests, and Nonparametric Combinations of Tests Tonix Virtual Retreat Philip B. Stark 22 October 2020 University of California, Berkeley 1

Randomized experiments • Subjects recruited at one or more centers • Criteria to ensure they have the condition • Randomized to treatment/control or treatment level, sometimes w/ constraints or “bias” to get balance. • Randomization algorithms often proprietary 2

Analyzing the data • Common to use things like ANOVA, t-tests, regression, logistic regression • Assumptions generally have nothing to do with the experiment 3

Small example 11 pairs of rats, each pair from the same litter. Randomly–by coin toss–put one of each pair into “enriched” environment; other sib gets “normal” environment. After 65 days, measure cortical mass (mg). enriched 689 656 668 660 679 663 664 647 694 633 653 impoverished 657 623 652 654 658 646 600 640 605 635 642 diff 32 33 16 6 21 17 64 7 89 -2 11 Cartoon of Rosenzweig, M.R., E.L. Bennet, and M.C. Diamond, 1972. Brain changes in response to experience, Scientific American , 226 , 22–29. 4

Informal Hypotheses Null hypothesis: treatment has “no effect.” Alternative hypothesis: treatment increases cortical mass. Suggests 1-sided test for an increase. 5

Test contenders • 2-sample Student t -test mean(treatment) - mean(control) pooled estimate of SD of difference of means 6

Test contenders • 2-sample Student t -test mean(treatment) - mean(control) pooled estimate of SD of difference of means • 1-sample Student t -test on the differences mean(differences) √ SD(differences) / 11 6

Test contenders • 2-sample Student t -test mean(treatment) - mean(control) pooled estimate of SD of difference of means • 1-sample Student t -test on the differences mean(differences) √ SD(differences) / 11 • randomization test using t -statistic of differences: same statistic, calibrate probability differently 6

The Neyman “ticket” model (1930) • S subjects, T treatments 7

The Neyman “ticket” model (1930) • S subjects, T treatments • subject s represented by a ticket with T numbers on it, x s 1 , . . . , x sT , set before treatment is assigned (but unknown to the experimenter) resp to tx 1 resp to tx 2 · · · resp to tx T 4 9.2 · · · -3.33 7

The Neyman “ticket” model (1930) • S subjects, T treatments • subject s represented by a ticket with T numbers on it, x s 1 , . . . , x sT , set before treatment is assigned (but unknown to the experimenter) resp to tx 1 resp to tx 2 · · · resp to tx T 4 9.2 · · · -3.33 • x st is the response subject s will have if assigned treatment t • if subject s is assigned to treatment t , observe x st 7

The Neyman “ticket” model (1930) • S subjects, T treatments • subject s represented by a ticket with T numbers on it, x s 1 , . . . , x sT , set before treatment is assigned (but unknown to the experimenter) resp to tx 1 resp to tx 2 · · · resp to tx T 4 9.2 · · · -3.33 • x st is the response subject s will have if assigned treatment t • if subject s is assigned to treatment t , observe x st • no necessary connection of the numbers across subjects • no assumption about the distribution of the numbers • “non-interference” implicit 7

Generalizations • subject s represented by a ticket with T J -vectors on it, � x s 1 , . . . ,� x sJ . • if subject s is assigned treatment t s , observe the vector � x st item resp to tx 1 resp to tx 2 · · · resp to tx T 1 4 9.2 · · · -3.33 2 2 1 · · · 17 . . . . . . . . . . . . . . . 5 42 · · · 9 J 8

More generalizations • subject s represented by a ticket with T probability distributions on it, F s 1 , . . . , F sT . • if subject s is assigned treatment t , observe a draw from F st • F st could be a multivariate distribution resp to tx 1 resp to tx 2 · · · resp to tx T F 11 ( · ) F 12 ( · ) · · · F 1 T ( · ) 9

Generic notation x st could be a scalar, a vector, or a realization of a random variable or random vector. ψ ( · ) is a test statistic : it maps the data x to a scalar 10

The strong null hypothesis • “treatment doesn’t matter at all” • subject s ’s response would have been the same, no matter what treatment was assigned 11

The strong null hypothesis • “treatment doesn’t matter at all” • subject s ’s response would have been the same, no matter what treatment was assigned • x s 1 = x s 2 = · · · = x sT • (but x st is not necessarily equal to x rt for r � = s ) 11

The strong null hypothesis • “treatment doesn’t matter at all” • subject s ’s response would have been the same, no matter what treatment was assigned • x s 1 = x s 2 = · · · = x sT • (but x st is not necessarily equal to x rt for r � = s ) resp to tx 1 resp to tx 2 · · · resp to tx T 4 4 · · · 4 11

• if the null is true, know what would have been observed if random assignment had been different: every subject would have had same response • induces null distribution for any test statistic ψ • completely determined by the randomization: no additional assumptions 12

The rats: strong null Treatment has no effect–as if each rat’s cortical mass was determined before randomization. Then equally likely that the rat with the heavier cortex will be assigned to treatment or to control, independently across littermate pairs. Gives 2 11 = 2048 equally likely possibilities: ± 32 ± 33 ± 16 ± 6 ± 21 ± 17 ± 64 ± 7 ± 89 ± 2 ± 11 13

Alternative hypotheses 1. Individual’s response depends only on that individual’s assignment • Special cases: shift, scale, etc. 2. Interactions/Interference: my response could depend on your treatment 14

Assumptions of the tests 1. 2-sample t -test: • masses are iid sample from normal distribution, same unknown variance, same unknown mean. • Tests “weak” null hypothesis (plus normality, independence, non-interference, etc.). 2. 1-sample t -test on the differences: • mass differences are iid sample from normal distribution, unknown variance, zero mean. • Tests “weak” null hypothesis (plus normality, independence, non-interference, etc.) 3. randomization test: • randomization performed as claimed. • tests strong null hypothesis. Assumptions of randomization test are true by fiat. 15

Student t -test calculations Mean of differences: 26.73mg Sample SD of differences: 27.33mg t -statistic: 3 . 244 ≡ t 0 . P -value for 2-sided t -test: 0.0088 16

Student t -test calculations Mean of differences: 26.73mg Sample SD of differences: 27.33mg t -statistic: 3 . 244 ≡ t 0 . P -value for 2-sided t -test: 0.0088 • Why do cortical weights have normal distribution? • Why is variance of the difference between treatment and control the same for different litters? • Treatment and control are dependent because assigning a rat to treatment excludes it from the control group, and vice versa. • P -value depends on assuming differences are iid sample from a normal distribution. • If we reject the null, is that because there is a treatment effect, or because the other assumptions are wrong? 16

Randomization t -test calculations Could enumerate all 2 11 = 2 , 048 equally likely possibilities. Calculate t -statistic for each. P -value is (# possibilities s.t. t ≥ t 0 )/2048 ≈ 0 . 0018. 17

“Statistical procedure and experimental design are only two different aspects of the same whole, and that whole is the logical requirements of the complete process of adding to natural knowledge by experimentation.” 19

“A Lady declares that by tasting a cup of tea made with milk she can discriminate whether the milk or the tea infusion was first added to the cup. We will consider the problem of designing an experiment by means of which this assertion can be tested. · · · Our experiment consists in mixing eight cups of tea, four in one way and four in the other, and presenting them to the subject for judgment in a random order. The subject has been told in advance of what the test will consist, namely, that she will be asked to taste eight cups, that these shall be four of each kind, and that they shall be presented to her in a random order, that is in an order not determined arbitrarily by human choice, but by the actual manipulation of the physical apparatus used in games of chance, dice, cards, roulettes, etc., or, more expeditiously, from a published collection of random sampling numbers purporting to give the actual results of such manipulation. Her task is to divide the 8 cups into two sets of 4, agreeing, if possible, with the treatments received.” 20

Test statistic: number of correct IDs � = 70 � 8 4 21

Test statistic: number of correct IDs � = 70 � 8 4 � = 16 � 4 �� 4 3 1 1 / 70 ≈ 0 . 014; (16 + 1) / 70 ≈ 0 . 243 21

P -values, Randomization Tests, and Nonparametric Combinations of - PowerPoint PPT Presentation

P -values, Randomization Tests, and Nonparametric Combinations of Tests Tonix Virtual Retreat Philip B. Stark 22 October 2020 University of California, Berkeley 1 Randomized experiments Subjects recruited at one or more centers

Randomization Algorithm Theory WS 2012/13 Fabian Kuhn Randomization Randomized Algorithm: An

What About Randomization Tests? Strengths Gail et al. (1996) reported nominal Type I and II

Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating

STAT 401A - Statistical Methods for Research Workers Nonparametric two-sample tests Jarad Niemi

Comparing the Performance of Randomization Tests and Traditional Tests: A Simulation Study

Beyond Domain Randomization Josh Tobin 6/23/19 Goals for this talk Understand domain

Stage III of Social Subprojects Selection, Youth Corps Project Randomization (computer-based

Experience with MAC Address Randomization in Windows 10 Christian Huitema Huitema@microsoft.com

Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21 Randomization Model Population

Chapter 16 Nonparametric Statistics Introduction: Distribution-Free Tests Distribution-free

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Values Learning Outcomes Define what values are Identify your personal values Relate

Comparing User-Provided Tests to Developer-Provided Tests Ren Just, Chris Parnin, Ian Drosos,

Engaging in the Interactive Process Paula McMahon Americans with Disabilities Act Coordinator

YAHWEH ELOHIM, THE ELOHIM YAHWEH ELOHIM, THE ELOHIM OF ABRAHAM, ISAAC AND JACOB OF

Advanced Macroeconomics 12. Population & Resources: Malthus and the Environment Karl Whelan

Exergy, environment and sustainability Mechanical Engineering ME562Sustainable Energy: an

Immigration after Brexit: law, policy, and economics Wednesday 27 February 2019 British Academy,

Equity in Health Care Prof Sara Willems, MA, PhD Dep. of Family Medicine and Primary Health Care

15-150 Fall 2020 Stephen Brookes Lecture 3 Patterns and specifications Patterns and

Neural ENIGMA Karel Chvalovsk Jan Jakubv Martin Suda Josef Urban Czech Technical

P -values, Randomization Tests, and Nonparametric Combinations of - PowerPoint PPT Presentation

P -values, Randomization Tests, and Nonparametric Combinations of Tests Tonix Virtual Retreat Philip B. Stark 22 October 2020 University of California, Berkeley 1 Randomized experiments Subjects recruited at one or more centers

Randomization Algorithm Theory WS 2012/13 Fabian Kuhn Randomization Randomized Algorithm: An

What About Randomization Tests? Strengths Gail et al. (1996) reported nominal Type I and II

Nonparametric hypothesis tests and permutation tests 1.7 &amp; 2.3. Probability Generating

STAT 401A - Statistical Methods for Research Workers Nonparametric two-sample tests Jarad Niemi

Comparing the Performance of Randomization Tests and Traditional Tests: A Simulation Study

Beyond Domain Randomization Josh Tobin 6/23/19 Goals for this talk Understand domain

Stage III of Social Subprojects Selection, Youth Corps Project Randomization (computer-based

Experience with MAC Address Randomization in Windows 10 Christian Huitema Huitema@microsoft.com

Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21 Randomization Model Population

Chapter 16 Nonparametric Statistics Introduction: Distribution-Free Tests Distribution-free

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Values Learning Outcomes Define what values are Identify your personal values Relate

Comparing User-Provided Tests to Developer-Provided Tests Ren Just, Chris Parnin, Ian Drosos,

Engaging in the Interactive Process Paula McMahon Americans with Disabilities Act Coordinator

YAHWEH ELOHIM, THE ELOHIM YAHWEH ELOHIM, THE ELOHIM OF ABRAHAM, ISAAC AND JACOB OF

Advanced Macroeconomics 12. Population &amp; Resources: Malthus and the Environment Karl Whelan

Exergy, environment and sustainability Mechanical Engineering ME562Sustainable Energy: an

Immigration after Brexit: law, policy, and economics Wednesday 27 February 2019 British Academy,

Equity in Health Care Prof Sara Willems, MA, PhD Dep. of Family Medicine and Primary Health Care

15-150 Fall 2020 Stephen Brookes Lecture 3 Patterns and specifications Patterns and

Neural ENIGMA Karel Chvalovsk Jan Jakubv Martin Suda Josef Urban Czech Technical

Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating

Advanced Macroeconomics 12. Population & Resources: Malthus and the Environment Karl Whelan