18.650 Statistics for Applications Chapter 5: Parametric - PowerPoint PPT Presentation

18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37

Cherry Blossom run (1) ◮ The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. ◮ In 2009 there were 14974 participants ◮ Average running time was 103.5 minutes. Were runners faster in 2012? n runners To answer this question, select from the 2012 race at random and denote by X 1 , . . . , X n their running time. 2/37

Cherry Blossom run (2) We can see from past data that the running time has Gaussian distribution. The variance was 373. 3/37

Cherry Blossom run (3) ◮ We are given i.i.d r.v X 1 , . . . , X n and we want to know if X 1 ∼ N (103 . 5 , 373) ◮ This is a problem. hypothesis testing ◮ There are many ways this could be false: 1. I E[ X 1 ] = 103 . 5 2. var [ X 1 ] = 373 3. X 1 may not even be Gaussian. ◮ We are interested in a very specific question: is E[ X 1 ] < 103 . 5 ? I 4/37

Cherry Blossom run (4) ◮ We make the following assumptions : 1. var [ X 1 ] = 373 (variance is the same between 2009 and 2012) 2. X 1 is Gaussian. ◮ The only thing that we did not fix is I E[ X 1 ] = µ . µ = 103 . 5 or µ < 103 . 5 ”? ◮ Now we want to test (only): “Is is ◮ By making modeling assumptions , we have reduced the X 1 ∼ N (103 . 5 , 373) may number of ways the hypothesis be rejected. X 1 ∼ N ( µ, 373) for ◮ The only way it can be rejected is if some µ < 103 . 5 . ◮ We compare an expected value to a fixed reference number (103.5). 5/37

Cherry Blossom run (5) Simple heuristic: ¯ X n < 103 . 5 , µ < 103 . 5 ” “If then This could go wrong if I randomly pick only fast runners in my sample X 1 , . . . , X n . Better heuristic: ¯ X n < 103 . 5 − (something that then µ < 103 . 5 ” “If − − − → 0 ), n →∞ To make this intuition more precise, we need to take the size of the ¯ random fluctuations of X n into account! 6/37

Clinical trials (1) ◮ Pharmaceutical companies use hypothesis testing to test if a new drug is efficient. ◮ To do so, they administer a drug to a group of patients (test group) and a placebo to another group (control group). ◮ Assume that the drug is a cough syrup. ◮ Let µ control denote the expected number of expectorations per hour after a patient has used the placebo. ◮ Let µ drug denote the expected number of expectorations per hour after a patient has used the syrup. ◮ We want to know if µ drug < µ control ◮ We compare two expected values. No reference number. 7/37

Clinical trials (2) ◮ Let X 1 , . . . , X n drug denote n drug i.i.d r.v. with distribution Poiss ( µ drug ) ◮ Let Y 1 , . . . , Y n control denote n control i.i.d r.v. with distribution Poiss ( µ control ) ◮ We want to test if µ drug < µ control . Heuristic: ¯ ¯ “If X drug < X control − (something that − − − − − − − → 0 ), then n drug →∞ n control →∞ conclude that µ drug < µ control ” 8/37

Heuristics (1) Example 1: A coin is tossed 80 times, and Heads are obtained 54 times. Can we conclude that the coin is significantly unfair ? iid ◮ n = 80 , X 1 , . . . , X n ∼ Ber ( p ); ¯ ◮ X n = 54 / 80 = . 68 p = . 5 : By ◮ If it was true that CLT+Slutsky’s theorem, ¯ n − . 5 √ X n ≈ N (0 , 1) . J . 5(1 − . 5) ¯ n − . 5 √ X n ◮ J ≈ 3 . 22 ¯ . 5(1 − . 5) ◮ Conclusion: It seems quite reasonable to reject the p = . 5 . hypothesis 9/37

Heuristics (2) Example 2: A coin is tossed 30 times, and Heads are obtained 13 times. Can we conclude that the coin is significantly unfair ? iid ◮ n = 30 , X 1 , . . . , X n ∼ Ber ( p ); ¯ ◮ X n = 13 / 30 ≈ . 43 p = . 5 : By ◮ If it was true that CLT+Slutsky’s theorem, ¯ n − . 5 √ X n ≈ N (0 , 1) . J . 5(1 − . 5) ¯ √ X n − . 5 gives n ◮ Our data J ≈ − . 77 . 5(1 − . 5) ◮ The number . 77 is a plausible realization of a random variable Z ∼ N (0 , 1) . ◮ Conclusion: our data does not suggest that the coin is unfair. 10/37

Statistical formulation (1) ◮ Consider a sample X 1 , . . . , X n of i.i.d. random variables and a ( E, (I statistical model P θ ) θ ∈ Θ ) . ◮ Let Θ 0 and Θ 1 be disjoint subsets of Θ . � θ ∈ Θ 0 H 0 : ◮ Consider the two hypotheses: H 1 : θ ∈ Θ 1 null hypothesis , alternative hypothesis . ◮ H 0 is the H 1 is the θ is ◮ If we believe that the true either in Θ 0 or in Θ 1 , we may test H 0 against H 1 . want to reject H 0 (look ◮ We want to decide whether to for evidence against H 0 in the data). 11/37

Statistical formulation (2) ◮ H 0 and H 1 do not play a symmetric role: the data is is only used to try to disprove H 0 ◮ In particular lack of evidence, does not mean that H 0 is true (“innocent until proven guilty”) test is ψ ∈ { 0 , 1 } such ◮ A a statistic that: ψ = 0 , ◮ If H 0 is not rejected; ψ = 1 , ◮ If H 0 is rejected. example: H 0 : p = 1 / 2 vs. H 1 : p = 1 / 2 . ◮ Coin { } ¯ n − . 5 √ X > C ◮ ψ = 1 n C > 0 . J I , for some . 5(1 − . 5) threshold C ? ◮ How to choose the 12/37

Statistical formulation (3) ◮ Rejection region of a test ψ : R ψ = { x ∈ E n : ψ ( x ) = 1 } . ◮ Type 1 error of ψ (rejecting H 0 when a test it is actually true): α ψ : Θ 0 → I R θ P θ [ ψ = 1] . �→ I ◮ Type 2 error of ψ (not a test rejecting H 0 although H 1 is actually true): Θ 1 → β ψ : I R θ P θ [ ψ = 0] . �→ I ◮ Power of a test ψ : π ψ = inf (1 − β ψ ( θ )) . θ ∈ Θ 1 13/37

Statistical formulation (4) ψ has level α if ◮ A test α ψ ( θ ) ≤ α, ∀ θ ∈ Θ 0 . ψ has asymptotic level α if ◮ A test lim α ψ ( θ ) ≤ α, ∀ θ ∈ Θ 0 . n →∞ ◮ In general, a test has the form ψ = 1 I { T n > c } , c ∈ I for some statistic T n and threshold R . test statistic . The ◮ T n is called the rejection region is R ψ = { T n > c } . 14/37

Example (1) iid unknown p ∈ (0 , 1) . ◮ Let X 1 , . . . , X n ∼ Ber ( p ) , for some ◮ We want to test: H 0 : p = 1 / 2 vs. H 1 : p = 1 / 2 α ∈ (0 , 1) . with asymptotic level √ p ˆ n − 0 . 5 n ◮ Let T n = J , where p ˆ n is the MLE. . 5(1 − . 5) ◮ If H 0 is true, then by CLT and Slutsky’s theorem, I P[ T n > q α/ 2 ] − − − → 0 . 05 n →∞ I { T n > q α/ 2 } . ◮ Let ψ α = 1 15/37

Example (2) α = 5% , Coming back to the two previous coin examples: For q α/ 2 = 1 . 96 , so: ◮ In Example 1 , H 0 is rejected at the asymptotic level 5% by the test ψ 5% ; ◮ In Example 2 , H 0 is not rejected at the asymptotic level 5% by the test ψ 5% . Question: In α would Example 1 , for what level ψ α not reject H 0 α would ? And in Example 2 , at which level ψ α reject H 0 ? 16/37

p-value Definition p-value of The (asymptotic) a test ψ α is the smallest (asymptotic) α at level which ψ α rejects H 0 . It is random, it depends on the sample. Golden rule ≤ α ⇔ H 0 is p-value rejected by ψ α , at the (asymptotic) level α . The smaller the p-value, the more confidently one can reject H 0 . P[ | Z | > 3 . 21] ≪ . 01 . ◮ Example 1: p-value = I ◮ Example 2: p-value = I P[ | Z | > . 77] ≈ . 44 . 17/37

Neyman-Pearson’s paradigm Idea: For given hypotheses, among all tests of level/asymptotic level α , is it possible to find one that has maximal power ? ψ = 0 that Example: The trivial test never rejects H 0 has a level ( α = 0 ) but perfect poor power ( π ψ = 0 ). Neyman-Pearson’s theory provides (the most) powerful tests with given level. In 18.650, we only study several cases. 18/37

χ 2 distributions The Definition χ 2 (pronounced “Kai-squared”) For a positive integer d , the distribution with d degrees of freedom is the law of the random iid 2 + Z 2 . . . + Z 2 , Z 1 , . . . , Z d ∼ N (0 , 1) . variable Z 1 2 + where d Examples: Z ∼ N d ( 0 , I d ) , I Z I 2 2 ∼ χ 2 d . ◮ If then ◮ Recall that the sample variance is given by n n S n = 1 n ¯ n ) 2 = 1 n ¯ n ) 2 2 − ( X ( X i − X X i n n i =1 i =1 iid X 1 , . . . , X n ∼ N ( µ, σ 2 ) , ◮ Cochran’s theorem implies that for if S n is the sample variance, then nS n ∼ χ 2 n − 1 . σ 2 ◮ χ 2 2 = Exp (1 / 2) . 19/37

Student’s T distributions Definition Student’s T distribution with d For a positive integer d , the degrees of freedom (denoted by t d ) is the law of the random Z Z ∼ N (0 , 1) , V ∼ χ 2 and Z ⊥ ⊥ V ( Z is J variable , where d V/d V ). independent of Example: iid X 1 , . . . , X n ∼ N ( µ, σ 2 ) , ◮ Cochran’s theorem implies that for if S n is the sample variance, then √ ¯ n − µ X n − 1 ∼ t n − 1 . √ S n 20/37

18.650 Statistics for Applications Chapter 5: Parametric - PowerPoint PPT Presentation

18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Optical Parametric Generation and Amplification 1 Optical Parametric Generation Sum frequency

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Parametric and non-parametric multivariate test statistics for high-dimensional fMRI data Daniela

18.650 Statistics for Applications Chapter 4: The Method of Moments 1/14 Weierstrass

18.650 Statistics for Applications Chapter 3: Maximum Likelihood Estimation 1/23 Total

18.650 Statistics for Applications Chapter 1: Introduction 1/43 Goals Goals: To give you a

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Statistics I Chapter 3 Describing Data through Statistics Ling-Chieh Kung Department of

Statistics I Chapter 1 What is Statistics? Ling-Chieh Kung Department of Information

Towards a non-parametric Towards a non-parametric stochastic framework: a consistent approach of

Bayes net wrapup Exact inference algorithms Use to compute P(X1, ..., Xn) or P(X1, ..., Xn

Implicit Reparameterization Gradients Michael Figurnov, Shakir Mohamed, Andriy Mnih Poster: Room

I ask then: Did God reject His people? By no means! I am an Israelite myself, a descendant of

Simulation for estimation and testing Christopher F Baum EC 823: Applied Econometrics Boston

Hadron background rejection for Very for Very Hadron background rejection High Energy gamma ray

Smallest Explanations and Diagnoses of Rejection in Abstract Argumentation Andreas Niskanen

Inferential Statistics Inferential statistics are used to test

Introduction to Mobile Robotics Iterative Closest Point Algorithm Wolfram Burgard, Cyrill