Data Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling - PowerPoint PPT Presentation

Data Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling Instructor: Sargur N. Srihari University at Buffalo The State University of New York srihari@cedar.buffalo.edu

Topics 1. Hypothesis Testing 1. t-test for means 2. Chi-Square test of independence 3. Komogorov-Smirnov Test to compare distributions 2. Sampling Methods 1. Random Sampling 2. Stratified Sampling 3. Cluster Sampling 2 Srihari

Motivation • If a data mining algorithm generates a potentially interesting hypothesis we want to explore it further • Commonly, value of a parameter • Is new treatment better than standard one • Two variables related in a population • Conclusions based on a sample of population 3 Srihari

Classical Hypothesis Testing 1. Define two complementary hypotheses • Null hypothesis and alternative hypothesis • Often null hypothesis is a point value, e.g., draw conclusions about a parameter θ – Null Hypothesis is H 0 : θ = θ 0 Alternative Hypothesis is H 1 : θ ≠ θ 0 2. Using data calculate a statistic • Which depends on nature of hypotheses • Determine expected distribution of chosen statistic • Observed value would be one point in this distribution 3. If in tail then either unlikely event or H 0 false • More extreme observed value less confidence in null Srihari hypothesis

Example Problem • Hypotheses are mutually exclusive • if one is true, other is false • Determine whether a coin is fair • H 0 : P = 0.5   H a : P ≠ 0.5 • Flipped coin 50 times: 40 Heads and 10 Tails • Inclined to reject H 0 and accept H a • Accept/reject decision focuses on single test statistic   Srihari

Test for Comparing two Means • Whether population mean differs from hypothesized value • Called one-sample t test • Common problem • Does Your Group Come from a Different Population than the One Specified? • One-sample t-test (given sample of one population) • Two-sample t-test (two populations) Srihari

One-sample t test • Fix significance level in [0,1], e.g., 0.01, 0.05, 1.0 • Degrees of freedom, DF = n - 1 • n is no of observations in sample • Compute test statistic (t-score) x : sample mean, x : hypothesized mean (H 0 ), s std dev of sample • Compute p-value from student ʼ s t-distribution • reject null hypothesis if p-value < significance level • Used when population variances are equal / unequal, and with large or small samples. Srihari

Rejection Region • Test statistic • mean score, proportion, t-score, z-score • One and Two tail tests Hyp Set Null hyp Alternative hyp No of tails 1 μ = M μ ≠ M 2 2 μ > M μ < M 1 3 μ < M μ > M 1 • Values outside region of acceptance is region of rejection • Equivalent Approaches: p-value and region of acceptance • Size of region is significance level 8 Srihari

Power of a Test • Compare different test Type 1 and Type 2 errors are procedures denoted α and β • Power of Test Null is True Null is False • Probability it will correctly 1 - α β Accept Null reject a false null True False Positive Positive hypothesis (1- β ) 1- β α Reject Null True False • False Negative Rate Negative Negative • Significance of Test • Test's probability of incorrectly rejecting the null hypothesis ( α ) Srihari • True Negative Rate

Likelihood Ratio Statistic • Good strategy to find statistic is to use the Likelihood Ratio • Likelihood Ratio Statistic to test hypothesis H 0 : θ = θ 0 H 1 : θ ≠ θ 0 is defined as L ( θ 0 | D ) where D ={ x (1),.., x (n)} sup ϕ L ( ϕ | D ) λ = i.e., Ratio of likelihood when θ = θ 0 to the largest value of the likelihood when θ is unconstrained • Null hypothesis rejected when λ is small • Generalizable to when null is not single point Srihari

Testing for Mean of Normal • Given a sample of n points drawn from Normal with unknown mean and unit variance • Likelihood under null hypothesis n   1 exp − 1 ∏ ∏ 2 ( x ( i ) − 0) 2 L (0 | x (1),.., x ( n )) = p ( x ( i ) |0) =   2 π   i = 1 i • Maximum likelihood estimator is sample mean n   1 exp − 1 ∏ ∏ ) 2 L ( µ | x (1),.., x ( n )) = p ( x ( i ) | µ ) = 2 ( x ( i ) − x   2 π   i = 1 i λ = exp( − n ( x − 0) 2 /2) • Ratio simplifies to • Rejection region: { λ | λ < c} for a suitably chosen c • Expression written as − 2 x ≥ n ln c Srihari – Compare sample mean with a constant

Types of Tests used Frequently • Differences between means • Compare variances • Compare observed distribution with a hypothesized distribution • Called goodness-of-fit test • t-test for difference between means of two independent groups 12 Srihari

Two sample t-test • Whether two means have the same value x(1),..x(n) drawn from N( µ x , σ 2 ), y(1),..y(n) drawn from N( µ y , σ 2 ) • H O : µ x = µ y • Likelihood Ratio statistic x − y Difference between sample means adjusted by t = s 2 (1/ n + 1/ m ) standard deviation of that difference n − 1 m − 1 2 2 s = s x n + m − 2 + s y weighted sum of sample variances – with n + m − 2 ) 2 /( n − 1) – where 2 = ∑ s x ( x − x • t has t-distribution with n+m-2 degrees of freedom • Test robust to departures from normal • Test is widely used

Test for Relationship between Variables • Whether distribution of value taken by one variable is independent of value taken by another • Chi-squared test • Goodness-of-fit test with null hypothesis of independence • Two categorical variables • x takes values x i , i=1,..,r with probabilities p(x i ) • y takes values y j j=1,..,s with probabilities p(y j )

Chi-squaredTest for Independence • If x and y are independent p(x i ,y j )=p(x i )p(y j ) • n(x i )/n and n(y i )/n are estimates of probabilities of x taking value x i and y taking value y i • If independent estimate of p(x i ,y j ) is n(x i )n(y j )/n 2 • We expect to find n(x i )n(y j )/n samples in (x i ,y j ) cell • Number the cells 1 to t where t=r.s • E k is expected number in k th cell, O k is observed number Chi-squared with k degrees ( E k − O k ) 2 • Aggregation given by X 2 = of freedom: ∑ Disrib of sum of squares E k of n values each with std k = 1, t • If null hyp holds X 2 has χ 2 distrib normal dist As n increases density becomes flatter. Special case of Gamma • With (r-1)(s-1) degrees of freedom Srihari • Found in tables or directly computed

Testing for Independence Example Outcome\Hospital Referral Non- • Medical data referral No improvement 43 47 • Whether outcome of Partial Improvement 29 120 surgery is Complete Improvement 10 118 independent of hospital type • Total for referral=82 • Total with no improvement=90 • Overall total=367 Under independence, top left cell has expected no= 82 x 90/367=20.11 • Observed number is 43 . Contributes (20.11-43) 2 /20.11 to χ 2 • • Total value is χ 2 = 49.8 • Comparing with χ 2 distribution with (3-1)(2-1)=2 degrees of freedom reveals very high degree of significance • Suggests that outcome is dependent on hospital type 16 Srihari

Chi-squared Goodness of Fit Test • Chi-squared test is more versatile than t-test. Used for categorical distributions • Used for testing normal distribution as well • E,g. • 10 measurements: { x 1 , x 2 , ..., x 10 } . They are supposed to be "normally" distributed with mean µ and standard deviation σ . You could calculate ( x i − µ ) 2 χ 2 = ∑ σ 2 i we expect the measurements to deviate from the mean by the standard deviation, so : |( x i - µ )| is • about the same thing as σ . • Thus in calculating chi-square we add-up 10 numbers that would be near 1. We expect it to approximately equal k , the number of data points. If chi-square is "a lot" bigger than expected something is wrong. Thus one purpose of chi-square is to compare observed results with expected results and see if the result is likely.

Randomization/permutation tests • Earlier tests assume random sample drawn, to: • Make probability statement about a parameter • Make inference about population from sample • Consider medical example: • Compare treatment and control group • H 0 : no effect (distrib of those treated same as those not) • Samples may not be drawn independently • What difference in sample means would there be if difference is consequence of imbalance of popultns • Randomization tests: • Allow us to make statements conditioned on input samples 18 Srihari

Distribution-free Tests • Other Tests assume form of distribution from which samples are drawn • Distribution-free tests replace values by ranks • Examples • If samples from same distribution: ranks well-mixed • If mean is larger, ranks of one larger than other • Test statistics, called nonparametric tests , are: • Sign test statistic • Kolmogorov- Smirnov Test Statistic • Rank sum test statistic • Wilocoxon test statistic Srihari

Data Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling - PowerPoint PPT Presentation

Data Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling Instructor: Sargur N. Srihari University at Buffalo The State University of New York srihari@cedar.buffalo.edu Topics 1. Hypothesis Testing 1. t-test for means 2. Chi-Square test

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Hypothesis testing get data that differ from the null hypothesis. If the data would be quite

Uncertainty AIMA Chapter 13 Outline Uncertainty Uncertainty Probability Syntax and

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

7 Modelling Uncertainty Bayes theorem 7 Modelling Uncertainty Bayes theorem

Testing 6.1 Specification testing Michel Bierlaire A short reminder on hypothesis testing

DS504/CS586: Big Data Analytics Data acquisition and measurement Prof. Yanhua Li Time: 6:00pm

Sampling Distribution of a Statistic Recall: a statistic is a summary calculated from a sample.

STAT 113: EXAM 2 PRACTICE PROBLEMS SOLUTION Inference Foundations. Parameters and Statistics.

Bayesian Subnational Estimation using Complex Survey Data: Overview, Motivation and Survey

Sampling in Practice GESIS Survey Guidelines Sabine Hder These slides are based on the GESIS

Sampling and Estimation in Network Graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for

Political Science 209 - Fall 2018 Uncertainty Florian Hollenbach 2nd December 2018 Statistical

Logistics and Such COGS 105 Research Methods for Cognitive Scientists Exam date now posted.