13. hypothesis testing 1 competing hypotheses 2 competing - - PowerPoint PPT Presentation

13 hypothesis testing
SMART_READER_LITE
LIVE PREVIEW

13. hypothesis testing 1 competing hypotheses 2 competing - - PowerPoint PPT Presentation

CSE 312, Winter 2011, W.L.Ruzzo 13. hypothesis testing 1 competing hypotheses 2 competing hypotheses 3 competing hypotheses 4 competing hypotheses 5 hypothesis testing E.g.: By convention, the null hypothesis is usually the


slide-1
SLIDE 1
  • 13. hypothesis testing

CSE 312, Winter 2011, W.L.Ruzzo 1

slide-2
SLIDE 2

competing hypotheses

2

slide-3
SLIDE 3

competing hypotheses

3

slide-4
SLIDE 4

competing hypotheses

4

slide-5
SLIDE 5

competing hypotheses

5

slide-6
SLIDE 6

hypothesis testing

6

By convention, the null hypothesis is usually the “simpler” hypothesis, or “prevailing wisdom.” E.g., Occam’s Razor says you should prefer that unless there is good evidence to the contrary.

E.g.:

slide-7
SLIDE 7

decision rules

7

slide-8
SLIDE 8

error types

8

slide-9
SLIDE 9

likelihood ratio tests

9

slide-10
SLIDE 10

simple vs composite hypotheses

10

note that LRT is problematic for composite hypotheses; which value for the unknown parameter would you use to compute it’s likelihood?

slide-11
SLIDE 11

Neyman-Pearson lemma

11

slide-12
SLIDE 12

example

12

slide-13
SLIDE 13

13

another example Given: A coin, either fair (p(H)=1/2) or biased (p(H)=2/3) Decide: which How? Flip it 5 times. Suppose outcome D = HHHTH Null Model/Null Hypothesis M0: p(H)=1/2 Alternative Model/Alt Hypothesis M1: p(H)=2/3 Likelihoods:

P(D | M0) = (1/2) (1/2) (1/2) (1/2) (1/2) = 1/32 P(D | M1) = (2/3) (2/3) (2/3) (1/3) (2/3) = 16/243

Likelihood Ratio: I.e., alt model is ≈ 2.1x more likely than null model, given data

p(D | M 1 ) p(D| M 0 ) = 16/ 243 1/ 32 = 512 243 ≈ 2.1

slide-14
SLIDE 14

2

some notes

Log of likelihood ratio is equivalent, often more convenient add logs instead of multiplying… “Likelihood Ratio Tests”: reject null if LLR > threshold LLR > 0 disfavors null, but higher threshold gives stronger evidence against Neyman-Pearson Theorem: For a given error rate, LRT is as good a test as any (subject to some fine print).

14

slide-15
SLIDE 15

summary

Null/Alternative hypotheses - specify distributions from which data are assumed to have been sampled Simple hypothesis - one distribution

E.g., “Normal, mean = 42, variance = 12”

Composite hypothesis - more that one distribution

E.g., “Normal, mean > 42, variance = 12”

Decision rule; “accept/reject null if sample data...”; many possible Type 1 error: reject null when it is true Type 2 error: accept null when it is false

α = P(type 1 error), β = P(type 2 error)

Likelihood ratio tests: for simple null vs simple alt, compare ratio of likelihoods under the 2 competing models to a fixed threshold. Neyman-Pearson: LRT is best possible in this scenario.

15

slide-16
SLIDE 16

And One Last Bit of Probability Theory

slide-17
SLIDE 17

17

slide-18
SLIDE 18

18

slide-19
SLIDE 19

19

slide-20
SLIDE 20

20

slide-21
SLIDE 21

21