HYPOTHESIS TESTING PART III LEARNING GOALS become able to - - PowerPoint PPT Presentation

hypothesis testing
SMART_READER_LITE
LIVE PREVIEW

HYPOTHESIS TESTING PART III LEARNING GOALS become able to - - PowerPoint PPT Presentation

INTRODUCTION TO DATA ANALYSIS HYPOTHESIS TESTING PART III LEARNING GOALS become able to interpret & apply some statistical tests Pearsons -tests of independence 2 z -test one-sample t -test two-sample t -test


slide-1
SLIDE 1

HYPOTHESIS TESTING

INTRODUCTION TO DATA ANALYSIS PART III

slide-2
SLIDE 2

LEARNING GOALS

▸ become able to interpret & apply some statistical tests ▸ Pearson’s

  • tests of independence

▸ z-test ▸ one-sample t-test ▸ two-sample t-test ▸ one-way ANOVA ▸ understand differences and commonalities of different

approaches to frequentist testing

▸ Fisher ▸ Neyman/Pearson ▸ modern hybrid NHST

χ2

slide-3
SLIDE 3

P-VALUE

p(Dobs) = P(T|H0 ⪰H0,a t(Dobs))

slide-4
SLIDE 4

Pearson’s

  • test

goodness of fit

χ2

slide-5
SLIDE 5

PEARSON

  • TESTS

χ2

▸ tests for categorical data (with more than two categories) ▸ two flavors: ▸ test of goodness of fit ▸ test of independence ▸ sampling distribution is a

  • distribution

χ2

slide-6
SLIDE 6
  • DISTRIBUTION

χ2

▸ standard normal random variables: ▸ derived RV: ▸ it follows (by construction) that:

X1, …Xn Y = X2

1 + … + X2 n

y ∼ χ2-distribution(n)

slide-7
SLIDE 7

PEARSON’S -TEST [GOODNESS OF FIT]

χ2

Is it conceivable that each category (= pair of music+subject choice) has been selected with the same flat probability of 0.25?

slide-8
SLIDE 8

FREQUENTIST MODEL FOR PEARSON’S -TEST [GOODNESS OF FIT]

χ2

⃗ n ∼ Multinomial( ⃗ p , N)

Sampling distribution: χ2 ∼ χ2-distribution(k − 1)

⃗ n N χ2 ⃗ p

χ2 =

k

i=1

(ni − npi)2 npi

slide-9
SLIDE 9

PEARSON’S -TEST [GOODNESS OF FIT]

χ2

⃗ n N χ2 ⃗ p

χ2 ∼ χ2-distribution(k − 1)

χ2 =

k

i=1

(ni − npi)2 npi

slide-10
SLIDE 10

PEARSON’S -TEST [GOODNESS OF FIT]

χ2

⃗ n N χ2 ⃗ p

χ2 ∼ χ2-distribution(k − 1)

χ2 =

k

i=1

(ni − npi)2 npi

slide-11
SLIDE 11

PEARSON’S -TEST [GOODNESS OF FIT]

χ2

⃗ n N χ2 ⃗ p

χ2 ∼ χ2-distribution(k − 1)

χ2 =

k

i=1

(ni − npi)2 npi

slide-12
SLIDE 12

PEARSON’S -TEST [GOODNESS OF FIT]

χ2

How to interpret / report the result:

What about the lecturer’s conjecture that (colorfully speaking) logic + metal = 🥱?

slide-13
SLIDE 13

Pearson’s

  • test

independence

χ2

slide-14
SLIDE 14

STOCHASTIC INDEPENDENCE

▸ events and are stochastically independent iff ▸ intuitively: learning one does not change beliefs about the other; ▸ formally: ▸ notice that

entails that (see web-book)

A B P(A ∣ B) = P(A) P(A ∣ B) = P(A) P(B ∣ A) = P(B)

slide-15
SLIDE 15

STOCHASTIC INDEPENDENCE

slide-16
SLIDE 16

Is it conceivable that the outcome in each cell is given by independent choices of row and column options? Hence: is the probability of a choice of cell the product of the probability of row- and column choices?

PEARSON’S -TEST [INDEPENDENCE]

χ2

slide-17
SLIDE 17

FREQUENTIST MODEL FOR PEARSON’S -TEST [INDEPENDENCE]

χ2

Sampling distribution: χ2 ∼ χ2-distribution ((kr − 1) ⋅ (kc − 1))

⃗ p = vec. of outer product ⃗ r & ⃗ c

⃗ n χ2 ⃗ r ⃗ c ⃗ p

⃗ n ∼ Multinomial( ⃗ p , N)

N

χ2 =

k

i=1

(ni − npi)2 npi

slide-18
SLIDE 18

FREQUENTIST MODEL FOR PEARSON’S -TEST [INDEPENDENCE]

χ2

⃗ n χ2 ⃗ r ⃗ c ⃗ p N

slide-19
SLIDE 19

FREQUENTIST MODEL FOR PEARSON’S -TEST [INDEPENDENCE]

χ2

⃗ n χ2 ⃗ r ⃗ c ⃗ p N

χ2 =

k

i=1

(ni − npi)2 npi

slide-20
SLIDE 20

FREQUENTIST MODEL FOR PEARSON’S -TEST [INDEPENDENCE]

χ2

⃗ n χ2 ⃗ r ⃗ c ⃗ p N

slide-21
SLIDE 21

FREQUENTIST MODEL FOR PEARSON’S -TEST [INDEPENDENCE]

χ2

⃗ n χ2 ⃗ r ⃗ c ⃗ p N

slide-22
SLIDE 22

FREQUENTIST MODEL FOR PEARSON’S -TEST [INDEPENDENCE]

χ2

How to interpret / report the result:

slide-23
SLIDE 23

z-test

slide-24
SLIDE 24

SCENARIO FOR A -TEST [ONE-SAMPLE]

z

▸ metric variable

with samples from normal distribution

▸ unknown ▸ known [usually unrealistic!]

⃗ x μ σ

Is it plausible to maintain that this data was generated by a normal distribution with mean 100 (if we assume that the standard deviation is known to be 15)?

slide-25
SLIDE 25

FREQUENTIST MODEL FOR A -TEST [ONE-SAMPLE]

z

μ xi σ z

xi ∼ Normal(μ, σ) z = ¯ x − μ σ/ N

z ∼ Normal(0,1) Sampling distribution:

slide-26
SLIDE 26

FREQUENTIST Z-TEST [APPLICATION]

xi ∼ Normal(μ, σ) z = ¯ x − μ σ/ N z ∼ Normal(0,1)

slide-27
SLIDE 27

FREQUENTIST Z-TEST [APPLICATION]

xi ∼ Normal(μ, σ) z = ¯ x − μ σ/ N z ∼ Normal(0,1)

slide-28
SLIDE 28
  • ne-sample

t-test

slide-29
SLIDE 29

FREQUENTIST T-TEST MODEL [ONE-SAMPLE]

xi ̂ σ

xi ∼ Normal(μ, σ)

μ n

t = ¯ x − μ0 ̂ σ/ n

t ∼ Student-t(ν = n − 1) Sampling distribution:

t

̂ σ = 1 n − 1

n

i=1

(xi − μ

⃗ x )2

slide-30
SLIDE 30
  • DISTRIBUTION

t

▸ two random variables: ▸ derived RV: ▸ it follows (by construction) that:

x ∼ Normal(0,1) y ∼ χ2-distribution(n) Z = X Y/n z ∼ Student-t(ν = n − 1)

slide-31
SLIDE 31

FREQUENTIST T-TEST [APPLICATION]

xi ∼ Normal(μ, σ) t = ¯ x − μ0 ̂ σ/ n t ∼ Student-t(ν = n − 1) ̂ σ = 1 n − 1

n

i=1

(xi − μ

⃗ x )2
slide-32
SLIDE 32

xi ∼ Normal(μ, σ) t = ¯ x − μ0 ̂ σ/ n t ∼ Student-t(ν = n − 1) ̂ σ = 1 n − 1

n

i=1

(xi − μ

⃗ x )2

FREQUENTIST T-TEST [APPLICATION]

slide-33
SLIDE 33

two-sample

t-test

(unpaired data, equal variance & unequal sample size)

slide-34
SLIDE 34

COMPARING TWO GROUPS OF METRIC MEASURES

Is it plausible to assume that the observed prices for conventional and organic avocados could have been generated by a single normal distribution?

slide-35
SLIDE 35

FREQUENTIST T-TEST MODEL [TWO-SAMPLE, UNPAIRED, EQUAL VARIANCE, UNEQUAL SAMPLE SIZES]

xA

i

̂ σ μ

xA

i ∼ Normal(μ + δ, σ)

δ xB

i

nA nB

xB

i ∼ Normal(μ, σ)

t = ((¯ xA − ¯ xB) − δ) ⋅ 1 ̂ σ ̂ σ = (nA − 1) ̂ σ2

A + (nB − 1) ̂

σ2

B

nA + nB − 2 ( 1 nA + 1 nB)

t ∼ Student-t(ν = nA + nB − 2) Sampling distribution:

t

slide-36
SLIDE 36

TWO-SAMPLE T-TEST EXAMPLE

xA

i ∼ Normal(μ + δ, σ)

xB

i ∼ Normal(μ, σ)

t = ((¯ xA − ¯ xB) − δ) ⋅ 1 ̂ σ ̂ σ = (nA − 1) ̂ σ2

A + (nB − 1) ̂

σ2

B

nA + nB − 2 ( 1 nA + 1 nB ) t ∼ Student-t(ν = nA + nB − 2)

slide-37
SLIDE 37

TWO-SAMPLE T-TEST EXAMPLE

xA

i ∼ Normal(μ + δ, σ)

xB

i ∼ Normal(μ, σ)

t = ((¯ xA − ¯ xB) − δ) ⋅ 1 ̂ σ ̂ σ = (nA − 1) ̂ σ2

A + (nB − 1) ̂

σ2

B

nA + nB − 2 ( 1 nA + 1 nB ) t ∼ Student-t(ν = nA + nB − 2)

slide-38
SLIDE 38
  • ne-way

ANOVA

slide-39
SLIDE 39

COMPARING K ≥ 2 GROUPS OF METRIC MEASURES

Is it plausible to assume that these measures stem from the same normal distribution?

slide-40
SLIDE 40

WHY NOT -TESTS?

t

▸ we could run -tests between

different groups

▸ chance of error rises with

each comparison

▸ common corrections apply ▸ gets tedious with large

t α k

slide-41
SLIDE 41

FREQUENTIST MODEL FOR ANOVA [ONE-WAY]

xij σ

xij ∼ Normal(μ, σ)

μ

F = ̂ σbetween ̂ σwithin

F ∼ F-distribution (k − 1,

k

i=1

(ni − 1)) Sampling distribution:

F

̂ σwithin = ∑k

j=1 ∑nj i=1 (xij − ¯

xj)2 ∑k

i=1 (ni − 1)

̂ σbetween = ∑k

j=1 nj(¯

xj − ¯ ¯ x)2 k − 1

slide-42
SLIDE 42

F-STATISTIC EXAMPLES

slide-43
SLIDE 43
  • DISTRIBUTION

F

▸ two

  • distributed random variables:

▸ derived RV: ▸ it follows (by construction) that:

χ2 x ∼ χ2-distribution(m) y ∼ χ2-distribution(n) Z = X/m Y/n z ∼ F-distribution(m, n)

slide-44
SLIDE 44

EXAMPLE

slide-45
SLIDE 45

varieties of frequentist testing

slide-46
SLIDE 46

THREE VARIETIES OF FREQUENTIST TESTING

FISHER NEYMAN/PEARSON HYBRID NHST* explicit & serious alternative Ha

X ✓ X

when to set-up statistical model after data collection before data collection after data collection goal of statistical analysis quantify evidence against H0 decide action: adopt H0 or Ha decide action: adopt H0 or ¬H0

power calculation

X ✓ X

* this is a worst-case portrait of modern NHST ; this is not how it should be done

slide-47
SLIDE 47

NEYMAN/PEARSON APPROACH [INFORMAL GIST]

▸ procedure in N/P approach: ▸ fix H0 and Ha (based on prior research) ▸ determine desired α- and β-error level ▸ calculate sample size N necessary for β given α ▸ run the experiment ▸ determine significance based on α-level ▸ make a dichotomous decision: ▸ accept Ha if test is significant ▸ accept H0 otherwise

slide-48
SLIDE 48

LONG-TERM ERROR CONTROL IN NEYMAN/PEARSON APPROACH

[null-hypothesis] [alternative hypothesis] [sampling distribution of mean under H0] [sampling distribution of mean under Ha] [more data = tighter curves!! = lower β] [α error = accept Ha when H0 is true] [β error = accept H0 when Ha is true]
slide-49
SLIDE 49

EXAMPLES FROM TEXTBOOKS

neither textbook talks about fixing Ha and/or calculating power of a test
slide-50
SLIDE 50

THREE VARIETIES OF FREQUENTIST TESTING

FISHER NEYMAN/PEARSON HYBRID NHST* explicit & serious alternative Ha

X ✓ X

when to set-up statistical model after data collection before data collection after data collection goal of statistical analysis quantify evidence against H0 decide action: adopt H0 or Ha decide action: adopt H0 or ¬H0

power calculation

X ✓ X

* this is a worst-case portrait of modern NHST ; this is not how it should be done

slide-51
SLIDE 51