H y pothesis test for a proportion IN FE R E N C E FOR C ATE G OR - - PowerPoint PPT Presentation

h y pothesis test for a proportion
SMART_READER_LITE
LIVE PREVIEW

H y pothesis test for a proportion IN FE R E N C E FOR C ATE G OR - - PowerPoint PPT Presentation

H y pothesis test for a proportion IN FE R E N C E FOR C ATE G OR IC AL DATA IN R Andre w Bra y Assistant Professor of Statistics at Reed College INFERENCE FOR CATEGORICAL DATA IN R INFERENCE FOR CATEGORICAL DATA IN R INFERENCE FOR


slide-1
SLIDE 1

Hypothesis test for a proportion

IN FE R E N C E FOR C ATE G OR IC AL DATA IN R

Andrew Bray

Assistant Professor of Statistics at Reed College

slide-2
SLIDE 2

INFERENCE FOR CATEGORICAL DATA IN R

slide-3
SLIDE 3

INFERENCE FOR CATEGORICAL DATA IN R

slide-4
SLIDE 4

INFERENCE FOR CATEGORICAL DATA IN R

slide-5
SLIDE 5

INFERENCE FOR CATEGORICAL DATA IN R

slide-6
SLIDE 6

INFERENCE FOR CATEGORICAL DATA IN R

slide-7
SLIDE 7

INFERENCE FOR CATEGORICAL DATA IN R

slide-8
SLIDE 8

INFERENCE FOR CATEGORICAL DATA IN R

Do half of Americans favor capital punishment?

gss2016 %>% ggplot(aes(x = cappun)) + geom_bar() p_hat <- gss2016 %>% summarize(mean(cappun == "FAVOR")) %>% pull() p_hat 0.5666667

slide-9
SLIDE 9

INFERENCE FOR CATEGORICAL DATA IN R

Do half of Americans favor capital punishment?

null <- gss2016 %>% specify( response = cappun, success = "FAVOR" ) %>% hypothesize( null = "point", p = 0.5 ) %>% generate( reps = 500, type = "simulate" ) %>% calculate(stat = "prop") A tibble: 500 x 2 replicate stat <fct> <dbl> 1 1 0.48 2 2 0.447 3 3 0.48 4 4 0.44 5 5 0.407 6 6 0.52 7 7 0.413 8 8 0.553 9 9 0.52 10 10 0.467 # … with 490 more rows

slide-10
SLIDE 10

INFERENCE FOR CATEGORICAL DATA IN R

Do half of Americans favor capital punishment?

ggplot(null, aes(x = stat)) + geom_density() + geom_vline( xintercept = p_hat, color = "red" ) null %>% summarize(mean(stat > p_hat)) %>% pull() * 2

slide-11
SLIDE 11

INFERENCE FOR CATEGORICAL DATA IN R

Hypothesis test

Null hypothesis: theory about the state of the world. Null distribution: distribution of test statistics assuming null is true. p-value: a measure of consistency between null hypothesis and your observations. high p-value: consistent (p-val > alpha) low p-value: inconsistent (p-val < alpha)

slide-12
SLIDE 12

Let's practice!

IN FE R E N C E FOR C ATE G OR IC AL DATA IN R

slide-13
SLIDE 13

Intervals for differences

IN FE R E N C E FOR C ATE G OR IC AL DATA IN R

Andrew Bray

Assistant Professor of Statistics at Reed College

slide-14
SLIDE 14

INFERENCE FOR CATEGORICAL DATA IN R

A question in two variables

Do women and men believe at dierent rates? Let p be the proportion that believe in life aer death.

H : p − p = 0 H : p − p ≠ 0

female male A female male

slide-15
SLIDE 15

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

ggplot(gss2016, aes(x = sex, fill = postlife)) + geom_bar()

slide-16
SLIDE 16

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

ggplot(gss2016, aes(x = sex, fill = postlife)) + geom_bar(position = "fill")

slide-17
SLIDE 17

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

p_hats <- gss2016 %>% group_by(sex) %>% summarize(mean(postlife == "YES", na.rm = TRUE)) %>% pull() d_hat <- diff(p_hats) d_hat 0.1472851

slide-18
SLIDE 18

INFERENCE FOR CATEGORICAL DATA IN R

Generating data from H0

H : p − p = 0

There is no association between belief in the aerlife and the sex of a subject. The variable postlife is independent from the variable sex . ⇒ Generate data by permutation

female male

slide-19
SLIDE 19

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

gss2016 %>% specify( response = postlife, explanatory = sex, success = "YES" ) %>% hypothesize(null = "independence") %>% generate(reps = 1, type = "permute")

slide-20
SLIDE 20

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

gss2016 %>% specify( postlife ~ sex, # this line is new success = "YES" ) %>% hypothesize(null = "independence") %>% generate(reps = 1, type = "permute") Response: postlife (factor) Explanatory: sex (factor) Null Hypothesis: independence # A tibble: 137 x 3 # Groups: replicate [1] postlife sex replicate <fct> <fct> <int> 1 YES FEMALE 1 2 YES MALE 1 3 YES FEMALE 1 4 YES MALE 1 5 YES MALE 1 6 YES FEMALE 1 7 NO FEMALE 1

slide-21
SLIDE 21

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

gss2016 %>% specify( postlife ~ sex, success = "YES" ) %>% hypothesize(null = "independence") %>% generate(reps = 1, type = "permute") Response: postlife (factor) Explanatory: sex (factor) Null Hypothesis: independence # A tibble: 137 x 3 # Groups: replicate [1] postlife sex replicate <fct> <fct> <int> 1 YES FEMALE 1 2 NO MALE 1 3 NO FEMALE 1 4 YES MALE 1 5 YES MALE 1 6 YES FEMALE 1 7 YES FEMALE 1

slide-22
SLIDE 22

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

gss2016 %>% specify(postlife ~ sex, success = "YES") %>% hypothesize(null = "independence") %>% generate(reps = 500, type = "permute") %>% calculate(stat = "diff in props", order = c("FEMALE", "MALE")) Warning message: Removed 13 rows containing missing values.

slide-23
SLIDE 23

INFERENCE FOR CATEGORICAL DATA IN R

Do women and men have different opinions on life after death?

ggplot(null, aes(x = stat)) + geom_density() + geom_vline(xintercept = d_hat, color = "red")

These data suggest that there is a dierence between sexes in the belief of life aer death.

slide-24
SLIDE 24

Let's practice!

IN FE R E N C E FOR C ATE G OR IC AL DATA IN R

slide-25
SLIDE 25

Statistical errors

IN FE R E N C E FOR C ATE G OR IC AL DATA IN R

Andrew Bray

Assistant Professor of Statistics at Reed College

slide-26
SLIDE 26

INFERENCE FOR CATEGORICAL DATA IN R

slide-27
SLIDE 27

INFERENCE FOR CATEGORICAL DATA IN R

slide-28
SLIDE 28

INFERENCE FOR CATEGORICAL DATA IN R

slide-29
SLIDE 29

INFERENCE FOR CATEGORICAL DATA IN R

slide-30
SLIDE 30

INFERENCE FOR CATEGORICAL DATA IN R

slide-31
SLIDE 31

INFERENCE FOR CATEGORICAL DATA IN R

slide-32
SLIDE 32

INFERENCE FOR CATEGORICAL DATA IN R

slide-33
SLIDE 33

INFERENCE FOR CATEGORICAL DATA IN R

slide-34
SLIDE 34

INFERENCE FOR CATEGORICAL DATA IN R

slide-35
SLIDE 35

INFERENCE FOR CATEGORICAL DATA IN R

slide-36
SLIDE 36

INFERENCE FOR CATEGORICAL DATA IN R

slide-37
SLIDE 37

INFERENCE FOR CATEGORICAL DATA IN R

slide-38
SLIDE 38

INFERENCE FOR CATEGORICAL DATA IN R

slide-39
SLIDE 39

INFERENCE FOR CATEGORICAL DATA IN R

slide-40
SLIDE 40

INFERENCE FOR CATEGORICAL DATA IN R

slide-41
SLIDE 41

INFERENCE FOR CATEGORICAL DATA IN R

slide-42
SLIDE 42

INFERENCE FOR CATEGORICAL DATA IN R

slide-43
SLIDE 43

INFERENCE FOR CATEGORICAL DATA IN R

slide-44
SLIDE 44

Let's practice!

IN FE R E N C E FOR C ATE G OR IC AL DATA IN R