Welcome to the course!
FOU N DATION S OF IN FE R E N C E
Jo Hardin
Instructor
Welcome to the co u rse ! FOU N DATION S OF IN FE R E N C E Jo - - PowerPoint PPT Presentation
Welcome to the co u rse ! FOU N DATION S OF IN FE R E N C E Jo Hardin Instr u ctor What is statistical inference ? The process of making claims abo u t a pop u lation based on information from a sample FOUNDATIONS OF INFERENCE What is
FOU N DATION S OF IN FE R E N C E
Jo Hardin
Instructor
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
Null hypothesis (H ): The claim is not that interesting Alternative hypothesis (H ): The claim corresponding to the research hypothesis The "goal" is to disprove the null hypothesis
A
FOUNDATIONS OF INFERENCE
Compare speed of two dierent subspecies of cheetah
H : Asian and African cheetahs run the same
speed, on average
H : African cheetahs are faster than Asian
cheetahs, on average
A
FOUNDATIONS OF INFERENCE
From a sample, the researchers would like to claim that Candidate X will win
H : Candidate X will get half the votes H : Candidate X will get more than half the
votes
A
FOU N DATION S OF IN FE R E N C E
FOU N DATION S OF IN FE R E N C E
Jo Hardin
Instructor
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
Generating a distribution of the statistic from the null population gives information about whether the observed data are inconsistent with the null hypothesis
FOUNDATIONS OF INFERENCE
Original data Location Cola Orange East 28 6 West 19 7
= 28/(28 + 6) = 0.82 = 19/(19 + 7) = 0.73 p ^east p ^west
FOUNDATIONS OF INFERENCE
First shue, same as original Location Cola Orange East 28 6 West 19 7
FOUNDATIONS OF INFERENCE
Second shue Location Cola Orange East 27 7 West 20 6
FOUNDATIONS OF INFERENCE
Third shue Location Cola Orange East 28 8 West 21 5
FOUNDATIONS OF INFERENCE
Fourth shue Location Cola Orange East 25 9 West 22 4
FOUNDATIONS OF INFERENCE
Fih shue Location Cola Orange East 29 5 West 18 8
FOUNDATIONS OF INFERENCE
Fih shue Location Cola Orange East 29 5 West 18 8
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
soda %>% group_by(location) %>% summarize(prop_cola = mean(drink == "cola")) %>% summarize(diff(prop_cola)) # A tibble: 1 x 1 `diff(prop_cola)` <dbl> 1 -0.09276018 library(infer) soda %>% specify(drink ~ location, success = "cola") %>% hypothesize(null = "independence") %>% generate(reps = 1, type = "permute") %>% calculate(stat = "diff in props",
# A tibble: 1 x 2 replicate stat <int> <dbl> 1 1 -0.02488688
FOUNDATIONS OF INFERENCE
soda %>% specify(drink ~ location, success = "cola") %>% hypothesize(null = "independence") %>% generate(reps = 5, type = "permute") %>% calculate(stat = "diff in props", order = c("west", "east")) # A tibble: 5 x 2 replicate stat <int> <dbl> 1 1 0.04298643 2 2 -0.09276018 3 3 0.11085973 4 4 0.17873303 5 5 -0.16063348
FOUNDATIONS OF INFERENCE
FOU N DATION S OF IN FE R E N C E
FOU N DATION S OF IN FE R E N C E
Jo Hardin
Instructor
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
table(soda) location drink East West cola 28 19
soda %>% group_by(location) %>% summarize(mean(drink == "cola")) # A tibble: 2 × 2 location `mean(drink == "cola")` <fctr> <dbl> 1 East 0.8235294 2 West 0.7307692
FOUNDATIONS OF INFERENCE
FOUNDATIONS OF INFERENCE
diff_orig <- soda %>% group_by(location) %>% summarize(prop_cola = mean(drink == "cola")) %>% summarize(diff(prop_cola)) %>% pull() soda_perm <- soda %>% specify(drink ~ location, success = "cola") %>% hypothesize(null = "independence") %>% generate(reps = 100, type = "permute") %>% calculate(stat = "diff in props",
soda_perm %>% summarize(proportion = mean(diff_orig >= stat)) # A tibble: 1 x 1 proportion <dbl> 1 0.380
FOU N DATION S OF IN FE R E N C E
FOU N DATION S OF IN FE R E N C E
Jo Hardin
Instructor
FOUNDATIONS OF INFERENCE
We fail to reject the null hypothesis: There is no evidence that our data are inconsistent with the null hypothesis
FOUNDATIONS OF INFERENCE
Representative sample of US population Conclusions from sample may apply to population Nothing to report in this case
FOU N DATION S OF IN FE R E N C E