welcome to the course
play

Welcome to the course! Mine Cetinkaya-Rundel Associate Professor of - PowerPoint PPT Presentation

DataCamp Inference for Numerical Data in R INFERENCE FOR NUMERICAL DATA IN R Welcome to the course! Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University DataCamp Inference for Numerical Data in R Rent in Manhattan On a


  1. DataCamp Inference for Numerical Data in R INFERENCE FOR NUMERICAL DATA IN R Welcome to the course! Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University

  2. DataCamp Inference for Numerical Data in R Rent in Manhattan On a given day, twenty 1 BR apartments were randomly selected on Craigslist Manhattan from apartments listed as "by owner" (as opposed to by a rental agency). Is the mean or the median a better measure of typical rent in Manhattan?

  3. DataCamp Inference for Numerical Data in R Bootstrapping techniques Assume the data is representative Pulling oneself up by one's bootstraps

  4. DataCamp Inference for Numerical Data in R Observed sample sample median = $2,350

  5. DataCamp Inference for Numerical Data in R Bootstrap population

  6. DataCamp Inference for Numerical Data in R Bootstraping scheme 1. Take a bootstrap sample - a random sample taken with replacement from the original sample, of the same size as the original sample. 2. Calculate the bootstrap statistic - a statistic such as mean, median, proportion, etc. computed on the bootstrap samples. 3. Repeat steps (1) and (2) many times to create a bootstrap distribution - a distribution of bootstrap statistics.

  7. DataCamp Inference for Numerical Data in R Bootstraping scheme, in R library(infer) ___ %>% # start with data frame specify(response = ___) %>% # specify the variable of interest

  8. DataCamp Inference for Numerical Data in R Bootstraping scheme, in R library(infer) ___ %>% # start with data frame specify(response = ___) %>% # specify the variable of interest generate(reps = ___, type = "bootstrap") %>% # generate bootstrap samples

  9. DataCamp Inference for Numerical Data in R Bootstraping scheme, in R library(infer) ___ %>% # start with data frame specify(response = ___) %>% # specify the variable of interest generate(reps = ___, type = "bootstrap") %>% # generate bootstrap samples calculate(stat = "___") # calculate bootstrap statistic

  10. DataCamp Inference for Numerical Data in R Constructing the bootstrap interval library(infer) ___ %>% # start with data frame specify(response = ___) %>% # specify the variable of interest generate(reps = ___, type = "bootstrap") %>% # generate bootstrap samples calculate(stat = "___") # calculate bootstrap statistic

  11. DataCamp Inference for Numerical Data in R Constructing the bootstrap interval library(infer) ___ %>% # start with data frame specify(response = ___) %>% # specify the variable of interest generate(reps = ___, type = "bootstrap") %>% # generate bootstrap samples calculate(stat = "___") # calculate bootstrap statistic

  12. DataCamp Inference for Numerical Data in R INFERENCE FOR NUMERICAL DATA IN R Let's practice!

  13. DataCamp Inference for Numerical Data in R INFERENCE FOR NUMERICAL DATA IN R Review: Percentile and standard error methods Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University

  14. DataCamp Inference for Numerical Data in R Bootstrap distribution

  15. DataCamp Inference for Numerical Data in R Percentile method

  16. DataCamp Inference for Numerical Data in R Percentile method

  17. DataCamp Inference for Numerical Data in R Standard error method ∗ sample statistic ± t × SE df = n −1 boot ∗ df for t is n − 1 , where n is the sample size is the standard deviation of the bootstrap distribution distribution SE boot

  18. DataCamp Inference for Numerical Data in R INFERENCE FOR NUMERICAL DATA IN R Let's practice!

  19. DataCamp Inference for Numerical Data in R INFERENCE FOR NUMERICAL DATA IN R Re-centering a bootstrap distribution for hypothesis testing Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University

  20. DataCamp Inference for Numerical Data in R Re-centering a bootstrap distribution for hypothesis testing Bootstrap distributions are by design centered at the observed sample statistic. However since in a hypothesis test we assume that H is true, we shift the 0 bootstrap distribution to be centered at the null value. p-value = The proportion of simulations that yield a sample statistic at least as favorable to the alternative hypothesis as the observed sample statistic.

  21. DataCamp Inference for Numerical Data in R Re-centering the bootstrap distribution - sketch

  22. DataCamp Inference for Numerical Data in R INFERENCE FOR NUMERICAL DATA IN R Let's practice!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend