CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and - PowerPoint PPT Presentation

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and Classification Classification Evan Rosenman Evan Rosenman April 23, 2019 April 23, 2019 3.7

Contents Contents Hypothesis testing Logistic Regression Random Forest 3.7

Hypothesis testing Hypothesis testing 3.7

Hypothesis testing answers explicit questions Hypothesis testing answers explicit questions Is the measured quantity equal to/higher/lower than a given threshold? e.g. is the number of faulty items in an order statistically higher than the one guaranteed by a manufacturer? Is there a difference between two groups or observations ? e.g. Do treated patient have a higher survival rate than the untreated ones? Is the level of one quantity related to the value of the other quantity? e.g. Is lung cancer associated with smoking? 3.7

To perform a hypothesis test you need to: To perform a hypothesis test you need to: 1. Define the null and alternative hypotheses. 2. Choose level of significance . 3. Pick and compute test statistics. 4. Compute the p-value. 5. Check whether to reject the null hypothesis by comparing p- value to . 6. Draw conclusion from the test. 3.7

Null and alternative hypotheses Null and alternative hypotheses The null hypothesis ( ) : A statement assumed to be true unless it H 0 can be shown to be incorrect beyond a reasonable doubt. This is something one usually attempts to disprove or discredit. The alternative hypothesis ( ) : A claim that is contradictory to H 1 H 0 and what we conclude when we reject . H 0 and are on set up to be contradictory, so that one can collect H 0 H 1 and examine data to decide if there is enough evidence to reject the null hypothesis or not . 3.7

Student’s ttest Student’s ttest Originated from William Gosset (1908), a chemist at the Guiness brewery . Published in Biometrika under a pseudonym Student . Used to select best yielding varieties of barley. Now one of the standard/traditional methods for hypothesis testing. Among the typical applications: Comparing population mean to a constant value Comparing the means of two populations Comparing the slope of a regression line to a constant In general, used when the test statistic would follow a normal distribution if the standard deviation of the test statistic were known. 3.7

̔ Distribution of the tstatistic Distribution of the tstatistic If , the empirical estimates for mean and variance are: 2 X ∼  ( ̍ , ) i and n ∑ n n − 1 ∑ n 1 1 s 2 ¯) 2 ¯ X = i=1 X = i=1 X ( − X i i The t-statistic is: ¯ − ̍ X T = ∼ t ̎ =n − 1 s/ n ‾ √ 3.7

pvalue pvalue p-value is the probability of obtaining the same or “more extreme” event than the one observed, assuming the null hypothesis is true . It is emphatically not the probability that the null hypothesis is true! A small p-value, typically < 0.05, indicates strong evidence against the null hypothesis; in this case you can reject the null hypothesis. A large p-value, > 0.05, indicates weak evidence against the null hypothesis Note: 0.05 is a completely arbitrary cutoff that is nonetheless in common use. 3.7

p-value = P[observations ∣ hypothesis] ≠ P[hypothesis ∣ observations] pvalues should NOT be used a “ranking”/“scoring” system for your hypotheses 3.7

̍ ̍ Twosided test of the mean Twosided test of the mean Is the mean flight arrival delay statistically equal to 0? Test the null hypothesis: H 0 : ̍ = ̍ 0 = 0 H 1 : ̍ ≠ ̍ 0 = 0 where is where is the average arrival delay. 3.7

library (tidyverse) library (nycflights13) mean (flights$arr_delay, na.rm = T) ## [1] 6.895377 Is this statistically different from 0? ( tt = t.test (x=flights$arr_delay, mu=0, alternative="two.sided" ) ) ## ## One Sample t-test ## ## data: flights$arr_delay ## t = 88.39, df = 327340, p-value < 2.2e-16 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## 6.742478 7.048276 ## sample estimates: ## mean of x ## 6.895377 3.7

from 7? ( tt = t.test (x=flights$arr_delay, mu=7, alternative="two.sided" ) ) ## ## One Sample t-test ## ## data: flights$arr_delay ## t = -1.3411, df = 327340, p-value = 0.1799 ## alternative hypothesis: true mean is not equal to 7 ## 95 percent confidence interval: ## 6.742478 7.048276 ## sample estimates: ## mean of x ## 6.895377 3.7

The function t.test returns an object containing the following components: names (tt) ## [1] "statistic" "parameter" "p.value" "conf.int" "estimate" ## [6] "null.value" "alternative" "method" "data.name" # The p-value: tt$p.value ## [1] 2.80067e-130 # The 95% confidence interval for the mean: tt$conf.int ## [1] 6.742478 7.048276 ## attr(,"conf.level") ## [1] 0.95 3.7

Onesided test of the mean Onesided test of the mean One-sided can be more powerful, but the intepretation is more difficult. Test the null hypothesis: H 0 : ̍ = ̍ 0 = 0 H 1 : ̍ < ̍ 0 = 0 t.test (x, mu=0, alternative="less") 3.7

Testing difference between groups Testing difference between groups This test allows you to compare the means between two groups and . a b Test the null hypothesis: H 0 : ̍ a = ̍ b : ≠ H 1 ̍ a ̍ b 3.7

Testing differences in mean carat by diamond cut Testing differences in mean carat by diamond cut ggplot (diamonds %>% filter (cut %in% c ("Ideal", "Very Good"))) + geom_boxplot ( aes (x = cut, y = carat)) 3.7

Testing differences in mean carat by diamond cut Testing differences in mean carat by diamond cut ideal.diamonds.carat <- diamonds$carat[diamonds$cut == "Ideal"] vg.diamonds.carat <- diamonds$carat[diamonds$cut == "Very Good"] t.test (ideal.diamonds.carat, vg.diamonds.carat) ## ## Welch Two Sample t-test ## ## data: ideal.diamonds.carat and vg.diamonds.carat ## t = -20.242, df = 23794, p-value < 2.2e-16 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -0.11357056 -0.09351824 ## sample estimates: ## mean of x mean of y ## 0.7028370 0.8063814 3.7

Exercise Exercise Similarly to dataset mtcars , the dataset mpg from ggplot package includes data on automobiles. However, mpg includes data for newer cars from year 1999 and 2008. The variables measured for each car is slighly different. Here we are interested in the variable, hwy , the highway miles per gallon. # We first format the column trans to contain only info on transmission auto/manual mpg <- mpg %>% mutate ( transmission = factor ( gsub ("\\((.*)", "", trans), levels = c ("auto", "manual")) ) mpg 3.7

Exercise 1 Exercise 1 1. Subset the mpg dataset to inlude only cars from year 2008. 2. Test whether cars from 2008 have mean the highway miles per gallon, hwy , equal to 30 mpg. 3. Test whether cars from 2008 with 4 cylinders have mean hwy equal to 30 mpg. 3.7

Logistic Regression Logistic Regression 3.7

What is classification? What is classification? Classification is a supervised methood which deals with prediction outcomes or response variables that are qualitative, or categorical . The task is to classify or assign each observation to a category or a class. Examples of classification problems include: predicting what medical condition or disease a patient has base on their symptoms, determining cell types based on their gene expression profiles (single cell RNA-seq data). detecting fraudulent transactions based on the transaction history 3.7

̃ Logistic Regression Logistic Regression Logistic regression is actually used for classification , and not regression tasks, . Y ∈ {0, 1} The name regression comes from the fact that the method fits a linear function to a continuous quantity, the log odds of the response . p = P[Y = 1 ∣ X = x] p T log ( ) = x 1 − p The method performs binary classification (k = 2), but can be generalized to handle classes ( multinomial k > 2 logistic regression ). 3.7

̃ ̃ p g(p) = log ( ) , (logit link function ) 1 − p 1 g − 1 ( ̈ ) = , (logistic function) 1 + e − ̈ T = x, (linear predictor) E[Y] = P[Y = 1 ∣ X = x] (probability of outcome) g − 1 = p = ( ̈ ) 1 = T 1 + e − x 3.7

The logistic function The logistic function 3.7

Grad School Admissions Grad School Admissions Suppose we would like to predict students’ admission to graduate school based on GRE, GPA, and undergrad institution rank. admissions <- read_csv ("https://stats.idre.ucla.edu/stat/data/binary.csv") ## Parsed with column specification: ## cols( ## admit = col_integer(), ## gre = col_integer(), ## gpa = col_double(), ## rank = col_integer() ## ) admissions ## # A tibble: 400 x 4 ## admit gre gpa rank ## <int> <int> <dbl> <int> ## 1 0 380 3.61 3 ## 2 1 660 3.67 3 ## 3 1 800 4 1 ## 4 1 640 3.19 4 ## 5 0 520 2.93 4 ## 6 1 760 3 2 ## 7 1 560 2.98 1 ## 8 0 400 3.08 2 ## 9 1 540 3.39 3 ## 10 0 700 3.92 2 ## # ... with 390 more rows 3.7

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and - PowerPoint PPT Presentation

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and Classification Classification Evan Rosenman Evan Rosenman April 23, 2019 April 23, 2019 3.7 Contents Contents Hypothesis testing Logistic

CME/STATS 195 CME/STATS 195 Lecture 8: Hypothesis Testing and Lecture 8: Hypothesis Testing and

CME/STATS 195 CME/STATS 195 Lecture 2: Programming and Lecture 2: Programming and Communicating

CME/STATS 195 CME/STATS 195 Lecture 4: Visualizing data Lecture 4: Visualizing data Evan

CME/STATS 195 CME/STATS 195 Lecture 3: Importing and transforming data Lecture 3: Importing and

CME/STATS 195 CME/STATS 195 Lecture 6: Data Modeling and Linear Lecture 6: Data Modeling and

CME/STATS 195 CME/STATS 195 Lecture 5: Exploratory Data Analysis Lecture 5: Exploratory Data

CME/STATS 195 Lecture 1: Intro to R Evan Rosenman April 2, 2019 Contents Course Objectives

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Lecture 4: Hypothesis Testing Ani Manichaikul amanicha@jhsph.edu 20 April 2007 1 / 69 Steps of

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Cp*Ru(II) COMPLEXES BEARING PRIMARY AMINE LIGANDS L = N(CH 3 ) 2 catalyst precursor

Impact of Sludge Towards Stabilization of the Fire Road Mine M. Coleman 1 , K.D.Phinney 2 + + 1

Print version Updated: 25 February 2020 Lecture #20 Dissolved Carbon Dioxide: Closed Systems II

Real-time Data Pipelines with Structured Streaming in Tathagata TD Das @tathadas

Corrosion Potential of Soils For geotechnical engineers, its very important subject Metal

Tree models with Scikit-Learn Great learners with little assumptions Material:

Resampling statistics and multiple testing STEPHANIE J. SPIELMAN, PHD BIO5312, FALL 2017 While

DC investment trends Ian Sykes Willis T owers Watson Insert your logo here IAPF Annual DC