Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden - PowerPoint PPT Presentation

Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

Ways of being wrong When we run a test: Real effect present No real effect present Effect found (positive : alternative hypothesis) True Positive False Positive No effect found (negative: null hypothesis) False Negative True Negative Type I error : �nd difference where none exists Type II error : fail to �nd difference that does exist EXPERIMENTAL DESIGN IN PYTHON

Avoiding type I errors Basis of tests Statistical tests are probabilistic Quantify likelihood of results under null hypothesis Consider: Signi�cant results are improbable , not impossible under null hypothesis Still possible result are by chance EXPERIMENTAL DESIGN IN PYTHON

Picking a single result can be misleading Example EXPERIMENTAL DESIGN IN PYTHON

Accounting for multiple tests By design Avoid "p-value �shing" By correction Correct p-values for presence of multiple tests Correction methods Bonferroni and Š ídák Choose method based on independence of tests EXPERIMENTAL DESIGN IN PYTHON

Bonferroni correction Conservative method import statsmodels as sm from scipy import stats Simple t_1= stats.ttest_ind(Array1, Array2) t_2= stats.ttest_ind(Array2, Array3) t_3= stats.ttest_ind(Array1, Array3) Use when pvals_array = [t_1[1],t_2[1],t_3[1]] T ests are not independent from each other adjustedvalues= sm.stats.multitest.multipletests pvals_array, alpha=0.05, method='b') EXPERIMENTAL DESIGN IN PYTHON

Bonferroni correction example Multiple non-independent t-tests EXPERIMENTAL DESIGN IN PYTHON

from scipy import stats import statsmodels as sm t_result_1= stats.ttest_ind(HighJumpVals, LongJumpVals) t_result_2= stats.ttest_ind(LongJumpVals, ShotPutVals) t_result_3= stats.ttest_ind(HighJumpVals, HighJumpVals) pvals_array = [t_result_1[1],t_result_2[1],t_result_3[1]] adjustedvalues= sm.stats.multitest.multipletests(pvals_array, alpha=0.05, method='b') print(adjustedvalues) (array([ True, True, False]), array([6.72030836e-63, 3.46967459e-97, 1.00000000e+00]), 0.016952427508441503, 0.016666666666666666) EXPERIMENTAL DESIGN IN PYTHON

Š ídák correction Less conservative method import statsmodels as sm t_1= stats.ttest_ind(Array1, Array2) t_2= stats.ttest_ind(Array3, Array4) Use when t_3= stats.ttest_ind(Array5, Array6) T ests are independent from each other pvals_array = [t_1[1],t_2[1],t_3[1]] adjustedvalues= sm.stats.multitest.multipletests pvals_array, alpha=0.05, method='s') EXPERIMENTAL DESIGN IN PYTHON

Š ídák correction example EXPERIMENTAL DESIGN IN PYTHON

from scipy import stats import statsmodels as sm t_result_1 = stats.ttest_ind(HighJumpVals, LongJumpVals) t_result_2 = stats.ttest_ind(ShotPutVals, HammerVals) t_result_3 = stats.ttest_ind(MarathonVals, PoleVals) pvals_array = [t_result_1[1],t_result_2[1],t_result_3[1]] adjustedvaluesm = sm.stats.multitest.multipletests(pvals_array, alpha=0.05, method='s') print(adjustedvalues) (array([ True, True, True]), array([0., 0., 0.]), 0.016952427508441503, 0.016666666666666666) EXPERIMENTAL DESIGN IN PYTHON

Let's practice! EX P ERIMEN TAL DES IGN IN P YTH ON

Sample size EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

Type II errors & sample size De�nition False negative Fail to detect an effect that exists Caveat Can never be sure that no effect is present Sample size Helps avoid false negatives Larger sample size = more sensitive methods EXPERIMENTAL DESIGN IN PYTHON

Importance of sample size EXPERIMENTAL DESIGN IN PYTHON

Other factors that affect sample size Alpha Critical value of p at which to reject null hypothesis Power Probability we correctly reject null hypothesis if alternative hypothesis is true Effect size Departure from null hypothesis EXPERIMENTAL DESIGN IN PYTHON

Effects of other factors Increase sample size: What sample size do we need with effect_size = x, power = y, alpha = z? Increase statistical power Decrease usable alpha Smaller effect size detectable Functions t-test: TTestIndPower() Other functions for other tests EXPERIMENTAL DESIGN IN PYTHON

Calculating sample size needed for t-test Initialize analysis from statsmodels.stats import power as pwr TTestIndPower() for ttest_ind() analysis = pwr.TTestIndPower() Values ssresult = analysis.solve_power( effect_size=effect_size, effect size : standardized effect size power=power, alpha=alpha, power : 0 - 1 ratio=1.0, nobs1=None) alpha : 0.05 standard print(ssresult) ratio : 1 if experiment balanced nob1s : set to None EXPERIMENTAL DESIGN IN PYTHON

Sample size calculation example EXPERIMENTAL DESIGN IN PYTHON

Sample size calculation example Assumptions effect_size = 0.8 power = 0.8 effect_size : 0.8 (large) alpha = 0.05 ratio = power : 0.8 (80% chance of detection) float(len(df[df.Fertilizer == "B"]) )/ len(df[df.Fertilizer == "A"]) alpha : 0.05 (standard) ratio : (group 2 samples / group 1 samples) EXPERIMENTAL DESIGN IN PYTHON

Sample size calculation example from statsmodels.stats import power as pwr analysis = pwr.TTestIndPower() ssresult = analysis.solve_power( effect_size=effect_size, power=power, alpha=alpha, ratio=ratio , nobs1=None) print(ssresult) 25.5245725005 EXPERIMENTAL DESIGN IN PYTHON

Effect size EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

De�ning effect size EXPERIMENTAL DESIGN IN PYTHON

Effect size vs. signi�cance Signi�cance How sure we are that effect exists X% con�dent that fertilizer A is better than fertilizer B Effect size How much difference that effect makes Yields with fertilizer A are Y higher than yields with fertilizer B EXPERIMENTAL DESIGN IN PYTHON

Measures of effect size Cohen's d Continuous variables in relation to discrete variables Normalized differences between the means of two samples Odds ratio For discrete variables How much one event is associated with another Correlation coef�cients For continuous variables Measures correlation EXPERIMENTAL DESIGN IN PYTHON

Effect sizes for t-tests EXPERIMENTAL DESIGN IN PYTHON

Calculating Cohen's d import math as ma sampleA = df[df.Fertilizer == "A"].Production sampleB = df[df.Fertilizer == "B"].Production diff = abs(sampleA.mean() - sampleB.mean() ) pooledstdev = ma.sqrt( (sampleA.std()**2 + sampleB.std()**2)/2 ) cohend = diff / pooledstdev Cohen's d = (M2 - M1) ? SDpooled print(cohend) 4.05052530265279 EXPERIMENTAL DESIGN IN PYTHON

Calculating minimum detectable effect size Assumptions from statsmodels.stats import power as pwr effect_size : None analysis = pwr.TTestIndPower() power : 0.8 (80% chance of detection) esresult = analysis.solve_power( effect_size=None, alpha : 0.05 (standard) power=power, alpha=alpha, ratio : 1 (equal sample size per group) ratio=ratio , nobs1=nobs1 ) nobs1 : 100 print(esresult) 0.398139117391 EXPERIMENTAL DESIGN IN PYTHON

Effect size for Fisher exact test Metric: Odds ratio from scipy import stats chi = stats.fisher_exact( import pandas as pd table, alternative='two-sided') print(pd.crosstab(df.Coin,df.Flip)) print(round(chi[0],1)) Flip heads tails 2.1 Coin 1 22 8 2 17 13 EXPERIMENTAL DESIGN IN PYTHON

Effect size for Pearson correlation Example from scipy import stats pearson = stats.pearsonr( df.Weight, df.Height) print(pearson[0]) 0.7922545330545416 Metric: Pearson correlation coef�cient (r) Perfect correlation at r = 1 EXPERIMENTAL DESIGN IN PYTHON

Power EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

De�ning statistical power Probability of detecting an effect Increase power, decrease chance of type II errors Relationship to other factors Larger effect size, increase power Larger sample size, increase power EXPERIMENTAL DESIGN IN PYTHON

Calculating power EXPERIMENTAL DESIGN IN PYTHON

Calculating power Assumptions from statsmodels.stats import power as pwr effect_size : 0.8 (large) analysis = pwr.TTestIndPower() power : None pwresult = analysis.solve_power( effect_size=effect_size, alpha : 0.05 (standard) power=None, alpha=alpha, ratio : 1 (balanced design ) ratio=ratio , nobs1=nobs1 ) nobs1 : 100 print(pwresult) 0.9998783661018764 EXPERIMENTAL DESIGN IN PYTHON

Calculating power Interpretation 0.9998783661018764 Almost certain of detecting an effect of this size EXPERIMENTAL DESIGN IN PYTHON

Dealing with uncertainty Hypothesis tests Estimate likelihoods Can't give absolute certainty Power analysis Estimates the strength of answers EXPERIMENTAL DESIGN IN PYTHON

Drawing conclusions Interpreting tests In context of power analyses Possibility of type II errors Negative test result & high power: true negative Negative test result & low power: possible false negative EXPERIMENTAL DESIGN IN PYTHON

Type I & II errors in context Find balance More power: maybe more risk of type I errors Domain knowledge Make reasonable assumptions EXPERIMENTAL DESIGN IN PYTHON

Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden - PowerPoint PPT Presentation

Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor Ways of being wrong When we run a test: Real effect present No real effect present Effect found (positive : alternative hypothesis) True Positive False Positive No

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Type Checking Grammar Rule Semantic Rule var-decl id : type-exp Insert (id.name, type-exp .

How willing are you to be wrong? Type I and Type II Errors Type 1, Type II Errors and Power

Unified error reporting -- A worthy goal? Andi Kleen, Intel Corporation Sep 2009

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

Treasurers Institute Sun, Nov. 17, 2019 Property Tax Errors Property Tax Errors Property Tax

NMVTIS INFORMATION FOR TACA MARCH 2019 NMVTIS ERRORS Odometer Reading Discrepancies

GENIE Systematic Errors GENIE Systematic Errors GENIE Systematic Errors Hugh Gallagher, Tufts

Unforced Errors Unforced Errors My mother taught me that in polite society, we do not talk

Exceptions Introduction to Computing Using Python Types of errors We saw different types of

Type Reconstruction and Polymorphism 1 Type Checking and Type Reconstruction We now come to the

A short overview of Type Theory Yves Bertot June 2015 1 / 36 Motivation for types You know

Lecture 2: Carrying Out an Empirical Project Research questions You will come to understand

Sta$s$cs & Experimental Design with R Barbara Kitchenham

+ Quantitative Statistics: Chi-Square ScWk 242 Session 7 Slides + Chi-Square Test of

New approaches to error control in multiple testing Juliet Popper Shaffer Fourth Lehmann

Testing: is my coin fair ? Formally: we want to make some inference about P(head) Try

Statistics 300: Elementary Statistics Section 8-2 1 Hypothesis Testing Principles

Creating Confidence Intervals using Excel 2013 XL8A-V0R XL8A-V0R XL8A-V0R Create Confidence

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden - PowerPoint PPT Presentation

Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor Ways of being wrong When we run a test: Real effect present No real effect present Effect found (positive : alternative hypothesis) True Positive False Positive No

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Type Checking Grammar Rule Semantic Rule var-decl id : type-exp Insert (id.name, type-exp .

How willing are you to be wrong? Type I and Type II Errors Type 1, Type II Errors and Power

Unified error reporting -- A worthy goal? Andi Kleen, Intel Corporation Sep 2009

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

Treasurers Institute Sun, Nov. 17, 2019 Property Tax Errors Property Tax Errors Property Tax

NMVTIS INFORMATION FOR TACA MARCH 2019 NMVTIS ERRORS Odometer Reading Discrepancies

GENIE Systematic Errors GENIE Systematic Errors GENIE Systematic Errors Hugh Gallagher, Tufts

Unforced Errors Unforced Errors My mother taught me that in polite society, we do not talk

Exceptions Introduction to Computing Using Python Types of errors We saw different types of

Type Reconstruction and Polymorphism 1 Type Checking and Type Reconstruction We now come to the

A short overview of Type Theory Yves Bertot June 2015 1 / 36 Motivation for types You know

Lecture 2: Carrying Out an Empirical Project Research questions You will come to understand

Sta$s$cs &amp; Experimental Design with R Barbara Kitchenham

+ Quantitative Statistics: Chi-Square ScWk 242 Session 7 Slides + Chi-Square Test of

New approaches to error control in multiple testing Juliet Popper Shaffer Fourth Lehmann

Testing: is my coin fair ? Formally: we want to make some inference about P(head) Try

Statistics 300: Elementary Statistics Section 8-2 1 Hypothesis Testing Principles

Creating Confidence Intervals using Excel 2013 XL8A-V0R XL8A-V0R XL8A-V0R Create Confidence

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

Sta$s$cs & Experimental Design with R Barbara Kitchenham