Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden - - PowerPoint PPT Presentation

type i errors
SMART_READER_LITE
LIVE PREVIEW

Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden - - PowerPoint PPT Presentation

Type I errors EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor Ways of being wrong When we run a test: Real effect present No real effect present Effect found (positive : alternative hypothesis) True Positive False Positive No


slide-1
SLIDE 1

Type I errors

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-2
SLIDE 2

EXPERIMENTAL DESIGN IN PYTHON

Ways of being wrong

When we run a test: Real effect present No real effect present Effect found (positive : alternative hypothesis) True Positive False Positive No effect found (negative: null hypothesis) False Negative True Negative Type I error : nd difference where none exists Type II error : fail to nd difference that does exist

slide-3
SLIDE 3

EXPERIMENTAL DESIGN IN PYTHON

Avoiding type I errors

Basis of tests Statistical tests are probabilistic Quantify likelihood of results under null hypothesis Consider: Signicant results are improbable, not impossible under null hypothesis Still possible result are by chance

slide-4
SLIDE 4

EXPERIMENTAL DESIGN IN PYTHON

Picking a single result can be misleading

Example

slide-5
SLIDE 5

EXPERIMENTAL DESIGN IN PYTHON

Accounting for multiple tests

By design Avoid "p-value shing" By correction Correct p-values for presence of multiple tests Correction methods Bonferroni and Šídák Choose method based on independence of tests

slide-6
SLIDE 6

EXPERIMENTAL DESIGN IN PYTHON

Bonferroni correction

Conservative method Simple Use when T ests are not independent from each other

import statsmodels as sm from scipy import stats t_1= stats.ttest_ind(Array1, Array2) t_2= stats.ttest_ind(Array2, Array3) t_3= stats.ttest_ind(Array1, Array3) pvals_array = [t_1[1],t_2[1],t_3[1]] adjustedvalues= sm.stats.multitest.multipletests pvals_array, alpha=0.05, method='b')

slide-7
SLIDE 7

EXPERIMENTAL DESIGN IN PYTHON

Bonferroni correction example

Multiple non-independent t-tests

slide-8
SLIDE 8

EXPERIMENTAL DESIGN IN PYTHON

from scipy import stats import statsmodels as sm t_result_1= stats.ttest_ind(HighJumpVals, LongJumpVals) t_result_2= stats.ttest_ind(LongJumpVals, ShotPutVals) t_result_3= stats.ttest_ind(HighJumpVals, HighJumpVals) pvals_array = [t_result_1[1],t_result_2[1],t_result_3[1]] adjustedvalues= sm.stats.multitest.multipletests(pvals_array, alpha=0.05, method='b') print(adjustedvalues) (array([ True, True, False]), array([6.72030836e-63, 3.46967459e-97, 1.00000000e+00]), 0.016952427508441503, 0.016666666666666666)

slide-9
SLIDE 9

EXPERIMENTAL DESIGN IN PYTHON

Šídák correction

Less conservative method Use when T ests are independent from each other

import statsmodels as sm t_1= stats.ttest_ind(Array1, Array2) t_2= stats.ttest_ind(Array3, Array4) t_3= stats.ttest_ind(Array5, Array6) pvals_array = [t_1[1],t_2[1],t_3[1]] adjustedvalues= sm.stats.multitest.multipletests pvals_array, alpha=0.05, method='s')

slide-10
SLIDE 10

EXPERIMENTAL DESIGN IN PYTHON

Šídák correction example

slide-11
SLIDE 11

EXPERIMENTAL DESIGN IN PYTHON

from scipy import stats import statsmodels as sm t_result_1 = stats.ttest_ind(HighJumpVals, LongJumpVals) t_result_2 = stats.ttest_ind(ShotPutVals, HammerVals) t_result_3 = stats.ttest_ind(MarathonVals, PoleVals) pvals_array = [t_result_1[1],t_result_2[1],t_result_3[1]] adjustedvaluesm = sm.stats.multitest.multipletests(pvals_array, alpha=0.05, method='s') print(adjustedvalues) (array([ True, True, True]), array([0., 0., 0.]), 0.016952427508441503, 0.016666666666666666)

slide-12
SLIDE 12

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON

slide-13
SLIDE 13

Sample size

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-14
SLIDE 14

EXPERIMENTAL DESIGN IN PYTHON

Type II errors & sample size

Denition False negative Fail to detect an effect that exists Caveat Can never be sure that no effect is present Sample size Helps avoid false negatives Larger sample size = more sensitive methods

slide-15
SLIDE 15

EXPERIMENTAL DESIGN IN PYTHON

Importance of sample size

slide-16
SLIDE 16

EXPERIMENTAL DESIGN IN PYTHON

Other factors that affect sample size

Alpha Critical value of p at which to reject null hypothesis Power Probability we correctly reject null hypothesis if alternative hypothesis is true Effect size Departure from null hypothesis

slide-17
SLIDE 17

EXPERIMENTAL DESIGN IN PYTHON

Effects of other factors

Increase sample size: Increase statistical power Decrease usable alpha Smaller effect size detectable What sample size do we need with effect_size = x, power = y, alpha = z? Functions t-test: TTestIndPower() Other functions for other tests

slide-18
SLIDE 18

EXPERIMENTAL DESIGN IN PYTHON

Calculating sample size needed for t-test

Initialize analysis

TTestIndPower() for ttest_ind()

Values

effect size : standardized effect size power : 0 - 1 alpha : 0.05 standard ratio : 1 if experiment balanced nob1s : set to None

from statsmodels.stats import power as pwr analysis = pwr.TTestIndPower() ssresult = analysis.solve_power( effect_size=effect_size, power=power, alpha=alpha, ratio=1.0, nobs1=None) print(ssresult)

slide-19
SLIDE 19

EXPERIMENTAL DESIGN IN PYTHON

Sample size calculation example

slide-20
SLIDE 20

EXPERIMENTAL DESIGN IN PYTHON

Sample size calculation example

Assumptions

effect_size : 0.8 (large) power : 0.8 (80% chance of detection) alpha : 0.05 (standard) ratio : (group 2 samples / group 1 samples)

effect_size = 0.8 power = 0.8 alpha = 0.05 ratio = float(len(df[df.Fertilizer == "B"]) )/ len(df[df.Fertilizer == "A"])

slide-21
SLIDE 21

EXPERIMENTAL DESIGN IN PYTHON

Sample size calculation example

from statsmodels.stats import power as pwr analysis = pwr.TTestIndPower() ssresult = analysis.solve_power( effect_size=effect_size, power=power, alpha=alpha, ratio=ratio , nobs1=None) print(ssresult) 25.5245725005

slide-22
SLIDE 22

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON

slide-23
SLIDE 23

Effect size

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-24
SLIDE 24

EXPERIMENTAL DESIGN IN PYTHON

Dening effect size

slide-25
SLIDE 25

EXPERIMENTAL DESIGN IN PYTHON

Effect size vs. signicance

Signicance How sure we are that effect exists X% condent that fertilizer A is better than fertilizer B Effect size How much difference that effect makes Yields with fertilizer A are Y higher than yields with fertilizer B

slide-26
SLIDE 26

EXPERIMENTAL DESIGN IN PYTHON

Measures of effect size

Cohen's d Continuous variables in relation to discrete variables Normalized differences between the means of two samples Odds ratio For discrete variables How much one event is associated with another Correlation coefcients For continuous variables Measures correlation

slide-27
SLIDE 27

EXPERIMENTAL DESIGN IN PYTHON

Effect sizes for t-tests

slide-28
SLIDE 28

EXPERIMENTAL DESIGN IN PYTHON

Calculating Cohen's d

Cohen's d = (M2 - M1) ? SDpooled

import math as ma sampleA = df[df.Fertilizer == "A"].Production sampleB = df[df.Fertilizer == "B"].Production diff = abs(sampleA.mean() - sampleB.mean() ) pooledstdev = ma.sqrt( (sampleA.std()**2 + sampleB.std()**2)/2 ) cohend = diff / pooledstdev print(cohend) 4.05052530265279

slide-29
SLIDE 29

EXPERIMENTAL DESIGN IN PYTHON

Calculating minimum detectable effect size

Assumptions

effect_size : None power : 0.8 (80% chance of detection) alpha : 0.05 (standard) ratio : 1 (equal sample size per group) nobs1 : 100

from statsmodels.stats import power as pwr analysis = pwr.TTestIndPower() esresult = analysis.solve_power( effect_size=None, power=power, alpha=alpha, ratio=ratio , nobs1=nobs1 ) print(esresult) 0.398139117391

slide-30
SLIDE 30

EXPERIMENTAL DESIGN IN PYTHON

Effect size for Fisher exact test

Metric: Odds ratio

import pandas as pd print(pd.crosstab(df.Coin,df.Flip))

Flip heads tails Coin 1 22 8 2 17 13

from scipy import stats chi = stats.fisher_exact( table, alternative='two-sided') print(round(chi[0],1)) 2.1

slide-31
SLIDE 31

EXPERIMENTAL DESIGN IN PYTHON

Effect size for Pearson correlation

Example Metric: Pearson correlation coefcient (r) Perfect correlation at r = 1

from scipy import stats pearson = stats.pearsonr( df.Weight, df.Height) print(pearson[0]) 0.7922545330545416

slide-32
SLIDE 32

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON

slide-33
SLIDE 33

Power

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-34
SLIDE 34

EXPERIMENTAL DESIGN IN PYTHON

Dening statistical power

Probability of detecting an effect Increase power, decrease chance of type II errors Relationship to other factors Larger effect size, increase power Larger sample size, increase power

slide-35
SLIDE 35

EXPERIMENTAL DESIGN IN PYTHON

Calculating power

slide-36
SLIDE 36

EXPERIMENTAL DESIGN IN PYTHON

Calculating power

Assumptions

effect_size : 0.8 (large) power : None alpha : 0.05 (standard) ratio : 1 (balanced design ) nobs1 : 100

from statsmodels.stats import power as pwr analysis = pwr.TTestIndPower() pwresult = analysis.solve_power( effect_size=effect_size, power=None, alpha=alpha, ratio=ratio , nobs1=nobs1 ) print(pwresult) 0.9998783661018764

slide-37
SLIDE 37

EXPERIMENTAL DESIGN IN PYTHON

Calculating power

0.9998783661018764

Interpretation Almost certain of detecting an effect of this size

slide-38
SLIDE 38

EXPERIMENTAL DESIGN IN PYTHON

Dealing with uncertainty

Hypothesis tests Estimate likelihoods Can't give absolute certainty Power analysis Estimates the strength of answers

slide-39
SLIDE 39

EXPERIMENTAL DESIGN IN PYTHON

Drawing conclusions

Interpreting tests In context of power analyses Possibility of type II errors Negative test result & high power: true negative Negative test result & low power: possible false negative

slide-40
SLIDE 40

EXPERIMENTAL DESIGN IN PYTHON

Type I & II errors in context

Find balance More power: maybe more risk of type I errors Domain knowledge Make reasonable assumptions

slide-41
SLIDE 41

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON