Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P - - PowerPoint PPT Presentation

assumptions and normal distributions
SMART_READER_LITE
LIVE PREVIEW

Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P - - PowerPoint PPT Presentation

Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor Summary stats Mean Sum divided by count Median Half of values fall above and below the median Mode Value that occurs most often Standard


slide-1
SLIDE 1

Assumptions and normal distributions

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-2
SLIDE 2

EXPERIMENTAL DESIGN IN PYTHON

Summary stats

Mean Sum divided by count Median Half of values fall above and below the median Mode Value that occurs most often Standard deviation Measure of variability

slide-3
SLIDE 3

EXPERIMENTAL DESIGN IN PYTHON

Normal distribution

slide-4
SLIDE 4

EXPERIMENTAL DESIGN IN PYTHON

Sample distribution

print(p9.ggplot(countrydata)+ p9.aes(x= 'Life_exp')+ p9.geom_density(alpha=0.5))

slide-5
SLIDE 5

EXPERIMENTAL DESIGN IN PYTHON

Accessing summary stats

Mean

print(countrydata.Life_exp.mean()) 73.68201058201058

Median

print(countrydata.Life_exp.median()) 76.0

Mode

print(countrydata.Life_exp.mode()) 78.4

slide-6
SLIDE 6

EXPERIMENTAL DESIGN IN PYTHON

Normal distribution

slide-7
SLIDE 7

EXPERIMENTAL DESIGN IN PYTHON

Q-Q (quantile-quantile) plot

Normal probability plot Use Distribution t expected (normal) distribution? Graphical method to assess normality Basis Compare quantiles of data with theoretical quantiles predicted under distribution

slide-8
SLIDE 8

EXPERIMENTAL DESIGN IN PYTHON

Creating a Q-Q plot

from scipy import stats import plotnine as p9 tq = stats.probplot(countrydata.Life_exp, dist="norm") df = pd.DataFrame(data = {'Theoretical Quantiles': tq[0][0], "Ordered Values": countrydata.Life_exp.sort_values() }) print(p9.ggplot(df)+ p9.aes('Theoretical Quantiles', "Ordered Values") +p9.geom_point())

slide-9
SLIDE 9

EXPERIMENTAL DESIGN IN PYTHON

Q-Q plot for sample

Distribution Q-Q plot

slide-10
SLIDE 10

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON

slide-11
SLIDE 11

Testing for normality

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-12
SLIDE 12

EXPERIMENTAL DESIGN IN PYTHON

Testing for normality

Normal distribution Mean, median, and mode are equal Symmetrical Crucial assumption of certain tests Approach T est for normality

slide-13
SLIDE 13

EXPERIMENTAL DESIGN IN PYTHON

Shapiro-Wilk test

Basis T est for normality Based on same logic as Q-Q plot Use 1) T est normality of each sample 2) Choose test/approach 3) Perform hypothesis test

from scipy import stats shapiro = stats.shapiro(my_sample) print(shapiro)

slide-14
SLIDE 14

EXPERIMENTAL DESIGN IN PYTHON

Shapiro-Wilk test example

slide-15
SLIDE 15

EXPERIMENTAL DESIGN IN PYTHON

Implementing a Shapiro-Wilk test

from scipy import stats shapiro = stats.shapiro(countrydata.Life_exp) print(shapiro) (0.39991819858551025, 6.270842690066813e-26)

slide-16
SLIDE 16

EXPERIMENTAL DESIGN IN PYTHON

Test assumptions

Tests based on assumption of normality Student's t-test (one and two-sample) Paired t-test ANOVA Normality test T est by group

slide-17
SLIDE 17

EXPERIMENTAL DESIGN IN PYTHON

Normality and test choice

Sample size & sample mean Large sample size: sample mean approaches population mean Small sample sizes Important that normality assumption not violated Large sample sizes Importance of normality is relaxed

slide-18
SLIDE 18

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON

slide-19
SLIDE 19

Non-parametric tests: Wilcoxon rank- sum test

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-20
SLIDE 20

EXPERIMENTAL DESIGN IN PYTHON

When assumptions don't hold

T ests are based on assumptions about data Normality: assumption underlying t-test Violation of assumptions T est no longer valid Approach Non-parametric tests "Looser" constraints

slide-21
SLIDE 21

EXPERIMENTAL DESIGN IN PYTHON

Parametric vs non-parametric tests

Parametric tests Make many assumptions Population modeled by distribution with xed parameters (eg: normal) Sensitivity Higher Hypotheses More specic Non-parametric tests Make few assumptions No xed population parameters Used when data doesn't t these distributions Sensitivity Lower Hypotheses Less specic

slide-22
SLIDE 22

EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon rank-sum vs t-test

Student's t-test Parametric Hypothesis mean sample A == mean sample B? Assumptions Relies on normality Sensitivity Higher Wilcoxon rank-sum test Non-parametric Hypothesis random sample A > random sample B Assumptions No sensitive to distribution shape Sensitivity Slightly lower

slide-23
SLIDE 23

EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon rank-sum test example

slide-24
SLIDE 24

EXPERIMENTAL DESIGN IN PYTHON

Implementing a Wilcoxon rank-sum test

from scipy import stats Sample_A = df[df.Fertilizer == "A"] Sample_B = df[df.Fertilizer == "B"] wilc = stats.ranksums(Sample_A, Sample_B) print(wilc) RanksumsResult(statistic=16.085203659039184, pvalue=3.239851573227159e-58)

slide-25
SLIDE 25

EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon signed-rank test

Non-parametric equivalent to paired t-test T ests if ranks differ across pairs 2017 yield 2018 yield 60.2 63.2 12 15.6 13.8 14.8 91.8 96.7 50 53 45 47

slide-26
SLIDE 26

EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon signed-rank test example

from scipy import stats yields2018= [60.2, 12, 13.8, 91.8, 50, 45,32, 87.5, 60.1,88 ] yields2019 = [63.2, 15.6, 14.8, 96.7, 53, 47, 31.3, 89.8, 67.8, 90] wilcsr = stats.wilcoxon(yields2018, yields2019) print(wilcsr) WilcoxonResult(statistic=1.0, pvalue=0.00683774356850919)

slide-27
SLIDE 27

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON

slide-28
SLIDE 28

More non- parametric tests: Spearman correlation

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-29
SLIDE 29

EXPERIMENTAL DESIGN IN PYTHON

Correlation

Basis Relate one continuous or ordinal variable to another Will variation in one predict variation in the other? Pearson correlation Based on a linear model

slide-30
SLIDE 30

EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation

Pearson correlation Parametric Based on raw values Sensitive to outliers Assumes: Linear, monotonic relationship Effect measure Pearson's r Spearman correlation Non-parametric Based on ranks Robust to outliers Assumes: Monotonic relationship Effect measure Spearman's rho

slide-31
SLIDE 31

EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation

Pearson's r: 1, Spearman's rho = 1

slide-32
SLIDE 32

EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation

Pearson's r: -1, Spearman's rho = -1

slide-33
SLIDE 33

EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation

Pearson's r: 0.915, Spearman's rho = 1

slide-34
SLIDE 34

EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation

Pearson's r: 0.0429, Spearman's rho = 0.0428

slide-35
SLIDE 35

EXPERIMENTAL DESIGN IN PYTHON

Spearman correlation example

slide-36
SLIDE 36

EXPERIMENTAL DESIGN IN PYTHON

Implementing a Spearman correlation

from scipy import stats pearcorr = stats.pearsonr(oly.Height, oly.Weight) print(pearcorr) (0.6125605419882442, 7.0956520885987905e-190) spearcorr = stats.spearmanr(oly.Height, oly.Weight) print(spearcorr) SpearmanrResult(correlation=0.728877815423366, pvalue=1.4307959767478955e-304)

slide-37
SLIDE 37

Let's practice!

EX P ERIMEN TAL DES IGN IN P YTH ON

slide-38
SLIDE 38

Summary

EX P ERIMEN TAL DES IGN IN P YTH ON

Luke Hayden

Instructor

slide-39
SLIDE 39

EXPERIMENTAL DESIGN IN PYTHON

What you've learned

Chapter 1 Exploratory data analysis & hypothesis testing Chapter 2 Dealing with multiple factors Chapter 3 Type I and II errors and the power-sample size-effect size relationship Chapter 4 Dealing with assumptions of tests

slide-40
SLIDE 40

EXPERIMENTAL DESIGN IN PYTHON

Uncertainty is a theme of statistics

Uncertainty is always present We can't expect absolute certainty Approach Quantify our uncertainty Assess likelihood of competing hypotheses Methods may rest on unproven assumptions

slide-41
SLIDE 41

Embrace uncertainty!

EX P ERIMEN TAL DES IGN IN P YTH ON