Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P - PowerPoint PPT Presentation

Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

Summary stats Mean Sum divided by count Median Half of values fall above and below the median Mode Value that occurs most often Standard deviation Measure of variability EXPERIMENTAL DESIGN IN PYTHON

Normal distribution EXPERIMENTAL DESIGN IN PYTHON

Sample distribution print(p9.ggplot(countrydata)+ p9.aes(x= 'Life_exp')+ p9.geom_density(alpha=0.5)) EXPERIMENTAL DESIGN IN PYTHON

Accessing summary stats Mean Median print(countrydata.Life_exp.mean()) print(countrydata.Life_exp.median()) 73.68201058201058 76.0 Mode print(countrydata.Life_exp.mode()) 78.4 EXPERIMENTAL DESIGN IN PYTHON

Normal distribution EXPERIMENTAL DESIGN IN PYTHON

Q-Q (quantile-quantile) plot Normal probability plot Use Distribution �t expected (normal) distribution? Graphical method to assess normality Basis Compare quantiles of data with theoretical quantiles predicted under distribution EXPERIMENTAL DESIGN IN PYTHON

Creating a Q-Q plot from scipy import stats import plotnine as p9 tq = stats.probplot(countrydata.Life_exp, dist="norm") df = pd.DataFrame(data = {'Theoretical Quantiles': tq[0][0], "Ordered Values": countrydata.Life_exp.sort_values() }) print(p9.ggplot(df)+ p9.aes('Theoretical Quantiles', "Ordered Values") +p9.geom_point()) EXPERIMENTAL DESIGN IN PYTHON

Q-Q plot for sample Distribution Q-Q plot EXPERIMENTAL DESIGN IN PYTHON

Let's practice! EX P ERIMEN TAL DES IGN IN P YTH ON

Testing for normality EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

Testing for normality Normal distribution Mean, median, and mode are equal Symmetrical Crucial assumption of certain tests Approach T est for normality EXPERIMENTAL DESIGN IN PYTHON

Shapiro-Wilk test Basis from scipy import stats T est for normality shapiro = stats.shapiro(my_sample) Based on same logic as Q-Q plot print(shapiro) Use 1) T est normality of each sample 2) Choose test/approach 3) Perform hypothesis test EXPERIMENTAL DESIGN IN PYTHON

Shapiro-Wilk test example EXPERIMENTAL DESIGN IN PYTHON

Implementing a Shapiro-Wilk test from scipy import stats shapiro = stats.shapiro(countrydata.Life_exp) print(shapiro) (0.39991819858551025, 6.270842690066813e-26) EXPERIMENTAL DESIGN IN PYTHON

Test assumptions Tests based on assumption of normality Student's t-test (one and two-sample) Paired t-test ANOVA Normality test T est by group EXPERIMENTAL DESIGN IN PYTHON

Normality and test choice Sample size & sample mean Large sample size: sample mean approaches population mean Small sample sizes Important that normality assumption not violated Large sample sizes Importance of normality is relaxed EXPERIMENTAL DESIGN IN PYTHON

Non-parametric tests: Wilcoxon rank- sum test EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

When assumptions don't hold T ests are based on assumptions about data Approach Normality: assumption underlying t-test Non-parametric tests Violation of assumptions "Looser" constraints T est no longer valid EXPERIMENTAL DESIGN IN PYTHON

Parametric vs non-parametric tests Parametric tests Non-parametric tests Make many assumptions Make few assumptions Population modeled by distribution with No �xed population parameters �xed parameters (eg: normal) Used when data doesn't �t these Sensitivity distributions Sensitivity Higher Hypotheses Lower Hypotheses More speci�c Less speci�c EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon rank-sum vs t-test Student's t-test Wilcoxon rank-sum test Parametric Non-parametric Hypothesis Hypothesis mean sample A == mean sample B? random sample A > random sample B Assumptions Assumptions Relies on normality No sensitive to distribution shape Sensitivity Sensitivity Higher Slightly lower EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon rank-sum test example EXPERIMENTAL DESIGN IN PYTHON

Implementing a Wilcoxon rank-sum test from scipy import stats Sample_A = df[df.Fertilizer == "A"] Sample_B = df[df.Fertilizer == "B"] wilc = stats.ranksums(Sample_A, Sample_B) print(wilc) RanksumsResult(statistic=16.085203659039184, pvalue=3.239851573227159e-58) EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon signed-rank test Non-parametric equivalent to paired t-test 2017 yield 2018 yield T ests if ranks differ across pairs 60.2 63.2 12 15.6 13.8 14.8 91.8 96.7 50 53 45 47 EXPERIMENTAL DESIGN IN PYTHON

Wilcoxon signed-rank test example from scipy import stats yields2018= [60.2, 12, 13.8, 91.8, 50, 45,32, 87.5, 60.1,88 ] yields2019 = [63.2, 15.6, 14.8, 96.7, 53, 47, 31.3, 89.8, 67.8, 90] wilcsr = stats.wilcoxon(yields2018, yields2019) print(wilcsr) WilcoxonResult(statistic=1.0, pvalue=0.00683774356850919) EXPERIMENTAL DESIGN IN PYTHON

More nonparametric tests: Spearman correlation EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

Correlation Basis Relate one continuous or ordinal variable to another Will variation in one predict variation in the other? Pearson correlation Based on a linear model EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation Pearson correlation Spearman correlation Parametric Non-parametric Based on raw values Based on ranks Sensitive to outliers Robust to outliers Assumes: Assumes: Linear, monotonic relationship Monotonic relationship Effect measure Effect measure Pearson's r Spearman's rho EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation Pearson's r: 1, Spearman's rho = 1 EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation Pearson's r: -1, Spearman's rho = -1 EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation Pearson's r: 0.915, Spearman's rho = 1 EXPERIMENTAL DESIGN IN PYTHON

Pearson vs Spearman correlation Pearson's r: 0.0429, Spearman's rho = 0.0428 EXPERIMENTAL DESIGN IN PYTHON

Spearman correlation example EXPERIMENTAL DESIGN IN PYTHON

Implementing a Spearman correlation from scipy import stats pearcorr = stats.pearsonr(oly.Height, oly.Weight) print(pearcorr) (0.6125605419882442, 7.0956520885987905e-190) spearcorr = stats.spearmanr(oly.Height, oly.Weight) print(spearcorr) SpearmanrResult(correlation=0.728877815423366, pvalue=1.4307959767478955e-304) EXPERIMENTAL DESIGN IN PYTHON

Summary EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

What you've learned Chapter 1 Exploratory data analysis & hypothesis testing Chapter 2 Dealing with multiple factors Chapter 3 Type I and II errors and the power-sample size-effect size relationship Chapter 4 Dealing with assumptions of tests EXPERIMENTAL DESIGN IN PYTHON

Uncertainty is a theme of statistics Uncertainty is always present We can't expect absolute certainty Approach Quantify our uncertainty Assess likelihood of competing hypotheses Methods may rest on unproven assumptions EXPERIMENTAL DESIGN IN PYTHON

Embrace uncertainty! EX P ERIMEN TAL DES IGN IN P YTH ON

Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P - PowerPoint PPT Presentation

Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor Summary stats Mean Sum divided by count Median Half of values fall above and below the median Mode Value that occurs most often Standard

ACMS 20340 Statistics for Life Sciences Chapter 11: The Normal Distributions Introducing the

Chapter 5 Slide 1 Normal Probability Distributions 5-1 Overview 5-2 The Standard Normal

Linear regression How to measure the accuracy of linear regression models Linear Regression

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Checking Assumptions Normal distributions: use probability plot (or quantile-quantile plot);

Unit 2: Probability and distributions 3. Normal and binomial distributions GOVT 3990 - Spring

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart

Normal A Spectrum of Engineering Design Normal Radical A Spectrum of Engineering Design Normal

Normal Distributions MATH 107: Finite Mathematics University of Louisville April 2, 2014 Normal

Lecture 5: Probability Distributions Random Variables Probability Distributions

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

Chomsky Normal Form Chomsky Normal Form Chomsky Normal Form A context free grammar is in

Unit 2: Probability and distributions 3. Normal and binomial distributions PS: Explain your

Burst detection method in wavelet domain (WaveBurst) S.Klimenko, G.Mitselmakher University of

STAT 830 Non-parametric Inference Basics Handwritten Notes Richard Lockhart Simon Fraser

Bayesian Nonparametrics Peter Orbanz Columbia University P ARAMETERS AND P ATTERNS Parameters P

STAT 401A - Statistical Methods for Research Workers Nonparametric two-sample tests Jarad Niemi

Chapter 16 Nonparametric Statistics Introduction: Distribution-Free Tests Distribution-free

Imputing missing values in satellite data: From parametric to non-parametric approaches

Outline Density Estimation 1 Nonparametric Methods Bins Kernel Estimators k-Nearest Neighbor

Overview Course 02402 Introduction to Statistics 1 Introduction to simulation Example 1 Lecture

Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P - PowerPoint PPT Presentation

Assumptions and normal distributions EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor Summary stats Mean Sum divided by count Median Half of values fall above and below the median Mode Value that occurs most often Standard

ACMS 20340 Statistics for Life Sciences Chapter 11: The Normal Distributions Introducing the

Chapter 5 Slide 1 Normal Probability Distributions 5-1 Overview 5-2 The Standard Normal

Linear regression How to measure the accuracy of linear regression models Linear Regression

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Checking Assumptions Normal distributions: use probability plot (or quantile-quantile plot);

Unit 2: Probability and distributions 3. Normal and binomial distributions GOVT 3990 - Spring

? ? ? ? Basic Charts Outline - Distributions &amp; Histograms - Mean, Mode, Average - Chart

Normal A Spectrum of Engineering Design Normal Radical A Spectrum of Engineering Design Normal

Normal Distributions MATH 107: Finite Mathematics University of Louisville April 2, 2014 Normal

Lecture 5: Probability Distributions Random Variables Probability Distributions

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

Chomsky Normal Form Chomsky Normal Form Chomsky Normal Form A context free grammar is in

Unit 2: Probability and distributions 3. Normal and binomial distributions PS: Explain your

Burst detection method in wavelet domain (WaveBurst) S.Klimenko, G.Mitselmakher University of

STAT 830 Non-parametric Inference Basics Handwritten Notes Richard Lockhart Simon Fraser

Bayesian Nonparametrics Peter Orbanz Columbia University P ARAMETERS AND P ATTERNS Parameters P

STAT 401A - Statistical Methods for Research Workers Nonparametric two-sample tests Jarad Niemi

Chapter 16 Nonparametric Statistics Introduction: Distribution-Free Tests Distribution-free

Imputing missing values in satellite data: From parametric to non-parametric approaches

Outline Density Estimation 1 Nonparametric Methods Bins Kernel Estimators k-Nearest Neighbor

Overview Course 02402 Introduction to Statistics 1 Introduction to simulation Example 1 Lecture

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart