Statistics'and' Hypothesis'Testing - PowerPoint PPT Presentation

Statistics'and' Hypothesis'Testing NENS�230:�Data�Analysis�for�the�Biosciences�using�MATLAB Eddy�Albarran� November�3,�2015

Analysis�Methodology Data Exploratory�� Hypothesis�� Data�Analysis Testing • Summary�Statistics� • T-Test� • Dimensionality�Reduction/PCA� • Z-test� • Visualization�� • Chi-Square�� • Histogram� • etc. • Scatterplots� • Box�plots� • etc. Fail�to� Reject� reject�null Null Generate� Hypotheses

Outline Summary statistics functions Random Variables – Random variables, PDF, CDFs – Estimates of central tendency and dispersion – Standard error of the mean, confidence intervals Statistical Hypothesis Testing – Tests and significance – Student’s t test walkthrough – Other commonly used tests Analysis of Variance Homework

Summary Statistics Commonly used functions: – mean() – std() – var() – sum() – min() – max()

mean() �function mean() �computes�the�average�(sample�mean)�of�a� vector.�With�matrices,�you�need�to�specify�which� dimension�to�average�along.� mean(X, 1) �means�return�the�average�row� (average�across�the�rows).�This�is�the�default�if�you� only�specify�one�argument.� mean(X, 2) �means�return�the�average�column� (average�across�the�columns)

mean() �function mean() �computes�the�average�(sample�mean)�of�a� vector.�When�dealing�with�matrices,�you�need�to� specify�which�dimension�to�average�along. mean(X) Dim�2 mean(X, 1) evaluates�to 11.1 4 X = 26 0 mean(X, 2) evaluates�to 13 15 15 15 Dim�1 1 1 1 2.4 0 1.2

mean() �function mean() �operates�on�its�first�argument.�Be� careful�when�averaging�two�things�together� that�you�pack�them�in�a�vector�using� [ ] � mean(1, 5) evaluates�to� 1 “Take�the�mean�of� [1] �along�the�5th� dimension”� � mean([1 5]) �evaluates�to� 3

std() �function std() �computes�the�standard�deviation�of�a�list�of�numbers� — When�dealing�with�matrices,�you�need�to�specify�which�dimension�to�average� along,� as'the'third'argument.' � — The�second�argument�should�be� 0 �if�you�want�the�unbiased�estimator�that� normalizes�by� n-1 ,�where� n �is�the�number�of�samples std(X) Dim�2 std(X, 0, 1) evaluates�to 11.7604�� 7.3485 X = 26 0 std(X, 0, 2) evaluates�to 18.3848 15 15 0 Dim�1 1 1 0 2.4 0 1.6971

var() �function var() �computes�the�sample�variance�of�a�list�of�numbers� — When�dealing�with�matrices,�you�need�to�specify�which�dimension�to�operate� along,� as'the'third'argument.' � — The�second�argument�should�be� 0 �if�you�want�the�unbiased�estimator�that� normalizes�by� n-1 ,�where� n �is�the�number�of�samples.�(This�is�the�default) var(X) Dim�2 var(X, 0, 1) evaluates�to 138.31�� 54 X = 26 0 var(X, 0, 2) evaluates�to 338 15 15 0 Dim�1 1 1 0 2.4 0 2.88

sum() �function sum() �computes�the�sum�of�a�vector.�When� dealing�with�matrices,�you�should�specify�which� dimension�to�average�along.� sum(X, 1) �means�return�the�sum�over�rows�(sum� over�rows�within�each�column).�This�is�the�default�if� you�only�specify�one�argument.� sum(X, 2) �means�return�the�sum�over�columns� (sum�over�columns�within�each�row)

min() �function min() �computes�the�minimum�of�a�vector.�When� dealing�with�matrices,�you�should�specify�which� dimension�to�find�the�minimum�along.� min(X, Y) �means�return�an�array�the�same�size�as� X�and�Y�consisting�of�the�smaller�of�the�elements�in� X�and�Y�at�each�location.� min(X, [], 1) �means�return�the�minimum�value� in�each�column.�This�is�the�default�if�you�only� specify�one�argument.� min(X, [], 2) �means�return�the�minimum�in� each�row.

max() �function max() �computes�the�maximum�of�a�vector.�When� dealing�with�matrices,�you�should�specify�which� dimension�to�find�the�maximum�along.� max(X, Y) �means�return�an�array�the�same�size�as� X�and�Y�consisting�of�the�larger�of�the�elements�in� X�and�Y�at�each�location.� max(X, [], 1) �means�return�the�maximum�value� in�each�column.�This�is�the�default�if�you�only� specify�one�argument.� max(X, [], 2) �means�return�the�maximum�in� each�row.

Outline Summary�statistics�functions� Random'Variables' — Random'variables,'PDF,'CDFs' — Estimates'of'central'tendency'and'dispersion' — Standard'error'of'the'mean,'confidence'intervals' Statistical�Hypothesis�Testing� — Tests�and�significance� — Student’s�t�test�walkthrough� — Other�commonly�used�tests� Analysis�of�Variance� Homework

Discrete�random�variables Suppose�we�have�a�random�variable�X.� Discrete'random'variables' take�one�value�within�a� set�of�k�possible�values.� Probability'mass'function: �For�a�given�value�x i� returns�the�probability�p i� of�X�taking�that�value.� Pr [ X = x i ] = p i � � Sum�of�these�probabilities�must�be�1.�� p 1 + p 2 + · · · + p k = 1

Probability�Mass�Function

  Continuous�random�variables Suppose�we�have�a�random�variable�X.� Continuous'random'variables' take�values�within� some�continuous�range�of�values.� Probability'density'function'(PDF): �integrating�this� function�over�some�interval�gives�you�the� probability�that�X�lies�in�that�interval.� Z b Pr [ a ≤ X ≤ b ] = f ( x ) dx � a Therefore,�the�integral�under�this�function�is�1.� Z ∞ f ( x ) dx = 1 −∞

Normal�distribution Normal�or�Gaussian�distributions�describe�many�naturally� occurring�phenomena,�due�to�the�central�limit�theorem.� Specified�by�two�parameters:� — Location'parameter: �the�mean�(μ)� — Scale'parameter: �the�standard�deviation�(σ) 1 e − ( x − µ )2 2 σ 2 p (2 π ) σ Source:�wikipedia.org

PDF�for�normal�distribution

Cumulative�distribution�function Cumulative'distribution'function'(CDF): �how�likely� is�X�less�than�or�equal�to�a�particular�value.� � Pr [ X ≤ x ] = F ( x ) � The�CDF�is�the�integral�of�the�PDF.�� The�PDF�is�the�derivative�of�the�CDF.�Therefore,�the� parts�of�the�CDF�with�the�steepest�slope�are�the� highest�points�of�the�PDF,�i.e.�where�most�of�the� values�lie.��

CDF�for�normal�distribution

Expected�Value The�expected�value�of�a�random�variable�is�it’s� mean.�You�can�calculate�the�expected�value�of�a� random�variable�X�by�taking�the�weighted�average� of�all�its�possible�values.�The�weights�are�the� probability�of�X�taking�each�value. E [ X ] = x 1 p 1 + x 2 p 2 + · · · + x k p k Discrete�RV: Z ∞ E [ X ] = xf ( x ) dx Continuous�RV: −∞

Sample�mean Sampling:' When�we�measure�some�quantity�in�an� experiment,�we�think�of�it�as�taking�samples�from�a� distribution.� Sample'mean:' By�taking�the�average,�we�are�estimating� the�mean�or�expected�value�of�the�underlying� distribution�which�generated�these�quantities.� A'central'problem'in'statistics:' How�close�is�this� estimate�of�the�mean�(the�average�of�our�samples)�to� the�true,�underlying�mean?

Standard�Error�of�the�Mean Suppose�we�make�N�measurements�of�X,�sampling� from�a�normal�distribution�with�mean� μ�and� standard�deviation�σ .�� If�we�take�the�average�of�these�N�samples,�our� estimate'of'the'mean'is'a'normal'distribution .� The�mean�of�this�sampling�distribution�is�μ� The'standard'error'is'σ'/'sqrt(N).' This�means�that�on�average,�our�estimate�will�be� correct.�The�spread�around�the�true�mean�shrinks� as�1/sqrt(N).

Standard�Error�of�the�Mean Suppose�we�make�N�measurements�of�X�which�may� or�not�be�normally�distributed.� If�we�take�the�average�of�these�N�samples,�our� estimate�of�the�mean� approaches �a�normal� distribution�as�N�gets�larger�(central�limit�theorem).� The�mean�of�this�sampling�distribution�is�μ� The�standard�error�is�σ�/�sqrt(N).�

Statistics'and' Hypothesis'Testing - PowerPoint PPT Presentation

Statistics'and' Hypothesis'Testing NENS230:DataAnalysisfortheBiosciencesusingMATLAB EddyAlbarran November3,2015 AnalysisMethodology Data Exploratory Hypothesis DataAnalysis Testing

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Testing 6.1 Specification testing Michel Bierlaire A short reminder on hypothesis testing

Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics

Hypothesis testing get data that differ from the null hypothesis. If the data would be quite

Lecture 4: Hypothesis Testing Ani Manichaikul amanicha@jhsph.edu 20 April 2007 1 / 69 Steps of

CME/STATS 195 CME/STATS 195 Lecture 8: Hypothesis Testing and Lecture 8: Hypothesis Testing and

Non-Normality / Non-Gaussianity and Filtering Cris%an Proistosescu, Andy Rhines,

The Effect of Socio-economic Status on the Think-Aloud Quality in Children Mila Sugovic PhD;

Effectiveness of Facilitating ESL Learning with Personal Response System Xiaohong YANG Shanghai

CS 309: Autonomous Intelligent Robotics FRI I Lecture 09: Introduction to HRI Instructor: Justin

An Empirical Evaluation of the Received Signal Strength Indicator for fixed outdoor 802.11 links

Linear Algebra II: linear combinations & matrices Math Tools for Neuroscience (NEU 314)

Math 211 Math 211 Lecture #19 Nullspaces and Subspaces October 10, 2001 2 Structure of the

Matrix Calculations: Solutions of Systems of Linear Equations A. Kissinger Institute for

Statistics'and' Hypothesis'Testing - PowerPoint PPT Presentation

Statistics'and' Hypothesis'Testing NENS230:DataAnalysisfortheBiosciencesusingMATLAB EddyAlbarran November3,2015 AnalysisMethodology Data Exploratory Hypothesis DataAnalysis Testing

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Testing 6.1 Specification testing Michel Bierlaire A short reminder on hypothesis testing

Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics

Hypothesis testing get data that differ from the null hypothesis. If the data would be quite

Lecture 4: Hypothesis Testing Ani Manichaikul amanicha@jhsph.edu 20 April 2007 1 / 69 Steps of

CME/STATS 195 CME/STATS 195 Lecture 8: Hypothesis Testing and Lecture 8: Hypothesis Testing and

Non-Normality / Non-Gaussianity and Filtering Cris%an Proistosescu, Andy Rhines,

The Effect of Socio-economic Status on the Think-Aloud Quality in Children Mila Sugovic PhD;

Effectiveness of Facilitating ESL Learning with Personal Response System Xiaohong YANG Shanghai

CS 309: Autonomous Intelligent Robotics FRI I Lecture 09: Introduction to HRI Instructor: Justin

An Empirical Evaluation of the Received Signal Strength Indicator for fixed outdoor 802.11 links

Linear Algebra II: linear combinations &amp; matrices Math Tools for Neuroscience (NEU 314)

Math 211 Math 211 Lecture #19 Nullspaces and Subspaces October 10, 2001 2 Structure of the

Matrix Calculations: Solutions of Systems of Linear Equations A. Kissinger Institute for

Linear Algebra II: linear combinations & matrices Math Tools for Neuroscience (NEU 314)