Rcourse: Basic statistics with R Sonja Grath, No emie Becker & - PowerPoint PPT Presentation

Rcourse: Basic statistics with R Sonja Grath, No´ emie Becker & Dirk Metzler Winter semester 2014-15

Theory of statistical tests 1 Test for a difference in means 2 Testing for dependence 3 Nominal variables Continuous variables Ordinal variables Power of a test 4 Degrees of freedom 5

Theory of statistical tests Contents Theory of statistical tests 1 Test for a difference in means 2 Testing for dependence 3 Nominal variables Continuous variables Ordinal variables Power of a test 4 Degrees of freedom 5

Theory of statistical tests A simple example You want to show that a treatment is effective.

Theory of statistical tests A simple example You want to show that a treatment is effective. You have data for 2 groups of patients with and without treatment.

Theory of statistical tests A simple example You want to show that a treatment is effective. You have data for 2 groups of patients with and without treatment. 80% patients with treatment recovered whereas only 30% patients without recovered.

Theory of statistical tests A simple example You want to show that a treatment is effective. You have data for 2 groups of patients with and without treatment. 80% patients with treatment recovered whereas only 30% patients without recovered. A pessimist would say that this just happened by chance. What do you do to convince the pessimist?

Theory of statistical tests A simple example You want to show that a treatment is effective. You have data for 2 groups of patients with and without treatment. 80% patients with treatment recovered whereas only 30% patients without recovered. A pessimist would say that this just happened by chance. What do you do to convince the pessimist? You assume he is right and you show that under this hypothesis the data would be very unlikely.

Theory of statistical tests In statistical words What you want to show is the alternative hypothesis H 1 . The pessimist (by chance) is the null hypothesis H 0 .

Theory of statistical tests In statistical words What you want to show is the alternative hypothesis H 1 . The pessimist (by chance) is the null hypothesis H 0 . Show that the observation and everything more ’extreme’ is sufficiently unlikely under this null hypothesis. Scientists have agreed that it suffices that this probability is at most 5%. This refutes the pessimist. Statistical language: We reject the null hypothesis on the significance level 5%.

Theory of statistical tests In statistical words What you want to show is the alternative hypothesis H 1 . The pessimist (by chance) is the null hypothesis H 0 . Show that the observation and everything more ’extreme’ is sufficiently unlikely under this null hypothesis. Scientists have agreed that it suffices that this probability is at most 5%. This refutes the pessimist. Statistical language: We reject the null hypothesis on the significance level 5%. p = P ( observation and everything more ’extreme’ / H 0 is true ) If the p value is over 5% you say you cannot reject the null hypothesis.

Theory of statistical tests Statistical tests in R There is a huge variety of statistical tests that you can perform in R. We will cover the most basic ones in this lecture and you can find a non-exhaustive list in your lecture notes.

Test for a difference in means Contents Theory of statistical tests 1 Test for a difference in means 2 Testing for dependence 3 Nominal variables Continuous variables Ordinal variables Power of a test 4 Degrees of freedom 5

Test for a difference in means The Students T test: Underline What is given? Independent observations ( x 1 , . . . , x n ) and ( y 1 , . . . , y m )).

Test for a difference in means The Students T test: Underline What is given? Independent observations ( x 1 , . . . , x n ) and ( y 1 , . . . , y m )). Null hypothesis: x and y are samples from distributions having the same mean.

Test for a difference in means The Students T test: Underline What is given? Independent observations ( x 1 , . . . , x n ) and ( y 1 , . . . , y m )). Null hypothesis: x and y are samples from distributions having the same mean. R command: t.test(x,y)

Test for a difference in means The Students T test: Underline What is given? Independent observations ( x 1 , . . . , x n ) and ( y 1 , . . . , y m )). Null hypothesis: x and y are samples from distributions having the same mean. R command: t.test(x,y) Idea of the test: If the sample means are too far apart, then reject the null hypothesis.

Test for a difference in means The Students T test: Underline What is given? Independent observations ( x 1 , . . . , x n ) and ( y 1 , . . . , y m )). Null hypothesis: x and y are samples from distributions having the same mean. R command: t.test(x,y) Idea of the test: If the sample means are too far apart, then reject the null hypothesis. Approximative test but rather robust

Test for a difference in means Martian example Dataset containing height of martian of different colours. See the code on the R console.

Test for a difference in means Martian example Dataset containing height of martian of different colours. See the code on the R console. We cannot reject the null hypothesis. It was an unpaired test because the two samples are independent.

Test for a difference in means Shoe example Dataset containing wear of shoes of 2 materials A and B. The same persons have weared the two types of shoes abd we have a measure of use of the shoes.

Test for a difference in means Shoe example Dataset containing wear of shoes of 2 materials A and B. The same persons have weared the two types of shoes abd we have a measure of use of the shoes. Paired test because some persons will cause more damage to the shoe than others. See the code on the R console.

Test for a difference in means Shoe example Dataset containing wear of shoes of 2 materials A and B. The same persons have weared the two types of shoes abd we have a measure of use of the shoes. Paired test because some persons will cause more damage to the shoe than others. See the code on the R console. We can reject the null hypothesis.

Test for a difference in means Test for (un)equality of variances In t.test() there is an option var.equal= . This way we can control if the variances between the two samples are assumed to be equal or not. The default value is FALSE . If you have a good biological reason, you can assume that the variances are equal. You can test for equality of variances by applying a variance test with the command var.test . Let’s see an example on the R console.

Testing for dependence Contents Theory of statistical tests 1 Test for a difference in means 2 Testing for dependence 3 Nominal variables Continuous variables Ordinal variables Power of a test 4 Degrees of freedom 5

Testing for dependence Testing for dependence The test depends on the data type: Nominal variables: not ordered like eye colour or gender

Testing for dependence Testing for dependence The test depends on the data type: Nominal variables: not ordered like eye colour or gender Ordinal variables: ordered but not continuous like the result of a dice

Testing for dependence Testing for dependence The test depends on the data type: Nominal variables: not ordered like eye colour or gender Ordinal variables: ordered but not continuous like the result of a dice Continuous variables: like body height

Testing for dependence Nominal variables Contents Theory of statistical tests 1 Test for a difference in means 2 Testing for dependence 3 Nominal variables Continuous variables Ordinal variables Power of a test 4 Degrees of freedom 5

Testing for dependence Nominal variables Nominal variables: Underline What is given? Pairwise observations ( x 1 , y 1 ) , ( x 2 , y 2 ) ... ( x n , y n )

Testing for dependence Nominal variables Nominal variables: Underline What is given? Pairwise observations ( x 1 , y 1 ) , ( x 2 , y 2 ) ... ( x n , y n ) Null hypothesis: x and y are independent

Testing for dependence Nominal variables Nominal variables: Underline What is given? Pairwise observations ( x 1 , y 1 ) , ( x 2 , y 2 ) ... ( x n , y n ) Null hypothesis: x and y are independent Test: χ 2

Testing for dependence Nominal variables Nominal variables: Underline What is given? Pairwise observations ( x 1 , y 1 ) , ( x 2 , y 2 ) ... ( x n , y n ) Null hypothesis: x and y are independent Test: χ 2 R command: chisq.test(x,y) or chisq.test(contingency table)

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & - PowerPoint PPT Presentation

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & Dirk Metzler Winter semester 2014-15 Theory of statistical tests 1 Test for a difference in means 2 Testing for dependence 3 Nominal variables Continuous variables

Rcourse: Linear model Sonja Grath, No emie Becker & Dirk Metzler Winter semester 2014-15

Conference Report AI Lab NLP center Jiangtong Li Basic Statistics Basic Statistics Basic

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Exercise 1: Basic Input Exercise 1: Basic Input FLUKA Beginners Course Exercise 1: Basic Input

REPUBLIC OF NAMIBIA WHAT IS FOREIGN TRADE STATISTICS WHAT IS FOREIGN TRADE STATISTICS Records

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Who we are? OECD STATISTICS ESTONIA AUSTRALIAN BUREAU OF STATISTICS STATISTICS NEW ZEALAND

Statistics in Schools Classrooms Powered by Census Data CENSUS.GOV/SCHOOLS Statistics in

Order Statistics and Pitman Closeness Katherine F. Davies Department of Statistics University of

NLP for low-resourced languages Teresa Lynn, PhD Research Fellow ADAPT Centre Dublin City

Presenting Data e.g., bronze, silver, gold ordered e.g., support, tank, jungler

Week 2: from categorical and ordered Express Separate Express Separate Arrange

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model

Some Statistical Tools for Particle Physics Particle Physics Colloquium MPI fr Physik u.

texdoc 2.0 An update on creating LaTeX documents from within Stata Example 2 Ben Jann

Calibrate p values by taking the square root Rutgers Foundations of Probability Seminar

Acknowledgements Acknowledgements Coauthors: Amy Wilson-Stronks, The Joint Commission,

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & - PowerPoint PPT Presentation

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & Dirk Metzler Winter semester 2014-15 Theory of statistical tests 1 Test for a difference in means 2 Testing for dependence 3 Nominal variables Continuous variables

Rcourse: Linear model Sonja Grath, No emie Becker &amp; Dirk Metzler Winter semester 2014-15

Conference Report AI Lab NLP center Jiangtong Li Basic Statistics Basic Statistics Basic

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics &amp; Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Exercise 1: Basic Input Exercise 1: Basic Input FLUKA Beginners Course Exercise 1: Basic Input

REPUBLIC OF NAMIBIA WHAT IS FOREIGN TRADE STATISTICS WHAT IS FOREIGN TRADE STATISTICS Records

AP Biology and Statistics Statistics Statistics help to better understand the meaning of a

Who we are? OECD STATISTICS ESTONIA AUSTRALIAN BUREAU OF STATISTICS STATISTICS NEW ZEALAND

Statistics in Schools Classrooms Powered by Census Data CENSUS.GOV/SCHOOLS Statistics in

Order Statistics and Pitman Closeness Katherine F. Davies Department of Statistics University of

NLP for low-resourced languages Teresa Lynn, PhD Research Fellow ADAPT Centre Dublin City

Presenting Data e.g., bronze, silver, gold ordered e.g., support, tank, jungler

Week 2: from categorical and ordered Express Separate Express Separate Arrange

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model

Some Statistical Tools for Particle Physics Particle Physics Colloquium MPI fr Physik u.

texdoc 2.0 An update on creating LaTeX documents from within Stata Example 2 Ben Jann

Calibrate p values by taking the square root Rutgers Foundations of Probability Seminar

Acknowledgements Acknowledgements Coauthors: Amy Wilson-Stronks, The Joint Commission,

Rcourse: Linear model Sonja Grath, No emie Becker & Dirk Metzler Winter semester 2014-15

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning