Descriptive Statistics Descriptive and Inferential Statistics - PowerPoint PPT Presentation

ST 380 Probability and Statistics for the Physical Sciences Descriptive Statistics Descriptive and Inferential Statistics Recall that statistical methods are broadly divided into: Descriptive methods, which focus on the characteristics of a particular set of data; Inferential methods, which place the particular data set in a broader context, and allow us to relate what we see in the data to that broader context. Descriptive statistical methods can also be broadly divided into: Graphical displays , such as bar charts; Numerical summaries , such as the mean and standard deviation. 1 / 14 Descriptive Statistics Pictorial and Tabular Methods

ST 380 Probability and Statistics for the Physical Sciences Graphical Displays Stem-and-Leaf Display Introduced by John Tukey: FundRsng <- scan("Data/Example-01-01.txt") stem(FundRsng) The numbers on the left are the “stems”, and the digits on the right are the “leaves”. So for instance the leaf “8” on the row with stem “4” stands for a data value of 48. The original data are given to one decimal place, so the display contains almost the same information. 2 / 14 Descriptive Statistics Pictorial and Tabular Methods

ST 380 Probability and Statistics for the Physical Sciences Histogram The histogram is another graphical display that summarizes a set of data: hist(FundRsng) The heights of the blocks in this histogram shows the “frequency” of various ranges of the data: 36 values between 0 and 10, inclusive; 18 values between 10 and 20 (strictly, 10 < x ≤ 20); and so on. 3 / 14 Descriptive Statistics Pictorial and Tabular Methods

ST 380 Probability and Statistics for the Physical Sciences Optionally, the histogram can display the “density” of the data: hist(FundRsng, freq = FALSE) In this version, the areas of the blocks (width × height) are the fractions of the data that lie in the same ranges. The total area is therefore 1. The heights of the blocks are height = fraction width which is called the “density” of the observations. 4 / 14 Descriptive Statistics Pictorial and Tabular Methods

ST 380 Probability and Statistics for the Physical Sciences Numerical Summaries Measures of Location The average price of a new house sold in April, 2013, was $337,000 ( http://www.fedprimerate.com/new%5Fhome%5Fsales%5Fprice%5Fhistory.htm ) The median price in the same month was $279,300. When we are looking at a large number of values, we may want to know only what is a “typical” value. The mean and the median are two candidates for measuring the “typical” value. 5 / 14 Descriptive Statistics Measures of Location

ST 380 Probability and Statistics for the Physical Sciences Mean Given n values x 1 , x 2 , . . . , x n , the sample mean , or arithmetic average, is n x = x 1 + x 2 + · · · + x n = 1 � ¯ x i . n n i =1 mean(FundRsng) 6 / 14 Descriptive Statistics Measures of Location

ST 380 Probability and Statistics for the Physical Sciences Median The median is defined loosely as the value ˜ x such that half the x values lie below ˜ x and half lie above. The simplest way, but not the only way, to find the median is to order the data as x (1) ≤ x (2) ≤ · · · ≤ x ( n ) . If n is odd, n = 2 m + 1, then the median ˜ x is the unique middle value in the ordered list: ˜ x = x ( m +1) . If n is even, n = 2 m , then ˜ x is the average of the two middle values: x = x ( m ) + x ( m +1) ˜ . 2 7 / 14 Descriptive Statistics Measures of Location

ST 380 Probability and Statistics for the Physical Sciences Quantile More generally, the p th quantile divides the data into a fraction p that lie below the quantile, and a complementary fraction 1 − p that lie above it. A quantile is usually calculated from the ordered values x (1) ≤ x (2) ≤ · · · ≤ x ( n ) . but various specific rules have been proposed. The quartiles correspond to p = . 25 , . 75, and the deciles correspond to p = . 1 , . 2 , . . . , . 9. 8 / 14 Descriptive Statistics Measures of Location

ST 380 Probability and Statistics for the Physical Sciences Why Do We Need Both Mean and Median? Suppose 5 houses sold for $260,000, $270,000, $280,000, $290,000 and $1,000,000. prices <- c(260, 270, 280, 290, 1000) mean(prices) # 420 median(prices) # 280 The median is typical of the moderate priced houses, but the mean is not representative of either the moderate prices (all below $300,000) or the high priced house. 9 / 14 Descriptive Statistics Measures of Location

ST 380 Probability and Statistics for the Physical Sciences But the mean is less affected by small variations: prices <- c(260, 270, 285, 290, 1000) mean(prices) # 421 median(prices) # 285 The mean and median are two measures of location . Many others have been devised to meet various objectives. The trimmed mean (drop the highest and lowest values, and average the rest) is used in some athletic events, and in finance. 10 / 14 Descriptive Statistics Measures of Location

ST 380 Probability and Statistics for the Physical Sciences Measures of Variability The prices 260 , 270 , 280 , 290 , 300 and 278 , 279 , 280 , 281 , 282 both have mean $280,000, but are very different in variability. Once we know the typical value (a measure of location), the next most interesting aspect of a set of data is usually how much they vary around that typical value (a measure of variability). 11 / 14 Descriptive Statistics Measures of Variability

ST 380 Probability and Statistics for the Physical Sciences Standard Deviation It is natural to measure variability by looking at how much the individual data values x i differ from a location measure such as the mean ¯ x . The sample variance is x ) 2 � ( x i − ¯ s 2 = n − 1 √ and the sample standard deviation is s = s 2 . 12 / 14 Descriptive Statistics Measures of Variability

ST 380 Probability and Statistics for the Physical Sciences Examples prices <- c(260, 270, 280, 290, 1000) sd(prices) # 324.4 prices <- c(260, 270, 280, 290, 300) sd(prices) # 15.8 prices <- c(278, 279, 280, 281, 282) sd(prices) # 1.58 As with measures of location, many other measures of variability have been devised to meet various objectives. For example, the interquartile range (IQR) is the difference between the upper and lower quartiles. 13 / 14 Descriptive Statistics Measures of Variability

ST 380 Probability and Statistics for the Physical Sciences The Boxplot The boxplot (or box-and-whisker plot) is a graphical summary that displays both a measure of location (the median) and a measure of variability (the interquartile range): boxplot(FundRsng) The central box extends from the lower quartile to the upper quartile, and the median is shown by the bold line within the box. A data value that differs from the nearer quartile by more than 1 . 5 × IQR is shown individually as a possible “outlier”. 14 / 14 Descriptive Statistics Measures of Variability

Descriptive Statistics Descriptive and Inferential Statistics - PowerPoint PPT Presentation

ST 380 Probability and Statistics for the Physical Sciences Descriptive Statistics Descriptive and Inferential Statistics Recall that statistical methods are broadly divided into: Descriptive methods, which focus on the characteristics of a

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Descriptive statistics P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna

I t Introduction to d t i t Descriptive Descriptive Statistics Statistics 17.871 Spring

Descriptive Epidem iology & Descriptive Epidem iology & Study design Study design

Descriptive Complexity of Jonni Virtema Deterministic Polylogarithmic Time Descriptive

Statistics and Data Analysis Descriptive Statistics (2): Summarization Ling-Chieh Kung

Descriptive Statistics DS GA 1002 Probability and Statistics for Data Science

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Introduction to Data Science CS 5963 / Math 3900 Lecture 2: Introduction to Descriptive

Descriptive Statistics Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1

Descriptive Statistics Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Marc

Descriptive Statistics and Probability: A Look at Real- World

Trademark and Unfair Competition Law Slides 22: Descriptive and Nominative Fair Use LAWS 7341-001

Descriptive combinatorics and ergodic theorems Anush Tserunyan University of Illinois at

Agenda for today 1. Descriptive Data Analysis 2. Graphics XploRe Descriptive Data Analysis 1-2

Games in Descriptive Set Theory, or: its all fun and games until someone loses the axiom of

Alyson Lischka Kennesaw State University alischka@kennesaw.edu MM2D 2D2. 2. St Studen dents

The Selection [7] In the last class Heap Structure and Patial Order Tree Property The

Natural extension of median algebras Bruno Teheux joint work with Georges Hansoul University of

Smooth Sensitivity and Sampling Sofya Raskhodnikova Penn State University Joint work with Kobbi

Teaching Statistical Literacy: Chapter 3 16 May 2019 V1 Ch3: V1 Ch3: V1 2019 USCOTS Workshop 1

Median problems with positive and negative weights: some new results 10th Combinatorial

Functional Median Polish, with Climate Applications Marc G. Genton Department of Statistics,

Generating Range Fixes for Software Configuration Yingfei Xiong * Arnaud Hubaux Steven She

Sambuz

Useful Links

Newsletter

Mail Us

Descriptive Statistics Descriptive and Inferential Statistics - PowerPoint PPT Presentation

ST 380 Probability and Statistics for the Physical Sciences Descriptive Statistics Descriptive and Inferential Statistics Recall that statistical methods are broadly divided into: Descriptive methods, which focus on the characteristics of a

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Descriptive statistics P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna

I t Introduction to d t i t Descriptive Descriptive Statistics Statistics 17.871 Spring

Descriptive Epidem iology &amp; Descriptive Epidem iology &amp; Study design Study design

Descriptive Complexity of Jonni Virtema Deterministic Polylogarithmic Time Descriptive

Statistics and Data Analysis Descriptive Statistics (2): Summarization Ling-Chieh Kung

Descriptive Statistics DS GA 1002 Probability and Statistics for Data Science

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Introduction to Data Science CS 5963 / Math 3900 Lecture 2: Introduction to Descriptive

Descriptive Statistics Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1

Descriptive Statistics Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Marc

Descriptive Statistics and Probability: A Look at Real- World

Trademark and Unfair Competition Law Slides 22: Descriptive and Nominative Fair Use LAWS 7341-001

Descriptive combinatorics and ergodic theorems Anush Tserunyan University of Illinois at

Agenda for today 1. Descriptive Data Analysis 2. Graphics XploRe Descriptive Data Analysis 1-2

Games in Descriptive Set Theory, or: its all fun and games until someone loses the axiom of

Alyson Lischka Kennesaw State University alischka@kennesaw.edu MM2D 2D2. 2. St Studen dents

The Selection [7] In the last class Heap Structure and Patial Order Tree Property The

Natural extension of median algebras Bruno Teheux joint work with Georges Hansoul University of

Smooth Sensitivity and Sampling Sofya Raskhodnikova Penn State University Joint work with Kobbi

Teaching Statistical Literacy: Chapter 3 16 May 2019 V1 Ch3: V1 Ch3: V1 2019 USCOTS Workshop 1

Median problems with positive and negative weights: some new results 10th Combinatorial

Functional Median Polish, with Climate Applications Marc G. Genton Department of Statistics,

Generating Range Fixes for Software Configuration Yingfei Xiong * Arnaud Hubaux Steven She

Sambuz

Useful Links

Newsletter

Mail Us

Descriptive Epidem iology & Descriptive Epidem iology & Study design Study design