Measures of Variation Summary of Section 9.2 Range The difference - PowerPoint PPT Presentation

Measures of Variation Summary of Section 9.2 Range The difference Largest Data - Smallest Data in a Sample. Deviation from the Mean � x 2 i − nx 2 � ( x i − x ) 2 1 Variance σ 2 = s 2 = = n − 1 n − 1 √ 2 Standard Deviation σ = s = s 2 These are random variables called Sample Variance and Sample Standard Deviation. For a random variable X , µ = E ( X ) is called the mean. The variance Var ( X ) is σ 2 = Var ( X ) = E (( X − µ ) 2 ). Main Property/ Explanation for dividing by n − 1: If X i are i.i.d with distribution X , then if you set S 2 = � ( X i − X ) 2 , its n − 1 expected value is E ( S 2 ) = σ 2 . This is not true for the standard deviation, E ( S ) � = σ. �� f i x 2 M , i − nx 2 Grouped Data s = . n − 1 Dan Barbasch Math 1105 Chapter 9 Week of October 2 1 / 1

Examples I Example (Range) Data 15 , − 3 , 4 , 7 , 18. The smallest is − 3, the largest 18 so Range = 18 − ( − 3) = 21 . Always a nonnegative number. Example (Deviation from the Mean) In the previous example, x = 15 − 3+4+7+18 = 8 . 2. So 5 15 − 8 . 2 = 6 . 8 , − 3 − 8 . 2 = − 11 . 2 , 4 − 8 . 2 = − 3 . 8 , 7 − 8 . 2 = − 1 . 2 , 18 − 8 . 2 = 9 . 8 . Example (Variance and Standard Deviation) s 2 = 6 . 8 2 +11 . 2 2 +3 . 8 2 +1 . 2 2 +9 . 8 2 = 15 2 +3 2 +4 2 +7 2 +18 2 − 5 · 8 . 2 2 √ 4 4 s 2 . s = Dan Barbasch Math 1105 Chapter 9 Week of October 2 2 / 1

Examples II Example (Binomial Distribution) P ( X = 1) = p , P ( X = 0) = 1 − p . Then µ = E ( X ) = p , and σ 2 = E (( X − p ) 2 ) = (1 − p ) 2 p + (0 − p ) 2 (1 − p ) = p (1 − p ) . This is the same as E ( X 2 − p 2 ) = (1 − p 2 ) p + ( − p 2 )(1 − p ) = (1 − p ) p . Remark: Note that the formula for variance and standard deviation only holds for n > 2 . Otherwise, for n = 1 , you would be dividing by 0. For one random variable, the variance is defined as Var ( X ) = E (( X − E ( X )) 2 ) . For X 1 , X 2 , , two independent random variables, Var ( X 1 + X 2 ) = Var ( X 1 ) + Var ( X 2 ) . Suppose X is a random variable. We can write a table . . . X a 1 a 2 a n P ( X ) p 1 p 2 . . . p n Dan Barbasch Math 1105 Chapter 9 Week of October 2 3 / 1

Examples III For the expected value µ = E ( X ) , you multiply the two terms in each column, and add � a i × p n = a 1 p 1 + · · · + a n p n . i In a spreadsheet program, the data would be in columns and you would add over the products from the rows. You use a command like sumproduct to perform the operation. If you have some other variable like ( X − µ ) 2 , you would use the values ( a i − µ ) 2 and the same p i . Dan Barbasch Math 1105 Chapter 9 Week of October 2 4 / 1

Examples IV Example 2 3 − 1 1 X X 2 4 9 1 1 ( X − µ ) 2 (2 − 1 / 4) 2 (3 − 1 / 4) 2 ( − 1 − 1 / 4) 2 (1 − 1 / 4) 2 P ( X ) 1 / 2 1 / 8 1 / 4 1 / 8 Computing the expected values is below. µ = E ( X ) = (2) × (1 / 2) + (3) × (1 / 8) + ( − 1) × (1 / 4) + (1) × (1 / 8) = 1 / 4 . Var ( X ) =(2 − 1 / 4) 2 · (1 / 2) + (3 − 1 / 4) 2 · (1 / 8) + ( − 1 − 1 / 4) 2 · (1 / 4)+ +(1 − 1 / 4) 2 · (1 / 8) = 47 / 16 . Dan Barbasch Math 1105 Chapter 9 Week of October 2 5 / 1

Normal Distribution I Definition Data are said to be normally distributed if the rate at which the frequencies fall off is proportional to the distance of the score from the mean, and to the frequencies themselves. This definition requires Calculus. We don’t assume or do Calculus in this course. We will however learn how to work with this distribution. It is very useful in that many phenomena can be modeled by this. We will see how the binomial distribution is related to the normal distribution later in the chapter. Suppose you have a random variable X , and you would like to know about its mean µ . So you perform many n independent trials, and draw a histogram. The larger the n , the closer the outcome will look like the 2 πσ e − ( x − µ )2 1 2 σ 2 . The pictures in the text show what it looks curve f ( x ) = √ like. The resulting probability is called N ( µ, σ 2 ) , normal with mean µ and Dan Barbasch Math 1105 Chapter 9 Week of October 2 6 / 1

Normal Distribution II variance σ 2 . There is a precise statement called the Central Limit Theorem which says that for large n , √ n ( S n − µ ) “looks” like a normal distribution N (0 , σ 2 ) . it is used in practice to model large populations and “ errors”. There are many examples that can be approximated by normal distributions. Heights of people, and scores on tests are examples. This is not a finite distribution. For a random variable that is normally distributed, we write N ( µ, σ 2 ) , P ( X ≤ a ) = the area under the normal curve from − ∞ to a . This is tabulated for µ = 0 and σ = 1 . The rest is computed by simple formulas involving arithmetic. Dan Barbasch Math 1105 Chapter 9 Week of October 2 7 / 1

Height Example I Example (from the practice prelim) 8. (14 points) Assume that the height in inches of American women follows a normal distribution with mean mu = 64 ′′ (5’4”) and standard deviation σ = 3 ′′ . (a) (3 points) How many standard deviations above or below the mean is a height of 73” (6’1”)? (b) (4 points) What fraction of women are taller than 73 inches? (c) (4 points) In a room with 30 women, what is the probability that at least one of them is taller than 73”? (d) (3 points) What assumptions did you make when answering part (c)? Are there circumstances under which those assumptions would not be justified? Dan Barbasch Math 1105 Chapter 9 Week of October 2 8 / 1

Height Example II Answer. (a) same as before 3 standard deviations away. ( b ) P ( X ≥ 73) = P ( X − 64 ≥ 73 − 64 = 9 = 3 σ ) = P ( X − µ ≥ 3) = σ =1 − P ( X − µ ≤ 3) = 1 − 0 . 999 = 0 . 001 . σ This is 1 / 1000 . The random variable X has probability distribution N (64 , 17). The probability P ( X ≥ 73) comes from this normal distribution. To actually look it up in the tables, you rewrite it in terms of Z = X − 64 which has 3 probability distribution N (0 , 1). This is the one in the tables. ( c ) P (at least 1 / 30 ≥ 73) =1 − P (30 / 30 ≤ 73) = 1 − P ( X ≤ 73) 30 = =1 − (0 . 998) 30 . Dan Barbasch Math 1105 Chapter 9 Week of October 2 9 / 1

z − value The principle is ⇒ Z = X − µ X normal N ( µ, σ 2 ) ⇐ normal N (0 , 1). σ So P ( X ≤ a ) = P ( Z ≤ a − µ ) . σ z = a − µ is called the z − value. This is what you look up in the tables. σ Dan Barbasch Math 1105 Chapter 9 Week of October 2 10 / 1

Example with Grades Example A professor (not this one!) of a course wants to give grades so that A top 8% F bottom 8% B next 20% below A D next 20% above the F C the rest The mean is µ = 67 and the standard deviation is σ = 17. Find the cutoffs. Answer. P ( ≤ A ) = 0 . 92 z = 1 . 41 a = µ + z σ = 67 + 17 · 1 . 41 = 91 P ( ≤ B ) = 0 . 72 z = 0 . 58 a = µ + z σ = 67 + 17 · 0 . 58 = 77 P ( ≤ C ) = 0 . 28 z = − . 59 a = µ + z σ = 67 + 17 · ( − . 59) = 57 P ( ≤ D ) = 0 . 08 z = − 1 . 39 a = µ + z σ = 67 + 17 · ( − 1 . 39) = 43 from the tables. In Excel or alike you can write norminv (0 . 92 , 67 , 17) ∼ = 91 . Dan Barbasch Math 1105 Chapter 9 Week of October 2 11 / 1

Approximate Binomial Distribution by the Normal Dstribution I Theorem Let B = X 1 + · · · + X n be the binomial Distribution, coming from adding up X i = X i.i.d. with P ( X = 1) = p , P ( X = 0) = 1 − p . Then E ( B ) = np , Var ( B ) = np (1 − p ) . The normal approxinmation of the binomial distribution is a − np P ( B ≤ a ) ≃ P ( Z ≤ ) . � np (1 − p ) where Z has the normal distribution N (0 , 1) . In other words, the binomial distribution is approximately the normal distribution with the same mean and variance, N ( np , np (1 − p ) . Dan Barbasch Math 1105 Chapter 9 Week of October 2 12 / 1

Approximate Binomial Distribution by the Normal Dstribution II Remember the notation N ( µ, σ 2 ) for the normal distribution. σ 2 is the variance, its square root σ is the standard deviation. Example Approximate C (100 , 50) . Dan Barbasch Math 1105 Chapter 9 Week of October 2 13 / 1

Approximate Binomial Distribution by the Normal Dstribution III Answer. Use the binomial distribution with p = 0 . 5 and n = 100 . C (100 , 50) · (0 . 5) 100 ≃ P (49 . 5 < B < 50 . 5) . √ The mean is 100 · 0 . 5 = 50 . The standard deviation is 100 · 0 . 5 · 0 . 5 = 5 . So � 49 . 5 − 50 ≤ Z ≤ 50 . 5 − 50 � C (100 , 50) ≃ 2 100 · P = 5 5 =0 . 5398 − 0 . 4602 = 0 . 08 . You have to box in 50 by two numbers: 49 . 5 < 50 < 50 . 5 is a reasonable choice. Dan Barbasch Math 1105 Chapter 9 Week of October 2 14 / 1

Drug Effectiveness I Example (Drug Effectiveness, Problem 24 in 9.4) A new drug cures 80% of the patients to whom it is administered. It is given to 25 patients. Find the probabilities that among these patients, the following results occur. a. Exactly 20 are cured. b. All are cured. c. No one is cured. d. Twelve or fewer are cured. Dan Barbasch Math 1105 Chapter 9 Week of October 2 15 / 1

Measures of Variation Summary of Section 9.2 Range The difference - PowerPoint PPT Presentation

Measures of Variation Summary of Section 9.2 Range The difference Largest Data - Smallest Data in a Sample. Deviation from the Mean x 2 i nx 2 ( x i x ) 2 1 Variance 2 = s 2 = = n 1 n 1 2 Standard Deviation = s

Nonhomogeneous linear systems of DEs Diagonalization, Variation of Parameters ITI 11/04/2020

Transitional Measures Introduction to Regulatory Measures 1 Why Regulatory Measures ?

Outline Introduction Variation Among Batches Variation Within Batches Experimenting

Total Variation in Image Analysis (The Homo Erectus Stage?) Franois Lauze 1Department of

Variation in sampling The death of toxicology Variation in sampling The death of

Lake Pleasant Limnology and Down-Canal Water Quality Implications Spatial Variation in

Block 9 Section 12 Hackett Territory Plan variation and Crown lease variation to regularise the

Variation Tolerant Buffered Variation Tolerant Buffered Clock Netw ork Synthesis Clock Netw ork

all your variation points for free? Variation points Design for change/easily accomodated to

TFG-Biological variation database Terms of Reference: To use a critical appraisal check list to

Modelling phenotypic variation Modelling phenotypic variation in monthly weights of in monthly

Statistics in medicine Sources of variation Lecture 1- part 1: Describing variation, and

Modern human variation (Ch 12) There's variation in ALL humans Polytypic species: composed of

Documenting Interaction and Variation in Ampenan Sasak Khairunnisa Bradley McDonnell InLaLi

Me & the UO Language Variation & Computation Lab Sociophonetician and

Molecular g gas a as a a d dynamical IMF v variation probe o of I IMF v variation Timo

DATA MINING LECTURE 4 Frequent Itemsets and Association Rules This is how it all started

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Chapter 11 Categorical Data Analysis Categorical Data and the Multinomial Distribution

SELECT THE RIGHT ABSTRACT INTERESTINGNESS MEASURE FOR ASSOCIATION PATTERNS Many techniques

Outline Review Practice Problems! Review Time! Random Variables Joint

Lecture 12 : The Basic Continuous Distributions 0/ 32 We will now study the basic examples This

Implication Strength of Classification Rules Gilbert Ritschard Djamel A. Zighed University of

Smoothing of sign test and approximation of its p -value Mengxin LU Yoshihiko MAESONO Kyushu

Measures of Variation Summary of Section 9.2 Range The difference - PowerPoint PPT Presentation

Measures of Variation Summary of Section 9.2 Range The difference Largest Data - Smallest Data in a Sample. Deviation from the Mean x 2 i nx 2 ( x i x ) 2 1 Variance 2 = s 2 = = n 1 n 1 2 Standard Deviation = s

Nonhomogeneous linear systems of DEs Diagonalization, Variation of Parameters ITI 11/04/2020

Transitional Measures Introduction to Regulatory Measures 1 Why Regulatory Measures ?

Outline Introduction Variation Among Batches Variation Within Batches Experimenting

Total Variation in Image Analysis (The Homo Erectus Stage?) Franois Lauze 1Department of

Variation in sampling The death of toxicology Variation in sampling The death of

Lake Pleasant Limnology and Down-Canal Water Quality Implications Spatial Variation in

Block 9 Section 12 Hackett Territory Plan variation and Crown lease variation to regularise the

Variation Tolerant Buffered Variation Tolerant Buffered Clock Netw ork Synthesis Clock Netw ork

all your variation points for free? Variation points Design for change/easily accomodated to

TFG-Biological variation database Terms of Reference: To use a critical appraisal check list to

Modelling phenotypic variation Modelling phenotypic variation in monthly weights of in monthly

Statistics in medicine Sources of variation Lecture 1- part 1: Describing variation, and

Modern human variation (Ch 12) There's variation in ALL humans Polytypic species: composed of

Documenting Interaction and Variation in Ampenan Sasak Khairunnisa Bradley McDonnell InLaLi

Me &amp; the UO Language Variation &amp; Computation Lab Sociophonetician and

Molecular g gas a as a a d dynamical IMF v variation probe o of I IMF v variation Timo

DATA MINING LECTURE 4 Frequent Itemsets and Association Rules This is how it all started

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Chapter 11 Categorical Data Analysis Categorical Data and the Multinomial Distribution

SELECT THE RIGHT ABSTRACT INTERESTINGNESS MEASURE FOR ASSOCIATION PATTERNS Many techniques

Outline Review Practice Problems! Review Time! Random Variables Joint

Lecture 12 : The Basic Continuous Distributions 0/ 32 We will now study the basic examples This

Implication Strength of Classification Rules Gilbert Ritschard Djamel A. Zighed University of

Smoothing of sign test and approximation of its p -value Mengxin LU Yoshihiko MAESONO Kyushu

Me & the UO Language Variation & Computation Lab Sociophonetician and