CS 147: Computer Systems Performance Analysis Review of Statistics - PowerPoint PPT Presentation

CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Review of Statistics CS 147: Computer Systems Performance Analysis Review of Statistics 1 / 26

15 Concepts Introduction to Statistics CS147 Introduction to Statistics 2015-06-15 15 Concepts ◮ Concentration on applied statistics ◮ Especially those useful in measurement ◮ Today’s lecture will cover 15 basic concepts ◮ You should already be familiar with them Introduction to Statistics ◮ Concentration on applied statistics ◮ Especially those useful in measurement ◮ Today’s lecture will cover 15 basic concepts ◮ You should already be familiar with them 2 / 26

15 Concepts Independent Events 1. Independent Events CS147 1. Independent Events 2015-06-15 15 Concepts ◮ Occurrence of one event doesn’t affect probability of other ◮ Examples: Independent Events ◮ Coin flips ◮ Inputs from separate users ◮ “Unrelated”traffic accidents 1. Independent Events ◮ Occurrence of one event doesn’t affect probability of other ◮ Examples: ◮ Coin flips ◮ Inputs from separate users ◮ “Unrelated”traffic accidents 3 / 26

15 Concepts Independent Events 1. Independent Events CS147 1. Independent Events 2015-06-15 15 Concepts ◮ Occurrence of one event doesn’t affect probability of other ◮ Examples: Independent Events ◮ Coin flips ◮ Inputs from separate users ◮ “Unrelated”traffic accidents ◮ What about second basketball free throw after the player 1. Independent Events misses the first? ◮ Occurrence of one event doesn’t affect probability of other ◮ Examples: ◮ Coin flips ◮ Inputs from separate users ◮ “Unrelated”traffic accidents ◮ What about second basketball free throw after the player misses the first? 3 / 26

15 Concepts Random Variable 2. Random Variable CS147 2. Random Variable 2015-06-15 15 Concepts ◮ Variable that takes values probabilistically ◮ Variable usually denoted by capital letters, particular values Random Variable by lowercase ◮ Examples: ◮ Number shown on dice ◮ Network delay 2. Random Variable ◮ CS 70 attendance ◮ What about disk seek time? ◮ Variable that takes values probabilistically ◮ Variable usually denoted by capital letters, particular values by lowercase ◮ Examples: ◮ Number shown on dice ◮ Network delay ◮ CS 70 attendance ◮ What about disk seek time? 4 / 26

15 Concepts CDF 3. Cumulative Distribution Function (CDF) CS147 3. Cumulative Distribution Function (CDF) 2015-06-15 15 Concepts ◮ Maps a value a to probability that the outcome is less than or equal to a: CDF F x ( a ) = P ( x ≤ a ) ◮ Valid for discrete and continuous variables ◮ Monotonically increasing 3. Cumulative Distribution Function (CDF) ◮ Easy to specify, calculate, measure ◮ Maps a value a to probability that the outcome is less than or equal to a: F x ( a ) = P ( x ≤ a ) ◮ Valid for discrete and continuous variables ◮ Monotonically increasing ◮ Easy to specify, calculate, measure 5 / 26

15 Concepts CDF CDF Examples CS147 CDF Examples 2015-06-15 ◮ Coin flip (T = 0, H = 1): 15 Concepts 1.0 0.5 CDF 0.0 0 1 2 ◮ Exponential packet interarrival times: ◮ Coin flip (T = 0, H = 1): CDF Examples 1.0 0.5 0.0 0 1 2 3 4 1.0 0.5 0.0 0 1 2 ◮ Exponential packet interarrival times: 1.0 0.5 0.0 0 1 2 3 4 6 / 26

15 Concepts pdf 4. Probability Density Function (pdf) CS147 4. Probability Density Function (pdf) 2015-06-15 15 Concepts ◮ Derivative of (continuous) CDF: f ( x ) = dF ( x ) pdf dx ◮ Usable to find probability of a range: P ( x 1 < x ≤ x 2 ) = F ( x 2 ) − F ( x 1 ) 4. Probability Density Function (pdf) � x 2 = f ( x ) dx x 1 ◮ Derivative of (continuous) CDF: f ( x ) = dF ( x ) dx ◮ Usable to find probability of a range: P ( x 1 < x ≤ x 2 ) = F ( x 2 ) − F ( x 1 ) � x 2 = f ( x ) dx x 1 7 / 26

15 Concepts pdf Examples of pdf CS147 Examples of pdf 2015-06-15 ◮ Exponential interarrival times: 15 Concepts 1.0 0.5 pdf 0.0 0 1 2 3 4 ◮ Gaussian (normal) distribution: Examples of pdf 0.25 ◮ Exponential interarrival times: 0.00 0 1 2 3 4 5 6 1.0 0.5 0.0 0 1 2 3 4 ◮ Gaussian (normal) distribution: 0.25 0.00 0 1 2 3 4 5 6 8 / 26

15 Concepts pmf 5. Probability Mass Function (pmf) CS147 5. Probability Mass Function (pmf) 2015-06-15 15 Concepts ◮ CDF not differentiable for discrete random variables ◮ pmf serves as replacement: f ( x i ) = p i where p i is the pmf probability that x will take on the value x i : P ( x 1 < x ≤ x 2 ) = F ( x 2 ) − F ( x 1 ) � = p i 5. Probability Mass Function (pmf) x 1 < x ≤ x 2 ◮ CDF not differentiable for discrete random variables ◮ pmf serves as replacement: f ( x i ) = p i where p i is the probability that x will take on the value x i : P ( x 1 < x ≤ x 2 ) = F ( x 2 ) − F ( x 1 ) � = p i x 1 < x ≤ x 2 9 / 26

15 Concepts pmf Examples of pmf CS147 Examples of pmf 2015-06-15 ◮ Coin flip: 15 Concepts 1.0 0.5 pmf 0.0 0 1 ◮ Typical CS grad class size: ◮ Coin flip: Examples of pmf 0.3 0.2 0.1 0.0 27 28 29 30 31 32 1.0 0.5 0.0 0 1 ◮ Typical CS grad class size: 0.3 0.2 0.1 0.0 27 28 29 30 31 32 10 / 26

15 Concepts Mean 6. Expected Value (Mean) CS147 6. Expected Value (Mean) 2015-06-15 15 Concepts ◮ Mean: n � ∞ Mean µ = E ( x ) = � p i x i = xf ( x ) dx −∞ i = 1 ◮ Summation if discrete 6. Expected Value (Mean) ◮ Integration if continuous ◮ Mean: � ∞ n � µ = E ( x ) = p i x i = xf ( x ) dx −∞ i = 1 ◮ Summation if discrete ◮ Integration if continuous 11 / 26

15 Concepts Variance 7. Variance CS147 7. Variance 2015-06-15 15 Concepts ◮ Variance: n Var ( x ) = E [( x − µ ) 2 ] � p i ( x i − µ ) 2 = Variance i = 1 � ∞ = ( x − µ ) 2 f ( x ) dx −∞ 7. Variance ◮ Often easier to calculate equivalent E ( x 2 ) − E ( x ) 2 ◮ Usually denoted σ 2 ; square root σ is called standard deviation ◮ Variance: n Var ( x ) = E [( x − µ ) 2 ] � p i ( x i − µ ) 2 = i = 1 � ∞ ( x − µ ) 2 f ( x ) dx = −∞ ◮ Often easier to calculate equivalent E ( x 2 ) − E ( x ) 2 ◮ Usually denoted σ 2 ; square root σ is called standard deviation 12 / 26

15 Concepts Coefficient of Variation 8. Coefficient of Variation (C.O.V. or C.V.) CS147 8. Coefficient of Variation (C.O.V. or C.V.) 2015-06-15 15 Concepts ◮ Ratio of standard deviation to mean: Coefficient of Variation C.V. = σ µ ◮ Indicates how well mean represents the variable 8. Coefficient of Variation (C.O.V. or C.V.) ◮ Ratio of standard deviation to mean: C.V. = σ µ ◮ Indicates how well mean represents the variable 13 / 26

15 Concepts Covariance 9. Covariance CS147 9. Covariance 2015-06-15 15 Concepts ◮ Given x , y with means x and y , their covariance is: Cov ( x , y ) = σ 2 = E [( x − µ x )( y − µ y )] xy Covariance = E ( xy ) − E ( x ) E ( y ) ◮ Two typos on p.181 of book 9. Covariance ◮ High covariance implies y departs from mean whenever x does ◮ Given x , y with means x and y , their covariance is: Cov ( x , y ) = σ 2 = E [( x − µ x )( y − µ y )] xy = E ( xy ) − E ( x ) E ( y ) ◮ Two typos on p.181 of book ◮ High covariance implies y departs from mean whenever x does 14 / 26

15 Concepts Covariance Covariance (cont’d) CS147 Covariance (cont’d) 2015-06-15 15 Concepts ◮ For independent variables, E ( xy ) = E ( x ) E ( y ) so Cov ( x , y ) = 0 Covariance ◮ Reverse isn’t true: Cov ( x , y ) = 0 does NOT imply independence ◮ If y = x , covariance reduces to variance Covariance (cont’d) ◮ For independent variables, E ( xy ) = E ( x ) E ( y ) so Cov ( x , y ) = 0 ◮ Reverse isn’t true: Cov ( x , y ) = 0 does NOT imply independence ◮ If y = x , covariance reduces to variance 15 / 26

15 Concepts Correlation Coefficient 10. Correlation Coefficient CS147 10. Correlation Coefficient 2015-06-15 15 Concepts ◮ Normalized covariance: Correlation ( x , y ) = ρ xy = σ 2 Correlation Coefficient xy σ x σ y ◮ Always lies between -1 and 1 10. Correlation Coefficient ◮ Correlation of 1 ⇒ x ∼ y , -1 ⇒ x ∼ 1 y ◮ Normalized covariance: Correlation ( x , y ) = ρ xy = σ 2 xy σ x σ y ◮ Always lies between -1 and 1 ◮ Correlation of 1 ⇒ x ∼ y , -1 ⇒ x ∼ 1 y 16 / 26

15 Concepts Mean and Variance of Sums 11. Mean and Variance of Sums CS147 11. Mean and Variance of Sums 2015-06-15 15 Concepts ◮ For any random variables, Mean and Variance of Sums E ( a 1 x 1 + · · · + a k x k ) = a 1 E ( x 1 ) + · · · + a k E ( x k ) ◮ For independent variables, 11. Mean and Variance of Sums Var ( a 1 x 1 + · · · + a k x k ) = a 2 1 Var ( x 1 ) + · · · + a 2 k Var ( x k ) ◮ For any random variables, E ( a 1 x 1 + · · · + a k x k ) = a 1 E ( x 1 ) + · · · + a k E ( x k ) ◮ For independent variables, Var ( a 1 x 1 + · · · + a k x k ) = a 2 1 Var ( x 1 ) + · · · + a 2 k Var ( x k ) 17 / 26

CS 147: Computer Systems Performance Analysis Review of Statistics - PowerPoint PPT Presentation

CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Review of Statistics CS 147: Computer Systems Performance Analysis Review of Statistics 1 / 26 15 Concepts Introduction to Statistics CS147 Introduction to Statistics

CS 147: Computer Systems Performance Analysis Approaching Performance Projects 1 / 35 Overview

CS 147: Computer Systems Performance Analysis Comparing Systems and Analyzing Alternatives 1 /

CS 147: Computer Systems Performance Analysis Networks of Queues 1 / 18 Overview CS147

CS 147: Computer Systems Performance Analysis Selecting Techniques 1 / 37 Overview CS147

CS 147: Computer Systems Performance Analysis Advanced Regression Techniques 1 / 31 Overview

CS 147: Computer Systems Performance Analysis Introduction to Queueing Theory 1 / 27 Overview

CS 147: Computer Systems Performance Analysis Higher Designs and Other Considerations 1 / 25

CS 147: Computer Systems Performance Analysis Fractional Factorial Designs 1 / 26 Overview

CS 147: Computer Systems Performance Analysis Multiple and Categorical Regression 1 / 36

CS 147: Computer Systems Performance Analysis Examples Using a Distributed File System 1 / 37

CS 147: Computer Systems Performance Analysis Course Introduction 1 / 35 Overview CS147

CS 147: Computer Systems Performance Analysis Mistakes in Graphical Presentation 1 / 45

CS 147: Computer Systems Performance Analysis Workload Characterization 1 / 31 Overview CS147

CS 147: Computer Systems Performance Analysis Specifics of Graphical Presentation 1 / 35

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

CS 147: Computer Systems Performance Analysis Summarizing Data 1 / 30 Overview CS147 Overview

Outline Outline 2 Joint Cumulative Distribution Function (4.1, Joint Cumulative

Towards a Performance Model for Virtualised Multi-Tier Storage Systems Nicholas Dingle, Peter

Foundations of Chemical Kinetics Lecture 21: Master equations and rates of reaction Marc R.

Introduction to statistics: Foundations Shravan Vasishth Universit at Potsdam

Variance; Continuous Random Variables 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Variance

CSE 143 ADTs: Great Idea, but... How do we actually get modularity, abstraction, ADTs, black

CSSE 220 Inheritance Import Inheritance from the repo Inheritance Sometimes a new class is a

Conference 2019 Break out 2a Open Banking Facilitator Bob Winnington (Money Advice Liaison