[PPT] - CPSC 531: System Modeling and Simulation Carey Williamson PowerPoint Presentation

SLIDE 1

CPSC 531: System Modeling and Simulation

Carey Williamson Department of Computer Science University of Calgary Fall 2017

SLIDE 2

▪ The world a model-builder sees is probabilistic rather than deterministic:

—Some probability model might well describe the variations

Goals: ▪ Review the fundamental concepts of probability ▪ Understand the difference between discrete and continuous random variable ▪ Review the most common probability models Overview

2

SLIDE 3

▪ Probability and random variables

—Random experiment and random variable —Probability mass/density functions —Expectation, variance, covariance, correlation

▪ Probability distributions

—Discrete probability distributions —Continuous probability distributions —Empirical probability distributions

Outline

3

SLIDE 4

▪ Probability and random variables

—Random experiment and random variable —Probability mass/density functions —Expectation, variance, covariance, correlation

▪ Probability distributions

—Discrete probability distributions —Continuous probability distributions —Empirical probability distribution

Outline

4

SLIDE 5

Is widely used in mathematics, science, engineering, finance and philosophy to draw conclusions about the likelihood of potential events and the underlying mechanics of complex systems

▪ Probability is a measure of how likely it is for an event to happen ▪ We measure probability with a number between 0 and 1 ▪ If an event is certain to happen, then the probability of the event is 1 ▪ If an event is certain not to happen, then the probability

f the event is 0

Probability

5

SLIDE 6

▪ An experiment is called random if the outcome of the experiment is uncertain ▪ For a random experiment:

—The set of all possible outcomes is known before the

experiment

—The outcome of the experiment is not known in advance

▪ Sample space Ω of an experiment is the set of all possible outcomes of the experiment ▪ Example: Consider random experiment of tossing a coin twice. Sample space is: Ω = { 𝐼, 𝐼 , 𝐼, 𝑈 , 𝑈, 𝐼 , (𝑈, 𝑈)} Random Experiment

6

SLIDE 7

▪ An event is a subset of sample space Example 1: in tossing a coin twice, E={(H,H)} is the event of having two heads Example 2: in tossing a coin twice, E={(H,H), (H,T)} is the event of having a head in the first toss ▪ Probability of an event E is a numerical measure of the likelihood that event 𝐹 will occur, expressed as a number between 0 and 1, 0 ≤ ℙ 𝐹 ≤ 1

— If all possible outcomes are equally likely: ℙ 𝐹 = 𝐹 /|Ω| — Probability of the sample space is 1: ℙ Ω = 1

Probability of Events

7

SLIDE 8

▪ Probability that two events 𝐵 and 𝐶 occur in a single experiment: ℙ 𝐵 and 𝐶 = ℙ 𝐵 ∩ 𝐶 ▪ Example: drawing a single card at random from a regular deck of cards, probability of getting a red king

—𝐵: getting a red card —𝐶: getting a king —ℙ 𝐵 ∩ 𝐶 =

2 52

Joint Probability

8

SLIDE 9

▪ Two events 𝐵 and 𝐶 are independent if the

ccurrence of one does not affect the occurrence of

the other: ℙ 𝐵 ∩ 𝐶 = ℙ 𝐵 ℙ(𝐶) ▪ Example: drawing a single card at random from a regular deck of cards, probability of getting a red king

—𝐵: getting a red card ⇒ ℙ 𝐵 = 26/52 —𝐶: getting a king ⇒ ℙ 𝐶 = 4/52 —ℙ 𝐵 ∩ 𝐶 =

2 52 = ℙ 𝐵 ℙ 𝐶 ⇒ 𝐵 and 𝐶 are independent

Independent Events

9

SLIDE 10

▪ Events 𝐵 and 𝐶 are mutually exclusive if the occurrence of

ne implies the non-occurrence of the other, i.e., 𝐵 ∩ 𝐶 = 𝜚:

ℙ 𝐵 ∩ 𝐶 = 0 ▪ Example: drawing a single card at random from a regular deck

f cards, probability of getting a red club

— 𝐵: getting a red card — 𝐶: getting a club — ℙ 𝐵 ∩ 𝐶 = 0

▪ Complementary event of event 𝐵 is event [𝑜𝑝𝑢 𝐵], i.e., the event that 𝐵 does not occur, denoted by ҧ 𝐵

— Events 𝐵 and ҧ

𝐵 are mutually exclusive

— ℙ

ҧ 𝐵 = 1 − ℙ(𝐵)

Mutually Exclusive Events

10

SLIDE 11

▪ Union of events 𝐵 and 𝐶: ℙ 𝐵 or 𝐶 = ℙ 𝐵 ∪ 𝐶 = ℙ 𝐵 + ℙ 𝐶 − ℙ 𝐵 ∩ 𝐶 ▪ If 𝐵 and 𝐶 are mutually exclusive: ℙ 𝐵 ∪ 𝐶 = ℙ 𝐵 + ℙ(𝐶) ▪ Example: drawing a single card at random from a regular deck

f cards, probability of getting a red card or a king

— 𝐵: getting a red card ⇒ ℙ 𝐵 = 26/52 — 𝐶: getting a king ⇒ ℙ 𝐶 = 4/52 — ℙ 𝐵 ∩ 𝐶 =

2 52

— ℙ 𝐵 ∪ 𝐶 = ℙ 𝐵 + ℙ 𝐶 − ℙ 𝐵 ∩ 𝐶 =

26 52 + 4 52 − 2 52 = 28 52

Union Probability

11

SLIDE 12

▪ Probability of event 𝐵 given the occurrence of some event 𝐶: ℙ 𝐵 𝐶 = ℙ 𝐵 ∩ 𝐶 ℙ(𝐶) ▪ If events 𝐵 and 𝐶 are independent: ℙ 𝐵 𝐶 = ℙ(𝐵)ℙ(𝐶) ℙ(𝐶) = 𝑄 𝐵 ▪ Example: drawing a single card at random from a regular deck

f cards, probability of getting a king given that the card is red

— 𝐵: getting a red card ⇒ ℙ 𝐵 = 26/52 — 𝐶: getting a king ⇒ ℙ 𝐶 = 4/52 — ℙ 𝐵 ∩ 𝐶 =

2 52

— ℙ 𝐶|𝐵 = ℙ(𝐶∩𝐵)

ℙ(𝐵)

= 2

26 = ℙ 𝐶

Conditional Probability

12

SLIDE 13

▪ A numerical value can be associated with each

utcome of an experiment

▪ A random variable X is a function from the sample space  to the real line that assigns a real number X(s) to each element s of  X:  → R ▪ Random variable takes on its values with some probability Random Variable

13

SLIDE 14

▪ Example: Consider random experiment of tossing a coin twice. Sample space is:  = {(H,H), (H,T), (T,H), (T,T)} Define random variable X as the number of heads in the experiment: X((T,T)) = 0, X((H,T))=1, X((T,H)) = 1, X((H,H))=2 ▪ Example: Rolling a die. Sample space  = {1,2,3,4,5,6). Define random variable X as the number rolled: X(j) = j, 1 ≤ j ≤ 6 Random Variable

14

SLIDE 15

▪ Example: roll two fair dice and observe the outcome Sample space = {(i,j) | 1 ≤ i ≤ 6, 1 ≤ j ≤ 6} i: integer from the first die j: integer from the second die Random Variable

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6) (2,1) (2,2) (2,3) (2,4) (2,5) (2,6) (3,1) (3,2) (3,3) (3,4) (3,5) (3,6) (4,1) (4,2) (4,3) (4,4) (4,5) (4,6) (5,1) (5,2) (5,3) (5,4) (5,5) (5,6) (6,1) (6,2) (6,3) (6,4) (6,5) (6,6) Possible outcomes

15

SLIDE 16

▪ Random variable 𝑌: sum of the two faces of the dice X(i,j) = i+j

—ℙ(X = 12) = ℙ( (6,6) ) = 1/36 —ℙ(X = 10) = ℙ( (5,5), (4,6), (6,4) ) = 3/36

▪ Random variable Y: value of the first die

—ℙ(Y = 1) = 1/6 —ℙ( Y = i) = 1/6, 1 ≤ i ≤ 6

Random Variable

16

SLIDE 17

▪ Discrete

—Random variables whose set of possible values can be

written as a finite or infinite sequence

—Example: number of requests sent to a web server

▪ Continuous

—Random variables that take a continuum of possible values —Example: time between requests sent to a web server

Types of Random Variables

17

SLIDE 18

▪ X: discrete random variable ▪ 𝑞(𝑦𝑗): probability mass function of 𝑌, where 𝑞 𝑦𝑗 = ℙ(𝑌 = 𝑦𝑗) ▪ Properties:

0 ≤ 𝑞 𝑦𝑗 ≤ 1 ෍

𝑦𝑗

𝑞(𝑦𝑗) = 1

Probability Mass Function (PMF)

18

SLIDE 19

▪ Number of heads in tossing three coins

෍

𝑦𝑗

𝑞 𝑦𝑗 = 1 8 + 3 8 + 3 8 + 1 8 = 1

PMF Examples

𝒚𝒋 𝒒(𝒚𝒋) 1/8 1 3/8 2 3/8 3 1/8

 Number rolled in rolling

a fair die

෍

𝑦𝑗

𝑞 𝑦𝑗 = 1 6 + 1 6 + 1 6 + 1 6 + 1 6 + 1 6 = 1

19

SLIDE 20

▪ X: continuous random variable ▪ f(x): probability density function of X ▪ Note:

—ℙ 𝑌 = 𝑦 = 0 !! —𝑔 𝑦 ≠ ℙ 𝑌 = 𝑦 —ℙ 𝑦 ≤ 𝑌 ≤ 𝑦 + Δ𝑦 ≈ 𝑔 𝑦 Δx

▪ Properties:

—ℙ 𝑏 ≤ 𝑌 ≤ 𝑐 = ׬

𝑏 𝑐 𝑔 𝑦 𝑒𝑦

—׬

−∞ +∞ 𝑔 𝑦 𝑒𝑦 = 1

Probability Density Function (PDF)

) ( ) ( x F dx d x f 

CDF of 𝑌

20

SLIDE 21

Example: Life of an inspection device is given by 𝑌, a continuous random variable with PDF: 𝑔 𝑦 =

1 2 𝑓−𝑦

2, for 𝑦 ≥ 0

—𝑌 has an exponential distribution with mean 2 years —Probability that the device’s life is between 2 and 3 years:

ℙ 2 ≤ 𝑌 ≤ 3 = 1 2 න

2 3

𝑓−𝑦

2𝑒𝑦 = 0.14

Probability Density Function

21

SLIDE 22

▪ X: discrete or continuous random variable ▪ 𝐺(𝑦): cumulative probability distribution function of X, or simply, probability distribution function of X 𝐺 𝑦 = ℙ 𝑌 ≤ 𝑦

— If 𝑌 is discrete, then 𝐺 𝑦 = σ𝑦𝑗≤𝑦 𝑞 𝑦𝑗 — If 𝑌 is continuous, then 𝐺 𝑦 = ׬

−∞ 𝑦 𝑔 𝑢 𝑒𝑢

▪ Properties

— 𝐺(𝑦) is a non-decreasing function, i.e., if 𝑏 < 𝑐, then 𝐺 𝑏 ≤ 𝐺(𝑐) —

lim

𝑦→+∞𝐺 𝑦 = 1, and lim 𝑦→−∞𝐺 𝑦 = 0

▪ All probability questions about 𝑌 can be answered in terms of the CDF, e.g.: ℙ 𝑏 < 𝑌 ≤ 𝑐 = 𝐺 𝑐 − 𝐺 𝑏 , for all 𝑏 ≤ 𝑐

Cumulative Distribution Function (CDF)

22

SLIDE 23

Discrete random variable example. ▪ Rolling a die, X is the number rolled

—p(i) = ℙ(X = i) = 1/6, 1 ≤ i ≤ 6 —F(i) = ℙ (X ≤ i)

= p(1) + … + p(i) = i/6 Cumulative Distribution Function

23

SLIDE 24

Continuous random variable example. ▪ The inspection device has CDF:

—The probability that the device lasts for less than 2 years:

ℙ 𝑌 ≤ 2 = 𝐺 2 = 1 − 𝑓−1 = 0.632

—The probability that it lasts between 2 and 3 years:

ℙ 2 ≤ 𝑌 ≤ 3 = 𝐺 3 − 𝐺 2 = (1 − 𝑓−3

2) − 1 − 𝑓−1 = 0.145

Cumulative Distribution Function

2 / 2 /

1 2 1 ) (

x x t

e dt e x F

 

   

24

SLIDE 25

▪ Joint probability distribution of random variables 𝑌 and 𝑍 is defined as

𝐺 𝑦, 𝑧 = ℙ 𝑌 ≤ 𝑦, 𝑍 ≤ 𝑧

▪ 𝑌 and 𝑍 are independent random variables if

𝐺 𝑦, 𝑧 = 𝐺

𝑌 𝑦 ⋅ 𝐺 𝑍 𝑧

—Discrete: 𝑞 𝑦, 𝑧 = 𝑞𝑌 𝑦 ⋅ 𝑞𝑍(𝑧) —Continuous: 𝑔 𝑦, 𝑧 = 𝑔

𝑌 𝑦 ⋅ 𝑔 𝑍(𝑧)

Joint Probability Distribution

25

SLIDE 26

▪ Mean or Expected Value: ▪ Example: number of heads in tossing three coins E[X] = 0 ⋅ p(0) + 1 ⋅ p(1) + 2 ⋅ p(2) + 3 ⋅ p(3) = 1 ⋅ 3/8 + 2 ⋅ 3/8 + 3 ⋅ 1/8 = 12/8 = 1.5 Expectation of a Random Variable

        

 

   

X dx x xf X x p x X E

n i i i

continuous ) ( discrete ) ( ] [

1



26

SLIDE 27

▪ 𝑕(𝑌): a real-valued function of random variable 𝑌 ▪ How to compute 𝐹[𝑕(𝑌)]?

—If 𝑌 is discrete with PMF 𝑞(𝑦):

𝐹 𝑕 𝑌 = ෍

𝑦

𝑕 𝑦 𝑞(𝑦)

—If 𝑌 is continuous with PDF 𝑔 𝑦 :

𝐹 𝑕 𝑌 = න

−∞ +∞

𝑕 𝑦 𝑔 𝑦 𝑒𝑦

▪ Example: 𝑌 is the number rolled when rolling a die

—PMF: 𝑞 𝑦 = 1/6, for 𝑦 = 1,2, … , 6

𝐹 𝑌2 = ෍

𝑦=1 6

𝑦2𝑞 𝑦 = 1 6 1 + 22 + ⋯ + 62 = 91 6 = 15.17

Expectation of a Function

27

SLIDE 28

▪ 𝑌, 𝑍: two random variables ▪ 𝑏, 𝑐: two constants 𝐹 𝑏𝑌 = 𝑏𝐹[𝑌] 𝐹 𝑌 + 𝑐 = 𝐹 𝑌 + 𝑐 𝐹 𝑌 + 𝑍 = 𝐹 𝑌 + 𝐹 𝑍 Properties of Expectation

28

SLIDE 29

▪ Multiplying means to get the mean of a product

▪ Example: tossing three coins

—X: number of heads —Y: number of tails —E[X] = E[Y] = 3/2  E[X]E[Y] = 9/4 —E[XY] = 3/2

 E[XY] ≠ E[X]E[Y] ▪ Dividing means to get the mean of a ratio

Misuses of Expectations

] [ ] [ Y E X E Y X E       

] [ ] [ ] [ Y E X E XY E 

29

SLIDE 30

▪ The variance is a measure of the spread of a distribution around its mean value ▪ Variance is symbolized by 𝑊[𝑌] or 𝑊𝑏𝑠[𝑌] or 𝜏2:

—Mean is a way to describe the location of a distribution —Variance is a way to capture its scale or degree of being

spread out

—The unit of variance is the square of the unit of the original

variable

 𝜏: standard deviation

 Defined as the square root of variance 𝑊[𝑌]  Expressed in the same units as the mean

Variance of a Random Variable

30

SLIDE 31

▪ Variance: The expected value of the square of distance between a random variable and its mean where, μ= E[X] ▪ Equivalently:

σ2 = E[X2] – (E[X])2

Variance of a Random Variable

            

 

   

X dx x f x X x p x X E X V

n i i i

continuous ) ( ) ( discrete ) ( ) ( ] ) [( ] [

2 1 2 2 2

   

31

SLIDE 32

▪ Example: number of heads in tossing three coins E[X] = 1.5 σ2 = (0 – 1.5)2⋅p(0) + (1 – 1.5)2⋅p(1) + (2 – 1.5)2⋅p(2) + (3 – 1.5)2⋅p(3) = 9/4 ⋅ 1/8 + 1/4 ⋅ 3/8 + 1/4 ⋅ 3/8 + 9/4 ⋅ 1/8 = 24/32 = 3/4 Variance of a Random Variable

32

SLIDE 33

▪ Example: The mean of life of the previous inspection device is: ▪ To compute variance of 𝑌, we first compute 𝐹[𝑌2]: ▪ Hence, the variance and standard deviation of the device’s life are: Variance of a Random Variable

2 2 / 2 1 ] [

2 / 2 /

    

  

    

dx e x dx xe X E

x x

xe

2 ] [ 4 2 8 ] [

2

     X V X V 

8 2 2 / 2 2 1 ] [

2 / 2 / 2 2

    

  

    

dx xe x dx e x X E

x x

e x

33

SLIDE 34

▪ 𝑌, 𝑍: two random variables ▪ 𝑏, 𝑐: two constants 𝑊 𝑌 ≥ 0 𝑊 𝑏𝑌 = 𝑏2𝑊 𝑌 𝑊 𝑌 + 𝑐 = 𝑊 𝑌 ▪ If 𝑌 and 𝑍 are independent: 𝑊 𝑌 + 𝑍 = 𝑊 𝑌 + 𝑊 𝑍 Properties of Variance

34

SLIDE 35

▪ Coefficient of Variation: CV = Standard Deviation Mean = 𝜏 𝜈 ▪ Example: number of heads in tossing three coins ▪ Example: inspection device Coefficient of Variation

3 1 2 / 3 4 / 3 CV  

1 CV 2 2 ] [         X E

35

SLIDE 36

▪ Covariance between random variables 𝑌 and 𝑍 denoted by 𝐷𝑝𝑤(𝑌, 𝑍) or 𝜏𝑌,𝑍

2

is a measure of how much 𝑌 and 𝑍 change together 𝜏𝑌,𝑍

2

= 𝐹 𝑌 − 𝐹 𝑌 𝑍 − 𝐹 𝑍 = 𝐹 𝑌𝑍 − 𝐹 𝑌 𝐹[𝑍] ▪ For independent variables, the covariance is zero: 𝐹 𝑌𝑍 = 𝐹 𝑌 𝐹[𝑍] ▪ Note: Although independence always implies zero covariance, the reverse is not true

Covariance

36

SLIDE 37

▪ Example: tossing three coins

—X: number of heads —Y: number of tails —E[X] = E[Y] = 3/2

▪ E[XY]?

—X and Y depend on each other —Y = 3 – X —E[XY] = 0×P(0) + 2×P(2)

= 3/2

▪ 𝜏𝑌,𝑍

2 = E[XY] – E[X]E[Y]

= 3/2 – 3/2 × 3/2 = – 3/4 Covariance

x y xy p(x) 3 1/8 1 2 2 3/8 2 1 2 3/8 3 1/8 xy p(xy) 2/8 2 6/8

37

SLIDE 38

▪ Correlation Coefficient between random variables 𝑌 and 𝑍, denoted by 𝜍𝑌,𝑍, is the normalized value of their covariance: 𝜍𝑌,𝑍 = 𝜏𝑌,𝑍

2

𝜏𝑌𝜏𝑍 ▪ Indicates the strength and direction of a linear relationship between two random variables ▪ The correlation always lies between -1 and +1 ▪ Example: tossing three coins 𝜍𝑌,𝑍 = −3/4 3/4 3/4 = −1

Correlation

1

+1 No correlation Positive linear correlation Negative linear correlation

38

SLIDE 39

▪ Probability and random variables

—Random experiment and random variable —Probability mass/density functions —Expectation, variance, covariance, correlation

▪ Probability distributions

—Discrete probability distributions —Continuous probability distributions —Empirical probability distribution

Outline

39

SLIDE 40

▪ Probability and random variables

—Random experiment and random variable —Probability mass/density functions —Expectation, variance, covariance, correlation

▪ Probability distributions

—Discrete probability distributions —Continuous probability distributions —Empirical probability distribution

Outline

40

SLIDE 41

▪ A random variable 𝑌 has discrete uniform distributed if each of the 𝑜 values in its range, say 𝑦1, 𝑦2, … , 𝑦𝑜, has equal probability. ▪ PMF: 𝑞 𝑦𝑗 = ℙ 𝑌 = 𝑦𝑗 =

1 𝑜

Discrete Uniform Distribution

1 2 3 4 1 4 𝑞(𝑦𝑗)

41

SLIDE 42

▪ Consider a discrete uniform random variable 𝑌 on the consecutive integers 𝑏, 𝑏 + 1, 𝑏 + 2, … , 𝑐, for 𝑏 ≤ 𝑐. Then:

𝐹 𝑌 = 𝑐 + 𝑏 2 𝑊 𝑌 = 𝑐 − 𝑏 + 1 2 − 1 12

Discrete Uniform Distribution

42

SLIDE 43

▪ Consider an experiment whose outcome can be a success with probability 𝑞 or a failure with probability 1 − 𝑞:

— 𝑌 = 1 if the outcome is a success — 𝑌 = 0 if the outcome is a failure

▪ 𝑌 is a Bernoulli random variable with parameter 𝑞

— where 0 ≤ 𝑞 ≤ 1 is the success probability

▪ PMF: p 1 = ℙ(𝑌 = 1) = 𝑞 p 0 = ℙ(𝑌 = 0) = 1 − 𝑞 ▪ Properties:

— 𝐹 𝑌 = 𝑞 and 𝑊 𝑌 = 𝑞(1 − 𝑞)

Bernoulli Trial

43

SLIDE 44

▪ 𝑌: number of successes in 𝑜 (𝑜 = 1,2, … ) independent Bernoulli trials with success probability 𝑞 ▪ 𝑌 is a binomial random variable with parameters 𝑜, 𝑞 ▪ PMF: Probability of having 𝑙 (𝑙 = 0,1,2, … , 𝑜) successes in 𝑜 trials

𝑞 𝑙 = ℙ(𝑌 = 𝑙) = 𝑜 𝑙 𝑞𝑙 1 − 𝑞 𝑜−𝑙 where, 𝑜

𝑙 =

𝑜! 𝑙! 𝑜−𝑙 !

▪ Properties:

— 𝐹 𝑌 = 𝑜𝑞 and 𝑊 𝑌 = 𝑜𝑞(1 − 𝑞)

Binomial Distribution

44

SLIDE 45

Example: Binomial Distribution

Binomial distribution PMF (𝑜 = 10) Binomial distribution CDF (𝑜 = 10)

SLIDE 46

▪ 𝑌: number of Bernoulli trials until achieving the first success ▪ 𝑌 is a geometric random variable with success probability 𝑞 ▪ PMF: probability of 𝑙 (k = 1,2,3, … ) trials until the first success 𝑞 𝑙 = 𝑞 1 − 𝑞 𝑙−1 ▪ CDF: 𝐺 𝑙 = 1 − 1 − 𝑞 𝑙 ▪ Properties: 𝐹 𝑌 =

1 𝑞, and 𝑊 𝑌 = 1−𝑞 𝑞2

Geometric Distribution

46

SLIDE 47

Example: Geometric Distribution

Geometric distribution PMF Geometric distribution CDF

SLIDE 48

▪ Number of events occurring in a fixed time interval

— Events occur with a known rate and are independent

▪ Poisson distribution is characterized by the rate 

— Rate: the average number of event occurrences in a fixed time interval

▪ Examples

— The number of calls received by a switchboard per minute — The number of packets coming to a router per second — The number of travelers arriving to the airport for flight registration

per hour

Poisson Distribution

48

SLIDE 49

Random variable 𝑌 is Poisson distributed with rate parameter 𝜇 ▪ PMF: the probability that there are exactly 𝑙 events in a time interval 𝑞 𝑙 = ℙ 𝑌 = 𝑙 =

𝜇𝑙 𝑙! 𝑓−𝜇, 𝑙 = 0,1,2, …

▪ CDF: the probability of at least 𝑙 events in a time interval 𝐺 𝑙 = ℙ 𝑌 ≤ 𝑙 = ෍

𝑗=0 𝑙 𝜇𝑗

𝑗! 𝑓−𝜇

▪ Properties: 𝐹 𝑌 = 𝜇 𝑊 𝑌 = λ

Poisson Distribution

49

SLIDE 50

Example: Poisson Distribution

Poisson distribution PMF Poisson distribution CDF

SLIDE 51

The number of cars that enter a parking lot follows a Poisson distribution with a rate equal to 𝜇 = 20 cars/hour

—The probability of having exactly 15 cars entering the

parking lot in one hour: p 15 =

2015 15! 𝑓−20 = 0.051649

—The probability of having more than 3 cars entering the

parking lot in one hour: ℙ 𝑌 > 3 = 1 − ℙ 𝑌 ≤ 3 = 1 − 𝑄 0 + 𝑄 1 + 𝑄 2 + 𝑄 3 = 0.9999967

Example: Poisson Distribution

51

SLIDE 52

▪ Probability and random variables

—Random experiment and random variable —Probability mass/density functions —Expectation, variance, covariance, correlation

▪ Probability distributions

—Discrete probability distributions —Continuous probability distributions —Empirical probability distribution

Outline

52

SLIDE 53

▪ A random variable X has continuous uniform distribution on the interval [a,b], if its PDF and CDF are:

—PDF: 𝑔 𝑦 =

1 𝑐−𝑏, for 𝑏 ≤ 𝑦 ≤ 𝑐

—CDF: 𝐺 𝑦 = ൞

𝑦 < 𝑏

𝑦−𝑏 𝑐−𝑏

𝑏 ≤ 𝑦 ≤ b 1 𝑦 > 𝑐 ▪ Properties: 𝐹 𝑌 =

𝑏+𝑐 2 , and 𝑊 𝑌 = 𝑏−𝑐 2 12

Uniform Distribution

CDF PDF

53

SLIDE 54

▪ ℙ(𝑦1 < 𝑌 < 𝑦2) is proportional to the length of the interval 𝑦2 − 𝑦1 ℙ 𝑦1 < 𝑌 < 𝑦2 = 𝑦2 − 𝑦1 𝑐 − 𝑏 ▪ Special case: standard uniform distribution denoted by X~𝑉(0,1)

—Very useful for random number generation in simulations

Uniform Distribution Properties 𝑔 𝑦 = ቊ1, 0 ≤ 𝑦 ≤ 1 0,

therwise

54

SLIDE 55

▪ A random variable 𝑌 is exponentially distributed with parameter 𝜇 if its PDF and CDF are:

—PDF: 𝑔 𝑦 = 𝜇 𝑓−𝜇𝑦, for 𝑦 ≥ 0 —CDF: 𝐺 𝑦 = 1 − 𝑓−𝜇𝑦, for 𝑦 ≥ 0

▪ Properties:

𝐹 𝑌 =

1 𝜇, and 𝑊 𝑌 = 1 𝜇2

▪ The exponential distribution describes the time between consecutive events in a Poisson process of rate 𝜇 Exponential Distribution

55

SLIDE 56

Example: Exponential Distribution

Exponential distribution PDF Exponential distribution CDF

SLIDE 57

▪ Memoryless is a property of certain probability distributions such as exponential distribution and geometric distribution

—future events do not depend on the past events,

but only on the present event

▪ Formally: random variable 𝑌 has a memoryless distribution if ℙ 𝑌 > 𝑢 + 𝑡 𝑌 > 𝑡) = ℙ 𝑌 > 𝑢 , for 𝑡, t ≥ 0 ▪ Example: The probability that you will wait 𝑢 more minutes given that you have already been waiting 𝑡 minutes is the same as the probability that you wait for more than 𝑢 minutes from the beginning!

Memoryless Property

57

SLIDE 58

Example: Exponential Distribution

▪ The time needed to repair the engine of a car is exponentially distributed with a mean time equal to 3 hours.

— The probability that the car spends more than the average wait time in

repair: ℙ 𝑌 > 3 = 1 − 𝐺 3 = 𝑓−3

3 = 0.368

— The probability that the car repair time lasts between 2 to 3 hours is:

ℙ 2 ≤ 𝑌 ≤ 3 = 𝐺 3 − 𝐺 2 = 0.145

— The probability that the repair time lasts for another hour given that it has

already lasted for 2.5 hours: Using the memoryless property of the exponential distribution,

ℙ 𝑌 > 1 + 2.5 |𝑌 > 2.5 = ℙ 𝑌 > 1 = 1 − 𝐺 1 = 𝑓−1

3 = 0.717

58

SLIDE 59

▪ The Normal distribution, also called the Gaussian distribution, is an important continuous probability distribution applicable in many fields ▪ It is specified by two parameters: mean (μ) and variance (𝜏2) ▪ The importance of the normal distribution as a statistical model in natural and behavioral sciences is due in part to the Central Limit Theorem ▪ It is usually used to model system error (e.g. channel error), the distribution of natural phenomena, height, weight, etc.

Normal Distribution

59

SLIDE 60

▪ There are two main reasons for the popularity of the normal distribution:

1. The sum of n independent normal variables is a normal
variable. If,

then has a normal distribution with mean and variance

2. The mean of a large number of independent observations

from any distribution tends to have a normal distribution. This result, which is called central limit theorem, is true for

bservations from all distributions

=> Experimental errors caused by many factors are normal

Why Normal?

) , ( ~

i i i

N X  

 



n i i iX

a X

1

 



n i i i

a

1

 

 



n i i i

a

1 2 2 2

 

60

SLIDE 61

Central Limit Theorem

Histogram plot of average proportion of heads in a fair coin toss, over a large number of sequences of coin tosses.

61

SLIDE 62

▪ Random variable X is normally distribution with parameters 𝜈, 𝜏2 , i.e., 𝑌~𝑂(𝜈, 𝜏2):

— PDF: 𝑔 𝑦 =

1 𝜏 2𝜌 𝑓−1

2 𝑦−𝜈 𝜏 2

, for −∞ ≤ 𝑦 ≤ +∞

— CDF: does not have any closed form! — 𝐹 𝑌 = 𝜈, and 𝑊 𝑌 = 𝜏2

▪ Properties:

— lim

𝑦→±∞𝑔 𝑦 = 0

— Normal PDF is a symmetrical, bell-shaped curve

centered at its expected value 𝜈

— Maximum value of PDF occurs at 𝑦 = 𝜈

Normal Distribution

62

SLIDE 63

▪ Random variable 𝑎 has Standard Normal Distribution if it is normally distributed with parameters (0, 1), i.e., 𝑎~𝑂 0, 1 :

— PDF: 𝑔 𝑦 =

1 2𝜌 𝑓−1

2𝑦2, for −∞ ≤ 𝑦 ≤ +∞

— CDF: commonly denoted by Φ(𝑨):

Φ 𝑨 = 1 2𝜌 න

−∞ 𝑨

𝑓−1

2𝑦2 𝑒𝑦

Standard Normal Distribution

63

SLIDE 64

▪ Evaluating the distribution 𝑌~𝑂 𝜈, 𝜏2 :

— 𝐺 𝑦 = ℙ 𝑌 ≤ 𝑦 ?

Two techniques: 1. Use numerical methods (no closed form) 2. Use the standard normal distribution

—

Φ(𝑨) is widely tabulated

—

Use the transformation 𝑎 =

𝑌−𝜈 𝜏

—

If 𝑌~𝑂(𝜈, 𝜏2) then 𝑎~𝑂(0, 1), i.e., standard normal distribution:

𝐺 𝑦 = ℙ 𝑌 ≤ 𝑦 = ℙ 𝑌 − 𝜈 𝜏 ≤ 𝑦 − 𝜈 𝜏 = ℙ 𝑎 ≤ 𝑦 − 𝜈 𝜏 = Φ 𝑦 − 𝜈 𝜏

Normal Distribution

64

SLIDE 65

▪ Example: The time required to load an oceangoing vessel, 𝑌, is distributed as 𝑂(12, 4)

— The probability that the vessel is loaded in less than 10 hours:

𝐺 10 = Φ 10 − 12 2 = Φ −1 = 0.1587

— Using the symmetry property, F(1) is the complement of F (-1)

Normal Distribution

65

SLIDE 66

Stochastic Process: Collection of random variables indexed over time ▪ Example:

—N(t): number of jobs at the CPU of a computer system over

time

—Take several identical systems and observe N(t) —The number N(t) at any time 𝑢 is a random variable —Can find the probability distribution functions for N(t) at

each possible value of t

▪ Notation: {𝑂 𝑢 : 𝑢 ≥ 0} Stochastic Process

66

SLIDE 67

▪ Counting Process: A stochastic process that represents the total number of events occurred in the time interval [0, 𝑢] ▪ Poisson Process: The counting process {𝑂 𝑢 , 𝑢 ≥ 0} is a Poisson process with rate 𝜇, if:

— 𝑂 0 = 0 — The process has independent increments — The number of events in any interval of length 𝑢 is Poisson distributed

with mean 𝜇𝑢. That is, for all 𝑡, 𝑢 ≥ 0 ℙ 𝑂 𝑢 + 𝑡 − 𝑂 𝑡 = 𝑜 = 𝜇𝑢 𝑜 𝑜! 𝑓−𝜇𝑢

Property: equal mean and variance: 𝐹 𝑂 𝑢 = 𝑊 𝑂 𝑢 = 𝜇𝑢

Poisson Process

67

SLIDE 68

▪ Consider the interarrival times of a Poisson process with rate 𝜇, denoted by 𝐵1, 𝐵2, …, where 𝐵𝑗 is the elapsed time between arrival 𝑗 and arrival 𝑗 + 1

 Interarrival times, 𝐵1, 𝐵2, … are independent identically distributed

exponential random variables having the mean 1/𝜇 ▪ Proof?

Interarrival Times

Arrival counts ~ Poisson(𝜇) Interarrival times ~ Exponential(𝜇)

68

SLIDE 69

▪ Pooling:

— 𝑂1(𝑢): Poisson process with rate 𝜇1 — 𝑂2(𝑢): Poisson process with rate 𝜇2 — 𝑂 𝑢 = 𝑂1 𝑢 + 𝑂2(𝑢): Poisson process with rate 𝜇1 + 𝜇2

▪ Splitting:

— 𝑂(𝑢): Poisson process with rate 𝜇 — Each event is classified as Type I, with probability 𝑞 and Type II, with

probability 1 − 𝑞

— 𝑂1(𝑢): The number of type I events is a Poisson process with rate 𝑞𝜇 — 𝑂2(𝑢): The number of type II events is a Poisson process with rate (1 − 𝑞)𝜇 — Note: 𝑂 𝑢 = 𝑂1(t) + 𝑂2(𝑢)

Splitting and Pooling

N(t) ~ Poisson() N1(t) ~ Poisson(p) N2(t) ~ Poisson((1-p))  p (1-p) N(t) ~ Poisson(1  2) N1(t) ~ Poisson(1) N2(t) ~ Poisson(2) 1  2 1 2 69

SLIDE 70

▪ {𝑂 𝑢 , 𝑢 ≥ 0}: a Poisson process with arrival rate  ▪ Probability of no arrivals in a small time interval ℎ: ℙ 𝑂 ℎ = 0 = 𝑓−𝜇ℎ ≈ 1 − 𝜇ℎ ▪ Probability of one arrivals in a small time interval ℎ: ℙ 𝑂 ℎ = 1 = 𝜇ℎ ⋅ 𝑓−𝜇ℎ ≈ 𝜇ℎ ▪ Probability of two or more arrivals in a small time interval ℎ: ℙ 𝑂 ℎ ≥ 2 = 1 − ℙ 𝑂 ℎ = 0 + ℙ 𝑂 𝑢 = 1 ≈ 0 More on Poisson Distribution

70

SLIDE 71

▪ Probability and random variables

—Random experiment and random variable —Probability mass/density functions —Expectation, variance, covariance, correlation

▪ Probability distributions

—Discrete probability distributions —Continuous probability distributions —Empirical probability distribution

Outline

71

SLIDE 72

▪ A distribution whose parameters are the observed values in a sample of data:

— Could be used if no theoretical distributions fit the data

adequately

—Advantage: no assumption beyond the observed values in

the sample

—Disadvantage: sample might not cover the entire range of

possible values

Empirical Distribution

72

SLIDE 73

▪ “Piecewise Linear” empirical distribution

—Used for continuous data —Appropriate when a large

sample data is available

—Empirical CDF is approximated

by a piecewise linear function:

▪ the ‘jump points’ connected by linear functions

Empirical Distribution

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Piecewise Linear Empirical CDF

73

SLIDE 74

▪ Piecewise Linear empirical distribution

— Organize 𝑌-axis into 𝐿 intervals — Interval 𝑗 is from 𝑏𝑗−1 to 𝑏𝑗 for 𝑗 = 1,2, … , 𝐿 — 𝑞𝑗: relative frequency of interval 𝑗 — 𝑑𝑗: relative cumulative frequency of interval 𝑗, i.e., 𝑑𝑗 = 𝑞1 + ⋯ + 𝑞𝑗 — Empirical CDF:

▪ If 𝑦 is in interval 𝑗, i.e., 𝑏𝑗−1 < 𝑦 ≤ 𝑏𝑗, then: 𝐺 𝑦 = 𝑑𝑗−1 + 𝛽𝑗 𝑦 − 𝑏𝑗−1 where, slope 𝛽𝑗 is given by 𝛽𝑗 = 𝑑𝑗 − 𝑑𝑗−1 𝑏𝑗 − 𝑏𝑗−1

Empirical Distribution

• •

𝑏0 𝑏𝐿 𝑏𝑗−1 𝑏𝑗 𝐿 intervals interval 𝑗

74

SLIDE 75

▪ Suppose the data collected for 100 broken machine repair times are:

Example Empirical Distribution

i Interval (Hours) Frequency Relative Frequency Cumulative Frequency Slope 1 0.0 < x ≤ 0.5 31 0.31 0.31 0.62 2 0.5 < x ≤ 1.0 10 0.10 0.41 0.2 3 1.0 < x ≤ 1.5 25 0.25 0.66 0.5 4 1.5 < x ≤ 2.0 34 0.34 1.00 0.68

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 1.5 2

Piecewise Linear Empirical CDF

Slope 𝛽3 =

𝑑3−𝑑2 𝑏3−𝑏2 = 0.5

𝑏0 𝑏2 𝑏3 𝑏4 𝑏1 75