Telematics 2 & Performance Evaluation Chapter 9 Short - - PDF document

telematics 2 performance evaluation
SMART_READER_LITE
LIVE PREVIEW

Telematics 2 & Performance Evaluation Chapter 9 Short - - PDF document

Telematics 2 & Performance Evaluation Chapter 9 Short Probability Primer & Obtaining Results (Acknowledgement: these slides have mostly been compiled from [Kar04, Rob01, BLK02]) Telematics 2 / Performance Evaluation (WS 17/18): 09


slide-1
SLIDE 1

1 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Telematics 2 & Performance Evaluation

Chapter 9

Short Probability Primer & Obtaining Results

(Acknowledgement: these slides have mostly been compiled from [Kar04, Rob01, BLK02]) 2 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Motivation

q Recall:

q Usually, a simulation program is build in order to help answer a series of

(quantitative) questions about a system under study

q For this, a model (the simulation program) is build q This model needs to represent the system under study with sufficient

accuracy concerning the characteristics of the system that are relevant for the questions of interest

q Typically, this requires modeling of the following:

q System q Load q Faults

slide-2
SLIDE 2

3 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

System, Load, and Fault Models (1)

q System model:

q Describes the composition of an entire system out of simpler subsystems q Represents the behavior, the way a system works

q In communication networks:

q Entities communicate by exchanging messages over links q Protocols implemented in entities, can have parameters like processing

delays, limited queue length

q Links can have parameters like bandwidth and delay (compiled from [Kar04]) 4 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

System, Load, and Fault Models (2)

q Load model:

q Describes the pattern with which requests are made to the system to

perform different kinds of activities

■ When do such requests occur? ■ What is the time between requests? ■ What are the parameters of such a request?

q In communication networks:

q Load is the desire of a user to send a packet to another user

■ Example parameter: How big is the packet?

q On a coarser abstraction level:

■ How often are connections established? ■ How much data is transmitted within a connection? ■ What is the mix between different types of connections (QoS) within a

session?

■ ...

slide-3
SLIDE 3

5 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

System, Load, and Fault Models (3)

q Fault model:

q Describes which parts of a system can deviate from their

proscribed/desired behavior, and in which form

■ When do such deviations occur? ■ Can they be repeated (are faulty entities repaired)? ■ How long do deviations last? ■ Which kind of faulty behavior occurs?

q In communication networks:

q Entities can be faulty (e.g., a node can crash, often considered a

permanent error)

q Communication links can be faulty (e.g., some bits are not transmitted

correctly, due to electromagnetic noise, usually considered a transient error)

6 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

System, Load, and Fault Models (4)

q Neither the arrival of requests nor the occurrence of faults can be

described deterministically

q Random distributions needed to model these events along with their

parameters:

q What distributions are available, appropriate and easy to use in

simulations?

q How can random numbers be generated such that the simulated events

  • ccur according to these distributions? (Based on general-purpose

random number generators?)

q How well do standard distributions match observed behavior of real

(communication) systems? How to choose parameters for distributions to model real systems?

q What to do if no simple standard distributions can be found that match a

real systems behavior?

q Is it sufficient to just look at distributions?

q By the way, what exactly is a distribution???

q We will first review some basics of probability...

slide-4
SLIDE 4

7 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Short Review of Some Probability Basics

q Probability is a numerical measure of the likelihood that an

event will occur.

q Probability values are always assigned on a scale from 0 to 1. q A probability near 0 indicates an event is very unlikely to occur. q A probability near 1 indicates an event is almost certain to

  • ccur.

q A probability of 0.5 indicates the occurrence of the event is just

as likely as it is unlikely.

1 .5 Increasing Likelihood of Occurrence Probability: The occurrence of the event is just as likely as it is unlikely.

(compiled from [Rob01]) 8 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

An Experiment and Its Sample Space

q An experiment is any process that generates well-defined

  • utcomes.

q The sample space for an experiment is the set of all

experimental outcomes.

q A sample point is an element of the sample space, any one

particular experimental outcome.

q Examples:

Experiment

q Draw a card from a pack q Telephone sales call q First number drawn in

National Lottery Outcomes

q {Ace hearts, 2 hearts,

…, King of spades}

q Sale or no sale q {1,2,3,…,49}

slide-5
SLIDE 5

9 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Constructing Sample Spaces

q A good way to construct the sample space is to write down examples

  • f typical outcomes and try to identify the complete set

q Example:

Toss a coin four times One typical outcome is four consecutive heads (H,H,H,H), another is a head, tail, head and head (H,T,H,H) A little thought results in identifying the sample space as the set of all such 4-tuples S={ (H,H,H,H),(H,H,H,T), (H,H,T,H), (H,T,H,H), (T,H,H,H), (H,H,T,T), (H,T,H,T), (H,T,T,H), (T,H,T,H), (T,T,H,H), (T,H,H,T), (H,T,T,T), (T,H,T,T), (T,T,H,T), (T,T,T,H), (T,T,T,T) }

10 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

A Counting Rule for Multiple-Step Experiments

q If an experiment consists of a sequence of k steps in which

there are n1 possible results for the first step, n2 possible results for the second step, and so on, then the total number of experimental outcomes is given by (n1)(n2) . . . (nk)

q A helpful graphical representation of a multiple-step experiment

is a tree diagram

slide-6
SLIDE 6

11 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Another useful counting rule enables us to count the number of experimental outcomes when n objects are to be selected from a set of N objects

q Number of combinations of N objects taken n at a time

where N! = N(N - 1)(N - 2) . . . (2)(1) n! = n(n - 1)( n - 2) . . . (2)(1) 0! = 1

Counting Rule for Combinations

)! ( ! ! n N n N n N C

N n

  • =

÷ ÷ ø ö ç ç è æ =

12 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Counting Rule for Permutations

A third useful counting rule enables us to count the number of experimental outcomes when n objects are to be selected from a set of N objects where the order of selection is important

q Number of permutations of N objects taken n at a time:

n)! (N N! n N n! P

N n

  • =

÷ ÷ ø ö ç ç è æ =

slide-7
SLIDE 7

13 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Assigning Probabilities

q Classical Method:

q Assigning probabilities based on the assumption of equally likely

  • utcomes

q Relative Frequency Method:

q Assigning probabilities based on experimentation or historical data

q Subjective Method:

q Assigning probabilities based on the assignor’s judgment q Applied in economics and related sciences 14 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Classical Method

If an experiment has n possible outcomes, this method would assign a probability of 1/n to each outcome.

q Example:

Experiment: Rolling a die Sample Space: S = {1, 2, 3, 4, 5, 6} Probabilities: Each sample point has a 1/6 chance

  • f occurring.
slide-8
SLIDE 8

15 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Relative Frequency of an Outcome

q Suppose that, in a large number of repetitions, N, of

the experiment, the outcome O, occurs times. The relative frequency of O is, We can think of the probability of O as the value to which the relative frequency settles down as N gets larger and larger.

O

n

N nO

16 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Events and Their Probability

q An event is a collection of sample points q The probability of any event is equal to the sum of the

probabilities of the sample points in the event

q If we can identify all the sample points of an experiment and

assign a probability to each, we can compute the probability of an event

q There are some basic probability relationships that can be used

to compute the probability of an event without knowledge of all the sample point probabilities:

q Complement of an Event q Union of Two Events q Intersection of Two Events q Mutually Exclusive Events

slide-9
SLIDE 9

17 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Complement of an Event

q The complement of event A is defined to be the event consisting

  • f all sample points that are not in A

q The complement of A is denoted by Ac q The Venn diagram below illustrates the concept of a

complement

Event A Ac Sample Space S

18 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

q The union of events A and B is the event containing all sample

points that are in A or B or both

q The union is denoted by A È B q The union of A and B is illustrated below

Sample Space S Event A Event B Union of Two Events

slide-10
SLIDE 10

19 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Intersection of Two Events

q The intersection of events A and B is the set of all sample points

that are in both A and B

q The intersection is denoted by A Ç B q The intersection of A and B is the area of overlap in the

illustration below

Sample Space S Event A Event B Intersection

20 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Addition Law

q The addition law provides a way to compute the probability of

event A, or B, or both A and B occurring

q It is written as:

P(A È B) = P(A) + P(B) - P(A Ç B)

slide-11
SLIDE 11

21 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Mutually Exclusive Events

q Two events are said to be mutually exclusive if the events have

no sample points in common

q That is, two events are mutually exclusive if, when one event

  • ccurs, the other cannot occur

Sample Space S Event B Event A

q Addition Law for Mutually Exclusive Events:

P(A È B) = P(A) + P(B)

22 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Conditional Probability

q The probability of an event given that another event has

  • ccurred is called a conditional probability

q The conditional probability of A given B is denoted by P(A|B) q A conditional probability is computed as follows:

P P P ( | ) ( ) ( ) A B A B B = Ç

slide-12
SLIDE 12

23 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Multiplication Law

q The multiplication law provides a way to compute the probability

  • f an intersection of two events

q The law is written as:

P(A Ç B) = P(B)P(A|B)

24 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Independent Events

q Events A and B are independent if P(A|B) = P(A) q Multiplication Law for Independent Events

P(A Ç B) = P(A)P(B)

q The multiplication law also can be used as a test to see if two

events are independent

slide-13
SLIDE 13

25 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Random Variables

q A random variable is a numerical expression of the outcome of an

experiment

q Example 1: toss a die twice; count the number of times the number 4

appears (0, 1 or 2 times)

q Example 2: toss a coin; assign $10 to head and -$30 to a tail

q Discrete random variable:

q Can only take discrete values q Obtained by counting (0, 1, 2, 3, etc.) q Usually a finite number of different values q E.g., toss a coin 5 times; count the number of tails (0, 1, 2, 3, 4, or 5 times) (compiled from [BLK02]) 26 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Discrete Probability Distribution

q A discrete probability distribution is the list of all possible [Xj , P(Xj)]

pairs, with:

q Xj = Value of random variable q P(Xj) = Probability associated with value q This list can be visualized with a histogram

q Mutually Exclusive (Nothing in Common) q Collective Exhaustive (Nothing Left Out)

( ) ( )

1 1

j j

P X P X £ £ =

å

slide-14
SLIDE 14

27 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Basic Summary Measures (1)

q Expected Value (The Mean): q Weighted average of the probability distribution q E.g., toss 2 coins, count the number of tails, compute expected

value:

( )

( )

j j j

E X X P X µ = =å

( )

( )( ) ( )( ) ( )( )

.25 1 .5 2 .25 1

j j j

X P X µ = = + + =

å

28 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Basic Summary Measures (2)

q Variance:

q Weighted average squared deviation about the mean q E.g., Toss 2 coins, count number of tails, compute variance:

q The standard deviation is the square root of the variance:

( )

( ) ( )

2 2 2 j j

E X X P X s µ µ é ù =

  • =
  • ë

û å

( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

2 2 2 2 2

1 .25 1 1 .5 2 1 .25 .5

j j

X P X s µ =

  • =
  • +
  • +
  • =

å

2

s s =

slide-15
SLIDE 15

29 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Other Summary Measures

q Coefficient of Variation:

q Normalizes standard deviation:

q Median & Percentiles

q Pro: Better protection against outliers q Con: Better protection against outliers

) ( ) ( X E X C s =

30 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Binomial Probability Distribution

q n Identical Trials

q E.g., 15 tosses of a coin; 10 light bulbs taken from a warehouse

q 2 Mutually Exclusive Outcomes on Each Trial

q E.g., Heads or tails in each toss of a coin; defective or not defective light

bulb

q Trials are Independent

q The outcome of one trial does not affect the outcome of the other

q Constant Probability for Each Trial

q E.g., Probability of getting a tail is the same each time we toss the coin

q 2 Sampling Methods

q Infinite population without replacement q Finite population with replacement

slide-16
SLIDE 16

31 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Binomial Probability Distribution Function P X

( ) =

n! X ! n − X

( )!

p X 1− p

( )

n−X

P X

( ): probability of X successes given n and p

X : number of "successes" in sample X = 0,1,,n

( )

p : the probability of each "success" n : sample size

Tails in 2 Tosses of Coin X P(X) 1/4 = .25 1 2/4 = .50 2 1/4 = .25

32 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Binomial Distribution Characteristics

q Mean

q E.g.,

q Variance and

Standard Deviation

q q E.g.,

( )

E X np µ = =

( )

5 .1 .5 np µ = = =

n = 5 p = 0.1

.2 .4 .6 1 2 3 4 5 X P(X)

( ) ( )( )

1 5 .1 1 .1 .6708 np p s =

  • =
  • =

( ) ( )

2

1 1 np p np p s s =

  • =
slide-17
SLIDE 17

33 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Poisson Distribution

Siméon Poisson

34 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Poisson Distribution

q Discrete events (successes) occurring in a given area of

  • pportunity (interval)

q Interval can be time, length, surface area, etc.

q The probability of a success in a given interval is the same for

all the intervals

q The number of successes in one interval is independent of

the number of successes in other intervals

q The probability of two or more successes occurring in an

interval approaches zero as the interval becomes smaller

q E.g., # customers arriving in 15 minutes q E.g., # defects per case of light bulbs

slide-18
SLIDE 18

35 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Poisson Probability Distribution Function

( ) ( )

! : probability of "successes" given : number of "successes" per unit : expected (average) number of "successes" : 2.71828 (base of natural logs)

X

e P X X P X X X e

ll

l l

  • =

E.g., Find the probability of 4 customers arriving in 3 minutes when the mean is 3.6.

( )

3.6 4

3.6 .1912 4! e P X

  • =

=

Warning: Sometimes the definition is P(X) = (λt)X

X! e−λt

36 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Poisson Distribution Characteristics

q Mean q Standard Deviation

and Variance

( ) ( )

1 N i i i

E X X P X µ l

=

= = =å

l = 0.5 l = 6

.2 .4 .6 1 2 3 4 5 X P(X) .2 .4 .6 2 4 6 8 10 X P(X)

2

s l s l = =

slide-19
SLIDE 19

37 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Continuos Random Variables

q Continuous random variable:

q Values from interval of numbers (absence of gaps) q This implies, that for every single value, the probability that the variable

takes on this value is 0

q Continuous probability distribution:

q Distribution of continuous random variable q Transition from a histogram to a continuous function q The probability that the random variable takes on a value in the interval

[a, b] is the area under the function between the values a and b

q Probability density function:

q The derivation of the probability distribution

q Important continuous probability distributions:

q The uniform distribution q The exponential distribution q The normal distribution 38 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

The Uniform Distribution

q Properties:

q The probability of occurrence of a value is equally likely to occur anywhere

in the range between the smallest value a and the largest value b

q Also called the rectangular distribution q Mean: q Variance:

( )

2 a b µ + =

( )

2 2

12 b a s

  • =
slide-20
SLIDE 20

39 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

The Uniform Distribution

q The probability density function of the uniform distribution q Application: Selection of random numbers

q E.g., A wooden wheel is spun on a horizontal surface and allowed to

come to rest. What is the probability that a mark on the wheel will point to somewhere between the North and the East?

( ) ( )

1 if f X a X b b a = £ £

  • (

)

90 90 0.25 360 P X < < = =

  • 40

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Exponential Distributions

( )

arrival time 1 : any value of continuous random variable : the population average number of arrivals per unit of time 1/ : average time between arrivals 2.71828

X

P X e X e

l

l l

  • <

= - =

E.g., Drivers arriving at a toll bridge; customers arriving at an ATM q Sometimes also called negative exponential distribution

slide-21
SLIDE 21

41

Warning: Sometimes the definition is . Then

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Exponential Distributions

q Describe Time or Distance between Events

q Used for queues

q Density Function

q

q Parameters

q

f(X) X l = 0.5 l = 2.0

( )

1

x

f x e l l

  • =

µ l s l = =

l 1 ) ( = T E

t

e t f

l

l

  • ×

= ) (

42 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Exponential Distributions: Example

( ) ( )

( )

( )

30 5/60

30 5/ 60 hours arrival time > 1 arrival time 1 1 .0821 X P X P X e l

  • =

= = - £ = -

  • =

q Customers arrive at the checkout line of a supermarket at the rate

  • f 30 per hour.

q What is the probability that the arrival time between consecutive

customers will be greater than 5 minutes?

slide-22
SLIDE 22

43

The Beta distribution

q Often use in the absence of better knowledge q Distributed between two distinct points q PDF: q The only

normalizes the function

q For uniform

distribution

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

f(x) = xα−1 (1 − x)β−1 B(α, β)

B(α, β)

α = β = 1

[Stolen from Wikipedia]

44

The Pareto distribution

q Used to model systems with “preferential attachment” or “riches get

richer” effects

q Distributed in q PDF for q For low values of k, i.e. <= 2,

the distribution is heavy-tailed

q Used for traffic modelling q Sizes of ASes etc.

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

[xmin, ∞]

f(x) = ( k xk

min x−k−1

x ≥ xmin x < xmin

k > 0, xmin > 0

0,8 1,6 2,4 3,2 4 0,8 1,6 2,4 3,2 4

E(X) = V ar(X) = ∞

slide-23
SLIDE 23

45

The Pareto distribution

q Often plotted in log-log-scale

q Observed variables follow a straight line q Following the “power law” Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

1 10 0,01 0,1

46 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

The Normal Distribution

q Properties: q Bell Shaped q Symmetrical q Mean, Median and

Mode are Equal

q Random Variable

Has Infinite Range

Mean Median Mode X f(X) µ

slide-24
SLIDE 24

47 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

The Mathematical Model

( )

( )

( ) ( )

2

(1/ 2) /

1 2 : density of random variable 3.14159; 2.71828 : population mean : population standard deviation : value of random variable

X

f X e f X X e X X

µ s

ps p µ s

  • é

ù ë û

= » »

  • ¥ <

< ¥

48 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Many Normal Distributions

Varying the parameters s and µ, we obtain different normal distributions There are an infinite number of Normal distributions (as for all previous)

slide-25
SLIDE 25

49 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

The Standardized Normal Distribution

When X is normally distributed with a mean and a standard deviation , follows a standardized (normalized) normal distribution with a mean 0 and a standard deviation 1.

X Z µ s

  • =

X

f(X)

µ

Z

s

Z

µ =

1

Z

s =

f(Z)

µ

s

50 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Finding Probabilities

Probability is the area under the curve!

c d

X f(X)

( )

? P c X d £ £ =

slide-26
SLIDE 26

51 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Which Table to Use?

Infinitely many Normal distributions would mean infinitely many tables to look up!

52 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Solution: The Cumulative Standardized Normal Distribution

Z .00 .01 0.0 .5000 .5040 .5080 .5398 .5438 0.2 .5793 .5832 .5871 0.3 .6179 .6217 .6255

.5478

.02 0.1

.5478

Cumulative Standardized Normal Distribution Table (Portion)

Probabilities

Only One Table is Needed

1

Z Z

µ s = =

Z = 0.12

slide-27
SLIDE 27

53 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Standardizing Example

6.2 5 0.12 10 X Z µ s

  • =

= =

Normal Distribution Standardized Normal Distribution

10 s =

1

Z

s =

5 µ =

6.2 X Z

Z

µ =

0.12

54 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Example:

Normal Distribution Standardized Normal Distribution

10 s =

1

Z

s =

5 µ =

7.1

X Z

Z

µ =

0.21

2.9 5 7.1 5 .21 .21 10 10 X X Z Z µ µ s s

  • =

= = - = = =

2.9

0.21

  • .0832

( )

2.9 7.1 .1664 P X £ £ =

.0832

slide-28
SLIDE 28

55 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Z .00 .01 0.0 .5000 .5040 .5080 .5398 .5438 0.2 .5793 .5832 .5871 0.3 .6179 .6217 .6255

.5832 .02 0.1

.5478

Cumulative Standardized Normal Distribution Table (Portion)

1

Z Z

µ s = =

Z = 0.21 Example:

( )

2.9 7.1 .1664 P X £ £ =

56 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Z .00 .01

  • 0.3

.3821 .3783 .3745 .4207 .4168

  • 0.1 .4602

.4562 .4522 0.0 .5000 .4960 .4920 .4168 .02

  • 0.2

.4129 Cumulative Standardized Normal Distribution Table (Portion)

1

Z Z

µ s = =

Z = -0.21

Example:

( )

2.9 7.1 .1664 P X £ £ =

slide-29
SLIDE 29

57 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Back to Simulation...

q When performing a simulation study, we are usually interested in

  • btaining answers to some quantitative questions

q One very common question in this respect is:

q What is the mean value of some defined metric of the system under

study?

q Example:

q Packets arrive with an average inter-arrival time of 1/ l at a router q The router has two outgoing links and arriving packet joins link i with

probability fi

q The service time on link i is µi

µ1 µ2

l

(mostly compiled from [Tow04]) 58 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Gathering Performance Statistics

q Average delay at queue i: q Record Dij : delay of customer j at queue i q Let Ni be # customers passing through queue i i i i

T N g =

q Average queue length at i: i

N

total simulated time

q Throughput at queue i, gi = i N j ij i

N D T

i

å

=

=

1

Littles Law (treated later in detail)

slide-30
SLIDE 30

59 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Analyzing Output Results (1)

q Each time we run a simulation (using different random number

streams), we will get different output results!

distribution of random numbers to be used during simulation (interarrival, service times) random number sequence 1 simulation

  • utput results 1

input

  • utput

random number sequence 2 simulation

  • utput results 2

input

  • utput

random number sequence M simulation

  • utput results M

input

  • utput

… … … … … …

60 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Analyzing Output Results (2)

q Example: Delay

experienced by each customer in queue 2

q W2,n: delay of nth

departing customer from queue 2

µ1 µ2

l

slide-31
SLIDE 31

61 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Analyzing Output Results (3)

q Each run shows variation in

customer delay

q One run different from next q Statistical characterization of

delay must be made

q Expected delay of n-th

customer

q Behavior as n approaches

infinity

q Average of n customers

µ1 µ2

l

62 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Transient Behavior

q Simulation outputs that depend on initial condition (i.e., output value

changes when initial conditions change) are called transient characteristics

q Early part of simulation q Example: The first customer entering a supermarket in the morning

will always find an empty queue at the counter

q Later part of simulation less dependent on initial conditions

µ1 µ2

l

slide-32
SLIDE 32

63 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Effect of Initial Conditions

q Histogram of delay of 20th

customer, given initially empty (1000 runs)

q Histogram of delay of 20th

customer, given non-empty conditions

64 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Steady state behavior

q Output results may converge to limiting steady state value if

simulation run long enough

q Discard statistics gathered during transient phase, e.g., ignore first k

measurements of delay at queue 2

  • avg. delay
  • f packets

[n, n+10]

k N D T

i N k j ij i

i

  • =

å

=

q Pick k so statistic is approximately

the same for different random number streams and remains same as n increases

  • avg. of 5 simulations

Knee

slide-33
SLIDE 33

65 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Estimating the Transient Phase (1)

q Perform m independent replications containing n observations each:

{ }

m i j i

n j X

1 ,

..., , 1 ,

=

=

q Calculate mean across all replications:

n j X m X

m i j i j

..., , 1 , 1

1 ,

= =

å

=

q Calculate overall mean across all replications:

å

=

=

n j j

X n X

1

1

q Remove first k observations when calculating overall mean:

1 ..., , 1 , 1

1

  • =
  • =

å

+ =

n k X k n X

n k j j k

66 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Estimating the Transient Phase (2)

q Compute and plot as a function of k:

X X X k -

q Identify the position k0 of the knee and discard the first k0

  • bservations in subsequent runs

q In order to facilitate computation of , it is better to

record the running sum of all prior values instead of recording the values themselves:

q Individual values can still be easily obtained by simple subtraction q With this we get:

å

=

=

k j j i k i

X S

1 , ,

å

=

=

m i n i

S m n X

1 ,

1 1

and

( )

å

=

  • =

m i k i n i k

S S m k n X

1 , ,

1 1

k

X

slide-34
SLIDE 34

67 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals (1)

q Run simulation: get estimate X1 as estimate of performance metrics

  • f interest

q Repeat simulation M times (each with new set of random numbers),

get X2, … XM – all different!

q Which of X1, … XM is right? q Intuitively, average of M samples should be better than choosing

any one of M samples

M X X

M j j

å

=

=

1

How confident are we in X?

68 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals (2)

q We can not get perfect estimate of true mean, µ, with finite # samples q Instead, we look for bounds: find c1 and c2 such that:

Probability(c1 < µ < c2) = 1 – a [c1,c2]: confidence interval 100(1-a)%: confidence level

q One approach for finding c1, c2 (suppose a = 0.1) q Take k samples (e.g., k independent simulation runs) q Sort q Find largest value is smallest 5% c1 q Find smallest value in largest 5% c2

slide-35
SLIDE 35

69 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

q Central Limit Theorem: If samples X1, … XM are independent and from

same population (independent identically distributed, i.i.d.) with population mean µ and standard deviation s (both finite!), then

M X X

M j j

å

=

=

1

is approximately normally distributed with mean µ and standard deviation

the sample mean:

M s

Confidence Intervals – The Central Limit Theorem

70 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals (3)

q However, we usually do not know the populations standard

deviation

q So, we estimate it using the samples (observed) standard

deviation:

s2 = 1 M −1 (Xm − X)2

m=1 M

q Given

we can now find upper and lower tails of normal distributions containing a ⨉ 100% of the mass

X,s2

slide-36
SLIDE 36

71 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

q If we calculate confidence

intervals as in the above recipe for a = 0.05, 95% of the confidence intervals thus computed will contain the true (unknown) population mean

Interpretation of Confidence Intervals

72 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Computing a Samples Variance Simplified

q Computing the sum of squared samples leads to simpler computation:

s2 = 1 M −1 (Xi − X)2

i=1 M

= 1 M −1 Xi

2 − 2XiX + X 2

( )

i=1 M

= 1 M −1 Xi

2 − 2X

Xi + X 2 1

i=1 M

i=1 M

i=1 M

# $ % & ' ( = 1 M −1 Xi

2 − 2XMX i=1 M

+ MX 2 # $ % & ' ( = Xi

2 − MX 2 i=1 M

M −1

But: This may lead to numerical problems, if numbers get big and variance is small!

slide-37
SLIDE 37

73 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Computing a Samples Variance (Numerically Stable)

q In 1962, B. P. Welford proposed a method in an article that has been

incorporated into Donald Knuths book „The Art of Programming (2nd volume, page 232, 3rd edition): For 2 ≤ k ≤ n, the kth estimate of the variance is:

M1 = x1; S1 = 0 Mk = Mk−1 + xk − Mk−1

( )

k Sk = Sk−1 + xk − Mk−1

( )⋅ xk − Mk ( )

s2 = Sk k −1

( )

(see also: http://www.johndcook.com/standard_deviation.html)

74 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

s2 = 1 M −1 (Xm − X)2

m=1 M

q Given samples X1, … XM, (e.g., having repeated simulation M

times), compute

M X X

M j j

å

=

=

1

95% confidence interval:

X ±1.96s M

Confidence Intervals: The Recipe (1)

slide-38
SLIDE 38

75 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals: The Recipe (2)

q Why does this work? q If the Xi have been observed running the same simulation program

with different seed values for the random number generation (with enough distance between the seed values), then they are independent and identically distributed (i.i.d.) with some mean µ and variance s2

q With

we have:

å

=

=

M i i M

X M X

1

1

[ ]

ú û ù ê ë é = ú û ù ê ë é =

å å

= = M i i M i i M

X E M X M E X E

1 1

1 1

[ ]

µ µ = = =

å

=

M M X E M

M i i

1 1

1

76 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals: The Recipe (3)

q Furthermore,

and

q Because of the central limit theorem, for large M the XM are normal

distributed with expected value µ and standard deviation

q In practice, depending on your setting, this gives a good estimation

already for M>30

2 1

] [ s M X VAR M X VAR

i M i i

= × = ú û ù ê ë éå

=

[ ]

M M i i

X VAR M M M X M VAR = = = ú û ù ê ë é

å

= 2 2 2 1

1 1 s s

M s

slide-39
SLIDE 39

77 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals: The Recipe (4)

q Consider now the random variable q Z is normal distributed with mean 0 and variance 1 q We now look for and so that q From the table of the N(0,1) normal distribution we obtain the value

for a given a (~the amount of confidence we are aiming at)

q For a = 0.05, we obtain

M X Z

M

s µ

  • =

2 a

Z

  • 2

a

Z a

a a

  • =

ú û ù ê ë é £ £

  • 1

2 2

Z Z Z P

2 a

Z 96 . 1

2 = a

Z

78 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals: The Recipe (5)

q This leads to our formula for the confidence interval

a s µ s a s µ s a s µ a

a a a a a a a a

  • =

ú û ù ê ë é + £ £

  • Û
  • =

ú û ù ê ë é +

  • £
  • £
  • Û
  • =

ú ú ú û ù ê ê ê ë é £

  • £
  • Û
  • =

ú û ù ê ë é £ £

  • 1

1 1 1

2 2 2 2 2 2 2 2

M Z X M Z X P M Z X M Z X P Z M X Z P Z Z Z P

M M M M M

M X

X

s 96 . 1 ±

slide-40
SLIDE 40

79 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

q In order to obtain enough i.i.d. observations, we have two simple

alternatives (others exist):

q Independent replications: perform M independent runs with different

seeds (also remember to delete k observations from transient phase):

■ However, if we want to use the central limit theorem, we should at

least perform 30 independent simulation runs...

■ Alternatively, we can use a student-t distribution instead of the

normal distribution if M<30 (see next slides)

q Batch means: take a single run, delete first k observations, divide

remainder into n groups and obtain Xi for i-th group, i = 1,…,n

■ Follow procedure for independent replications ■ Complication due to non-independence of Xis ■ Potential efficiency gain due to deletion of only k observations

Generating Confidence Intervals for Steady State Measures

80 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals with Student-T Distribution (1)

q When the sample size is relatively small, however, the approximation

  • f the confidence interval with the normal distribution is poor

q In this case, one can make use of the student-t distribution q If X is the sample mean of a random sample of size M from a random

population having the mean µ and standard deviation S, then the random variable has student-t distribution with M-1 degrees of freedom

q Thus, the 100(1-a)% confidence interval of µ can be obtained as

T = X −µ S M X −tM−1;α 2 S M < µ < X +tM−1;α 2 S M

(source: [Tri02])

slide-41
SLIDE 41

81 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results 82 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Confidence Intervals with Student-T Distribution (2)

q In the above formula, is defined such that the area under the

t probability density function to its right is equal to , that is

q This value can be read from a table (see next slide) q Example:

q Estimate the average execution time of a program, that was run six time

with randomly chosen data sets, obtaining sample mean X = 230 ms and a sample standard deviation of s = 14

q To obtain a 98% confidence interval of the true mean execution time µ, we

read t 5;0.01 from the table of student-t distribution with n-1 = 5 degrees of freedom to be 3.365, leading to the following confidence interval:

2 ; 1 a

  • n

t

2 a

2 ) (

2 ; 1

a

a

= >

  • n

t T P

6 14 365 . 3 230 6 14 365 . 3 230 × + < < ×

  • µ

233 . 249 767 . 210 < < µ

  • r

(with 98% confidence)

slide-42
SLIDE 42

83 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Important Values of the Student-T Distribution

dfa = degrees of freedom

84

Thought experiment: Medians and confidence intervals

q Can we use confidence intervals to estimate the position of the true

median of our samples?

q Theory: Why not?

q Median and average value are equal in normal distribution

➡ Same confidence intervals

q Practice: Why would we use the median then? q The median is usually used if the distribution is unknown!

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

slide-43
SLIDE 43

85

Confidence intervals for median values (I)

q Given n independent samples of some experimental outcome (no

assumption on distribution)

q Observation: The (n/2)th value (sample median) is a good estimator for

the true median of the distribution

q Confidence? q By definition independent samples are either larger or smaller than the

true median with probability 0.5

q Probability of each sample being bigger than the true median:

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

X1 X0 X4 X3 X2 median?

P(median < min(X)) = 0.5n

86

Confidence intervals for median values (II)

q Probability of each sample being bigger than the true median: q Probability of median being between min(X) and max(X): q This is a confidence interval!

q Example confidence: 93,75% q (even though the probability is not predetermined) Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

X1 X0 X4 X3 X2 median?

P(median > max(X)) = 0.5n P(min(X) < median < max(X)) = 1 − 2 × 0.5n

slide-44
SLIDE 44

87

Confidence intervals for median values (III)

q How large is the confidence for q More general approach: bionomial distribution! q Calculate probability of each possible outcome:

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

X1 X0 X4 X3 X2 median? X5

P(X4 < median < X2)

✓6 2 ◆ = 15 ✓6 4 ◆ = 15

✓6 3 ◆ = 20

88

Confidence intervals for median values (IV)

q Sum of probabilities: q In example: q No assumptions on distribution required!

Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

P(Xx < median < Xy) =

y−1

X

k=x

✓n k ◆ 0.5n

P(X4 < median < X2) = 0.56 × ✓✓6 2 ◆ + ✓6 3 ◆ + ✓6 4 ◆◆ = 0.78125

slide-45
SLIDE 45

89 Telematics 2 / Performance Evaluation (WS 17/18): 09 – Probability & Results

Additional References

[BLK02] M. L. Berenson, D. M. Levine, T. K. Krehbiel. Basic Business Statistics. course slides to the ninth edition of the book. Prentice-Hall, 2002. http://myphlip.pearsoncmg.com/cw/mpbookhome.cfm?vbookid=462 [Jain91] R. Jain. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley, 1991. [Karl04] H. Karl. Praxis der Simulation. course slides, Technische Universität Berlin. [Rob01] T. Robinson. Probability. course slides, University of Bath, http://www.bath.ac.uk/~masar/UNIV0037/probability.ppt [Tow04] Don Towsley. Network Simulation. slides for course “Advanced Foundations of Computer Networks, University of Massachusetts, USA, 2004. http://www-net.cs.umass.edu/cs653/ [Tri02]

  • K. S. Trivedi. Probability and Statistics with Reliability, Queueing and Computer

Science Applications. Wiley, 2002. [Var04]

  • A. Varga. OMNeT++: Object-Oriented Discrete Event Simulator.

http://www.omnetpp.org/