1
Introduction to Simulation of
Telecommunication Networks
4hr-Seminar in the course «Switching and Routing» Massimo Tornatore
(Courtesy of Prof. Fabio Martignon)
Politecnico di Milano
Department of Electronics, Information and Bioengineering (DEIB)
Introduction to Simulation of Telecommunication Networks - - PowerPoint PPT Presentation
Politecnico di Milano Department of Electronics, Information and Bioengineering (DEIB) Introduction to Simulation of Telecommunication Networks 4hr-Seminar in the course Switching and Routing Massimo Tornatore (Courtesy of Prof. Fabio
1
4hr-Seminar in the course «Switching and Routing» Massimo Tornatore
(Courtesy of Prof. Fabio Martignon)
Politecnico di Milano
Department of Electronics, Information and Bioengineering (DEIB)
Introduction to Simulation 2
What is simulation?
Systems, models and variables
Discrete-event simulation
Generation of pseudo-random numbers
– Synthesis of random variables
Statistical Analysis
– Statistical confidence of simulative results Next Lecture: example of C/C++ simulator
Introduction to Simulation 3
Why, if we have studied queuing theory, do we also have to use the simulation?
Isn’t queuing theory sufficient to determine the performance of telecommunications networks?
No… In fact:
Queuing theory can describe and give results for a small ensemble
How can we study a complex system:
– Queues with not Poisson arrivals: bursty arrivals, arrive in groups,
back-off rejected requests, etc..
– Queues with complex mechanisms for managing queues (PQ, WFQ,
RED, ecc.)
– Queue networks that do not meet Jackson’s assumptions (steady state) – Analysis of the transient behavior of queuing systems
Introduction to Simulation 4
In addition there are network systems that can not be
easily described by queuing models, e.g.:
– Access Interface of wireless systems (GSM, WLAN, ecc.) with
errors due to channel characteristics and interference
– Dynamic routing mechanisms for IP networks – Congestion control mechanisms (ex. TCP) – Admission control mechanisms – Retransmission complex protocols such as, for example,
protocols with piggybacking, selective reject, ecc.
When you start from real systems is quite difficult to
find a model solvable only with the queueing theory
Introduction to Simulation 5
What is SIMULATION: Simulation seeks to build an experimental device,
that behaves like the real system under study for some important aspects
Examples:
– scale models of airplanes, cars or trains, used in
wind tunnels
– SimCity, Railroad Tycoon, and other videogames
based on reproducing how a system works
– flight simulators for pilot training
Introduction to Simulation 6
Other:
– predicting the development of ecosystems after
artificial alteration
– verification of stock-exchange tactics – weather forecasts – verification of battle tactics – etc.
7
System
– is a very general concept that can be defined informally as a collection of parts, called components, that interact with each other
Model
– is a system representation. This representation can take many forms (e.g., that of physical replication ), but here we focus on the representation by means of mathematical or software/simulative models
State
– the system state describes the current state of all its components – to the system state it corresponds a state of the system model and the model represents the evolution of the system through the history of changes in state
The level of abstraction of a model indicates that
some features of the system state are omitted
– The level of abstraction is tightly related to the measures that model is aimed at – The best model is simply the easiest model by which you get the measures (performance) you want
8
Introduction to Simulation 9
Variables
– the activities of the model are described as
relationships or functions between variables
– a mathematical model is described using variables. Same for a simulative model!
– State variables
– state variables define the state of the model – their evolution defines the evolution of the system
– Input variables
– input variables describe external stimula on the system under consideration
Introduction to Simulation 10
– Output variables
– are a function of the state variables and the input variables – they represent, therefore, the probes inserted in the model for the measure – the solution of the model is to obtain the values of
– the analytical solution of a model involves, e.g., mathematical methods for solving equations that describe relationships between variables – the simulated solution of a model reproduces the evolution of the system by evolving the state variables and directly measuring the output variables
Introduction to Simulation 11
Simulation Properties:
– simulation is “descriptive” and not “prescriptive” – simulation provides information on the behavior of
the system, given the parameters
– the simulation DOES NOT tell you how to set the
parameters for best system behavior, or to test the limits of the system
– Example: M/M/1
– I see immediately the capacity limit – If I simulate the system, I have to run lot of simulations, increasing the load, till I find out which is the limit
1 D
Introduction to Simulation 12
There are many ways to classify simulations, e.g.:
– Deterministic vs. Stochastic – Continuous time vs. Discrete time
Deterministic vs. Stochastic simulations:
– deterministic simulations are completely defined by the model, and their evolution is deterministically associated to the input parameters; – stochastic simulations are based on models that include random variables or processes and so they require the generation of random variables; the evolution of the model depends on the input parameters and the generation of random variables
Introduction to Simulation 13
Examples
– Deterministic simulation:
– Consider the motion of the billiard balls on the pool
force of impact of the cue, we can simulate the outcome
model)
– Stochastic simulation:
– Consider a GSM cell with N channels (i.e, it acts as a phone concentrator) to which connection requests arrive according to a Poisson process with rate – We want to determine the call blocking probability, knowing that with probability p rejected calls retry to log on after a time equal to T
Introduction to Simulation 14
Stochastic simulations can be classified as
static or dynamic
Static Simulations
– also called Monte Carlo simulations – the time variable plays no role – the basic objective is to determine some statistical
characteristics of one or more of random variables
– in fact, the Monte Carlo simulations tipically
evaluates statistical measures through independently repeated experiments
Introduction to Simulation 15
Dynamic simulations
– also called temporal simulations – time becomes the main variable to be tied to the
evolution of the model
– the purpose is to collect statistics for random
processes observed at different time
Introduction to Simulation 16
Simple Monte Carlo example (it can be
solved analytically):
– Consider a slotted random multiple access system with 10
users of type A, 10 users of type B and 13 users of type C.
– Users A, B and C have a packet ready for transmission in
each slot with probability 3p, 2p and p, respectively.
– Find the throughput of the system given p. – Determine the value of p that maximizes the throughput.
Introduction to Simulation 17
Another (more complex) Monte Carlo
simulation example
– Consider a cellular packet system in which a packet is
transmitted by a mobile user placed in random position in each time interval with probability G.
– The attenuation of the channel is a function of distance and
a random factor (fading), and transmitted power is fixed. The packet is received correctly
if the signal/interference is greater than 6 dB.
– Determine the probability of a successful transmission. – If we want to determine the value of G that maximizes the
throughput, we have to make repeated experiments for various values of G
Introduction to Simulation 18
Example of dynamic simulation (1):
– Consider a queue system:
– 1 server – queue size: K packets – interarrival times have a uniform random distribution between two values a and b – each arrival is composed of a number of users x, where x is a random variable Binomial with reason p
– Determine the packet loss probability and the
average packet delay
Introduction to Simulation 19
Example of dynamic simulation (2):
– Consider a slotted polling system with:
– N queues – Packet length x is randomly distributed according to a certain known pdf. – Interarrival times are distributed according to a Poisson process with rate .
– Assuming that:
– The server adopts a policy of priority queuing (i.e., it continues to serve a queue until it is empty) – The server chooses the next queue according to a «longest-queue» policy
– Determine the average transfer time through the
system
Discrete-event dynamic simulation
The system state changes in response to
«events»
E.g., telecom network simulation
Continuous-time dynamic simulation
The system state evolves in response to the
change of continuous-time variable
E.g., weather forecast Introduction to Simulation 20
Introduction to Simulation 21
Simulation of discrete events is of fundamental importance
for telecommunications networks
In discrete event simulation the state variables change value
The change of the system state is called event and is
characterized by an instant of occurrence
– An event has no duration
After the event occurs, in the system an activity starts that
persists for some time
– An activity is usually characterized by a start event and an end event – For example, the beginning and the end of the transmission of a
packet are events, while the transmission itself is an activity
Introduction to Simulation 22
In discrete event simulation we should:
– define the types of events that can occur – define the changes in the system state associated to each
event
– define a time variability and an ordering of events in a
calendar based on the instant of event occurrence
– define an initial state – scroll through the calendar and, each time an event
event
– measure on the output variables
Introduction to Simulation 23
Model:
– Queue system with a server and an infinite queue – Input variables:
– interarrival times of requests (packets) – service times of requests
– State variables:
– number of requests in the system
– Initial state
– e.g., no user in the system
– Output variables:
– average time spent by a packet in the system
Introduction to Simulation 24 Events
– 1. first arrival
– Set service start – Set service end
– 2. from the second arrival on
– In this case we should act differently on the basis of the state: – Arrival » in an empty system → immediate service start » in a not-empty system → add a packet in the queue – End (? See next slides) » with empty queue » with non-empty queue
Introduction to Simulation 25
Filling the calendar of events
– PROBLEM: it is not possible to place the end
service events of each request without knowing the system state
– SOLUTION: the calendar can be filled with new
events while other events are pending
– Example: a packet is queued. When the transmission start time is reached, then the end time is scheduled
Introduction to Simulation 26
when we have an «arrival» event, we increase
the number of users, then
if the system is empty, a new event is inserted in the calendar at a time equal to «CLOCK + service time» if the system is busy, add a packet in the queue
when an service-end event is reached, we
decrease the number of users, then
If the queue is empty, no action if the queue is not empty, a new end event is inserted in the calendar at a time equal to the value of «CLOCK + service time»
Introduction to Simulation 27
Measurement of output variables
– User arrival: storage time of arrival – End of service: calculation of the service time (CLOCK - t_arrival)
– Weigthed average of the users in the systems during each activity interval («time slices»)
Introduction to Simulation 28
Note: the correct inclusion of a new event in
the calendar is a critical operation if the calendar has many events
– Efficient techniques must be used for inclusion of
an element in an ordered list
Introduction to Simulation 29
Sliding of the «CLOCK» variable
– «clock driven» simulations
– the CLOCK variable is always increased of a fixed step » E.g., Slotted systems
– «event driven» simulations
– the CLOCK variable is increased according to the time interval between the occurrence of an event and the occurence of the following event
Note:
– from a computational-time point of view it may be convenient to adopt a method rather than the other depending on the model
Introduction to Simulation 30
Considering the power/efficiency of modern coding
languages and computing systems, simulation is today a powerful tool of analysis to address complex problems
But simulation is also a tool that should be used with care
for the following reasons:
– is not easy to validate the obtained results – the computational time can easily get very high – is not easy to understand how different parameters affect the
result
Introduction to Simulation 31
The simulation of a stochastic model involves the
utilization of random input variables → We need statistical distribution of input variables
So, for computer simulation, pseudo-random
number generation and synthesis of statistical variables is needed (the next topic ...)
– Example: traffic entering a queue system described by
the process of arrivals and the process of service times
Introduction to Simulation 32
What is simulation?
Systems, models and variables
Discrete-event simulation
Generation of pseudo-random numbers
– Synthesis of random variables
Statistical Analysis
– Statistical confidence of simulative results
Introduction to Simulation 33
When the model to be analyzed via simulation is
stochastic, two important problems arise:
the generation of pseudo-random numbers to
be use for the generation of input variables
the statistical analysis of the results obtained
through the output variables
Basic statistic used in the following:
Average (mean), variance Concept of random variable Probability density function (pdf) f(x) Cumulative distribution function (CDF) F(x) Theorem of central limit Gaussian (normal), t-student distributions
34
Introduction to Simulation 35
Rigorously speaking, numbers generated by a
computer can not be random due to the deterministic nature of a computer
We can however generate pseudo-random
sequences that meet a series of statistical tests
Introduction to Simulation 36
The problem of generating pseudo-random
numbers can be logically divided in two parts:
generation of sequences of random numbers
uniformly distributed between 0 and 1
generation of sequences of random numbers
distributed in an arbitrary mode
– Poisson, Bernoulli, Weibull, Exponential, etc ...
Introduction to Simulation 37
The pseudo-random sequences are obtained
through the implementation of recursive formulas
Some history
The first method to generate random sequences
was Von Neumann's «square center» method
The next number is obtained by squaring the
previous number and taking the central number
Introduction to Simulation 38
x0=3456 that squared provides (x0)
2 =11943936
so x1= 9439 This method was abandoned: difficult to analyze, relatively slow and statistically unsatisfactory
Sequence of random number obtained
Factors determining the quality of a method:
1)
numbers must be uniformly distributed (i.e., they must have the same probability to occur)
2)
numbers must be statistically independent
3)
we must be able to re-produce the sequence
4)
sequence must be of an arbitrary length
5)
the method should be quickly executable by the computer and must consume small amount of memory
Introduction to Simulation 39
Introduction to Simulation 40
Let’s recall some basic math operators
Module:
Congruency: – Property of module:
Introduction to Simulation 41
Linear Congruency Method (Lehmer 1948)
That:
1
n n
Introduction to Simulation 42
Example:
X0 = a = c = 7 m = 10 {Xn}nN = 7, 6, 9, 0,7, 6, 9,...
Note: 0 Xi m, for all i The method is called:
multiplicative if c = 0 Mixed if c 0
Introduction to Simulation 43
Drawbacks of linear congruency
As soon as Xp=X0, the sequence is repeated
periodically; p is the period of the sequence
Being Xn<m, the period will be less than or
equal to m
Note(1): if p = m then all numbers between 0 and
m-1 are repeated once in the sequence
Note(2): to obtain a sequence in [0,1):
Introduction to Simulation 44
We can relate directly Xn to X0
This emphasizes even more the deterministic nature of the sequence!
n n n
3 3 3 2 1 2 1
How to set the multiplier a and increase c:
a and c strongly influence the period and the
statistical properties of the sequence
there are rules for choosing a and c that will return
periods p = m (full period)
Criteria to ensure optimality:
1.
The parameters c and m must be co-prime, i.e.: MCD(c,m) = 1
2.
Every prime divisor of m must divide (a-1)
Ex: if m=10, its prime factors are 2 and 5. (a-1) must be a multiple of 2 and 5
3.
If m is a multiple of 4, also (a-1) must be
Introduction to Simulation 45
It is not easy to find values that satisfy (1), (2), (3)
Ex: m=10, a=21, c=3 (Xn=3,6,9,2,5,8,1,4,7,0,..)
Some researchers have therefore identified the
following values in accordance with these criteria:
KNUTH m = 231; a = int (p * 108) ; c = 453806245 GOODMAN/MILLER m = 231 -1; a = 75 ; c = 0 GORDON m = 231; a = 513 ; c = 0 LEORMONT/LEWIS m = 231; a = 216 + 3 ; c = 0
Introduction to Simulation 46
Introduction to Simulation 47
A simpler condition:
– if the method is multiplicative (c=0) you can show
that if m=2b then the maximum period is p=2b-2 if b4
Note: the equivalence of multiplicative and
mixed approaches has been proven
Introduction to Simulation 48
Notes on the choice of module m
m influences the period because p m m also affects the computational speed:
– to calculate a module, we should generally perform a
product, a sum and a division
– it is possible to do everything together if you choose as
module the maximum integer representable by the computer plus one
– if b is the number of bits used by the computer, you will choose m=2b – (e.g., 231 for registers with 31 bit)
– in this module, the operation correspond to a
truncation
Introduction to Simulation 49
Other methods:
congruent square method:
– It is based on the generation of congruent numbers with m-
module according to the relation:
Fibonacci or additive method:
n n n
2 1
k n n n
1
Introduction to Simulation 50
Notes on Test for generators:
The tests on pseudo-random numbers generators
are applied to verify that:
Generated numbers are uniformly distributed Generated numbers are indipendent
However, classic tests for statistical hypothesis
are designed for random variables and must find their implementation in a test run on a finite set
generally we assume an hypothesis as verified
number of tests
Introduction to Simulation 51
Notes on Test for generators:
a typical test to verify a certain statistical
distribution is the test of c2
We divide the set of possible values in k categories Ŷi is the set of sample values in the i-th category
and Yi =npi is the expected value, where n is the number of samples and pi is probability of category i (pi =1/k for uniform case)
a quality index can be defined as
k k k
2 2 2 2 2 1 2 1 1
Introduction to Simulation 52
Notes on Test for generators: c2 test
the problem is that the value of V is itself a
random variable which also depends on the absolute values
therefore is necessary to repeat the test several
times on different samples and evaluate the probability that V takes high values
it can be proven that V has a c2 distribution
with n = k-1 degrees of freedom:
! 1 2 2 : integer 2 ; 2 2 1 ) (
2 1 2 2 /
n n n x e x n x f
x n n
Introduction to Simulation 53
Notes on Test for generators: c2 test
If Px indicates the percentile x% of the
distribution c2, we can rank the observations of V according to table:
P0-P1, P99-P100 reject P1-P5, P95-P99 suspicios P5-P10, P90-P95 almost-suspicios
Introduction to Simulation 54
Notes on Test for generators: gap test
There are several tests to verify the independence
A frequently used test is the «gap» test – we define an event on the observed distribution,
such as passing a certain threshold
– we estimate the probability p associated to the event – starting from the sequence of samples we derive the
sequence of variables (0,1) that defines if the event
Introduction to Simulation 55
Notes on Test for generators: gap test
– Let us consider the length of the sequences of 0 and
the sequences of 1
– since the distribution of these lengths is geometric,
we verify the congruency to the distribution using a test (e.g., the c2)
– or, more simply, we estimate the average value and
compare it with 1-p and p, respectively 000111110010010010010011100001001
Now we have:
A sequence of pseudo-random numbers
– Uniformly distributed between 0 and 1 – That satisfies test of randomness What do we have to do?
We use them to obtain samples of variables
distributed according to the distribution we need (exponential, Poisson, geometric, etc. ..)
Introduction to Simulation 56
«Inverse transform» method
Given
– r: variable uniformly distribuited in [0,1] – that is, f(r)=1 or F(r)=r
To obtain a random variable x with f(x), we have to:
– Determine F(x), 0<=F(x)<=1 – Generate random samples r – Set r=F(x) – Calculate the inverse function F(x) => F-1(.) – Obtain x= F-1(r)
It can be proven, but we skip the proof! Introduction to Simulation 57
Inverse transformation example (1): – We want to get x such that f(x) =1/(b-a)
– From uniform [0,1] to uniform [a,b]
– F(x)=(x-a)/(b-a), 0 F(x) 1, a x b – r=F(x)=(x-a)/(b-a) – x=r(b-a)+a
Introduction to Simulation 58
Inverse transformation example (2): – I want to get x such that f(x) =e-x(x≥0)
– From uniform [0,1] to negative exponential with average
– F(x)=1e-x – r=F(x)=1e-x – x= (-ln(1-r))/
Introduction to Simulation 59
We could show it more rigorously… – Next 4 slides
Introduction to Simulation 60
Introduction to Simulation 61
Generation of an arbitrary distribution:
Elements of probability :
(y) x x y g(x) y x x g x f y f g(X) Y y f
i i i i i i X Y Y
,
function in turn are that : equation the
solutions the are where ) ( ' ) ( ) ( : by given is r.v.
) ( p.d.f.
The fundamental theorem of functions of random variables:
Introduction to Simulation 62
Example:
(0,1)! in uniform is So 1. y if
exists that equation the
solution
the is where 1 y 1 ) ( ' ) ( ' ) ( : write to allows theorem l fundamenta The Consider . C.D.F with r.v. a is
1 1 1
x (x) F y x x F x F y f (X) F Y (x) F X
X X X Y X X
Introduction to Simulation 63
Synthesis of a r.v. using the method of percentile:
it is now easy to see that if you have:
– U r.v. uniform in (0,1) – To obtain a r.v. X with CDF euqal to F(.) it is enough
to set:
1 U
X F-1(.) F(.) X U
Introduction to Simulation 64
This can be proven otherwise as:
U is a r.v. uniform in (0,1)
1 1 1 ) ( x x x x x F
U
elsewhere 1 for 1 ) ( x x fU
) (
1 U
F X
1(U) t
Set: It results:
Introduction to Simulation 65
The variable U is obtained from the
generation of pseudo random number
It remains the problem of finding Fr
the variable that you want to synthesize
for some processes the Fr
must resort to other methods
moreover, for discrete random variable we
need to slightly modify the approach
Introduction to Simulation 66
Example: Exponential
– If you want to generate an exponential
random variable x with average > 0
X
X
Introduction to Simulation 67
Example: Rayleigh
– if you want to generate a random variable with
Rayleigh pdf
2 2
2 2
/
X /
X
68
Example: Gaussian
– If you want to generate a random variable x with
gaussian pdf and with =0 and s=1
– To have a variable with average and variance s2
it is enough to use the transformation z=sx+
– pdf is – It is well known that the CDF of the Gaussian
cannot be expressed directly, so it can not be inverted explicitly
2
2
/
X
Introduction to Simulation 69
Example: Gaussian
– first approach is to use an approximation – the central limit theorem tells us that the sum of
N r.v.’s tends to the normal distribution with the increase of N
– Usually for N12 we assumewe can get a good
approximation
– So it is enough to extract 12 variables uniform Ui
12 1
i i
Introduction to Simulation 70
Example: Gaussian
– a smarter approach gets two independent samples
extractions
– is based on the observation that:
– a 2-dim vector which has Gaussian and Independent Cartesian components, has: » module with Rayleigh’s distribution » uniform phase in (0,2p)
Introduction to Simulation 71
Example: Gaussian
– therefore: – two variables are extracted: U1 ed U2, uniform in
(0,1)
– assessing X ed Y: – that are independent normal random variables
2 1 2 1
Introduction to Simulation 72
Discrete random variables:
Consider a discrete random variable described by
probability distribution:
FX(x) is a function as:
k k
pk ak
Introduction to Simulation 73
Discrete random variables
That, when inverted, becomes
pk ak 1
The relation expressing the
variable is therefore :
k 1
1 1
1
p p ... p p ... p : if
set u a x
k
NB: «u» is what we use to call «r», i.e., the pseudo-random number between 0 and1
Introduction to Simulation 74
Discrete random variables:
Example: Generate a random variable X that takes the
value 1 with probability p and value 0 with probability 1-p
It is enough to set:
Introduction to Simulation 75
Discrete random variables:
it becomes extremely complicated with
discrete distributions with m infinite
we must stop at a finite value m m determines the number of comparisons
that must be done in the routine assignment, and thus the speed of the routine itself
in some cases it is possible to adopt some
tricks
Introduction to Simulation 76
Example: Geometric Distribution
We have: Consider an exponential r.v. Z; we have: this value matches P(X=n+1) if you require
that:
1
k
) 1 ( ) 1 ( ) 1 ( 1
/ 1 / / / ) 1 (
e e e e n Z n P
n n n
) 1 ln( 1 ; 1
/ 1
p e p
Introduction to Simulation 77
Example: Geometric Distribution
Therefore to generate a geometric variable is
enough:
– 1. Generate a uniform variable U in (0,1) – 2. Get an exponential variable – 3. Set
Introduction to Simulation 78
Example: Poisson’s Distribution
with Poisson’s distribution things get
complicated, so there can be no shortcuts to the problem:
– 1. set k:=0, A:= e-a, p:=A – 2. U:=rand(seed) – 3. While U>p
» k:=k+1 » A:=A*a/k » p:=p+A
– 4. return k
a k
e k a k X P
!
Introduction to Simulation 79
Once we have built the simulation model and the
software that implements it, we shall:
decide what to measure (which output variables) decide the statistical metric (average, variance?)
– Note that the output variables are r.v!
repeat the experiment multiple times!! adopt the appropriate estimators for the
parameters
evaluate the accuracy (“confidence”) of estimation
Introduction to Simulation 80
given a population whose distribution is f(x),
with average E[x] = h and variance s2(x) = s2
[x1, x2, ... , xn] are n independent observations The average value of the samples is defined by:
Estimation problem: estimation of average value
n i i
1
Introduction to Simulation 81
The average of the samples is also a r.v. with: for large n, the average of the samples is a
normal variable, and then the variable:
it can be assumed normal with zero average
and unitary variance based on the central limit theorem
Estimation problem: estimation of average value
2 2
Introduction to Simulation 82
the normal distribution F(z) is tabulated u1a/2 is a value such that we have:
Estimation problem:estimation of average value
1a /2) 1a /2
1a /2 z u 1a /2
1a /2 x h
1a /2
F(z) z
1 1-a/2 u1-a/2
f(z)
u1-a/2
Introduction to Simulation 83
and therefore the (1-a) constant is usually expressed with
percentage and is called confidence level
the interval is called confidence interval
1a /2
1a /2
1a /2
1a /2
Estimation problem: estimation of average value
Introduction to Simulation 84
commonly we adopt a confidence level of
95% for which we have:
this means that h falls in this range: with a probability of 95%
Estimation problem: estimation of average value
a 0.05 u
1a /2 1.96
x 1.96 s n , x 1.96 s n
Introduction to Simulation 85
Unfortunately, the variance s2 is not known s2 should be replaced by the samples variance,
defined as:
In this way, however, the variable: is no longer normal but has t-student distribution
with n-1 degrees of freedom
Estimation problem: estimation of average value
n i i
x x n s
1 2 2
) ( 1 1
Introduction to Simulation 86 in cases with large n (>30) it is possible to
approximate the t-student with the normal distribution
but for smaller values of n it is necessary to use
t-student distribution with the corresponding number of degrees of freedom
Note: for Monte Carlo simulations the values of n>30 are quite
common, while for temporal simulations, n is usually smaller
Estimation problem:estimation of average value
Introduction to Simulation 87
Table of t-student values Warning!: b=1-a/2 k=n-1
Introduction to Simulation 88
Values generation of t-student
// t-distribution: given p-value and degrees of freedom, // return t-value; adapted from Peizer & Pratt JASA, vol63, p1416 double tval(double p, int df) { double t; int positive = p >= 0.5; p = (positive)? 1.0 - p : p; if (p <= 0.0 || df <= 0) t = HUGE_VAL; else if (p == 0.5) t = 0.0; else if (df == 1) t = 1.0 / tan((p + p) * 1.57079633); else if (df == 2) t = sqrt(1.0 / ((p + p) * (1.0 - p)) - 2.0); else { double ddf = df; double a = sqrt(log(1.0/(p*p))); double aa = a*a; a = a - ((2.515517+0.802853*a+0.010328*aa) / (1.0+1.432788*a+0.189269*aa+0.001308*aa*a)); t = ddf - 0.666666667 + 1.0 / (10.0 * ddf); t = sqrt(ddf*(exp(a*a*(ddf-0.833333333)/(t * t))-1.0)); } return (positive)? t : -t; }
Introduction to Simulation 89
Operations on confidence intervals
Let’s denote the confidence intervals of two
variables as:
it can be proven that:
y u l x u l
y x u u l l x u l
Introduction to Simulation 90
On the statistical confidence of variance estimation
A direct method for evaluating the confidence of
the variance estimation is using the expression
having the populations [x1, x2, ... , xn] and [(x1)2,
(x2)2, ... , (xn)2] it is possible estimate the confidence interval of the average of x and x2
the two intervals can then be combined using
the previous expressions ] [ ] [ ) (
2 2 2
Introduction to Simulation 91
Estimation problem:
The results seen so far are based on the
fundamental assumption that:
– the observed variables are stationary – the measurements are not affected by the initial state – the observations are independent
the hypothesis of independence is the more
difficult to obtain and verify in practical cases
the independence of the observations depends
variables that are not known
Introduction to Simulation 92
Estimaton problem: correlated observations
The estimator of the average continues to be a
non-biased estimator
but its variance is now equal to: where the correlation coefficient rk is:
1 1 2 2
n k k
Introduction to Simulation 93
Estimaton problem: correlated observations
The estimation of the confidence interval thus
requires knowledge of the autocorrelation function of the process that is not generally known
We could use autocorrelation estimators, but the
complexity and computation load would become excessive
In practice we use two different approaches to
build independent sequences
Introduction to Simulation 94
Method of repeated tests of correlated observations
1) repeated tests
– N independent observations of the process are built repeating N
times the simulation with N different random number generators
– the N estimated values for each simulation are used as
independent samples for the evaluation of the confidence
this approach implements in fact a generalization of the
Monte Carlo simulation
It is useful in many practical situations, but in fact it is
Introduction to Simulation 95
Method of sudvision into interval of observation
2) subdivision into intervals of observation (run)
– simulation is divided into N blocks, each consisting of a number of
– evaluating the average of the output variable in each block – it is shown that with sufficiently large K the average of each block
are independent
– estimate the confidence interval on the basis of estimates obtained
in each run
This approach is approximate sometimes may not be easy to check that the number K of
Introduction to Simulation 96
Estimaton problem: correlated observations
Example:
consider a mD/D/1 queue m flows with deterministic inter-arrival time T are offered
at a server
Service time is also deterministic and equal to S the relative phases of the flows are random (uniform
between 0 and T)
It can be shown that:
– delays are periodic and depend only on the initial phases
We must repeat the experiment a number sufficiently
large N of times with random phases to obtain some valid estimation of the average delay
Introduction to Simulation 97
Estimaton problem: correlated observations
in some cases the measurement process is a renewal
process and we can exploit the renewal process to have independent observations
a renewal process is characterized by a series of renewal
instants [b1, b2, b3, ...]
in these moments, the process returns to the same state the evolution of the process in the intervals [bn-1, bn] is
independent from interval to interval
measurements taken on the process in distinct intervals
are independent and you can apply the formulas for the estimation of confidence
98
Estimaton problem: correlated observations
Example 1:
– is easy to convince yourself that for queuing systems
with general arrivals and general services, the instant
empty this is a moment of renewal of the entire
– the system state is the same – a new period of inter-arrival is not started and
therefore there is no memory
– a new period of service is not started and therefore
there is no memory
– the system is empty and therefore there is no memory
99
Estimaton problem: observations related
Example 2: – consider a M/G/1 queue system – conduct a simulation to measure the delay through the system – consider as champions the delays experienced by each user – is easy to convince yourself that these samples are related – Indeed, for example, if the first arrival finds empty system, the
immediately following arrivals observe low delays, while consecutive arrivals with the system very high load observe high delay
– dividing the simulation in run of the same length of time you do
not control the number Ki of samples for each run and long run needs to be done to have a low dispersion of the Ki.
100
Estimation problem:
even assuming that we have solved the
estimation problem remains that of stationarity
although the process is stationary, we are forced
to start the simulation from an initial state
the initial state influence the statistics gathered
in the first part of the simulation until the system reaches a stationary behavior
the simplest approach is to eliminate the results
from the statistics collected during the initial interval
101
Estimation problem:
the problem is that it needs over the period
which shall not collect statistics
Unfortunately there are no precise rules for
deciding when to start collecting data
theoretically should estimate the autocorrelation
a transitional period of time until the autocorrelation is not considered negligible
in fact also autocorrelation estimate is a complex
"empirical"
102
Estimation problem:
Of course, if you know the regeneration points
problem is solved
– just delete the data until the first point of generation – or start from the state of the points of regeneration
and hold good all the statistical data
otherwise you need to have an idea of the time
constants involved in determining the state of the process
you can then proceed by attempts
103
Estimation problem:
Example: in the queuing systems the time needed to
stabilize the system depends on the load
as r tends to 1, the system takes longer time to
reach a stable state
in a sense r near one, means unstable system Problem: How do you know if the system is
stable or not if this can not be inferred from the input parameters?
104
Donald E. Knuth, “The Art of Computer
Programming”, Second Edition, Addison Wesley Publishing Company, Reading MA, 1981 (in particolare, Volume 2: “Seminumerical Algorithms”)