1
CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, - - PowerPoint PPT Presentation
CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, - - PowerPoint PPT Presentation
CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Today: Recap before midterm 1 Big Picture: Model-based Analysis of Systems portion/facet real world perception transfer
2
Big Picture: Model-based Analysis of Systems
portion/facet real world formal / computer aided analysis solution, rewards, qualitative and quantitative properties probability model, stochastic process transformation presentation transfer decision description perception solution to real world problem real world problem formal model
Reminder
3
This is no pipe! ... and this is no serpentine accumulator in a production line!
System - Model - Study Model vs System
largely simplified formal/mathematical/stochastic model implemented
in software in a fully controlled environment
set of physical devices interacting in space-time in an largely
uncontrolled, not fully understood environment
Model
includes some of the rules how the system operates, excludes others includes some aspects of the real world as random variables, ignores
- thers or assumes them as constant
is parameterized with respect to certain design variables
Study
has an objective, a clear question delivers values that are probabilities like R(0,t)
Interpretation?
evaluates effects of different design choices
4
CS 626 Topics
From Data to Stochastic Input Models
Input Modeling Probability, Distributions Exploratory Data Analysis, Statistical tests Stochastic processes, Markov Processes
DTMC, CTMC Phase type distributions, MAPs, MAP Fitting
Tools
for data analysis: R for MAP fitting: KPC toolbox
Simulation Modeling
Simulation Output Data Analysis Verification, Validation,
Trace driven simulation Debugging of simulation models
Tools for simulation: Mobius, (+Traviando)
Applications
Reliability analysis, Dependability modeling of a LEO satellite Modeling traffic in computer networks Emulation: Testing, Debugging, Training in Automated Material Handling Systems
5
From Data to Stochastic Input Models Probability
Axiomatic Definition Frequentist Definition
6
7
Frequency Definition of Probability
If our experiment is repeated over and over again then the proportion of time that event E occurs will just be P(E).
Frequency Definition of Probability: P(E) = lim m(E) / m where m(E) is the number of times event E occurs, m is the number of trials Note:
Random experiment can be repeated under identical conditions if repeated indefinitely, relative frequency of occurrence of an event
converges to a constant
Law of large numbers states that limit does exist. For small m, m(E) can show strong fluctuations.
m→∞
8
Axiomatic Definition of Probability Definition
For each event E of the sample S, we assume that a number P(E) is defined that satisfies Kolmogorov’s axioms:
9
Outline on Problem Solving (Goodman & Hedetniemi 77)
Identify sample space S
All elements must be mutually exclusive, collectively exhaustive. All possible outcomes of experiment should be listed separately.
(Root of “tricky” problems:
- ften ambiguity, inexact formulation of the model of a physical situation)
Assign probabilities
To all elements of S, consistent with Kolmogorov’s axioms.
(In practice: estimates based on experience, analysis or common assumptions)
Identify events of interest
Recast statements as subsets of S. Use laws (algebra of events) for simplifications Use visualizations for clarification
Compute desired probabilities
Use axioms, laws, often helpful: express event of interest as union of mutually
exclusive events and sum up probabilities
10
More relations What is the probability of a UNION of events ? What is the probability of a union of a set of events? Is there a better way to calculate this?
Sum of disjoint products (SDP) formula
11
Conditional Probabilities Definition
The conditional probability of E given F is if P(F) > 0 and it is undefined otherwise. Interpretation: Given F has happened, only events in EF are still possible for E, so original probability P(EF) is scaled by 1/P(F). Multiplication rule:
E F
EF
F
EF
given F happens
12
Independent events
Definition
Two events E and F are independent if:
This also means: In English, E and F are independent
if knowledge that F has occurred does not affect the probability that E occurs.
Notes:
if E, F independent then also E,Fc and Ec,F and Ec,Fc Generalizes from 2 to n events
e.g. n=3 every subset independent
Mutually exclusive vs independent
13
About independent events Venn diagrams Tree diagrams of sequential sample spaces
Throw coin twice
A B S H T T T H H (H,H) (H,T) (T,H) (T,T) For independent events: consider A, B being not empty and not S, 1) if A ⊂ B, then A and B cannot be independent 2) if A ∩ B = ∅, then A and B cannot be independent Joint sample space from cross product of individual sample spaces. First, second throw are independent.
14
Joint and pairwise independence
A ball is drawn from an urn containing four balls numbered 1, 2, 3, 4.
Then we have: They are pairwise independent, but not jointly independent
A sequence of experiments results in either a success or a failure where Ei, i >= 1 denotes a success. If for all i1, i2, …, in: we say the sequence of experiments consists of independent trials
Independence is a very important property Independence
simplifies calculations significantly => very popular assumption for
theoretical results
input modeling, workload modeling statistical tests output analysis of simulation models: confidence intervals for estimate of mean ...
independence need not be present in real data
data traffic in networks: often correlated output data of a (simulated) system, i.e. response of a system to some
workload
ways to investigate independence
graphics: correlation plot tests: chi-square test for vectors, rank von Neumann test, runs test see Law/Kelton Chap 6.3 and Chap 7.4.1
15
16
Bayes’ Formula
Let F1, F2, …, Fn be events of S, all mutually exclusive and collectively exhaustive. Theorem of total probability (also Rule of Elimination) Bayes’ Formula helps us to determine which Fj happened given we observed E
Random Variable RV Definition
A random variable X on a probability space (S,F,P) is a function X : S -> R
that assigns a real number X(s) to each sample point s ∈ S, such that for every real number x, the set of sample points {s|X(s) ≤ x} is an event, that is a member of F. RVs can be discrete or continuous
More concepts
cumulative distribution function density moments E[Xi], centralized moments, Variance, Skewness, Kurtosis
Particular examples
Normal distribution Poisson distribution Exponential distribution Pareto distribution
17
Parameterization of distributions Parameters of 3 basic types Location
specifies an x-axis location point of a distribution’s range of values usually the midpoint (e.g. mean for normal distribution) or lower end
point for the distribution’s range
sometimes called shift parameter since changing its value shifts the
distribution to the left or right, e.g., for Y = X + γ
Scale
determines the scale (unit) of measurement of the values in the
range of the distribution (e.g. std deviation σ for normal distribution)
changing its value compresses/expands distribution but does not
alter its basic form, e.g., for Y = β X
Shape
determines basic form/shape of a distribution changing its values alters a distribution’s properties, e.g. skewness
more fundamentally than a change in location or scale
18
Properties of Mean, Variance and Covariance For any random variables X, Y, Z and constant c,
function density ) x ( f variable stochastic X
x X
function
- n
distributi dy ) y ( f ) x X ( P ) x ( F
x X X X
= ! =
!
" #
cE(X) E(cX) value expected dy ) y ( yf E(X)
X
= !
" " #
Y x, P(X : t independen E(Y) E(X) Y) E(X cE(X) E(cX) + = + =
" #
E(X)E(Y) E(XY) y), P(Y x) P(X y) Y x, P(X : t independen E(Y) E(X) Y) E(X = = = = = =
) X var( a ) b aX var( ) )) X ( E X (( E ) X var(
2 2 2 X
" = = !
var ) X var( ) Y X var( ) X var( a ) b aX var(
2
= +
( E Y ))( X ( E X (( E ) Y , X cov( : covariance ) Y , X cov( 2 ) Y var( ) X var( ) Y X var( + + = +
) Y , X cov( ))) Y ( E Y ))( X ( E X (( E ) Y , X cov( : covariance " " = ) Y , X cov( : t independen ) Y , X cov( : n correlatio
2 Y 2 X!
!
) Y , X cov( : t independen
Y X
=
20
Proposition 2.4 X1, …, Xn are independently and identically distributed with expected value µ and variance σ2. Then, Confidence intervals for estimate of mean
Then, the (1 - !) confidence interval about x can be expressed as: Where –! –! –! N is the number of observations.
( ) ( )
N s t N s t
N N 2 1 2 1
1 ˆ 1 ˆ
! " ! "
" + µ # µ # " " µ
( ) ( )
. in tables) found be can
- n
distributi this
- f
(values freedom
- f
degrees 1
- n with
distributi s student' the
- f
percentile th 1 100 the is 1
2 2 1
! ! !
!
N t tN
" "
deviation. standard sample the is
2
s s =
What is input modeling? Input modeling
Deriving a representation of the uncertainty or randomness in a
stochastic simulation.
Common representations
Measurement data Distributions derived from measurement data <-- focus of “Input modeling”
usually requires that samples are i.i.d and corresponding random
variables in the simulation model are i.i.d
i.i.d. = independent and identically distributed theoretical distributions empirical distribution
Time-dependent stochastic process Other stochastic processes like MAPs, MMPPs, ...
Examples include
time to failure for a machining process; demand per unit time for inventory of a product; number of defective items in a shipment of goods; times between arrivals of calls to a call center.
21
Distributions Many theoretical distributions with nice properties
experience with scenarios when to apply those (physical basis) well-studied properties, parameters, characteristics compact representation of data software support for sampling in simulation runs software support to perform parameter fitting easy to vary by modification of parameters some allow for closed-form analytical formulas for system analysis
(queueing networks)
may allow for numbers beyond reasonable limits, e.g. negative
values, very high values such that truncation may be necessary
less sensitive to data irregularities than an empirical distribution
Compare to:
empirical distribution trace-driven simulation
22
Overview of fitting with data Select one or more candidate distributions
based on physical characteristics of the process and graphical examination of the data.
Fit the distribution to the data
determine values for its unknown parameters.
Check the fit to the data
via statistical tests and via graphical analysis.
If the distribution does not fit,
select another candidate and repeat the process, or use an empirical distribution.
23 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission
What is a good fit? Goodness-of-fit tests:
Chi-squared test (χ2 ) Kolmogorov-Smirnov test (K-S) Anderson Darling test (AD)
Graphical Comparisons:
Histogram-based plots Probability plots
P-P plot Q-Q plot
Good parameter estimates
24 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission
Goodness-of-fit tests
25
- Beware of goodness-of-fit tests because they are unlikely to reject any
distribution when you have little data, and are likely to reject every distribution when you have lots of data.
- Avoid histogram-based summary measures, if possible, when asking the
software for its recommendation!
K-S and A-D tests
Features:
- Comparison of an empirical distribution function
with the distribution function of the hypothesized distribution.
- Does not depend on the grouping of data.
- A-D detects discrepancies in the tails and has
higher power than K-S test
Chi-square test
Features:
- A formal comparison of a histogram or
line graph with the fitted density or mass function
- Sensitive to how we group the data.
from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission
Graphical comparisons
26
Frequency Comparisons
Features:
- Graphical comparison of a histogram of
the data with the density function of the fitted distribution.
- Sensitive to how we group the data.
Probability Plots
Features:
- Graphical comparison of an estimate of the
true distribution function of the data with the distribution function of the fit.
- Q-Q (P-P) plot amplifies differences
between the tails (middle) of the model and sample distribution functions.
- Use every graphical tool in the software to examine the fit.
- If histogram-based tool, then play with the widths of the cells.
- Q-Q plot is very highly recommended!
from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission
Check the fit to the data: Statistical tests
define a measure X for the difference between fitted distribution & data Test statistic X is an RV
say small X means small difference, high X means huge difference
if we find an argument what distribution X has, we get a statistical test
to see if in a concrete case a value of X is significant or not
Say P(X ≤ x) = (1-α), and e.g. this holds for x=10 and α=.05, then we know that
if data is sampled from a given distribution and this is done n times (n->∞), this measure X will be below 10 in 95% of those cases.
If in our case, the sample data yields x=10.7, we can argue that it is too unlikely
that the sample data is from the fitted distribution.
Concepts, Terminology
Hypothesis H0, Alternative H1 Power of a test: (1-beta), probability to correctly reject H0 Alpha / Type I error: reject a true hypothesis Beta / Type II error: not rejecting a false hypothesis P-value: probability of observing result at least as extreme as test
statistic assuming H0 is true
27
Sample test characteristic for Chi-Square test (all parameters known)
28
One-sided Right side:
- critical region
- region of rejection
Left side:
- region of acceptance
where we fail to reject hypothesis P-value of x: 1-F(x)
Tests and p-values In the typical test... H0: the chosen distribution fits H1: the chosen distribution does not fit P-value of a test is:
the probability of observing a result at least as extreme as test
statistic assuming H0 is true (hence 1-F(x) on previous slide)
is the Type I error level (significance) at which we would just reject
H0 for the given data.
Implications
If the α level (common values: 0.01, 0.05, 0.1) < p-value,
then we do not reject H0 otherwise, we reject H0.
If the p-value is large (> 0.10)
then more extreme values than our current one are still reasonably likely so we fail to reject H0 in this sense it supports H0 that the distribution fits (but not more than that!)
29
Chi-Square Test Histogram-based test
30
>;
- !
" " " "
# # $
! " " #
$ %
- O6',+R,-.
L+,(),JMQ STU,M0,-.L+,(),JMQ !" #$%&'" V$,+,.'% %'.0$,. 0$,3+,0%M*P.U+36<.37. 0$,."0$ %J0,+R*P<
Sums the squared differences
Kolmogorov-Smirnov Test
31
!"
3.45.676666 6768 3.45.967666 :;7<8 6 67" 67! 67= 679 67< 67> 67? 67; 67: " 6 < "6 "< !6 !< =6 =< 96
!
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
@AB+#.+2CD0-2.E0(1,$-
F$(.GHID0&,H.J10.K-%L
#*+.2012.%$$M1.I2. &IN,&C&.D,OO0(0-H0
#*+.2012.,1.C10OC%. PQ0-.1I&R%0.1,S0.,1. 1&I%% /012.12I2,12,H
!"#"$%&'"()&*"+ ,-)&*'
TUF.$O.2Q0. QLR$2Q01,S0D. D,12(,VC2,$- TUF.$O.2Q0. 0&R,(,HI%. D,12(,VC2,$-. H$-12(CH20D. O($&.2Q0.DI2I
KS-Test detects the max difference : ;<-=/->6(/-,-#80/'(61+#,0-?@A?!ABA?,A-1>/, *,C?D-E-C,9%8/'-#<-?@A?!ABA?, 1>61-6'/- ?D-F-,
K-S Test Sometimes a bit tricky: geometric meaning of test statistic
32
but not for details, see Law/Kelton, Chap. 6
Anderson-Darling test (AD test) Test statistic is a weighted average of the squared differences with weights such that weights are largest for F(x) close to 0 and 1.
33
Modified critical values for adjusted A-D test statistics, reject H0 if An2 exceeds critical value.
Goodness-of-fit tests
34
- Beware of goodness-of-fit tests because they are unlikely to reject any
distribution when you have little data, and are likely to reject every distribution when you have lots of data.
K-S and A-D tests
Features:
- Comparison of an empirical distribution function
with the distribution function of the hypothesized distribution.
- Does not depend on the grouping of data.
- A-D detects discrepancies in the tails and has
higher power than K-S test
Chi-square test
Features:
- A formal comparison of a histogram or
line graph with the fitted density or mass function
- Sensitive to how we group the data.
from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission
P-P plots and Q-Q plots
35
Q-Q plot vs for q1,...,qn P-P plot vs for p1,...,pn This intuitive definition needs an adjustment to handle ties (multiple samples of same value)
Features of the Q-Q plot It does not depend on how the data are grouped. It is much better than a density-histogram when the number of data points is small. Deviations from a straight line show where the distribution does not match. A straight line implies that the family of distributions is
- correct. A 45o line implies that parameters fit as well.
36 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission
@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use OnlyLogLogistic(-113.32, 156.71, 16.107)
20 40 60 80 100 120 20 40 60 80 100 120 Input quantile
@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use Only@RISK Student Version
For Academic Use OnlyExponential(44.468) Shift=-0.58
20 40 60 80 100 120 Fitted quantile 20 40 60 80 100 120 Input quantile
Pretty good fit, but misses a bit on the right tail. Poor fit, misses badly in both tails.
Parameter estimates Common methods for parameter estimation are
maximum likelihood, method of moments, and least squares.
While the method matters, the variability in the data often
- verwhelms the differences in the estimators.
Decide what parameter estimates to use with goodness-
- f-fit tests and graphical comparisons.
Remember:
There is no “true distribution” just waiting to be found!
37 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission
Summary Use input models to represent uncertainty in simulation The particular input model chosen matters! Selection of the an input model is not an exact science
no right answer, but the issues to consider are
theoretical vs. empirical data physical basis of the distribution assessment of the goodness of a fit independence of samples
Assess the sensitivity of simulation output results to input models chosen Use expert opinion whenever you can Do not automatically trust a completely automated derivation of an input model.
38
Exploratory Data Analysis (EDA): Assumptions Four typical assumptions for measurements processes: data from the process at hand "behave like":
1.random drawings; 2.from a fixed distribution; 3.with the distribution having fixed location; and 4.with the distribution having fixed variation.
Fixed location:
response = deterministic component + random component univariate case: response = constant + error so fixed location is the unknown constant can be extended to a function of many variables effect: residuals (error) between measurement and response should behave like
a univariate process with same assumed properties above
such that testing of underlying assumptions becomes a tool for the validation
and quality of fit of the chosen model
4 assumptions hold => probabilistic predictability, process is “in statistical control”, can do predictions
39
EDA: Four techniques for testing assumptions
- 1. run sequence plot (Yi versus i)
- 2. lag plot (Yi versus Yi-1)
- 3. histogram (counts versus subgroups of Y)
- 4. normal probability plot (ordered Y versus theoretical ordered Y)
40
Example: Process with
- fixed location
- fixed variation
- random
distribution
- approx. normal
- no outliers
Interpretation of 4-Plot Fixed Location: If the fixed location assumption holds, then the run sequence plot will be flat and non-drifting. Fixed Variation: If the fixed variation assumption holds, then the vertical spread in the run sequence plot will be the approximately the same over the entire horizontal axis. Randomness: If the randomness assumption holds, then the lag plot will be structureless and random. Fixed Distribution: If the fixed distribution assumption holds, in particular if the fixed normal distribution holds, then
the histogram will be bell-shaped, and the normal probability plot will be linear.
41
Autocorrelation Plot Purpose: check randomness
if random autocorrelations should
be near zero for any and all time- lag separations.
If non-random, then one or more
- f the autocorrelations will be
significantly non-zero.
42
Observation
rather high degree of correlations, hence not random horizontal lines around zero indicate thresholds for noise
Definition: r(h) vs h
vertical axis: autocorrelation coefficient horizontal axis: time lag h (h = 1,2,3, ...) Note: range [-1,+1] Memo: long-range dependency
Autocorrelation plots
43
Random data Moderate positive autocorrelation Strong autocorrelation and autoregressive model Sinusoidal model
What if i.i.d is inappropriate? Time-dependent stochastic Process
Time-dependent non-homogenous non-stationary Poisson Process
Markovian Arrival Process
MAPs, definition MAP fitting algorithms
44
45
Excursion: Reliability Analysis with Reliability Block Diagrams Reliability of series-parallel systems Motivation:
Illustrate how probabilities can be applied Illustrate how powerful independence assumption is
We consider a set of components with index i=1,2,…
Event Ai = “component i is functioning properly” Reliability Ri of i is the probability P(Ai)
Series system:
Entire system fails if any of its components fails
Parallel system:
Entire system fails if all of its components fail
Key assumption:
Failure of components are mutually independent.
For now. R is a probability, later R will be a function of time t
46
Reliability Analysis (if component failures are independent)
Reliability of a series system
(Product law of reliabilities)
Based on the assumption of series connections. Note how quickly Rs degrades for n = 1,2,…
Reliability of a parallel system
Let Fi = 1-Ri be the unreliability of a component, Fp = 1-Rp of a parallel system Then
Note: also law of diminishing returns (rate of increase in reliability decreases
rapidly as n increases)
Reliability of a series-parallel system
Of n serial stages, at stage i have ni identical components (in parallel)
(Product law of unreliabilities)
47
Reliability Block Diagrams
Series parallel RBD of a network Other representations: Fault trees Limits: more general dependencies
Structure Function
Inclusion/exclusion formula (or SDP)
Approach with Binary decision diagrams (BDD), Zang 99 (in Trivedi Ch1) Factoring/Conditioning More techniques for more general settings
R1 R2 R3 R3 R3 R4 R4 R5
Fault Trees (as in Mobius)
48
- !
Components are leaves in the tree
- !
A component fails = logical value of true, otherwise false.
- !
The nodes in the tree are boolean AND, OR, and k of N gates.
- !
The system fails if the root is true. AND gates true if all the components are true (fail). OR gates true if any of the components are true (fail). k of N gates true if at least k of the components are true (fail).
C1 C3 C2
AND
C1 C3 C2
OR
C1 C3 C2
2 of 3
Fault trees (as in Mobius)
49
OR
C1 C3 C2
2 of 3
AND
H1 H2
AND
L2 L1
Simulation Modeling Large models described in a compositional manner Atomic models
Variants of stochastic automata: SANs, PEPA, ...
Composition
shard variables vs action synchronization
Measurement
Rate, impulse rewards measured for instant/interval of time, steady
state
Exploration of a design space
Series of experiments over parameter sets Design of experiments
Analysis
Queueing network analysis (Highly constrained for non-simulative results) CTMC analysis (Exponential distributions & finite state spaces) Simulation (General, but only statistical estimates based on observed behavior,
rare events are problematic)
50
Types of simulation Continuous simulation vs discrete event simulation For discrete event simulation
Terminating vs steady simulation Generation of pseudo random variables
Generation of uniform [0,1] random variates
Linear congruential generators Tausworthe generator ...
Test of uniform [0,1] generators Generation of non-uniform random variates based on uniform generators
Inverse transform technique Convolution technique Composition technique Acceptance/Rejection technique
51
Output analysis Point estimates and confidence intervals How to obtain data for estimates?
In general:
Independent simulation runs, independent replications Data is i.i.d. which simplifies statistical analysis
Special (common) case:
Batch means on a single long simulation run Applies only for steady state analysis of an ergodic system Data is correlated and batch means considered to estimate variance
(necessary to calculate confidence intervals)
Requires decision on end of transient phase, decision on batch sizes
Confidence intervals
for estimate of mean and ci
uses estimate for variance
relies on assumption of a normal distribution
for estimate of variance and ci
jackknifing, bootstrapping
52
( ) ( )
N s t N s t
N N 2 1 2 1
1 ˆ 1 ˆ
! " ! "
" + µ # µ # " " µ
Define
( )
2 2
ˆ 2 1 2 1 ˆ
i i n n i
N N x N µ ! ! ! ! = "
#
$
Verification and Validation Validation:
“substantiation that a computerized model within its domain of applicability possesses a satisfactory range of accuracy consistent with the intended application of the model”
Verification:
“ensuring that the computer program of the computerized model and its implementation are correct”
Sargent’s WSC Tutorial 2010, cites Schlesinger 79
“Verify (debug) the computer program.” Law’s WSC 09 Tutorial
Accreditation:
DoD: “official certification that a model, simulation, or federation of models and simulations and its associated data are acceptable for use for a specific application.”
Credibility:
“developing in users the confidence they require in order to use a model and in the information derived from that model.”
53
Variant 1: Simplified Version of Modeling Approach Conceptual model validation
determine that theories & assumptions are correct model representation “reasonable” for intended purpose
Computerized model verification
assure correct implementation
Operational validation
model’s output behavior has
sufficient accuracy
Data validity
ensure that the data necessary
for model building, evaluation, testing, and experimenting are adequate & correct.
Iterative process
also reflects underlying learning process
54
Problem Entity (System) Conceptual Model
Data
Validity Computerized Model Verification Computer Programming and Implementation Conceptual Model Validation
Analysis
and Modeling Experimentation Operational Validation Computerized Model
Validation Techniques
55
Sanity checks
Degenerate Tests Event validity (relative to real events)
Extreme condition tests
Traces to follow individual entities
Historical methods
Rationalism (assumptions true/false) Empiricism Positive economics (predicts future correctly)
Variability
Internal validity to determine amount of internal variability with
several replication runs
Parameter Variability - Sensitivity Analysis
Computerized model verification Special case of verification in software engineering If a simulation framework is used
evaluate if framework works correctly test random number generation model-specific
existing functionality/libraries are used correctly conceptual model is completely and correctly encoded in modeling notation
- f the employed framework
Means
structured walk through traces testing, i.e., simulation is executed and dynamic behavior is checked
against a given set of criteria,
internal consistency checks (assertions) input-output relationships recalculate estimates for mean and variance of input probability distributions
56
Operational Validity Explore Model Behavior
Directions of behavior Reasonable / precise magnitudes Parameter variability-sensitivity analysis
Statistical approaches: Metamodeling, design of experiments
Comparisons of Output Behavior (System vs Model)
Most effective: trace driven simulation
feed measurement data into simulation to closely follow real behavior
Use graphs to make subjective decisions
Histograms, Box plots, Scatter plots Useful in model development process to evaluate level of detail and accuracy,
for face validity checks by subject matter experts, and in Turing tests
Use confidence intervals and/or hypothesis tests to make an
“objective” decision
Problems: underlying assumptions (independence, normality) and/or
insufficient system data
57
Documentation of VV effort Critical to build credibility, justify confidence Detailed documentation on specifics of tests etc Separate tables for data validity, conceptual model validity, computer model verification, operational validity
58
Low Medium High
59
Application: LEO Satellite
Communication
Satellite - satellite:
if within communication range
Satellite - ground station:
if within footprint
We discretize orbits, identify matching periods
Intersatellitelink (ISL) Gateway Link (GWL) GWL
Elevation: Angle wrt to center of radiation cone and earth surface
ε
Footprint
60
More input data
Radiation dose, shielding and its mapping to failure rates,
0.007 failures per year for processor and CMOS components 0.0001 failures per year for discrete components
Scale factor r=1 for 1mm shielding for higher orbit Consider several model configurations!
Communication:
Data collection rate: 2 GB/yr while memory available
all data lost at failure
Uplink communication considered negligible Simple routing mechanism ISL communication rate: 115 kbps with 50% overhead, 226665 MB/yr GWL rate: double ISL rate ISL with commercial satellite networks
61
Where are the probabilities ?
Dependability study:
Ground station: rates of failure / repair actions Satellite subsystems: rates of failure / repair actions
are modeled with a random variable that follows a negative exponential distribution with a given rate. Rate 5.0 means on average 5 events per time unit.
Total Ionizing Dose: is taken into account by a scaling factor r towards failure
rates of components.
What is then analyzed
Reliability and availability For different levels of radiation shielding For different levels of redundancy of components
What type of analysis is used
Transient analysis of Markov chains
For single satellite design
Discrete event simulation of stochastic models
valid alternative, used for evaluation of overall network
62
Results wrt to Performance and Dependability Baseline results for r = 1 A set of simulations experiments are performed for
- Different levels of r
- Different protocols
- Configurations
- Buffer capacities
- Data collection rates
- Communication with
- ther commercial networks
Following slides cover material from project 1 Following slides cover material from project 1
63
Applications: Network Traffic Failure of Poisson Modeling Observation: Scale-Invariant Burstiness on Multiple Scales Ways to describe the phenomenon
Long-Range Dependence Heavy Tail Distributions Self-Similarity
64
Burstiness on Multiple Scales X axis: time intervals Y axis: packets per unit time for a given time interval Time intervals increased by factor of 10, resp. 7 in last step Burstiness in packet traffic:
traffic “spikes” ride on
longer term “ripples,”
traffic “ripples” ride on
longer term “swells,”
ad infinitum.
65
Self-similarity in the Continuous Case Consider Y(t) in a continuous setting, t ∈ R
66
Definition 1.4.4 (
- ss)
is self-similar with self-similarity parameter, i.e., Hurst parameter, ( ), denoted
- ss, if for all
and , (1.4.5) Thus and its time scaled version —after normalizing by —must follow the same distribution. In the traffic modeling context, it is convenient to think
- f
as the cumulative or total traffic up to time . For —time is stretched
- r dilated—a contraction factor
is applied to make the magnitude of comparable to that of . For , the opposite holds true. As varies, the scaling exponent remains invariant. This is a most natural definition, however, it has an important drawback: unless is degenerate, i.e., for all , cannot be stationary due to the normalization factor . Its increment process has an important drawback: unless is degenerate, i.e., cannot be stationary due to the normalization factor . I , however, is another matter. In pa
Long-range Dependence X(t) long-range dependent if ρ(k) decays to zero so slowly that its sum does not converge:
For short-range dependent traffic, which is non-bursty, ρ(k) falls off
quickly with time, usually exponentially.
For long range dependent traffic, it falls off much more slowly,
usually obeying some type of power law.
Intuitively:
memory is built-in to the process because the dependence among
an LRD process’s widely separated values is significant, even across large time shifts.
67
is, ∑k=1
∞ |ρ(k)|
to the process
- is-
= ∞.
Hurst Parameter Some simple facts regarding H and its impact on r(k):
if H = 1/2 then r(k) = 0 and X(t) is trivially short-range dependent
since it is uncorrelated
if 0 < H < 1/2, then if H = 1, then r(k) = 1 for all k ≥ 1 (artificial special case) H > 1 prohibited due to stationary condition on X(t) So basically two cases remain: H = 1/2 1/2 < H < 1 To distinguish those 2 cases, reasonably accurate estimates for H are
necessary.
68
virtue of being com , is uninteresting sin
Heavy-tailed Distribution A random variable Z has a heavy-tailed distribution if
69
(1.4.10) where is called the tail index or shape parameter and is a positive
- constant10. That is, the tail of the distribution, asymptotically, decays hyperbolically.
This is in contrast to light-tailed distributions—e.g., exponential and Gaussian— which possess an exponentially decreasing tail. A distinguishing mark of heavy-tailed distributions is that they have infinite variance for , and if , they also have an unbounded mean. In the networking context, we will be primarily interested in the case . A frequently used heavy-tailed distribution is the interested in the case . Pareto distribution whose distributio
Example of a Heavy-Tailed Distribution
70
Pareto distribution whose distribution function is given by where is the shape parameter and is called the location parameter. The mean is given by . We remark that there are distributions—e.g., Weibull and log-normal—that have subexponentially decreasing tails but possess finite variance. The main characteristic of a random variable obeying a heavy-tailed distribution
also: Power-law distribution, Double-exponential distrib. If α ≤ 2, then the distribution has an infinite variance If α ≤ 1, then the distribution has an infinite mean. Density: f(x) = α bα x-α-1
from Fowler ’99: Network Traffic Models
71
Invariant Protocol Level Distribution Connection Size
- Lognormal
Connection Duration
- Lognormal
Requested File Popularity Application Zipf Requested File Sizes (Overall) Application Hybrid: Lognormal body, Pareto tail (Heavy-tailed) FTP Transfers Application Pareto tail (Heavy-tailed) Number Of Page Requests/Site Application Inverse Gaussian (Heavy-tailed) Reading Time/Page (Sec) Application Heavy-tailed Sessions (Arrivals) Session Poisson Session Duration Session Pareto (Heavy-tailed) Session Size Session Pareto (Heavy-tailed) WAN Traffic At TCP Level Transport Self-similar (fractal, multifractal) TCP Connections/Web Session Transport Heavy-tailed Interarrival Time Of Packets Network Heavy-tailed (LRD, fractal) Interarrival (Generation) Time Of Packets Generated By User At Keyboard Network Pareto (body) Pareto (upper tail) Interarrival Time Of Ethernet Frames Data Link Self-similar (fractal)