CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, - - PowerPoint PPT Presentation

cs626 data analysis and simulation
SMART_READER_LITE
LIVE PREVIEW

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, - - PowerPoint PPT Presentation

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Today: Recap before midterm 1 Big Picture: Model-based Analysis of Systems portion/facet real world perception transfer


slide-1
SLIDE 1

1

CS626 Data Analysis and Simulation

Today: Recap before midterm Instructor: Peter Kemper

R 104A, phone 221-3462, email:kemper@cs.wm.edu

slide-2
SLIDE 2

2

Big Picture: Model-based Analysis of Systems

portion/facet real world formal / computer aided analysis solution, rewards, qualitative and quantitative properties probability model, stochastic process transformation presentation transfer decision description perception solution to real world problem real world problem formal model

slide-3
SLIDE 3

Reminder

3

This is no pipe! ... and this is no serpentine accumulator in a production line!

slide-4
SLIDE 4

System - Model - Study Model vs System

 largely simplified formal/mathematical/stochastic model implemented

in software in a fully controlled environment

 set of physical devices interacting in space-time in an largely

uncontrolled, not fully understood environment

Model

 includes some of the rules how the system operates, excludes others  includes some aspects of the real world as random variables, ignores

  • thers or assumes them as constant

 is parameterized with respect to certain design variables

Study

 has an objective, a clear question  delivers values that are probabilities like R(0,t)

 Interpretation?

 evaluates effects of different design choices

4

slide-5
SLIDE 5

CS 626 Topics

From Data to Stochastic Input Models

 Input Modeling  Probability, Distributions  Exploratory Data Analysis, Statistical tests  Stochastic processes, Markov Processes

 DTMC, CTMC  Phase type distributions, MAPs, MAP Fitting

 Tools

 for data analysis: R  for MAP fitting: KPC toolbox

Simulation Modeling

 Simulation  Output Data Analysis  Verification, Validation,

 Trace driven simulation  Debugging of simulation models

 Tools for simulation: Mobius, (+Traviando)

Applications

 Reliability analysis, Dependability modeling of a LEO satellite  Modeling traffic in computer networks  Emulation: Testing, Debugging, Training in Automated Material Handling Systems

5

slide-6
SLIDE 6

From Data to Stochastic Input Models Probability

 Axiomatic Definition  Frequentist Definition

6

slide-7
SLIDE 7

7

Frequency Definition of Probability

If our experiment is repeated over and over again then the proportion of time that event E occurs will just be P(E).

Frequency Definition of Probability: P(E) = lim m(E) / m where m(E) is the number of times event E occurs, m is the number of trials Note:

 Random experiment can be repeated under identical conditions  if repeated indefinitely, relative frequency of occurrence of an event

converges to a constant

 Law of large numbers states that limit does exist.  For small m, m(E) can show strong fluctuations.

m→∞

slide-8
SLIDE 8

8

Axiomatic Definition of Probability Definition

For each event E of the sample S, we assume that a number P(E) is defined that satisfies Kolmogorov’s axioms:

slide-9
SLIDE 9

9

Outline on Problem Solving (Goodman & Hedetniemi 77)

Identify sample space S

 All elements must be mutually exclusive, collectively exhaustive.  All possible outcomes of experiment should be listed separately.

(Root of “tricky” problems:

  • ften ambiguity, inexact formulation of the model of a physical situation)

Assign probabilities

 To all elements of S, consistent with Kolmogorov’s axioms.

(In practice: estimates based on experience, analysis or common assumptions)

Identify events of interest

 Recast statements as subsets of S.  Use laws (algebra of events) for simplifications  Use visualizations for clarification

Compute desired probabilities

 Use axioms, laws, often helpful: express event of interest as union of mutually

exclusive events and sum up probabilities

slide-10
SLIDE 10

10

More relations What is the probability of a UNION of events ? What is the probability of a union of a set of events? Is there a better way to calculate this?

Sum of disjoint products (SDP) formula

slide-11
SLIDE 11

11

Conditional Probabilities Definition

The conditional probability of E given F is if P(F) > 0 and it is undefined otherwise. Interpretation: Given F has happened, only events in EF are still possible for E, so original probability P(EF) is scaled by 1/P(F). Multiplication rule:

E F

EF

F

EF

given F happens

slide-12
SLIDE 12

12

Independent events

Definition

 Two events E and F are independent if:

This also means: In English, E and F are independent

 if knowledge that F has occurred does not affect the probability that E occurs.

Notes:

 if E, F independent then also E,Fc and Ec,F and Ec,Fc  Generalizes from 2 to n events

e.g. n=3 every subset independent

 Mutually exclusive vs independent

slide-13
SLIDE 13

13

About independent events Venn diagrams Tree diagrams of sequential sample spaces

 Throw coin twice

A B S H T T T H H (H,H) (H,T) (T,H) (T,T) For independent events: consider A, B being not empty and not S, 1) if A ⊂ B, then A and B cannot be independent 2) if A ∩ B = ∅, then A and B cannot be independent Joint sample space from cross product of individual sample spaces. First, second throw are independent.

slide-14
SLIDE 14

14

Joint and pairwise independence

A ball is drawn from an urn containing four balls numbered 1, 2, 3, 4.

Then we have: They are pairwise independent, but not jointly independent

A sequence of experiments results in either a success or a failure where Ei, i >= 1 denotes a success. If for all i1, i2, …, in: we say the sequence of experiments consists of independent trials

slide-15
SLIDE 15

Independence is a very important property Independence

 simplifies calculations significantly => very popular assumption for

theoretical results

 input modeling, workload modeling  statistical tests  output analysis of simulation models: confidence intervals for estimate of mean  ...

 independence need not be present in real data

 data traffic in networks: often correlated  output data of a (simulated) system, i.e. response of a system to some

workload

 ways to investigate independence

 graphics: correlation plot  tests: chi-square test for vectors, rank von Neumann test, runs test  see Law/Kelton Chap 6.3 and Chap 7.4.1

15

slide-16
SLIDE 16

16

Bayes’ Formula

Let F1, F2, …, Fn be events of S, all mutually exclusive and collectively exhaustive. Theorem of total probability (also Rule of Elimination) Bayes’ Formula helps us to determine which Fj happened given we observed E

slide-17
SLIDE 17

Random Variable RV Definition

 A random variable X on a probability space (S,F,P) is a function X : S -> R

that assigns a real number X(s) to each sample point s ∈ S, such that for every real number x, the set of sample points {s|X(s) ≤ x} is an event, that is a member of F. RVs can be discrete or continuous

More concepts

 cumulative distribution function  density  moments E[Xi], centralized moments, Variance, Skewness, Kurtosis

Particular examples

 Normal distribution  Poisson distribution  Exponential distribution  Pareto distribution

17

slide-18
SLIDE 18

Parameterization of distributions Parameters of 3 basic types Location

 specifies an x-axis location point of a distribution’s range of values  usually the midpoint (e.g. mean for normal distribution) or lower end

point for the distribution’s range

 sometimes called shift parameter since changing its value shifts the

distribution to the left or right, e.g., for Y = X + γ

Scale

 determines the scale (unit) of measurement of the values in the

range of the distribution (e.g. std deviation σ for normal distribution)

 changing its value compresses/expands distribution but does not

alter its basic form, e.g., for Y = β X

Shape

 determines basic form/shape of a distribution  changing its values alters a distribution’s properties, e.g. skewness

more fundamentally than a change in location or scale

18

slide-19
SLIDE 19

Properties of Mean, Variance and Covariance For any random variables X, Y, Z and constant c,

function density ) x ( f variable stochastic X

x X

function

  • n

distributi dy ) y ( f ) x X ( P ) x ( F

x X X X

= ! =

!

" #

cE(X) E(cX) value expected dy ) y ( yf E(X)

X

= !

" " #

Y x, P(X : t independen E(Y) E(X) Y) E(X cE(X) E(cX) + = + =

" #

E(X)E(Y) E(XY) y), P(Y x) P(X y) Y x, P(X : t independen E(Y) E(X) Y) E(X = = = = = =

) X var( a ) b aX var( ) )) X ( E X (( E ) X var(

2 2 2 X

" = = !

var ) X var( ) Y X var( ) X var( a ) b aX var(

2

= +

( E Y ))( X ( E X (( E ) Y , X cov( : covariance ) Y , X cov( 2 ) Y var( ) X var( ) Y X var( + + = +

) Y , X cov( ))) Y ( E Y ))( X ( E X (( E ) Y , X cov( : covariance " " = ) Y , X cov( : t independen ) Y , X cov( : n correlatio

2 Y 2 X!

!

) Y , X cov( : t independen

Y X

=

slide-20
SLIDE 20

20

Proposition 2.4 X1, …, Xn are independently and identically distributed with expected value µ and variance σ2. Then, Confidence intervals for estimate of mean

Then, the (1 - !) confidence interval about x can be expressed as: Where –! –! –! N is the number of observations.

( ) ( )

N s t N s t

N N 2 1 2 1

1 ˆ 1 ˆ

! " ! "

" + µ # µ # " " µ

( ) ( )

. in tables) found be can

  • n

distributi this

  • f

(values freedom

  • f

degrees 1

  • n with

distributi s student' the

  • f

percentile th 1 100 the is 1

2 2 1

! ! !

!

N t tN

" "

deviation. standard sample the is

2

s s =

slide-21
SLIDE 21

What is input modeling? Input modeling

 Deriving a representation of the uncertainty or randomness in a

stochastic simulation.

 Common representations

 Measurement data  Distributions derived from measurement data <-- focus of “Input modeling”

 usually requires that samples are i.i.d and corresponding random

variables in the simulation model are i.i.d

 i.i.d. = independent and identically distributed  theoretical distributions  empirical distribution

 Time-dependent stochastic process  Other stochastic processes like MAPs, MMPPs, ...

Examples include

 time to failure for a machining process;  demand per unit time for inventory of a product;  number of defective items in a shipment of goods;  times between arrivals of calls to a call center.

21

slide-22
SLIDE 22

Distributions Many theoretical distributions with nice properties

 experience with scenarios when to apply those (physical basis)  well-studied properties, parameters, characteristics  compact representation of data  software support for sampling in simulation runs  software support to perform parameter fitting  easy to vary by modification of parameters  some allow for closed-form analytical formulas for system analysis

(queueing networks)

 may allow for numbers beyond reasonable limits, e.g. negative

values, very high values such that truncation may be necessary

 less sensitive to data irregularities than an empirical distribution

Compare to:

 empirical distribution  trace-driven simulation

22

slide-23
SLIDE 23

Overview of fitting with data Select one or more candidate distributions

 based on physical characteristics of the process and  graphical examination of the data.

Fit the distribution to the data

 determine values for its unknown parameters.

Check the fit to the data

 via statistical tests and  via graphical analysis.

If the distribution does not fit,

 select another candidate and repeat the process, or  use an empirical distribution.

23 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission

slide-24
SLIDE 24

What is a good fit? Goodness-of-fit tests:

 Chi-squared test (χ2 )  Kolmogorov-Smirnov test (K-S)  Anderson Darling test (AD)

Graphical Comparisons:

 Histogram-based plots  Probability plots

 P-P plot  Q-Q plot

Good parameter estimates

24 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission

slide-25
SLIDE 25

Goodness-of-fit tests

25

  • Beware of goodness-of-fit tests because they are unlikely to reject any

distribution when you have little data, and are likely to reject every distribution when you have lots of data.

  • Avoid histogram-based summary measures, if possible, when asking the

software for its recommendation!

K-S and A-D tests

Features:

  • Comparison of an empirical distribution function

with the distribution function of the hypothesized distribution.

  • Does not depend on the grouping of data.
  • A-D detects discrepancies in the tails and has

higher power than K-S test

Chi-square test

Features:

  • A formal comparison of a histogram or

line graph with the fitted density or mass function

  • Sensitive to how we group the data.

from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission

slide-26
SLIDE 26

Graphical comparisons

26

Frequency Comparisons

Features:

  • Graphical comparison of a histogram of

the data with the density function of the fitted distribution.

  • Sensitive to how we group the data.

Probability Plots

Features:

  • Graphical comparison of an estimate of the

true distribution function of the data with the distribution function of the fit.

  • Q-Q (P-P) plot amplifies differences

between the tails (middle) of the model and sample distribution functions.

  • Use every graphical tool in the software to examine the fit.
  • If histogram-based tool, then play with the widths of the cells.
  • Q-Q plot is very highly recommended!

from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission

slide-27
SLIDE 27

Check the fit to the data: Statistical tests

 define a measure X for the difference between fitted distribution & data  Test statistic X is an RV

 say small X means small difference, high X means huge difference

 if we find an argument what distribution X has, we get a statistical test

to see if in a concrete case a value of X is significant or not

 Say P(X ≤ x) = (1-α), and e.g. this holds for x=10 and α=.05, then we know that

if data is sampled from a given distribution and this is done n times (n->∞), this measure X will be below 10 in 95% of those cases.

 If in our case, the sample data yields x=10.7, we can argue that it is too unlikely

that the sample data is from the fitted distribution.

Concepts, Terminology

 Hypothesis H0, Alternative H1  Power of a test: (1-beta), probability to correctly reject H0  Alpha / Type I error: reject a true hypothesis  Beta / Type II error: not rejecting a false hypothesis  P-value: probability of observing result at least as extreme as test

statistic assuming H0 is true

27

slide-28
SLIDE 28

Sample test characteristic for Chi-Square test (all parameters known)

28

One-sided Right side:

  • critical region
  • region of rejection

Left side:

  • region of acceptance

where we fail to reject hypothesis P-value of x: 1-F(x)

slide-29
SLIDE 29

Tests and p-values In the typical test... H0: the chosen distribution fits H1: the chosen distribution does not fit P-value of a test is:

 the probability of observing a result at least as extreme as test

statistic assuming H0 is true (hence 1-F(x) on previous slide)

 is the Type I error level (significance) at which we would just reject

H0 for the given data.

Implications

 If the α level (common values: 0.01, 0.05, 0.1) < p-value,

then we do not reject H0 otherwise, we reject H0.

 If the p-value is large (> 0.10)

 then more extreme values than our current one are still reasonably likely  so we fail to reject H0  in this sense it supports H0 that the distribution fits (but not more than that!)

29

slide-30
SLIDE 30

Chi-Square Test Histogram-based test

30

>;

  • !

" " " "

# # $

! " " #

$ %

  • O6',+R,-.

L+,(),JMQ STU,M0,-.L+,(),JMQ !" #$%&'" V$,+,.'% %'.0$,. 0$,3+,0%M*P.U+36<.37. 0$,."0$ %J0,+R*P<

Sums the squared differences

slide-31
SLIDE 31

Kolmogorov-Smirnov Test

31

!"

3.45.676666 6768 3.45.967666 :;7<8 6 67" 67! 67= 679 67< 67> 67? 67; 67: " 6 < "6 "< !6 !< =6 =< 96

!

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

@AB+#.+2CD0-2.E0(1,$-

F$(.GHID0&,H.J10.K-%L

#*+.2012.%$$M1.I2. &IN,&C&.D,OO0(0-H0

#*+.2012.,1.C10OC%. PQ0-.1I&R%0.1,S0.,1. 1&I%% /012.12I2,12,H

!"#"$%&'"()&*"+ ,-)&*'

TUF.$O.2Q0. QLR$2Q01,S0D. D,12(,VC2,$- TUF.$O.2Q0. 0&R,(,HI%. D,12(,VC2,$-. H$-12(CH20D. O($&.2Q0.DI2I

KS-Test detects the max difference : ;<-=/->6(/-,-#80/'(61+#,0-?@A?!ABA?,A-1>/, *,C?D-E-C,9%8/'-#<-?@A?!ABA?, 1>61-6'/- ?D-F-,

slide-32
SLIDE 32

K-S Test Sometimes a bit tricky: geometric meaning of test statistic

32

but not for details, see Law/Kelton, Chap. 6

slide-33
SLIDE 33

Anderson-Darling test (AD test) Test statistic is a weighted average of the squared differences with weights such that weights are largest for F(x) close to 0 and 1.

33

Modified critical values for adjusted A-D test statistics, reject H0 if An2 exceeds critical value.

slide-34
SLIDE 34

Goodness-of-fit tests

34

  • Beware of goodness-of-fit tests because they are unlikely to reject any

distribution when you have little data, and are likely to reject every distribution when you have lots of data.

K-S and A-D tests

Features:

  • Comparison of an empirical distribution function

with the distribution function of the hypothesized distribution.

  • Does not depend on the grouping of data.
  • A-D detects discrepancies in the tails and has

higher power than K-S test

Chi-square test

Features:

  • A formal comparison of a histogram or

line graph with the fitted density or mass function

  • Sensitive to how we group the data.

from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission

slide-35
SLIDE 35

P-P plots and Q-Q plots

35

Q-Q plot vs for q1,...,qn P-P plot vs for p1,...,pn This intuitive definition needs an adjustment to handle ties (multiple samples of same value)

slide-36
SLIDE 36

Features of the Q-Q plot It does not depend on how the data are grouped. It is much better than a density-histogram when the number of data points is small. Deviations from a straight line show where the distribution does not match. A straight line implies that the family of distributions is

  • correct. A 45o line implies that parameters fit as well.

36 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

LogLogistic(-113.32, 156.71, 16.107)

20 40 60 80 100 120 20 40 60 80 100 120 Input quantile

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

@RISK Student Version

For Academic Use Only

Exponential(44.468) Shift=-0.58

20 40 60 80 100 120 Fitted quantile 20 40 60 80 100 120 Input quantile

Pretty good fit, but misses a bit on the right tail. Poor fit, misses badly in both tails.

slide-37
SLIDE 37

Parameter estimates Common methods for parameter estimation are

 maximum likelihood,  method of moments, and  least squares.

While the method matters, the variability in the data often

  • verwhelms the differences in the estimators.

Decide what parameter estimates to use with goodness-

  • f-fit tests and graphical comparisons.

Remember:

 There is no “true distribution” just waiting to be found!

37 from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission

slide-38
SLIDE 38

Summary Use input models to represent uncertainty in simulation The particular input model chosen matters! Selection of the an input model is not an exact science

 no right answer, but the issues to consider are

 theoretical vs. empirical data  physical basis of the distribution  assessment of the goodness of a fit  independence of samples

Assess the sensitivity of simulation output results to input models chosen Use expert opinion whenever you can Do not automatically trust a completely automated derivation of an input model.

38

slide-39
SLIDE 39

Exploratory Data Analysis (EDA): Assumptions Four typical assumptions for measurements processes: data from the process at hand "behave like":

1.random drawings; 2.from a fixed distribution; 3.with the distribution having fixed location; and 4.with the distribution having fixed variation.

Fixed location:

 response = deterministic component + random component  univariate case: response = constant + error  so fixed location is the unknown constant  can be extended to a function of many variables  effect: residuals (error) between measurement and response should behave like

a univariate process with same assumed properties above

 such that testing of underlying assumptions becomes a tool for the validation

and quality of fit of the chosen model

4 assumptions hold => probabilistic predictability, process is “in statistical control”, can do predictions

39

slide-40
SLIDE 40

EDA: Four techniques for testing assumptions

  • 1. run sequence plot (Yi versus i)
  • 2. lag plot (Yi versus Yi-1)
  • 3. histogram (counts versus subgroups of Y)
  • 4. normal probability plot (ordered Y versus theoretical ordered Y)

40

Example: Process with

  • fixed location
  • fixed variation
  • random

distribution

  • approx. normal
  • no outliers
slide-41
SLIDE 41

Interpretation of 4-Plot Fixed Location: If the fixed location assumption holds, then the run sequence plot will be flat and non-drifting. Fixed Variation: If the fixed variation assumption holds, then the vertical spread in the run sequence plot will be the approximately the same over the entire horizontal axis. Randomness: If the randomness assumption holds, then the lag plot will be structureless and random. Fixed Distribution: If the fixed distribution assumption holds, in particular if the fixed normal distribution holds, then

 the histogram will be bell-shaped, and  the normal probability plot will be linear.

41

slide-42
SLIDE 42

Autocorrelation Plot Purpose: check randomness

 if random autocorrelations should

be near zero for any and all time- lag separations.

 If non-random, then one or more

  • f the autocorrelations will be

significantly non-zero.

42

Observation

 rather high degree of correlations, hence not random  horizontal lines around zero indicate thresholds for noise

Definition: r(h) vs h

 vertical axis: autocorrelation coefficient  horizontal axis: time lag h (h = 1,2,3, ...)  Note: range [-1,+1]  Memo: long-range dependency

slide-43
SLIDE 43

Autocorrelation plots

43

Random data Moderate positive autocorrelation Strong autocorrelation and autoregressive model Sinusoidal model

slide-44
SLIDE 44

What if i.i.d is inappropriate? Time-dependent stochastic Process

 Time-dependent non-homogenous non-stationary Poisson Process

Markovian Arrival Process

 MAPs, definition  MAP fitting algorithms

44

slide-45
SLIDE 45

45

Excursion: Reliability Analysis with Reliability Block Diagrams Reliability of series-parallel systems Motivation:

 Illustrate how probabilities can be applied  Illustrate how powerful independence assumption is

We consider a set of components with index i=1,2,…

 Event Ai = “component i is functioning properly”  Reliability Ri of i is the probability P(Ai)

Series system:

 Entire system fails if any of its components fails

Parallel system:

 Entire system fails if all of its components fail

Key assumption:

 Failure of components are mutually independent.

For now. R is a probability, later R will be a function of time t

slide-46
SLIDE 46

46

Reliability Analysis (if component failures are independent)

Reliability of a series system

(Product law of reliabilities)

 Based on the assumption of series connections.  Note how quickly Rs degrades for n = 1,2,…

Reliability of a parallel system

Let Fi = 1-Ri be the unreliability of a component, Fp = 1-Rp of a parallel system Then

 Note: also law of diminishing returns (rate of increase in reliability decreases

rapidly as n increases)

Reliability of a series-parallel system

 Of n serial stages, at stage i have ni identical components (in parallel)

(Product law of unreliabilities)

slide-47
SLIDE 47

47

Reliability Block Diagrams

Series parallel RBD of a network Other representations: Fault trees Limits: more general dependencies

 Structure Function

 Inclusion/exclusion formula (or SDP)

 Approach with Binary decision diagrams (BDD), Zang 99 (in Trivedi Ch1)  Factoring/Conditioning  More techniques for more general settings

R1 R2 R3 R3 R3 R4 R4 R5

slide-48
SLIDE 48

Fault Trees (as in Mobius)

48

  • !

Components are leaves in the tree

  • !

A component fails = logical value of true, otherwise false.

  • !

The nodes in the tree are boolean AND, OR, and k of N gates.

  • !

The system fails if the root is true. AND gates true if all the components are true (fail). OR gates true if any of the components are true (fail). k of N gates true if at least k of the components are true (fail).

C1 C3 C2

AND

C1 C3 C2

OR

C1 C3 C2

2 of 3

slide-49
SLIDE 49

Fault trees (as in Mobius)

49

OR

C1 C3 C2

2 of 3

AND

H1 H2

AND

L2 L1

slide-50
SLIDE 50

Simulation Modeling Large models described in a compositional manner Atomic models

 Variants of stochastic automata: SANs, PEPA, ...

Composition

 shard variables vs action synchronization

Measurement

 Rate, impulse rewards measured for instant/interval of time, steady

state

Exploration of a design space

 Series of experiments over parameter sets  Design of experiments

Analysis

 Queueing network analysis (Highly constrained for non-simulative results)  CTMC analysis (Exponential distributions & finite state spaces)  Simulation (General, but only statistical estimates based on observed behavior,

rare events are problematic)

50

slide-51
SLIDE 51

Types of simulation Continuous simulation vs discrete event simulation For discrete event simulation

 Terminating vs steady simulation  Generation of pseudo random variables

 Generation of uniform [0,1] random variates

 Linear congruential generators  Tausworthe generator  ...

 Test of uniform [0,1] generators  Generation of non-uniform random variates based on uniform generators

 Inverse transform technique  Convolution technique  Composition technique  Acceptance/Rejection technique

51

slide-52
SLIDE 52

Output analysis Point estimates and confidence intervals How to obtain data for estimates?

 In general:

 Independent simulation runs, independent replications  Data is i.i.d. which simplifies statistical analysis

 Special (common) case:

 Batch means on a single long simulation run  Applies only for steady state analysis of an ergodic system  Data is correlated and batch means considered to estimate variance

(necessary to calculate confidence intervals)

 Requires decision on end of transient phase, decision on batch sizes

Confidence intervals

 for estimate of mean and ci

 uses estimate for variance

 relies on assumption of a normal distribution

 for estimate of variance and ci

 jackknifing, bootstrapping

52

( ) ( )

N s t N s t

N N 2 1 2 1

1 ˆ 1 ˆ

! " ! "

" + µ # µ # " " µ

Define

( )

2 2

ˆ 2 1 2 1 ˆ

i i n n i

N N x N µ ! ! ! ! = "

#

$

slide-53
SLIDE 53

Verification and Validation Validation:

“substantiation that a computerized model within its domain of applicability possesses a satisfactory range of accuracy consistent with the intended application of the model”

Verification:

“ensuring that the computer program of the computerized model and its implementation are correct”

Sargent’s WSC Tutorial 2010, cites Schlesinger 79

“Verify (debug) the computer program.” Law’s WSC 09 Tutorial

Accreditation:

DoD: “official certification that a model, simulation, or federation of models and simulations and its associated data are acceptable for use for a specific application.”

Credibility:

“developing in users the confidence they require in order to use a model and in the information derived from that model.”

53

slide-54
SLIDE 54

Variant 1: Simplified Version of Modeling Approach Conceptual model validation

 determine that theories & assumptions are correct  model representation “reasonable” for intended purpose

Computerized model verification

 assure correct implementation

Operational validation

 model’s output behavior has

sufficient accuracy

Data validity

 ensure that the data necessary

for model building, evaluation, testing, and experimenting are adequate & correct.

Iterative process

 also reflects underlying learning process

54

Problem Entity (System) Conceptual Model

Data

Validity Computerized Model Verification Computer Programming and Implementation Conceptual Model Validation

Analysis

and Modeling Experimentation Operational Validation Computerized Model

slide-55
SLIDE 55

Validation Techniques

55

Sanity checks

 Degenerate Tests  Event validity (relative to real events)

 Extreme condition tests

 Traces to follow individual entities

Historical methods

 Rationalism (assumptions true/false)  Empiricism  Positive economics (predicts future correctly)

Variability

 Internal validity to determine amount of internal variability with

several replication runs

 Parameter Variability - Sensitivity Analysis

slide-56
SLIDE 56

Computerized model verification Special case of verification in software engineering If a simulation framework is used

 evaluate if framework works correctly  test random number generation  model-specific

 existing functionality/libraries are used correctly  conceptual model is completely and correctly encoded in modeling notation

  • f the employed framework

Means

 structured walk through  traces  testing, i.e., simulation is executed and dynamic behavior is checked

against a given set of criteria,

 internal consistency checks (assertions)  input-output relationships  recalculate estimates for mean and variance of input probability distributions

56

slide-57
SLIDE 57

Operational Validity Explore Model Behavior

 Directions of behavior  Reasonable / precise magnitudes  Parameter variability-sensitivity analysis

 Statistical approaches: Metamodeling, design of experiments

Comparisons of Output Behavior (System vs Model)

 Most effective: trace driven simulation

 feed measurement data into simulation to closely follow real behavior

 Use graphs to make subjective decisions

 Histograms, Box plots, Scatter plots  Useful in model development process to evaluate level of detail and accuracy,

for face validity checks by subject matter experts, and in Turing tests

 Use confidence intervals and/or hypothesis tests to make an

“objective” decision

 Problems: underlying assumptions (independence, normality) and/or

insufficient system data

57

slide-58
SLIDE 58

Documentation of VV effort Critical to build credibility, justify confidence Detailed documentation on specifics of tests etc Separate tables for data validity, conceptual model validity, computer model verification, operational validity

58

Low Medium High

slide-59
SLIDE 59

59

Application: LEO Satellite

Communication

 Satellite - satellite:

if within communication range

 Satellite - ground station:

if within footprint

We discretize orbits, identify matching periods

Intersatellitelink (ISL) Gateway Link (GWL) GWL

Elevation: Angle wrt to center of radiation cone and earth surface

ε

Footprint

slide-60
SLIDE 60

60

More input data

Radiation dose, shielding and its mapping to failure rates,

 0.007 failures per year for processor and CMOS components  0.0001 failures per year for discrete components

Scale factor r=1 for 1mm shielding for higher orbit Consider several model configurations!

Communication:

 Data collection rate: 2 GB/yr while memory available

all data lost at failure

 Uplink communication considered negligible  Simple routing mechanism  ISL communication rate: 115 kbps with 50% overhead, 226665 MB/yr  GWL rate: double ISL rate  ISL with commercial satellite networks

slide-61
SLIDE 61

61

Where are the probabilities ?

Dependability study:

 Ground station: rates of failure / repair actions  Satellite subsystems: rates of failure / repair actions

are modeled with a random variable that follows a negative exponential distribution with a given rate. Rate 5.0 means on average 5 events per time unit.

 Total Ionizing Dose: is taken into account by a scaling factor r towards failure

rates of components.

What is then analyzed

 Reliability and availability  For different levels of radiation shielding  For different levels of redundancy of components

What type of analysis is used

 Transient analysis of Markov chains

 For single satellite design

 Discrete event simulation of stochastic models

 valid alternative, used for evaluation of overall network

slide-62
SLIDE 62

62

Results wrt to Performance and Dependability Baseline results for r = 1 A set of simulations experiments are performed for

  • Different levels of r
  • Different protocols
  • Configurations
  • Buffer capacities
  • Data collection rates
  • Communication with
  • ther commercial networks
slide-63
SLIDE 63

Following slides cover material from project 1 Following slides cover material from project 1

63

slide-64
SLIDE 64

Applications: Network Traffic Failure of Poisson Modeling Observation: Scale-Invariant Burstiness on Multiple Scales Ways to describe the phenomenon

 Long-Range Dependence  Heavy Tail Distributions  Self-Similarity

64

slide-65
SLIDE 65

Burstiness on Multiple Scales X axis: time intervals Y axis: packets per unit time for a given time interval Time intervals increased by factor of 10, resp. 7 in last step Burstiness in packet traffic:

 traffic “spikes” ride on

longer term “ripples,”

 traffic “ripples” ride on

longer term “swells,”

 ad infinitum.

65

slide-66
SLIDE 66

Self-similarity in the Continuous Case Consider Y(t) in a continuous setting, t ∈ R

66

Definition 1.4.4 (

  • ss)

is self-similar with self-similarity parameter, i.e., Hurst parameter, ( ), denoted

  • ss, if for all

and , (1.4.5) Thus and its time scaled version —after normalizing by —must follow the same distribution. In the traffic modeling context, it is convenient to think

  • f

as the cumulative or total traffic up to time . For —time is stretched

  • r dilated—a contraction factor

is applied to make the magnitude of comparable to that of . For , the opposite holds true. As varies, the scaling exponent remains invariant. This is a most natural definition, however, it has an important drawback: unless is degenerate, i.e., for all , cannot be stationary due to the normalization factor . Its increment process has an important drawback: unless is degenerate, i.e., cannot be stationary due to the normalization factor . I , however, is another matter. In pa

slide-67
SLIDE 67

Long-range Dependence X(t) long-range dependent if ρ(k) decays to zero so slowly that its sum does not converge:

 For short-range dependent traffic, which is non-bursty, ρ(k) falls off

quickly with time, usually exponentially.

 For long range dependent traffic, it falls off much more slowly,

usually obeying some type of power law.

Intuitively:

 memory is built-in to the process because the dependence among

an LRD process’s widely separated values is significant, even across large time shifts.

67

is, ∑k=1

∞ |ρ(k)|

to the process

  • is-

= ∞.

slide-68
SLIDE 68

Hurst Parameter Some simple facts regarding H and its impact on r(k):

 if H = 1/2 then r(k) = 0 and X(t) is trivially short-range dependent

since it is uncorrelated

 if 0 < H < 1/2, then  if H = 1, then r(k) = 1 for all k ≥ 1 (artificial special case)  H > 1 prohibited due to stationary condition on X(t)  So basically two cases remain:  H = 1/2  1/2 < H < 1  To distinguish those 2 cases, reasonably accurate estimates for H are

necessary.

68

virtue of being com , is uninteresting sin

slide-69
SLIDE 69

Heavy-tailed Distribution A random variable Z has a heavy-tailed distribution if

69

(1.4.10) where is called the tail index or shape parameter and is a positive

  • constant10. That is, the tail of the distribution, asymptotically, decays hyperbolically.

This is in contrast to light-tailed distributions—e.g., exponential and Gaussian— which possess an exponentially decreasing tail. A distinguishing mark of heavy-tailed distributions is that they have infinite variance for , and if , they also have an unbounded mean. In the networking context, we will be primarily interested in the case . A frequently used heavy-tailed distribution is the interested in the case . Pareto distribution whose distributio

slide-70
SLIDE 70

Example of a Heavy-Tailed Distribution

70

Pareto distribution whose distribution function is given by where is the shape parameter and is called the location parameter. The mean is given by . We remark that there are distributions—e.g., Weibull and log-normal—that have subexponentially decreasing tails but possess finite variance. The main characteristic of a random variable obeying a heavy-tailed distribution

also: Power-law distribution, Double-exponential distrib. If α ≤ 2, then the distribution has an infinite variance If α ≤ 1, then the distribution has an infinite mean. Density: f(x) = α bα x-α-1

slide-71
SLIDE 71

from Fowler ’99: Network Traffic Models

71

Invariant Protocol Level Distribution Connection Size

  • Lognormal

Connection Duration

  • Lognormal

Requested File Popularity Application Zipf Requested File Sizes (Overall) Application Hybrid: Lognormal body, Pareto tail (Heavy-tailed) FTP Transfers Application Pareto tail (Heavy-tailed) Number Of Page Requests/Site Application Inverse Gaussian (Heavy-tailed) Reading Time/Page (Sec) Application Heavy-tailed Sessions (Arrivals) Session Poisson Session Duration Session Pareto (Heavy-tailed) Session Size Session Pareto (Heavy-tailed) WAN Traffic At TCP Level Transport Self-similar (fractal, multifractal) TCP Connections/Web Session Transport Heavy-tailed Interarrival Time Of Packets Network Heavy-tailed (LRD, fractal) Interarrival (Generation) Time Of Packets Generated By User At Keyboard Network Pareto (body) Pareto (upper tail) Interarrival Time Of Ethernet Frames Data Link Self-similar (fractal)