CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, - PowerPoint PPT Presentation

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Today: Recap before midterm 1

Big Picture: Model-based Analysis of Systems portion/facet real world perception transfer solution to real world problem real world problem description decision formal model transformation presentation probability model, solution, rewards, stochastic process qualitative and formal / computer aided quantitative properties analysis 2

Reminder This is no pipe! ... and this is no serpentine accumulator in a production line! 3

System - Model - Study Model vs System  largely simplified formal/mathematical/stochastic model implemented in software in a fully controlled environment  set of physical devices interacting in space-time in an largely uncontrolled, not fully understood environment Model  includes some of the rules how the system operates, excludes others  includes some aspects of the real world as random variables, ignores others or assumes them as constant  is parameterized with respect to certain design variables Study  has an objective, a clear question  delivers values that are probabilities like R(0,t)  Interpretation?  evaluates effects of different design choices 4

CS 626 Topics From Data to Stochastic Input Models  Input Modeling  Probability, Distributions  Exploratory Data Analysis, Statistical tests  Stochastic processes, Markov Processes  DTMC, CTMC  Phase type distributions, MAPs, MAP Fitting  Tools  for data analysis: R  for MAP fitting: KPC toolbox Simulation Modeling  Simulation  Output Data Analysis  Verification, Validation,  Trace driven simulation  Debugging of simulation models  Tools for simulation: Mobius, (+Traviando) Applications  Reliability analysis, Dependability modeling of a LEO satellite  Modeling traffic in computer networks 5  Emulation: Testing, Debugging, Training in Automated Material Handling Systems

From Data to Stochastic Input Models Probability  Axiomatic Definition  Frequentist Definition 6

Frequency Definition of Probability If our experiment is repeated over and over again then the proportion of time that event E occurs will just be P(E). Frequency Definition of Probability: P(E) = lim m(E) / m m → ∞ where m(E) is the number of times event E occurs, m is the number of trials Note:  Random experiment can be repeated under identical conditions  if repeated indefinitely, relative frequency of occurrence of an event converges to a constant  Law of large numbers states that limit does exist.  For small m, m(E) can show strong fluctuations. 7

Axiomatic Definition of Probability Definition For each event E of the sample S, we assume that a number P(E) is defined that satisfies Kolmogorov’s axioms: 8

Outline on Problem Solving (Goodman & Hedetniemi 77) Identify sample space S  All elements must be mutually exclusive, collectively exhaustive.  All possible outcomes of experiment should be listed separately. (Root of “tricky” problems: often ambiguity, inexact formulation of the model of a physical situation) Assign probabilities  To all elements of S, consistent with Kolmogorov’s axioms. (In practice: estimates based on experience, analysis or common assumptions) Identify events of interest  Recast statements as subsets of S.  Use laws (algebra of events) for simplifications  Use visualizations for clarification Compute desired probabilities  Use axioms, laws, often helpful: express event of interest as union of mutually exclusive events and sum up probabilities 9

More relations What is the probability of a UNION of events ? What is the probability of a union of a set of events? Is there a better way to calculate this? Sum of disjoint products (SDP) formula 10

Conditional Probabilities E given F happens EF EF F F Definition The conditional probability of E given F is if P(F) > 0 and it is undefined otherwise. Interpretation: Given F has happened, only events in EF are still possible for E, so original probability P(EF) is scaled by 1/P(F). Multiplication rule: 11

Independent events Definition  Two events E and F are independent if: This also means: In English, E and F are independent  if knowledge that F has occurred does not affect the probability that E occurs. Notes:  if E, F independent then also E,F c and E c ,F and E c ,F c  Generalizes from 2 to n events e.g. n=3 every subset independent  Mutually exclusive vs independent 12

About independent events Venn diagrams For independent events: consider A, B being not empty S and not S, 1) if A ⊂ B, then A and B cannot be independent A B 2) if A ∩ B = ∅ , then A and B cannot be independent Tree diagrams of sequential sample spaces  Throw coin twice Joint sample space from cross product of individual sample spaces. H T First, second throw are independent. H T T H (H,H) (H,T) (T,H) (T,T) 13

Joint and pairwise independence A ball is drawn from an urn containing four balls numbered 1, 2, 3, 4. Then we have: They are pairwise independent, but not jointly independent A sequence of experiments results in either a success or a failure where E i , i >= 1 denotes a success. If for all i 1 , i 2 , …, i n : we say the sequence of experiments consists of independent trials 14

Independence is a very important property Independence  simplifies calculations significantly => very popular assumption for theoretical results  input modeling, workload modeling  statistical tests  output analysis of simulation models: confidence intervals for estimate of mean  ...  independence need not be present in real data  data traffic in networks: often correlated  output data of a (simulated) system, i.e. response of a system to some workload  ways to investigate independence  graphics: correlation plot  tests: chi-square test for vectors, rank von Neumann test, runs test  see Law/Kelton Chap 6.3 and Chap 7.4.1 15

Bayes’ Formula Let F 1 , F 2 , …, F n be events of S, all mutually exclusive and collectively exhaustive. Theorem of total probability (also Rule of Elimination) Bayes’ Formula helps us to determine which F j happened given we observed E 16

Random Variable RV Definition  A random variable X on a probability space (S,F,P) is a function X : S -> R that assigns a real number X(s) to each sample point s ∈ S, such that for every real number x, the set of sample points {s|X(s) ≤ x} is an event, that is a member of F. RVs can be discrete or continuous More concepts  cumulative distribution function  density  moments E[X i ], centralized moments, Variance, Skewness, Kurtosis Particular examples  Normal distribution  Poisson distribution  Exponential distribution  Pareto distribution 17

Parameterization of distributions Parameters of 3 basic types Location  specifies an x-axis location point of a distribution’s range of values  usually the midpoint (e.g. mean for normal distribution) or lower end point for the distribution’s range  sometimes called shift parameter since changing its value shifts the distribution to the left or right, e.g., for Y = X + γ Scale  determines the scale (unit) of measurement of the values in the range of the distribution (e.g. std deviation σ for normal distribution)  changing its value compresses/expands distribution but does not alter its basic form, e.g., for Y = β X Shape  determines basic form/shape of a distribution  changing its values alters a distribution’s properties, e.g. skewness more fundamentally than a change in location or scale 18

Properties of Mean, Variance and Covariance X x X stochastic variable ! F ( x ) P ( X x ) f ( y ) dy distributi on function = ! = X X f ( x ) density function X # " # " " = ! x E(cX) cE(X) = E(X) yf ( y ) dy expected value X E(X Y) E(X) E(Y) + = + # " E(X Y) E(X) E(Y) E(cX) cE(X) independen t : P(X x, Y independen t : P(X x, Y y) P(X x) P(Y y), E(XY) E(X)E(Y) = = = = = = 2 var( aX b ) a var( X ) 2 2 + = ! = var( X ) = E (( X " E ( X )) ) X var( X Y ) var( X ) var( Y ) 2 cov( X , Y ) + = + + var( X Y ) var( X ) var 2 var( aX b ) a var( X ) covariance : cov( X , Y ) E (( X E ( X ))( Y E ( covariance : cov( X , Y ) E (( X E ( X ))( Y E ( Y ))) = " " X Y cov( X , Y ) cov( X , Y ) independen t : cov( X , Y ) 0 = correlatio n : 2 2 ! X ! Y independen t : cov( X , Y ) For any random variables X, Y, Z and constant c,

Proposition 2.4 X 1 , …, X n are independently and identically distributed with expected value µ and variance σ 2 . Then, Confidence intervals for estimate of mean Then, the (1 - ! ) confidence interval about x can be expressed as: ( ) ( ) t 1 s t 1 s ! ! " " N 1 N 1 2 2 ˆ " ˆ " µ " # µ # µ + N N Where – ! ( ) ( ) t N 1 is the 100 1 th percentile of the student' s t distributi on with " " ! ! 1 2 2 ! N 1 degrees of freedom (values of this distributi on can be found in tables) . ! 2 – ! s = s is the sample standard deviation. – ! N is the number of observations. 20

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, - PowerPoint PPT Presentation

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Today: Recap before midterm 1 Big Picture: Model-based Analysis of Systems portion/facet real world perception transfer

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462,

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Regularization Paths Boosting fits a regularization path toward a max-margin classifier.

Hochberg Multiple Test Procedure Under Negative Dependence Ajit C. Tamhane Northwestern

Statistical Filtering and Control for AI and Robotics Part II. Linear methods for regression

General AIMD Congestion Control Y. Richard Yang and Simon S. Lam Motivation for new congestion

Simple Linear Regression Ronet Bachman, Ph.D. Presented by Justice Research and Statistics

Once again: the Central Limit Theorem and hypothesis testing

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis