Rev eview iew of of Probabilit obability Theor heory and and - - PDF document

rev eview iew of of probabilit obability theor heory and
SMART_READER_LITE
LIVE PREVIEW

Rev eview iew of of Probabilit obability Theor heory and and - - PDF document

CMPE 252A: Computer Networks Rev eview iew of of Probabilit obability Theor heory and and Random andom Proces ocesses es 1 Probability Theory A mathematical model used to quantify the likelihood of events taking place in an


slide-1
SLIDE 1

1

1

CMPE 252A: Computer Networks

Rev eview iew of

  • f Probabilit
  • bability

Theor heory and and Random andom Proces

  • cesses

es

2

Probability Theory

 A mathematical model used to quantify the

likelihood of events taking place in an experiment in which events are random.

 It consists of:

 A sample space: The set of all possible

  • utcomes of a random experiment.

 The set of events: Subsets of the sample

space.

 The probability measure: Defined according to a

probability law for all the events of the sample space.

3

Probability Law

 Let E be a random experiment (r.e.)  Let A be an event in E  The probability of A is denoted by P(A)  A probability law for E is a rule that assigns P(A)

to A in a way that the following conditions, taken from our daily experience, are satisfied:

 A may or may not take place; it has some

likelihood (which may be 0 if it never occurs).

 Something must occur in our experiment.  If one event negates another, then the likelihood

that either occurs is the likelihood that one occurs plus the likelihood that the other occurs.

slide-2
SLIDE 2

2

4

Probability Law

 More formally, we state the same as follows.  A probability law for E is a rule that assigns a

number P(A) to each event A in E satisfying the following axioms:

 AI:  AII:  AIII:

P(A) 0 ≤ 1 ) ( = S P

P(B) P(A) B A P B A + = ⇒ = ) (   φ

Everything else is derived from these axioms!

5

Important Corollaries

1. 2. 3. 4. 5.

= ⇒ ≠ ∀ Φ =

k k k k j i n

A P A P j i A A A A A ) ( ) ( ... , ,

2 1

) ( 1 ) ( A P A P

c

− =

1 ) ( ≤ A P

) ( = Φ P ) ( ) ( ) ( ) ( B A P B P A P B A P   − + =

6

Probability Law

 All probabilities must be in [0, 1]  The sum of probabilities must be at most 1

A S x ) ( 1 S P = ) ( Φ = P

Φ

) (A P

) (x P

slide-3
SLIDE 3

3

7

Conditional Probability

Events of interest occur within some context, and that context changes their likelihood.

time packet collisions time interarrival times

We are interested in events occurring given that

  • thers take place!

8

Conditional Probability

 The likelihood of an event A occurring given

that another event B occurs is smaller than the likelihood that A occurs at all.

 We define the conditional probability of A

given B as follows:

) ( for ) ( ) ( ) | ( > ∩ = B P B P B A P B A P

We require P(B) > 0 because we know B occurs!

9

Theorem of Total Probability

 Purpose is to divide and conquer  We can describe how likely an event is by

partitioning it into mutually exclusive pieces.

time

success failure

busy period idle period busy period

1

B

2

B

3

B

slide-4
SLIDE 4

4

10

Theorem of Total Probability

1

B B3

n

B

...

) ( ) ( ) | ( ) (

∑ ∑

∩ = =

i i i i i

B A P B P B A P A P

Intersections of A with B’s are mutually exclusive A

2

B A∩B2

11

Independence of Events

 In many cases, an event does not depend on prior events,

  • r we want that to be the case.

 Example: Our probability model should not have to account

for the entire history of a LAN.

 Independence of an even with respect to another means

that its likelihood does not depend on that other event.

 A is independent of B if P(A | B) = P(A)  B is independent of A if P(B | A) = P(B)

 So the likelihood of A does not change by

knowing about B and viceversa!

 This also means:

) ( ) ( ) ( B P A P B A P = ∩

12

Random Variables

 We are not interested in describing the likelihood

  • f every outcome of an experiment explicitly.

 We are interested in quantitative properties

associated with the outcomes of the experiment.

 Example:

 What is the probability with which each packet

sent in an experiment is received correctly? We don’t really care!

 What is the probability of receiving no packets

correctly within a period of T sec.? We care! This may make a router delete a neighbor!

slide-5
SLIDE 5

5

13

Random Variables

 We implicitly use a measurement that assigns a numerical

value to each outcome of the experiment.

 The measurement is [in theory] deterministic; based on

deterministic rules.

 The randomness of the observed values of the

measurement is completely determined by the randomness

  • f the experiment itself.

 A random variable X is a rule that assigns a

numerical value to each outcome of a random experiment Yes, it is really a function!

14

Random Variables

 Definition: A random variable X on a sample

space S is a function X: S ->R that assigns a real number X(s) to each sample point s in S.

∞ − ∞ +

S

5

s

3

s

2

s

1

s

4

s

) ( 2 s X ) ( 1 s X ) ( ) (

5 4

s X s X =

ℜ ⊂ ∈ = } | ) ( { S s s X S

i i X

which is called the event space

15

Random Variables

 Purpose is to simplify the description of the problem.  We will not have to define the sample space!

∞ + ∞ −

1

S s

x s X = ) (

) ) ( ( x s X P =

Possible values of X

slide-6
SLIDE 6

6

16

Types of Random Variables

 Discrete and continuous: Typically used

for counting packets or measuring time intervals.

time t time Busy period next packet

time?

1 2 3 4

17

Discrete Random Variables

 We are interested in the probability that a discrete random

variable X (our measurement!) equals a certain value or range of values.

 Example:

We measure the delay experienced by each packet sent from one host to another over the Internet; say we sent 1M packets (we have 1M delay measurements). We want to know the likelihood with which any one packet experiences a delay of 5 ms or less.

 Probability Mass Function (pmf) of a random

variable X :

The probability that X assumes a given value x

) ( ) ( x p x X P

X

= =

18

Discrete Random Variables

Cumulative Distribution Function (cdf) of a random variable X: The probability that X takes on any value in the interval ] , ( x −∞ ) ( ) ( x X P x FX ≤ =

The pdf and pmf of a random variable are just probabilities and obey the same axioms AI to

  • AIII. Therefore,

) ( ) ( b) X P(a for ) ( ) ( ) ( lim ; 1 ) ( lim ; 1 ) ( a F b F b a b F a F x F x F x F

X X X X X X x X

x

− = ≤ < < ≤ = = ≤ ≤

−∞ →

+∞ →

slide-7
SLIDE 7

7

19

Continuous Random Variables

 The probability that a continuous r.v. X assumes a given

value is 0.

 Therefore, we use the probability that X assumes a range

  • f values and make that length of that range tend to 0.

 The probability density function (pdf) of X, if it exists,

is defined in terms of the cdf of X as

dx x dF x f

X X

) ( ) ( =

∫ ∫ ∫

∞ + ∞ − ∞ −

= +∞ → ≤ ≤ ∞ − = = = ≤ ≤ y probabilit a is pdf because , 1 ) ( then if ) ( ) ( ) ( ; (x)dx f b) X P(a

b a X

dt t f x x X P dt t f x F

X x X X 20

What We Will Use

 We are interested in:  Using the definitions of well-known

r.v.s to compute probabilities

 Computing average values and

deviations from those values for well-known r.v.s

21

Mean and Variance

 What is the average queue length at each

router?

 What is our worst case queue?

B A D C 3 5 4 6 1 2 7

For VC1 use 3 For VC2 use 2 ….. For VCn use 3

VC1 VC2

slide-8
SLIDE 8

8

22

Mean and Variance

Expected value or mean of X: us! for defined always is Mean c.r.v for ) ( ) ( d.r.v. for ) ( ) (

∫ ∑

∞ + ∞ −

= = dt t tf X E k p x X E

X X k k

time queue size mean too much?

23

Variance

Describes how much a r.v. deviates from its average value, i.e., D = X - E(X) We are only interested in the magnitude of the deviation, so we use:

2 2

)) ( ( X E X D − =

The variance of a r.v. is defined as the mean squared variation

) (

2

D E ) )] ( ([ ) (

2 2

X E X E X Var − = = σ

Important relation:

) ( ) ( ) (

2 2

X E X E X Var − =

24

Properties of Mean and Variance

Useful when we discuss amplifying random variables or adding constant biases.

) ( ) ( ) ( ) ( ) ( ) ( ; ) ( ) (

2

X Var c cX Var X Var c X Var c Var b b E b X aE b aX E = = + = = + = +

slide-9
SLIDE 9

9

25

Examples of Random Variables

 We are interested in those r.v. that permit us to

model system behavior based on the present state alone.

 We need to

 count arrivals in a time interval  count the number of times we need to repeat

something to succeed

 count the number of successes and failures  measure the time between consecutive arrivals

 The trick is to map our performance questions into

the above four types of experiments

26

Bernoulli Random Variable

 Let A be an event related to the outcomes

  • f a random experiment.

 X = 1 if outcome occurs and 0 otherwise  This is the Bernoulli r.v. and has two

possible outcomes: success (1) or failure (0)

q p X P P P p q X P P P

X X

− = = = = = − = = = = = 1 ) 1 ( ) 1 ( 1 ) ( ) (

1

We use it as a building block for other types of counting

27

Geometric Random Variable

 Used to count the number of attempts needed to

succeed doing something.

 Example: How many times do we have to

transmit a packet over a broadcast radio channel before it is sent w/o interference?

time Assume that each attempt is independent of any prior attempt! Assume each attempt has the same probability of success (p) failure failure failure success!

slide-10
SLIDE 10

10

28

Geometric Random Variable

 We want to count the number k of trials needed for

the first success in a sequence of Bernoulli trials!

,... 2 , 1 ; ) 1 ( ) ( ) ( trial a in success

  • f

y probabilit

1

= − = = = =

k p p k X P k p p

k X

Why this is the case is a direct consequence of assuming independent Bernoulli trials, each with the same probability of success: k-1 failures needed before the last successful trial

Memoryless property: The probability of having a success in k additional trials having experienced n failures is the same as the probability of success in k trials before the n failed trails

) ( ) | ( k X P n X k n X P = = > + =

29

Binomial Random Variable

 X denotes the number of times success

  • ccurs in n independent Bernoulli trials.

success 1 where 0); , ... 0, 0, 1, , ... 1, 1, , 1 ( such that

  • utcome

specific he Consider t = = ∈ s S s

k n k c n c k k

p p A P A P A P A P A P s P

− +

− = = ) 1 ( ) ( )... ( ) ( ).... ( ) ( ) ( : t independen are trials Because

1 2 1

s 0' and s 1' have we ; 1 1 n-k k n k k +

q p A P A P p A P A A A A A s i A

c i c n c k k i

= − = − = = ∩ ∩ ∩ ∩ ∩ ∩ = =

+

1 ) ( 1 ) ( ; ) ( ... ... then , in trial success Let

1 2 1

30

Binomial Random Variable

k n k c n c k k

p p A P A P A P A P A P s P

− +

− = = ) 1 ( ) ( )... ( ) ( ).... ( ) ( ) (

1 2 1

! )! ( ! k n and slots, in the s 1' for positions choose to ways different k n are There k k n n n k − = " " # $ % % & ' " " # $ % % & ' Because each outcome is mutually exclusive of the others: n k p p k k n n p p k n k P

k n k k n k n

≤ ≤ − # # $ % & & ' ( − = − # # $ % & & ' ( =

− −

) 1 ( ! )! ( ! ) 1 ( ) (

slide-11
SLIDE 11

11

31

Poisson Random Variable

 We use this r.v. in cases where we need to count event

  • ccurrences in a time period.

 An event occurrence will typically be a packet arrival.  Arrivals are assumed to occur at random over a time

interval.

 The time interval is (0, t]  We divide the time interval into n small subintervals of

length t n t Δ = /

  • The probability of a new arrival in a given subinterval is

defined to be t

Δ λ

  • is constant, independent of the subinterval.

λ

32

Poisson Random Variable

A sequence of n independent Bernoulli trials; with X being

the number of arrivals in (0, t]

arrivals. 1

  • r

having

  • f

y probabilit the to compared negligible is in arrival

  • ne

than more

  • f

y probabilit The : and Make t t n Δ → Δ ∞ →

By assumption, whether or not an event occurs in a subinterval is independent of the outcomes in other subintervals. We have:

t Δ

….

time

t

arrival 1 2 3 4 n 1 2 3 k

33

Poisson Random Variable

arrivals

  • f

number the is ; ) 1 ( ) ( k p p k n k P

k n k n −

− " " # $ % % & ' = n t t n t t P p / } / in arrival { λ λ = Δ = = Δ =

k k n k

n t n t k k n n P k X P t k P ! " # $ % & ! " # $ % & − − = = = =

λ λ 1 ! )! ( ! ) ( ]} [0, in arrivals { : then , and Make → Δ ∞ → t n

k n k k k

n t n t k t k n n n P

" # $ % & ' − " # $ % & ' − " # $ % & ' − = λ λ λ 1 1 ! ) ( )! ( ! : get to Rearrange

[]

k n k k

n t k t n k n n n n

" # $ % & ' − " # $ % & ' + − − − 1 1 ! ) ( ) 1 )..( 2 )( 1 ( λ λ

P

k = n k

n t k t ! " # $ % & − λ λ 1 ! ) ( ) 1 (

n x x e

x x

− = " # $ % & ' + ≡

∞ →

with ; 1 1 lim also and

slide-12
SLIDE 12

12

34

Poisson Random Variable

 We see that the Poisson r.v. is the result of an

approximation of i.i.d. arrivals in a time interval.

 We will say “arrivals are Poisson” meaning we can use the

above formulas to describe packet arrivals

 The probability of 0 arrivals in [0, t] plays a key role in our

treatment of interarrival times.

P

k = P{k arrivals in (0,t]} = λt

( )

k

k! e−λt

t

e t P P

λ −

= = ]} (0, in arrivals {

t

e P t P

λ −

− = − = 1 1 ]} (0, in arrivals some {

35

Important Properties of Poisson Arrivals

 The aggregation of Poisson sources is a Poisson

source.

 If packets from a Poisson source are routed such

that a path is chosen independently with probability p, that stream is also Poisson, with rate p times the original rate.

 It turns out that the time of a given Poisson

arrival is uniformly distributed in a time interval

  • The parameter λ

is called the arrival rate

36

Exponential Random Variable

 Consider the Poisson r.v., then

t

e P

λ −

=

  • Let Y be the time interval from a time origin (chosen

arbitrarily) to the first arival after that origin

  • Let

] (0, in arrivals

  • f

Number ) ( τ τ = X then

λτ

τ τ

= = = > e X P Y P ) ) ( ( ) (

  • We can obtain now a c.d.f. for Y as follows:

; 1 ) ( 1 ) ( ) ( ≥ − = > − = ≤ =

τ τ τ τ

λτ

e Y P Y P F

Y

and the p.d.f. of Y is then ; ) ( ) ( ≥ = ≤ =

τ λ τ τ

λτ

e Y P dt d fY

slide-13
SLIDE 13

13

37

Memoryless Property of Exponential Random Variable

 After waiting h sec. for the first arrival, the probability

that it occurs after t sec. equals the prob. that the

first arrival occurs after t sec.

 Knowing that we have waited any amount of time does

not improve our knowledge of how much longer we’ll have to wait!

) ( ) | ( h Y P t Y h t Y P > = > + >

time

time > t? time? time >t ?

38

Mean and Variance Results

You have to memorize these! [ In theory, you could be able to derive any of the above]

Exponential:

2 2

/ 1 ; / 1 ) ( λ σ λ = = Y E

Poisson:

t X E λ σ = =

2

) (

Geometric:

2 2

/ ; / 1 ) ( p q p X E = = σ

Binomial:

npq np X E = =

2

; ) ( σ

39

Random Processes

 Definition: A random process is a family of

random variables {X(t) | t is in T } defined on a given sample space, and indexed by a parameter t, where t varies over an index set T.

t

x x x x x x x x x x x

1

t

2

t

3

t

4

t

5

t

a state of X(t) X(t5) is a R.V. state space of random process

slide-14
SLIDE 14

14

40

Counting Process

 A counting process N(t) starts at some time 0 and

counts the occurrences of events.

 Events are called arrivals.  Every sample function of N(t) equals 0 for all t ≤ 0.  The number of arrivals up to any time t > 0 is an

integer that cannot decrease with time. Definition: A random process N(t) is a counting process if for every sample function, n(t, s) = 0 for t < 0 and n(t, s) is integer-valued and non-decreasing with time.

Counting Process

 Jumps in the sample function of N(t) mark the arrivals.  The number of arrivals in (ti, tj ] is N(tj) – N(ti).  A simple counting process is derived from a Bernoulli

process (a sequence of i.i.d. Bernoulli random variables).

42

Counting Process

 Consider a Bernoulli process (a sequence of i.i.d. Bernoulli random

variables) with a small time step of size Δt seconds.

 There is an arrival in interval ( nΔt, (n + 1)Δt ) if and only if Xn = 1.  For an average rate λ > 0 arrivals/second we can choose Δt such

that rate λΔt << 1.

 We let the success probability of Xn be equal to λΔt.  The number of arrivals Nm before time T = mΔt (i.e., in the time

interval [0, mΔt ) ) has a binomial PMF (with λΔt = λT /m ) : P

Nm(k) =[Nm = k]=

m k ! " # $ % &(λT / m)k(1− λT / m)m−k (1)

 We have shown that, as m tends to

infinity or Δt tends to 0, Eq. (1) becomes the Poisson random variable with the PMF:

P

N (T )(k) =

(λT)k k! e−λT k = 0,1,2,...

  • therwise

" # $ % $

slide-15
SLIDE 15

15

43

Counting Process

IMPORTANT:

 We can generalize our argument to any interval (ti, tj ].  The number of arrivals would have a Poisson PMF with

T = tj – ti.

 The number of arrivals in T depends only on the Bernoulli

trials corresponding to that interval.

 The numbers of arrivals in non-overlapping intervals are

independent.

 In the limit, as Δt tends to 0, the number of arrivals in any

time interval is a Poisson random variable independent of the arrivals in any other interval.

 This defines the Poisson random process!

Poisson Process

Definition: A counting process N(t) is a Poisson process of rate λ if

(a) The number of arrivals in any interval (ti, tj ],

N(tj) – N(ti), is a Poisson random variable with expected value λ( tj – ti ).

(b) For any pair of non-overlapping time intervals

(ti, tj ] and (tp, tq ], the number of arrivals in (ti, tj ] and (tp, tq ] are independent random variables.

 λ is called the rate of the Poisson process, because

the expected number of arrivals per unit time is N(t) / t = λ.

45

Poisson Process

 We assume that events occur at random instants of time at an

average rate of λ events per second.

 Let N(t) be the number of event occurrences (e.g., packet

arrivals) in (0,t].

 N(t) is a non-decreasing, integer-valued, continuous time random

process. t

1

a

2

a

3

a

4

a

5

a

5 4 3 2 1

a r.v. a r.v.

N(t2) N(t1)

We have a Poisson random variable for each time instant!

slide-16
SLIDE 16

16

46

Poisson Process

 As we did for the case of the Poisson r.v., we divide the

time axis into very small subintervals, such that

 The probability of more than one arrival in a subinterval is much

smaller than the probability of 0 or 1 arrivals in the subinterval.

 Whether or not an arrival occurs in a subinterval is independent of

what takes place in any other subinterval.

 We end up with the same Binomial counting process and

with the length of the subintervals going to 0 we can approximate:

( )

,... 1 , , ! ]} (0, in arrivals { ] ) ( [ = = = =

k e k t t k P k t N P

t k λ

λ

Again, inter arrival times are exponentially distributed with parameter lambda

47

Poisson Process

 Why do we say that arrivals occur “at random” with a Poisson

process?

 Suppose that we know that only one arrival occurs in [0, t] and x

is the arrival time of the single arrival.

 Let N(x) be the number of arrivals up to time x, 0 ≤ x ≤ t  Then:

N(t) – N(x) = increment of arrivals in ( x, t ] } 1 ) ( { } ) ( ) ( and 1 ) ( { } 1 ) ( { } 1 ) ( ) ( { } 1 ) ( | 1 ) ( { ] [ = = − = = = = = = = = = ≤ t N P x N t N x N P t N P t N x N P t N x N P x X P

48

Poisson Process

 Arrivals in different intervals are independent.

} 1 ) ( { } ) ( ) ( P{ } 1 ) ( { ] [ = = − = = ≤ t N P x N t N x N P x X P t x te e x x X P

t x

= = ≤

− − λ λ λ

λ λ ] ][e ) [( ] [

x)

  • (t
  • The above probability corresponds to the

uniform distribution! In the average case, a packet arrives in the middle of a fixed time interval with Poisson arrivals