Probability & Stochastic Processes Introduction to Probability - - PowerPoint PPT Presentation

probability stochastic processes
SMART_READER_LITE
LIVE PREVIEW

Probability & Stochastic Processes Introduction to Probability - - PowerPoint PPT Presentation

Probability & Stochastic Processes Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes Dr Conor


slide-1
SLIDE 1

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 1/60

slide-2
SLIDE 2

Introduction to Probability Theory

Probability theory is concerned with the description and calculation of the properties of random phenomena, as occur in games of chance, computer and telecommunications systems, financial markets, electronic and optical circuits and many other random systems. Although such systems are random, in the sense that it is difficult or impossible to predict exactly how the system will behave in the future, probability theory can provide characterisation of the type of randomness involved and yield useful measures, such as average values of system parameters or the likelihood of certain events occurring in the future. To develop a rigorous mathematical theory of probability, the starting point is the notion of a random experiment and an abstract probability space. A random experiment E is an experiment satisfying the following conditions: all possible distinct outcomes are known a priori the outcome is not known a priori for any particular trial of the experiment the experiment is repeatable under identical conditions

Dr Conor McArdle EE414 - Probability & Stochastic Processes 2/60

slide-3
SLIDE 3

Introduction to Probability Theory

Many random phenomena can be modelled by the notion of a random experiment, for example: Recording the output voltage of a noise generator Observing the daily closing price of crude oil Measuring the number of packets queueing at the input port of a network router Each different random experiment E defines a its own particular sample space, event space and probability measure, which collectively form an abstract probability space for the random experiment. A probability space is the collection (Ω, F, P) where Ω the sample space is the set of all possible outcomes of a random experiment E F the event space is a collection of events, where each event is a subset of the sample space and the collection forms a σ-field P the probability measure is an assignment of a real number in the interval [0,1] to each event in the event space.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 3/60

slide-4
SLIDE 4

Introduction to Probability Theory

Example: Random experiment of tossing a fair coin Sample Space Ω = {H, T}, Event Space F = {{H}, {T}, {H, T}, {}} Probability Measure P defined by P({H}) = 1

2, P({T}) = 1 2, P({H, T}) = 1, P({}) = 0

Considering this example where the sample space is discrete (countable), it may appear unnecessary to define events to which probabilities are assigned. Why not simply assign probabilities directly to outcomes in the sample space? Consider instead an experiment where a random selection of a real number between 0 and 10 is made (an uncountable sample space), then the probability of any particular

  • utcome must be zero since there is an infinity of such outcomes in the sample space.

However, if events are defined as intervals of the real line (e.g. [0,5]), the events can have non-zero probability values (e.g. the probability of an outcome occurring within the interval [0,5] will be non-zero).

Dr Conor McArdle EE414 - Probability & Stochastic Processes 4/60

slide-5
SLIDE 5

Introduction to Probability Theory

So that we can form a useful theory for all random experiments (particularly those with uncountable sample spaces), the probability measure is only defined on specified subsets of the sample space (the events) rather than on individual outcomes in the sample space. Note that this stipulation does not preclude us from defining events consisting of a single outcome, but we draw the distinction between an outcome ω ∈ Ω (an element of Ω) and an event {ω} ⊂ Ω (a subset of Ω). The definition of the event space as a σ-field further specifies which subsets of Ω can belong to the same event space. That is, there is a certain relationship between the subsets of the sample space Ω that are chosen as events in the event space. The properties of a σ-field (and so of any event space) ensure that if events A and B have probabilities defined then logical combinations of these events (e.g. the outcome is in either A or B) are also events in the event space and so also have probabilities

  • defined. Any subset of Ω that does not belong to the event space of a random

experiment will simply not have a defined probability. We next look at the sample space, event space and probability measure in some detail.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 5/60

slide-6
SLIDE 6

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 6/60

slide-7
SLIDE 7

Sample Spaces

A sample space Ω is the non-empty set of all outcomes (also known as sample points, elementary outcomes or elementary events) of a random experiment E. The sample space takes different forms depending on the random experiment in

  • question. We have seen an example of a finite sample space {H, T}, in the case of the

coin tossing random experiment, and also an uncountable sample space (a interval of the real line [0, 10]) in the case of the random number experiment. What follows are some examples of more general sample spaces: Example 1 A finite sample space Ω = {ak : k = 1, 2, ..., K}. Specific examples are: A binary space {0, 1} A finite space of integers {0, 1, 2, ..., k − 1}. (Also denoted Zk).

Dr Conor McArdle EE414 - Probability & Stochastic Processes 7/60

slide-8
SLIDE 8

Sample Spaces

Example 2 A countably infinite space Ω = {ak : k = 1, 2, ...}. Specific examples are: All non-negative integers {0, 1, 2, ...}, denoted Z+ All integers {..., −2, −1, 0, 1, 2, ...}, denoted Z Example 3 An uncountably infinite space. Examples are the real line R or intervals of R such as (a, b), [a, b), (a, b], [a, ∞), (−∞, ∞). Example 4 A space consisting of k-dimensional vectors with coordinates taking values in one of the previously described spaces. The usual name for such a vector space is a product

  • space. For example, let A denote one of the abstract spaces previously considered.

Define the cartesian product Ak as: Ak = { (ao, a1, ..., ak−1) : ai ∈ A}

Dr Conor McArdle EE414 - Probability & Stochastic Processes 8/60

slide-9
SLIDE 9

Sample Spaces

Specific examples of this type of space are: Rk {0, 1}k [a, b]k Example 5 Let A be one of the sample spaces in examples 1-3. Form a new sample space consisting of all waveforms (or functions of time) with values in A (e.g. all real valued time functions). This space is a product space of infinite dimension. For example: At = {all waveforms {x(t) : t ∈ [0, ∞)} : x(t) ∈ A, ∀t} Exercise 1 Specify appropriate sample spaces that model the outcomes of the following ran- dom systems: (i) tossing a coin where a head is assigned a value of 1 and a tail a value of 0 (ii) rolling a die (iii) rolling three dice simultaneously (iv) choosing a random coordinate within a cube (v) an infinite random binary waveform.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 9/60

slide-10
SLIDE 10

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 10/60

slide-11
SLIDE 11

Event Spaces

The event space F of a sample space Ω is a non-empty collection of subsets of Ω, which has the following properties:

1 If F ∈ F then also F c ∈ F 2 If for some finite n, Fi ∈ F, i = 1, 2, ..., n then also n

  • i=1

Fi ∈ F

3 If Fi ∈ F, i = 1, 2, ... then also ∞

  • i=1

Fi ∈ F These properties specify that an event space is a σ-field (or σ-algebra) over Ω. Note that the definition of the σ-field, as above, specifies only that the collection be closed under complementation and countable unions. However, these requirements immediately yield additional closure properties. The countably infinite version of De Morgans’s ’Laws’ of elementary set theory require that if Fi, i = 1, 2, ... are all members of a σ-field then so is:

  • i=1

Fi = ∞

  • i=1

F

c

i

c

Dr Conor McArdle EE414 - Probability & Stochastic Processes 11/60

slide-12
SLIDE 12

Event Spaces

Thus the σ-field properties imply that the collection of events in an event space is closed under all set-theoretic operations (union, intersection, complementation, difference, etc.) so that performing set operations on events must result in other events inside the event space. This closure requirement ensures that if we know the probability of an event A

  • ccurring and probability of an event B occurring, then we can also find the probability
  • f logical combinations such as the probability of both A and B occurring (intersection
  • f events), the probability of either A and B occurring (union of events), etc.

It follows by similar set-theoretic arguments that any countable sequence of any of the set-theoretic operations (union, intersection, complementation, difference, symmetric difference, etc.) performed on events in an event space must yield other events in the event space. We next turn to the question of how such event spaces may be constructed.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 12/60

slide-13
SLIDE 13

Event Spaces: The Power Set P

Given a countable sample space Ω, the collection of all subsets of Ω is a σ-field (and thus a valid event space). This is true since any countable sequence of set-theoretic operations on subsets of Ω must yield another subset of Ω. Such a collection of all possible subsets of a sample space is called the Power Set P

  • f the space.

The power set is the largest possible event space since it contains all subsets of Ω. Note that, a finite sample space with n elements has a power set with at most 2n elements. For example, the power set of the binary sample space Ω = {0, 1} is P = {{0}, {1}, {0, 1}, Ø} with 22 elements.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 13/60

slide-14
SLIDE 14

Event Spaces: σ-Fields Generated by a Family of Events

Although the power set of the sample space automatically yields a valid event space, it is possible to find a smaller event space, given some set of events of interest. For example, consider the experiment of tossing two coins together in a game where we are only interested in the event of tossing one head and one tail. Denoting a head as 1 and a tail as 0, the appropriate sample space is: Ω = {0, 1}2 = {(0, 0), (0, 1), (1, 0), (1, 1)} The event space for the experiment can be defined as the power set of Ω: P ={{(0, 0)}, {(0, 1)}, {(1, 0)}, {(1, 1)}, {(0, 0), (0, 1)}, {(0, 0), (1, 0)}, {(0, 0), (1, 1)}, {(0, 1), (1, 0)}, {(0, 1), (1, 1)}, {(0, 1), (0, 0)}, {(0, 0), (0, 1), (1, 0)}, {(0, 0), (0, 1), (1, 1)}, {(0, 0), (1, 0), (1, 1)}, {(0, 1), (1, 0), (1, 1)}, Ω, Ø} Can we find a smaller event space for this random experiment containing the event of interest A = {(0, 1), (1, 0)}?

Dr Conor McArdle EE414 - Probability & Stochastic Processes 14/60

slide-15
SLIDE 15

Event Spaces: σ-Fields Generated by a Family of Events

We can in fact generate the smallest event space (σ-field) G that contains A. For our example, if we start with the event of interest A = {(0, 1), (1, 0)} and apply the rules of the σ-field (all complements and countable unions are also in the field) iteratively we arrive at the event space: G = {A, Ac, A ∪ Ac, A ∩ Ac} = {{(0, 1), (1, 0)}, {(0, 0), (1, 1)}, {(0, 1), (1, 0), (0, 0), (1, 1)}, Ø} We note that in this instance the chosen family of events of interest consisted of a single event A. In general, the family may contain many events. To give a more precise definition of a generated field we say that, given a family of events A of interest, we may find the σ-field G generated by A by taking the intersection of all σ-fields on Ω that contain A, that is: G =

  • {∀ F : F is a σ-field with A ⊂ F}

By this definition, G must be the smallest σ-field containing A.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 15/60

slide-16
SLIDE 16

Event Spaces

Exercise 2 What is the power set of Ω = {1, 2, 3, 4}? Given Ω = {1, 2, 3, 4}, find the σ-field (event space) generated by the family of events A = {{1}, {3, 4}}. Although the notion of a generated σ-field has been introduced in the context of a countable sample space, it is more usual to take the power set as the de facto event space for countable sample spaces. Generated fields are most useful when defining event spaces on uncountable sample spaces (for example the real line). In the uncountable case, a mathematical technicality arises with some subsets of the sample space (i.e. some elements of the power set). There can exist some subsets which due to their complicated structure cannot be assigned a meaningful probability measure and thus are not valid events. The approach, instead, is to start with a set of simple subsets of the sample space which are known to be measurable and generate a measurable event space from these. This leads us to the notion of a Borel field.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 16/60

slide-17
SLIDE 17

Event Spaces: The Borel Field B

Consider the problem of defining an event space on the real line R. Given a family of events S = {(−∞, x] : x ∈ R}, we may generate from these events a σ-field B(R), called the Borel Field on R. Although, this set of subsets of the real line B(R) is a smaller set than the power set of the real line, it is large enough not to restrict a useful theory of probability for real sample spaces. We note that any such family of intervals (e.g. S′ = {(y, ∞) : y ∈ R}) will generate the same Borel Field. To illustrate this point, consider the intervals (a, ∞) ∈ S′ and (−∞, ∞) ∈ S′, then the set (−∞, ∞) − (a, ∞) = (−∞, a], in the generated σ-field, is also in S. Similar to our previous definition of a generated field, the Borel Field B may be concisely defined as the σ-algebra generated by the set of all intervals: The Borel Field B =

  • {∀ F : F is a σ-field containing all intervals}

Dr Conor McArdle EE414 - Probability & Stochastic Processes 17/60

slide-18
SLIDE 18

Event Spaces: The Borel Field B

Ω = R is often a natural choice of sample space for many random systems and the Borel field B(R) on the real line is the usual choice of event space in this case. The structure of the Borel field, being generated from intervals, makes is easier to specify a probability measure on the set of events. By specifying probabilities on the intervals, we are assured that all events in the event space will have probabilities defined. We note that it is also possible to form a Borel field on a subset of the real line (e.g. R+). It is also possible to form a Borel field on real product spaces.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 18/60

slide-19
SLIDE 19

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 19/60

slide-20
SLIDE 20

Probability Measure P

The probability measure P of a probability space (Ω, F, P) is a set function with domain F and range [0,1] which obeys the following axioms: Axiom 1 P(F) ≥ 0 for all F ∈ F Axiom 2 P(Ω) = 1 Axiom 3 If Fi ∈ F, i = 1, 2, ..., n are disjoint, then P(

n

  • i=1

Fi) =

n

  • i=1

P(Fi) Axiom 4 If Fi ∈ F, i = 1, 2, ... are disjoint, then P(

  • i=1

Fi) =

  • i=1

P(Fi) We can see a relationship between the definition of the event space and the definition

  • f the probability measure.

The structure of the event space ensures that any countable series of set operations on a set of events is also in the event space. The probability axioms ensure that knowing the probability of the original set of events, the probability of the resulting set can be calculated.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 20/60

slide-21
SLIDE 21

Probability Measure P

Examples of useful properties of the probability measure that can be derived from these axioms: (a) P(F c) = 1 − P(F) (b) P(F) ≤ 1 (c) P(Ø) = 0 Other concepts related to the probability measure are reviewed below. Conditional Probability Given a probability space (Ω, F, P) and two events A and B ∈ F, the conditional probability of A given B is defined by: P(A|B) = P(A B) P(B) , P(B) = 0 Conditional probability can be interpreted as forming a new probability space: (Ω′ = B, F′ = {∀ (F ∩ B) : F ∈ F}, P′) where the new probability measure P′ is P normalised with respect to P(B).

Dr Conor McArdle EE414 - Probability & Stochastic Processes 21/60

slide-22
SLIDE 22

Probability Measure P

Independence Two events A and B are said to be independent if and only if: P(A ∩ B) = P(A)P(B) Note that, when P(A) = 0 and P(B) = 0, this condition implies that: P(A|B) = P(A) and P(B|A) = P(B) Law of Total Probability Let {B1, ..., Bn} be events that form a partition of the sample space, that is ∪{∀Bi} = Ω Bi ∩ Bj = Ø ∀i = j Then P(A) =

n

  • i=1

P(A ∩ Bi) =

n

  • i=1

P(A|Bi)P(Bi)

Dr Conor McArdle EE414 - Probability & Stochastic Processes 22/60

slide-23
SLIDE 23

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 23/60

slide-24
SLIDE 24

Probability Functions

We have seen an example of a simple random experiment (tossing a fair coin) where the value of the probability measure P can easily be specified explicitly for every event in the event space. For more complex probability spaces it is difficult to specify the set function P directly. The notion of a probability function becomes useful for specifying P, in an indirect way. Consider the probability space (Ω,F,P) where Ω is a countable space (e.g. {1,2,3,4,5,6} or Z) and F is the power set of Ω. Now consider a function p(ω) that assigns a real number to each sample point ω ∈ Ω such that p(ω) ≥ 0, all ω ∈ Ω and

  • ω∈Ω

p(ω) = 1 Now define the set function P as: P(F) =

  • ω∈F

p(ω), all F ∈ F

Dr Conor McArdle EE414 - Probability & Stochastic Processes 24/60

slide-25
SLIDE 25

Probability Functions

This set function P is a valid probability measure for the probability space (Ω,F,P) as it satisfies the axioms and specifies a probability for all events in the event space F. A function p(ω), with the properties specified above, is called a probability mass function (pmf). It is a more easily specified point function from which the set function P is induced. Examples of pmfs on finite sample spaces: The Binary pmf: Ω = {0, 1}; p(0) = 1 − ρ, p(1) = ρ, where ρ ∈ (0, 1) is a parameter. The Uniform pmf: Ω = Zn = {0, 1, ..., n − 1} and p(k) = 1

n; k ∈ Zn

The Binomial pmf: Ω = Zn+1 = {0, 1, ..., n} and p(k) = n k

  • ρk(1 − ρ)n−k; k ∈ Zn+1,

where n k

  • =

n! k!(n − k)! is the binomial coefficient.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 25/60

slide-26
SLIDE 26

Probability Functions

Common examples of pmfs on countably infinite sample spaces are: The Geometric pmf: Ω = {1, 2, 3, ...} and p(k) = (1 − ρ)k−1ρ; k = 1, 2, 3... where ρ ∈ (0, 1) is a parameter. The Poisson pmf: Ω = Z+ = {0, 1, 2, ...} and p(k) = λke−λ k! where λ is a parameter in (0, ∞) Exercise 3 Show that the function p(k) = (1 − ρ)k−1ρ; k = 1, 2, 3, ... and ρ ∈ (0, 1) is a parameter, satisfies the properties of a probability mass function (pmf). Given a sample space (Ω = {1, 2, 3, ...}, P(Ω), P) where P is induced by the pmf p(k), what is the probability of the event F = {1, 2, 3, 4}.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 26/60

slide-27
SLIDE 27

Probability Functions

In the case of a probability space (Ω, F, P) with an uncountably infinite sample space (e.g. R) can we make a similar simplification to specification of the probability measure P? For example, considering the probability space (R, B(R), P), can we find a function that induces P? Consider the real valued function f satisfying: f(r) ≥ 0, all r ∈ R

f(r) dr = 1 Now define the set function P as: P(F) =

  • F

f(r) dr, F ∈ B(R) We now have an expression for the probability measure P, a difficult to specify set function, in terms of a more easily specified point function f(r). The function f, as defined above, is called a probability density function or pdf.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 27/60

slide-28
SLIDE 28

Probability Functions

Like a pmf, a pdf is defined only for points in Ω and not for sets (events). The pmf relates to a countable sample space and is summed over all points in an event to produce its probability. The pdf relates to an uncountable sample space and is integrated over all points in an event to produce its probability. The pdf of a given probability measure does not always exist. If it does exist, then it is unique. We will discuss probability measures further in the next section on random variables. Some common examples of pdfs are: The Uniform pdf: Given b > a, f(r) =

1 (b−a), for r ∈ [a, b]

The Exponential pdf: f(r) = λe−λr; r ∈ [0, ∞) ; λ > 0 The Gaussian pdf: f(r) = (2πσ2)− 1

2 e−(r−m)2/2σ2; r ∈ R Dr Conor McArdle EE414 - Probability & Stochastic Processes 28/60

slide-29
SLIDE 29

Probability Functions

Exercise 4 Show that the exponential function f(r) = 2e−2r; r ∈ [0, ∞) satisfies the prop- erties of a probability density function. Given the probability space (R+, B(R+), P) where P is induced by the pdf f(r), find the probability of the event [0, 1].

Dr Conor McArdle EE414 - Probability & Stochastic Processes 29/60

slide-30
SLIDE 30

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 30/60

slide-31
SLIDE 31

Random Variables: Introduction

Consider our example random experiment of tossing two coins simultaneously. The probability space for the experiment is given as (Ω, F = P(Ω), P), where Ω = {(T, T), (T, H), (H, T), (H, H)} and P(Ω) is the power set of Ω. Suppose we are most interested in the probabilities of the number of heads turning up. Define a mapping (a set function) X(ω) that maps the individual outcomes ω ∈ Ω to the number of heads occurring: (T, T)

X

− → 0 (T, H)

X

− → 1 (H, T)

X

− → 1 (H, H)

X

− → 2 The function X(ω), mapping points in Ω to numerical values, is called a random variable. The notion of a random variable is generally useful in probability theory as it provides a means of presenting the randomness in a complex underlying abstract probability space simply as random numerical values.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 31/60

slide-32
SLIDE 32

Random Variables: Introduction

To continue our example, we can calculate the probabilities of X(ω) taking on different values by summing the probabilities of the original outcomes which cause X to take a particular value in its range, that is: Pr(X = 0) = P((T, T)) = 1/4 Pr(X = 1) = P((T, H)) + P((H, T)) = 1/2 Pr(X = 2) = P((H, H)) = 1/4 We can also find the probabilities of combinations of values of the random variable. For example, the probability of X(ω) > 0 is: Pr(X ∈ {1, 2}) = P((T, H)) + P((H, T)) + P((H, H)) = 3/4 It appears from this that the range of X has an associated event space of its own with each event corresponding to an event (and thus a probability) in the original event space F. In fact X has an associated probability space (ΩX, FX, PX), where ΩX is the set of range values of X, FX is an event space over ΩX and PX the probabilities of events in

  • FX. We note that PX is not arbitrarily defined but is determined by P in the

underlying probability space.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 32/60

slide-33
SLIDE 33

Random Variables: Introduction

So, we can view a random variable X as being a mapping from the original probability space to an output probability space: (Ω, F, P)

X

− → (ΩX, FX, PX) under the condition that for every event in FX there must be a corresponding event in the original domain event space F. In other words, the inverse mapping of any event in the range event space of X must be an event in the original event space F. In the case of our example, we can see that this requirement holds: {0} X−1 − → {(T, T)} {1} X−1 − → {(T, H), (H, T)} {2} X−1 − → {(H, H)} {0, 1} X−1 − → {(T, T), (T, H), (H, T)} {0, 2} X−1 − → {(T, T), (H, H)} {1, 2} X−1 − → {(T, H), (H, T), (H, H)} {0, 1, 2} X−1 − → {(T, T), (T, H), (H, T), (H, H)} {} X−1 − → {}

Dr Conor McArdle EE414 - Probability & Stochastic Processes 33/60

slide-34
SLIDE 34

Random Variables

Exercise 5 Consider the probability space (Ω, F, P) where Ω = {0, 1} and F = {Ø, Ω}. Is the function X(ω) = ω a valid random variable? Explain your answer. We have thus far considered the case where the original sample space is discrete and so the random variable’s range is also discrete. When the sample space is continuous, we have a continuous random variable X whose range is ΩX = R (or a subset of R). We have seen previously that a suitable event space for the real sample space is the Borel field over the reals and so the range event space becomes FX = B(R) and probability measure on this range event space is denoted PX. This gives the real-valued random variable as the mapping: (Ω, F, P)

X

− → (R, B(R), PX) again with the requirement that the inverse mapping of all events B ∈ B(R) must be events in F. This leads us to the formal definition of a (real-valued) random variable.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 34/60

slide-35
SLIDE 35

Random Variable: Definition

Given a probability space (Ω, F, P), a real-valued random variable is a function X : Ω → R which satisfies the condition that for each B ∈ B(R), the set X−1(B) = {ω : X(ω) ∈ B} is an event in F. We have noted earlier that the probability of an event in the range event space of the random variable must be the same the probability as that of the inverse mapping of the event. Thus, given the probability measure of the original space P, the probability measure PX of the random variable can be derived, or in mathematical terms: The probability measure PX(B), B ∈ B(R) of the real-valued random variable X is equal to P(X−1(B)) = P({ω : X(ω) ∈ B}). We now look at probability functions as they relate to random variables. As the range space (R, B(R), PX) is nothing other than a probability space, the concept of probability functions must also apply to this space. An interesting question is how a probability function defined on the original space determines the probability function in the range space.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 35/60

slide-36
SLIDE 36

Discrete Random Variables and Probability Functions

We have seen previously that, given a probability space (Ω, F, P) where Ω is discrete, we can more easily describe P in terms of a probability mass function p(ω) where p(ω) ≥ 0, for all ω ∈ Ω and

  • ω∈Ω

p(ω) = 1 giving an expression for the probability measure in terms of the pmf p(ω) as: P(F) =

  • ω∈F

p(ω), for all F ∈ F In a similar way, for a discrete random variable, we can describe PX for the random variable X in terms of a pmf pX(x), x ∈ R, where pX(x) is derived from p(ω) as: pX(x) = PX({x}) = P(X−1({x})) = P({ω : X(ω) = x}) =

  • ω : X(ω)=x

p(ω)

Dr Conor McArdle EE414 - Probability & Stochastic Processes 36/60

slide-37
SLIDE 37

Discrete Random Variables and Probability Functions

Example: Derived distribution of a discrete random variable Let (Ω, F, P) be a discrete probability space with Ω = {1, 2, 3, ...}, F the power set of Ω and P the probability measure induced by the geometric pmf: p(ω) = (1 − ρ)ω−1ρ, ∀ ω ∈ Ω, where ρ ∈ (0, 1) Define a random variable X on this space as: X(ω) =

  • 1 if ω even

0 if ω odd Thus we have a random variable X : {1, 2, 3, ...} → {0, 1}. Derive the pmf for the random variable X from p(ω).

Dr Conor McArdle EE414 - Probability & Stochastic Processes 37/60

slide-38
SLIDE 38

Discrete Random Variables and Probability Functions

Solution pX(x) =

  • ω:X(ω)=x

p(ω) ⇒ pX(1) =

  • ω: ω even

p(ω) =

  • ω=2,4,...

(1 − ρ)ω−1ρ = ρ 1 − ρ

  • ω=1

((1 − ρ)2)ω = ρ(1 − ρ)

  • ω=0

((1 − ρ)2)ω = ρ (1 − ρ) 1 − (1 − ρ)2 = 1 − ρ 2 − ρ ⇒ pX(0) = 1 − 1 − ρ 2 − ρ

Dr Conor McArdle EE414 - Probability & Stochastic Processes 38/60

slide-39
SLIDE 39

Continuous Random Variables and Probability Functions

Recall that a continuous random variable X is defined on a probability space (R, B(R), PX) and that the event space B(R) is generated from the set of open intervals S = {(−∞, x] : x ∈ R}. If we specify the probability of all intervals in S then the probability of any event (any set combination of the intervals) can be determined. This prompts the definition of the cumulative distribution function: The cumulative distribution function (cdf) FX of the random variable X is defined as FX(x) = PX((−∞, x]) = Pr(X ≤ x); for all x ∈ R Given the cdf of X, probabilities of any event can be determined, for example: Pr(a < X ≤ b) = PX((−∞, b] − (−∞, a]) = FX(b) − FX(a); where a ≤ b We note some properties of the cdf FX(x): FX(−∞) = 0 FX(∞) = 1 FX is non-decreasing and continuous from the right

Dr Conor McArdle EE414 - Probability & Stochastic Processes 39/60

slide-40
SLIDE 40

Continuous Random Variables and Probability Functions

We have seen earlier that the probability measure P can also be expressed in terms of a probability density function (pdf) when the sample space is real-valued. Thus we also have the notion of a pdf of a random variable, that is the pdf inducing PX. We define the probability density function (pdf) of a random variable X as the non-negative real-valued function fX(x) with a well defined integral over the real line, such that PX(F) =

  • F

fX(x) dx, ∀x ∈ F, ∀F ∈ B(R) where fX has the properties fX(x) ≥ 0, ∀x and

  • R

fX(x) dx = 1 We note the significance of the wording ’well defined integral’ in the above definition. Although the cdf always exist, the pdf may not.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 40/60

slide-41
SLIDE 41

Continuous Random Variables and Probability Functions

Considering events of the form (−∞, α], the pdf gives probabilities: PX((−∞, α]) = α

−∞

fX(x) dx, ∀α ∈ R We now have two ways of expressing the probability of an event of the form (−∞, α], the cdf and the pdf. Thus they can be related as follows: FX(α) = PX((−∞, α]) = α

−∞

fX(x) dx; α ∈ R and also fX(α) = d FX(α) d α ; α ∈ R Also note that: Pr(a < X ≤ b) = FX(b) − FX(a) = b

a fX(x) dx

Dr Conor McArdle EE414 - Probability & Stochastic Processes 41/60

slide-42
SLIDE 42

Continuous Random Variables and Probability Functions

We have previously derived the pmf of the discrete random variable from the pmf in the original (domain) probability space. Can we also derive the pdf of a continuous random variable X, given a pdf for the original space? (Ω = R, F = B(R), P)

X

− → (ΩX = R, FX = B(R), PX) f given fX ? Method: FX(x) = Pr(X ≤ x) = P({r ∈ Ω : X(r) ≤ x}) =

  • r∈Ω :X(r)≤x

f(r) dr Assuming we can find the limits of integration (which requires evaluating X−1), the pdf of X may then be calculated as: fX(x) = d dx

  • r∈Ω :X(r)≤x

f(r) dr

  • Dr Conor McArdle

EE414 - Probability & Stochastic Processes 42/60

slide-43
SLIDE 43

Continuous Random Variables and Probability Functions

Example: Derived distribution of a continuous random variable Consider the random variable X such that (R, B(R), P)

X(r)=r2

− − − − − → (R+, B(R+), PX) Find the probability density function (pdf) that induces PX, given that P is induced by the uniform pdf on [0, 1] (that is, f(r) = 1, ∀r ∈ [0, 1] and is 0 otherwise). Solution First find the cdf FX of X FX(x) = Pr(X ≤ x) = P({r ∈ Ω : X(r) ≤ x}) = = Pr(r ∈ [0, x

1 2 ])

=

x

1 2

  • f(r) dr = x

1 2 Dr Conor McArdle EE414 - Probability & Stochastic Processes 43/60

slide-44
SLIDE 44

Continuous Random Variables and Probability Functions

Solution continued ... Now finding the pdf fX(x) = d dxFX(x) = = d dxx

1 2 =

fX(x) = 1 2x

1 2 ,

0 ≤ x ≤ 1, 0 otherwise We may check our answer by checking that fX(x) is a pdf: 1 2

1

  • x

1 2 dx = x 1 2

  • 1

0 = 1 ok

Dr Conor McArdle EE414 - Probability & Stochastic Processes 44/60

slide-45
SLIDE 45

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 45/60

slide-46
SLIDE 46

Moments of Random Variables: Expectation

The mathematical expectation of a discrete random variable X , denoted E[X], is the probability-weighted average of the values taken on by X. For example, if the range of a discrete random variable X is x1, x2, ..., xn and P(X = xi) = p(i) (p the pmf of X), then we weight each possible value xi by the corresponding probability p(i) and sum to obtain the expectation (or expected value)

  • f X :

E[X] =

n

  • i=1

xip(i) We note that the expected value of a random variable X may also be referred to as the mean value of X or the first moment of the random variable X. The expected value is normally written in shorthand form as X.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 46/60

slide-47
SLIDE 47

Moments of Random Variables: Expectation

Example Find the expected value of the discrete random variable X with range space Z+ and pmf given by pX(k) = (1 − ρ)ρk, 0 ≤ ρ < 1. Solution E[X] =

  • k=0

k pX(k) = (1 − ρ)

  • k=0

k ρk = (1 − ρ)ρ δ δρ

  • k=0

ρk = (1 − ρ)ρ δ δρ 1 1 − ρ = ρ 1 − ρ

Dr Conor McArdle EE414 - Probability & Stochastic Processes 47/60

slide-48
SLIDE 48

Moments of Random Variables: Expectation

We can extend the idea of expectation to continuous random variables. Consider a continuous random variable X whose range is the interval [a, b] ∈ R. We can partition this interval into small subintervals [bi−1, bi] and write: E[X] =

  • i

xiPr(bi−1 ≤ X < bi) =

  • i

xi[FX(bi) − FX(bi−1)] ≈

  • ∀x

x dFX(x) where FX(x) = Pr(X ≤ x) is the cdf of X. This approximation becomes exact as the subintervals tend toward length 0 so we have, for a general random variable X E[X] =

  • x dFX(x)

and, given that the pdf is the derivative of the cdf: E[X] =

  • xfX(x) dx

Dr Conor McArdle EE414 - Probability & Stochastic Processes 48/60

slide-49
SLIDE 49

Moments of Random Variables: Expectation

Example Find the expected value of the continuous random variable X with range space R+ and exponential pdf given by f(r) = λe−λr; λ > 0. Solution E[X] = ∞ rλe−λr dr = −re−λr

r=0 +

∞ e−λr dr = − 1 λe−λr

r=0

= 1 λ

Dr Conor McArdle EE414 - Probability & Stochastic Processes 49/60

slide-50
SLIDE 50

Moments of Random Variables: Variance

The expected value gives limited information about the distribution of a random variable, as quite dissimilar random variables may have the same mean value. To further describe a random variable’s distribution, some measure of the spread of probability mass about the mean value is required. One possibility for this measure would be: E[|X − E[X]|] However, to weight large excursions from the mean value more heavily, we define: Var(X) = E[(X − E[X])2] This is termed the variance of the random variable X. We note that Var(X) = E[X2] − E2[X] and that E[X2] is referred to as the second moment of X. The second moment is often written in short-hand form as X2. The variance can thus be expressed as X2 − X

2

Dr Conor McArdle EE414 - Probability & Stochastic Processes 50/60

slide-51
SLIDE 51

Moments of Random Variables: Variance

Example Find the variance of the continuous random variable X with range space R+ and exponential pdf given by f(r) = λe−λr; λ > 0. Solution We have previously calculated E[X] = 1

λ and

E[X2] = ∞ r2λe−λr dr = 2 λ2 so Var(X) = E[X2] − E2[X] = 2 λ2 − 1 λ2 = 1 λ2

Dr Conor McArdle EE414 - Probability & Stochastic Processes 51/60

slide-52
SLIDE 52

Probability & Stochastic Processes

Introduction to Probability Theory Sample Spaces Event Spaces Probability Measure Probability Functions Random Variables Moments of Random Variables Introduction to Stochastic Processes

Dr Conor McArdle EE414 - Probability & Stochastic Processes 52/60

slide-53
SLIDE 53

Stochastic Processes

Many random systems of interest display some from of time-dependent change, evolving from one state to another as time passes, for instance: Stock market prices Population sizes The utilisation level of a communications link The number of packets queued in a router’s buffer To model such systems, the notion of a stochastic process (or random process) is useful. A stochastic process is a family of random variables {X(t, ω) : t ∈ I and ω ∈ Ω}, indexed by the set I and defined on a common (domain) probability space (Ω, F, P). X(t, ω) is often abbreviated to X(t) or Xt.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 53/60

slide-54
SLIDE 54

Stochastic Processes

The index set I may be discrete (e.g. I = Z+) or continuous (e.g. I = [0, ∞)). I is usually interpreted as being time (either discrete or continuous). We can view a stochastic process as being a mapping from each sample point ω ∈ Ω to a function of time and note that: For a given value of ω, X(t, ω) is a function of time, For a given value of t, X(t, ω) is a random variable and For a given value of both ω and t, X(t, ω) is a fixed sample value. X(t, ω) for a given ω is also called a trajectory or sample path of the random process. We observe that the probability distribution governing the likelihood of different ω’s dictates the likelihood of different trajectories that the output value of the stochastic process will take over time. We also observe that, at a given time t, X(t, ω) describes the likelihood of different values (states) of the process. That is, at a given point in time (e.g. t = t1), the random variable X(t1, ω) has a cdf (or pmf if discrete) describing the likelihood of the process being in different states at that time.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 54/60

slide-55
SLIDE 55

Stochastic Processes

Example of a Stochastic Process Consider a game where a coin is tossed repeatedly (ad infinitum) and the player’s score is accumulated by adding 1 point when a head turns up and deducting 1 point when a tail turns up. Let us describe this process as a stochastic process defined on a common probability space (Ω, F, P). A single outcome of the experiment is some infinite sequence of equally likely 1’s and

  • 1’s, that is the sample space is a product space:

Ω = {−1, 1}∞ = {all vectors ω = (a0, a1, ..., ai, ...) : ai ∈ {−1, 1}} We can then describe the player’s score as the stochastic process: X(t, ω) =

t

  • i=0

ai, t ∈ Z+, ai the i’th component of ω ∈ Ω We note that at any fixed value of t ∈ Z+, we have a random variable. For example, X(2, ω) is a random variable with associated pmf: X(2, −2) = 1 4, X(2, 0) = 1 2, X(2, 2) = 1 4

Dr Conor McArdle EE414 - Probability & Stochastic Processes 55/60

slide-56
SLIDE 56

Stochastic Processes: Classifications

Stochastic processes may be classified according to the nature of:

1 The State Space, the set of possible values (or states) that X(t, ω) can take on.

The state space can either be (i) discrete (finite or countable set of states) or (ii) continuous (values over continuous intervals).

2 The Parameter Space, the permitted times at which changes in state may occur.

The parameter space can either be (i) discrete (discrete time process) or (ii) continuous (continuous time process).

3 The Statistical Dependencies among the the family of random variables

X(t, ω), for different values of t. Classifications of statistical dependencies are discussed below.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 56/60

slide-57
SLIDE 57

Stochastic Processes: Classifications

Statistical Dependencies Firstly let us consider possible probabilistic relationships between two random variables X and Y . Consider the events (X ≤ x) and (Y ≤ y). The events are independent if Pr((X ≤ x) and (Y ≤ y)) = Pr(X ≤ x).Pr(Y ≤ y) Where this is not the case, there is a statistical dependency between the events. The random variables X and Y are said to be independent if Pr((X ≤ x) and (Y ≤ y)) = Pr(X ≤ x).Pr(Y ≤ y) for all such events (X ≤ x) and (Y ≤ y) Where this is not the case, the probabilistic dependencies between X and Y can be described in terms of joint probability functions.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 57/60

slide-58
SLIDE 58

Stochastic Processes: Statistical Dependencies

The joint distribution function of random variables X and Y is defined as FX,Y (x, y) = Pr(X ≤ x, Y ≤ y) The joint probability density function of random variables X and Y is defined as fX,Y (x, y) = δ δx δ δy FX,Y (x, y) We observe an alternative definition of independence. X and Y are independent if FX,Y (x, y) = FX(x)FY (y)

  • r, equivalently, X and Y are independent if

fX,Y (x, y) = fX(x)fY (y)

Dr Conor McArdle EE414 - Probability & Stochastic Processes 58/60

slide-59
SLIDE 59

Stochastic Processes: Statistical Dependencies

The the notion of joint distributions and joint density functions can be extended to a group of any number of random variables. Consider the the stochastic process X(t, ω) as an infinite series of random variables X(ti, ω) where i ∈ I an infinite index set. The the joint distribution function of these random variables can be denoted: FX(t1),X(t2),...(x1, x2, ...) = Pr(X(t1) ≤ x1, X(t2) ≤ x2, ...) We may then define an Independent Process as a stochastic process with the property FX(t1),X(t2),...(x1, x2, ...) = FX(t1)(x1).FX(t2)(x2).... We note that independent processes are somewhat trivial, given that the state of the process does not evolve from (depend on) previous states. For (more interesting) process that are not independent, the statistical dependence between states at different times is expressed in the joint distribution function, however, in general, this function is complex and so simpler mechanisms of specification are more useful. We will see an example of such a mechanism when we meet Markov Processes.

Dr Conor McArdle EE414 - Probability & Stochastic Processes 59/60

slide-60
SLIDE 60

Stochastic Processes: Statistical Dependencies

Other classifications of stochastic processes relating to statistical dependencies can be made: A Stationary Process is a stochastic process whose joint distribution function FX(t1),X(t2),... does not change with shifts in time, that is, for a constant τ, FX(t1+τ),X(t2+τ),... = FX(t1),X(t2),... An Ergodic Process is a stochastic process where a full description of the process can be determined from a single (infinitely long) sample path of the process. This infers that the behaviour of the process, after a long period of evolution, becomes independent of the starting point of the process. Exercise 6 Give a classification of the stochastic process described in the previous example (the infinite coin tossing game).

Dr Conor McArdle EE414 - Probability & Stochastic Processes 60/60