Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for - PowerPoint PPT Presentation

Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October 5, 2020 Introduction to Random Processes Markov Chains 1

Limiting distributions Limiting distributions Ergodicity Queues in communication networks: Limit probabilities Introduction to Random Processes Markov Chains 2

Limiting distributions ◮ MCs have one-step memory. Eventually they forget initial state ◮ Q: What can we say about probabilities for large n ? � X 0 = i � � � n →∞ P n π j := lim n →∞ P X n = j = lim ij ⇒ Assumed that limit is independent of initial state X 0 = i ◮ We’ve seen that this problem is related to the matrix power P n � 0 . 8 � 0 . 6031 � � 0 . 2 0 . 3969 P 7 = P = , 0 . 3 0 . 7 0 . 5953 0 . 4047 � 0 . 7 � 0 . 6000 � � 0 . 3 0 . 4000 P 2 = P 30 = , 0 . 45 0 . 55 0 . 6000 0 . 4000 ◮ Matrix product converges ⇒ probs. independent of time (large n ) ◮ All rows are equal ⇒ probs. independent of initial condition Introduction to Random Processes Markov Chains 3

Periodicity ◮ Def: Period d of a state i is (gcd means greatest common divisor) d = gcd { n : P n ii � = 0 } ◮ State i is periodic with period d if and only if ⇒ P n ii � = 0 only if n is a multiple of d ⇒ d is the largest number with this property ◮ Positive probability of returning to i only every d time steps ⇒ If period d = 1 state is aperiodic (most often the case) ⇒ Periodicity is a class property 1 − p p 0 1 2 1 1 ◮ State 1 has period 2. So do 0 and 2 (class property) ◮ Ex: One dimensional random walk also has period 2 Introduction to Random Processes Markov Chains 4

Periodicity example Example � 0 . 50 � 0 . 250 � � � � 0 1 0 . 50 0 . 750 P 2 = P 3 = P = , , 0 . 5 0 . 5 0 . 25 0 . 75 0 . 375 0 . 625 ◮ P 11 = 0, but P 2 11 , P 3 11 � = 0 so gcd { 2 , 3 , . . . } = 1. State 1 is aperiodic ◮ P 22 � = 0. State 2 is aperiodic (had to be, since 1 ↔ 2) Example � 0 � � � � � 0 1 1 0 1 P 2 = P 3 = P = , , , . . . 1 0 0 1 1 0 ◮ P 2 n +1 = 0, but P 2 n 11 � = 0 so gcd { 2 , 4 , . . . } = 2. State 1 has period 2 11 ◮ The same is true for state 2 (since 1 ↔ 2) Introduction to Random Processes Markov Chains 5

Positive recurrence and ergodicity ◮ Recall: state i is recurrent if the MC returns to i with probability 1 ⇒ Define the return time to state i as � X 0 = i } � T i = min { n > 0 : X n = i ◮ Def: State i is positive recurrent when expected value of T i is finite ∞ � X 0 = i � � X 0 = i � � � � � � E T i = n P T i = n < ∞ n =1 � X 0 = i ◮ Def: State i is null recurrent if recurrent but E � � � T i = ∞ ⇒ Positive and null recurrence are class properties ⇒ Recurrent states in a finite-state MC are positive recurrent ◮ Def: Jointly positive recurrent and aperiodic states are ergodic ⇒ Irreducible MC with ergodic states is said to be an ergodic MC Introduction to Random Processes Markov Chains 6

Null recurrent Markov chain example 1 / 2 1 / 3 1 / 4 0 1 2 3 . . . 1 1 / 2 2 / 3 3 / 4 = 1 = 1 2 × 1 � � X 0 = 0 � � X 0 = 0 P � T 0 = 2 � P � T 0 = 3 � 2 3 = 1 2 × 2 3 × 1 1 1 � � � X 0 = 0 � � � � X 0 = 0 � P T 0 = 4 4 = . . . P T 0 = n = 3 × 4 ( n − 1) × n ◮ State 0 is recurrent because probability of not returning is 0 1 � � X 0 = 0 � � P T 0 = ∞ = lim ( n − 1) × n → 0 n →∞ ◮ Also null recurrent because expected return time is infinite ∞ ∞ 1 � � X 0 = 0 � � � X 0 = 0 � � � � � E T 0 = n P T 0 = n = ( n − 1) = ∞ n =2 n =2 Introduction to Random Processes Markov Chains 7

Limit distribution of ergodic Markov chains Theorem For an ergodic (i.e., irreducible, aperiodic and positive recurrent) MC, lim n →∞ P n ij exists and is independent of the initial state i, i.e., n →∞ P n π j = lim ij Furthermore, steady-state probabilities π j ≥ 0 are the unique nonnegative solution of the system of linear equations ∞ ∞ � � π j = π i P ij , π j = 1 i =0 j =0 ◮ Limit probs. independent of initial condition exist for ergodic MC ⇒ Simple algebraic equations can be solved to find π j ◮ No periodic, transient, null recurrent states, or multiple classes Introduction to Random Processes Markov Chains 8

Algebraic relation to determine limit probabilities n →∞ P n ◮ Difficult part of theorem is to prove that π j = lim ij exists ◮ To see that algebraic relation is true use total probability ∞ P n +1 � � X n = i , X 0 = k � � � P n = P X n +1 = j ki kj i =0 ∞ � P ij P n = ki i =0 ◮ If limits exists, P n +1 ≈ π j and P n ki ≈ π i (sufficiently large n ) kj ∞ � π j = π i P ij i =0 ◮ The other equation is true because the π j are probabilities Introduction to Random Processes Markov Chains 9

Vector/matrix notation: Matrix limit ◮ More compact and illuminating using vector/matrix notation ⇒ Finite MC with J states n →∞ P n exists and ◮ First part of theorem says that lim   π 1 π 2 . . . π J π 1 π 2 . . . π J n →∞ P n =   lim . . . .   . . . .   . . . .   π 1 π 2 . . . π J ◮ Same probabilities for all rows ⇒ Independent of initial state ◮ Probability distribution for large n n →∞ ( P T ) n p (0) = [ π 1 , . . . , π J ] T n →∞ p ( n ) = lim lim ⇒ Independent of initial condition p (0) Introduction to Random Processes Markov Chains 10

Vector/matrix notation: Eigenvector ◮ Def: Vector limit (steady-state) distribution is π := [ π 1 , . . . , π J ] T ◮ Limit distribution is unique solution of ( 1 := [1 , 1 , . . . ] T ) π = P T π , π T 1 = 1 ◮ π eigenvector associated with eigenvalue 1 of P T ◮ Eigenvectors are defined up to a scaling factor ◮ Normalize to sum 1 ◮ All other eigenvalues of P T have modulus smaller than 1 ◮ If not, P n diverges, but we know P n contains n -step transition probs. ◮ π eigenvector associated with largest eigenvalue of P T ◮ Computing π as eigenvector is often computationally efficient Introduction to Random Processes Markov Chains 11

Vector/matrix notation: Rank ◮ Can also write as ( I is identity matrix, 0 = [0 , 0 , . . . ] T ) � I − P T � π T 1 = 1 π = 0 ◮ π has J elements, but there are J + 1 equations ⇒ Overdetermined ◮ If 1 is eigenvalue of P T , then 0 is eigenvalue of I − P T ◮ I − P T is rank deficient, in fact rank( I − P T ) = J − 1 ◮ Then, there are in fact only J linearly independent equations ◮ π is eigenvector associated with eigenvalue 0 of I − P T ◮ π spans null space of I − P T (not much significance) Introduction to Random Processes Markov Chains 12

Ergodic Markov chain example ◮ MC with transition probability matrix   0 0 . 3 0 . 7 P = 0 . 1 0 . 5 0 . 4   0 . 1 0 . 2 0 . 7 ◮ Q: Does P correspond to an ergodic MC? ◮ Irreducible: all states communicate with state 2 � ◮ Positive recurrent: irreducible and finite � ◮ Aperiodic: period of state 2 is 1 � ◮ Then, there exist π 1 , π 2 and π 3 such that π j = lim n →∞ P n ij ⇒ Limit is independent of i Introduction to Random Processes Markov Chains 13

Ergodic Markov chain example (continued) ◮ Q: How do we determine the limit probabilities π j ? ◮ Solve system of linear equations π j = � 3 i =1 π i P ij and � 3 j =1 π j = 1    0 0 . 1 0 . 1  π 1   π 1 0 . 3 0 . 5 0 . 2 π 2      = π 2       0 . 7 0 . 4 0 . 7 π 3    π 3 1 1 1 1 ⇒ The blue block in the matrix above is P T ◮ There are three variables and four equations ◮ Some equations might be linearly dependent ◮ Indeed, summing first three equations: π 1 + π 2 + π 3 = π 1 + π 2 + π 3 ◮ Always true, because probabilities in rows of P sum up to 1 ◮ A manifestation of the rank deficiency of I − P T ◮ Solution yields π 1 = 0 . 0909, π 2 = 0 . 2987 and π 3 = 0 . 6104 Introduction to Random Processes Markov Chains 14

Stationary distribution ◮ Limit distributions are sometimes called stationary distributions ⇒ Select initial distribution to P ( X 0 = i ) = π i for all i ◮ Probabilities at time n = 1 follow from law of total probability ∞ � � � X 0 = i � � P ( X 1 = j ) = P X 1 = j P ( X 0 = i ) i =1 ◮ Definitions of P ij , and P ( X 0 = i ) = π i . Algebraic property of π j ∞ � P ( X 1 = j ) = P ij π i = π j i =1 ⇒ Probability distribution is unchanged ◮ Proceeding recursively, system initialized with P ( X 0 = i ) = π i ⇒ Probability distribution invariant: P ( X n = i ) = π i for all n ◮ MC stationary in a probabilistic sense (states change, probs. do not) Introduction to Random Processes Markov Chains 15

Ergodicity Limiting distributions Ergodicity Queues in communication networks: Limit probabilities Introduction to Random Processes Markov Chains 16

Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for - PowerPoint PPT Presentation

Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October 5, 2020 Introduction to Random Processes Markov Chains 1

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

Discrete Time Markov Chains Discrete-Time Markov Chains Books - Introduction to Stochastic

Overview Verifying Continuous-Time Markov Chains Negative exponential distributions 1 Lecture

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University 2 Markov Chains

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Simulation of Discrete-Time Markov Chains Discrete-Time Markov Chains (DTMCs) Numerical Solution

Under Interval and Fuzzy From the . . . Symmetric Markov Chains Uncertainty, Symmetric In

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Markov chains and MCMC methods Ingo Blechschmidt November 7th, 2014 Kleine Bayessche AG Markov

Markov chains Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad Niemi

Ensemble Learning 4/10/17 Ensemble Learning Hypothesis Space: Supervised learning (data has

Introduction to Artificial Intelligence Decision Trees, Random Forest Janyl Jumadinova October

1 Real-valued features Non-binary class variable Noise and overfjtting 1.1

Decision Tree Learning Based on Machine Learning, T. Mitchell, McGRAW Hill, 1997, ch. 3

A first intermediate class with limit object Jaroslav Neetil Patrice Ossona de Mendez Charles

Some representation theory arising from set-theoretic homological algebra Jan Trlifaj Univerzita

A projective Fra ss e presentation of the Menger curve Aristotelis Panagiotopoulos,

Student Class Assignment Optimization K.Ciebiera, M.Mucha USOS The University