chapter 4 entropy rates of a stochastic process
play

Chapter 4 Entropy Rates of a Stochastic Process Peng-Hua Wang - PowerPoint PPT Presentation

Chapter 4 Entropy Rates of a Stochastic Process Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 4 Entropy Rates of a Stochastic Process 4.1 Markov Chains 4.2 Entropy Rate 4.3 Example: Entropy


  1. Chapter 4 Entropy Rates of a Stochastic Process Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University

  2. Chapter Outline Chap. 4 Entropy Rates of a Stochastic Process 4.1 Markov Chains 4.2 Entropy Rate 4.3 Example: Entropy Rate of a Random Walk on a Weighted Graph 4.4 Second Law of Thermodynamics 4.5 Functions of Markov Chains Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 2/13

  3. 4.1 Markov Chains Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 3/13

  4. Stationary Definition (Stationary) A stochastic process is said to be stationary if Pr { X 1 = x 1 , X 2 = x 2 , . . . , X n = x n } = Pr { X 1+ ℓ = x 1 , X 2+ ℓ = x 2 , . . . , X n + ℓ = x n } for every n and every shift ℓ . ■ the joint distribution of any subset of the sequence of random variables is invariant with respect to shifts in the time index. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 4/13

  5. Markov chain Definition (Markov chain) A discrete stochastic process X 1 , X 2 , . . . is said to be a Markov chain or a Markov process if for n = 1 , 2 , . . . , Pr { X n +1 = x n +1 | X n = x n , X n − 1 = x n − 1 , . . . , X 1 = x 1 } = Pr { X n +1 = x n +1 | X n = x n } . ■ The joint pmf can be written as p ( x 1 , x 2 , . . . , x n ) = p ( x 1 ) p ( x 2 | x 1 ) p ( x 3 | x 2 ) · · · p ( x n | x n − 1 ) . Definition (Time invariant) The Markov chain is said to be time invariant if the transition probability p ( x n +1 | x n ) , Pr { X n +1 = b | X n = a } = Pr { X 2 = b | X 1 = a } for all a, b ∈ X . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 5/13

  6. Markov chain ■ We will assume that the Markov chain is time invariant. ■ X n is called the state at time n . ■ A time invariant Markov chain is characterized by its initial state and a probability transition matrix P = [ P ij ] , i, j ∈ { 1 , 2 , . . . , m } , where P i,j = Pr { X n +1 = j | X n = i } . ■ The pmf at time n + 1 is � p ( x n +1 ) = p ( x n ) P x n x n +1 x n ■ A distribution on the states such that the distribution at time n + 1 is the same as the distribution at time n is called a stationary distribution . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 6/13

  7. Example 4.1.1 Consider a two-state Markov chain with a probability transition matrix � � 1 − α α P = . 1 − β β Find its stationary distribution and entropy. Solution. Let µ 1 , µ 2 be the stationary distribution. µ 1 = µ 1 (1 − α ) + µ 2 β µ 2 = µ 1 α + µ 2 (1 − β ) and µ 1 + µ 2 = 1 . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 7/13

  8. 4.2 Entropy Rate Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 8/13

  9. Entropy Rate Definition (Entropy Rate) The entropy of a random process { X i } is defined by 1 H ( X ) = lim nH ( X 1 , X 2 , . . . , X n ) . n →∞ Definition (Conditional Entropy Rate) The entropy of a random process { X i } is defined by H ′ ( X ) = lim n →∞ H ( X n | X 1 , X 2 , . . . , X n − 1 ) . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 9/13

  10. Entropy Rate ■ If X 1 , X 2 , . . . are i.i.d. random variables. Then H ( X 1 , X 2 , . . . , X n ) = lim nH ( X 1 ) H ( X ) = lim = H ( X 1 ) . n n n →∞ ■ If X 1 , X 2 , . . . are independent but not identical distributed n 1 � H ( X ) = lim H ( X i ) . n n →∞ i =1 ■ We can choose a sequence of distributions on X 1 , X 2 . . . such that the limit does not exist. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 10/13

  11. Entropy Rate Theorem 4.2.2 For a stationary stochastic process, H ( X n | X n − 1 , . . . , X 1 ) is nonincreasing in n and has a limit H ′ ( X ) . Proof. H ( X n +1 | X 1 , X 2 , . . . , X n ) ≤ H ( X n +1 | X 2 , . . . , X n ) ( conditioning reduce entropy ) = H ( X n | X 1 , . . . , X n − 1 ) ( stationary ) Since H ( X n | X n − 1 , . . . , X 1 ) is nonnegative and decreasinging, it has a limit H ′ ( X ) . Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 11/13

  12. Entropy Rate Theorem 4.2.1 For a stationary stochastic process, both H ( X ) and H ′ ( X ) exist and are equal. H ( X ) = H ′ ( X ) . Proof. By the chain rule, n nH ( X 1 , X 2 , . . . , X n ) = 1 1 � H ( X i | , X i − 1 , . . . , X 1 ) , n i =1 that is, the entropy rate is the time average of the conditional entropies. Since the conditional entropies has a limit H ′ ( X ) . We conclude that � the entropy rate has the same limit by Theorem of Ces´ aro mean. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 12/13

  13. Ces´ aro mean � n aro mean) If a n → a and b n = 1 i =1 a i , then Theorem (Ces´ n b n → a . Proof. Let ǫ > 0 . Since a n → a , there exists a number N such that | a n − a | ≤ ǫ for n > N . Hence, � � n n 1 � ≤ 1 � � � � | b n − a | = ( a i − a ) | ( a i − a ) | � � n n � � � i =1 i =1 N N ≤ 1 | ( a i − a ) | + n − N ǫ ≤ 1 � � | ( a i − a ) | + ǫ ≤ ǫ n n n i =1 i =1 when n is large enough. � Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 4 - p. 13/13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend