markov decision processes
play

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn - PowerPoint PPT Presentation

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020 Outline 1 Hidden Markov models Inference: filtering, smoothing, best sequence Dynamic Bayesian


  1. Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  2. Outline 1 ● Hidden Markov models ● Inference: filtering, smoothing, best sequence ● Dynamic Bayesian networks ● Speech recognition Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  3. Time and Uncertainty 2 ● The world changes; we need to track and predict it ● Diabetes management vs vehicle diagnosis ● Basic idea: sequence of state and evidence variables ● X t = set of unobservable state variables at time t e.g., BloodSugar t , StomachContents t , etc. ● E t = set of observable evidence variables at time t e.g., MeasuredBloodSugar t , PulseRate t , FoodEaten t ● This assumes discrete time ; step size depends on problem ● Notation: X a ∶ b = X a , X a + 1 ,..., X b − 1 , X b Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  4. Markov Processes (Markov Chains) 3 ● Construct a Bayes net from these variables: parents? ● Markov assumption: X t depends on bounded subset of X 0 ∶ t − 1 ● First-order Markov process: P ( X t ∣ X 0 ∶ t − 1 ) ≃ P ( X t ∣ X t − 1 ) Second-order Markov process: P ( X t ∣ X 0 ∶ t − 1 ) ≃ P ( X t ∣ X t − 2 , X t − 1 ) ● Sensor Markov assumption: P ( E t ∣ X 0 ∶ t , E 0 ∶ t − 1 ) ≃ P ( E t ∣ X t ) ● Stationary process: transition model P ( X t ∣ X t − 1 ) and sensor model P ( E t ∣ X t ) fixed for all t Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  5. Example 4 ● First-order Markov assumption not exactly true in real world! ● Possible fixes: 1. Increase order of Markov process 2. Augment state , e.g., add Temp t , Pressure t Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  6. 5 inference Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  7. Inference Tasks 6 ● Filtering: P ( X t ∣ e 1 ∶ t ) belief state—input to the decision process of a rational agent ● Smoothing: P ( X k ∣ e 1 ∶ t ) for 0 ≤ k < t better estimate of past states, essential for learning ● Most likely explanation: arg max x 1 ∶ t P ( x 1 ∶ t ∣ e 1 ∶ t ) speech recognition, decoding with a noisy channel Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  8. Filtering 7 ● Aim: devise a recursive state estimation algorithm P ( X t + 1 ∣ e 1 ∶ t + 1 ) = P ( X t + 1 ∣ e 1 ∶ t , e t + 1 ) = α P ( e t + 1 ∣ X t + 1 , e 1 ∶ t ) P ( X t + 1 ∣ e 1 ∶ t ) (Bayes rule) ≃ α P ( e t + 1 ∣ X t + 1 ) P ( X t + 1 ∣ e 1 ∶ t ) (Sensor Markov assumption) = α P ( e t + 1 ∣ X t + 1 )∑ P ( X t + 1 ∣ x t , e 1 ∶ t ) P ( x t ∣ e 1 ∶ t ) (multiplying out) x t ≃ α P ( e t + 1 ∣ X t + 1 )∑ P ( X t + 1 ∣ x t ) P ( x t ∣ e 1 ∶ t ) (first order Markov model) x t ● Summary: P ( X t + 1 ∣ e 1 ∶ t + 1 ) ≃ α P ( e t + 1 ∣ X t + 1 ) P ( X t + 1 ∣ x t ) P ( x t ∣ e 1 ∶ t ) ∑ �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� x t emission transition recursive call ● f 1 ∶ t + 1 = F ORWARD ( f 1 ∶ t , e t + 1 ) where f 1 ∶ t = P ( X t ∣ e 1 ∶ t ) Time and space constant (independent of t ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  9. Filtering Example 8 transition transition emission emission Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  10. Smoothing 9 ● If full sequence is known ⇒ what is the state probability P ( X k ∣ e 1 ∶ t ) including future evidence? ● Smoothing: sum over all paths Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  11. Smoothing 10 ● Divide evidence e 1 ∶ t into e 1 ∶ k , e k + 1 ∶ t : P ( X k ∣ e 1 ∶ t ) = P ( X k ∣ e 1 ∶ k , e k + 1 ∶ t ) = α P ( X k ∣ e 1 ∶ k ) P ( e k + 1 ∶ t ∣ X k , e 1 ∶ k ) α P ( X k ∣ e 1 ∶ k ) P ( e k + 1 ∶ t ∣ X k ) ≃ = α f 1 ∶ k b k + 1 ∶ t ● Backward message b k + 1 ∶ t computed by a backwards recursion P ( e k + 1 ∶ t ∣ X k ) P ( e k + 1 ∶ t ∣ X k , x k + 1 ) P ( x k + 1 ∣ X k ) ∑ = x k + 1 P ( e k + 1 ∶ t ∣ x k + 1 ) P ( x k + 1 ∣ X k ) ∑ ≃ x k + 1 P ( e k + 1 ∣ x k + 1 ) P ( e k + 2 ∶ t ∣ x k + 1 ) P ( x k + 1 ∣ X k ) ∑ = x k + 1 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  12. Smoothing Example 11 Forward–backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O ( t ∣ f ∣) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  13. Most Likely Explanation 12 ● Most likely sequence ≠ sequence of most likely states ● Most likely path to each x t + 1 = most likely path to some x t plus one more step x 1 ... x t P ( x 1 ,..., x t , X t + 1 ∣ e 1 ∶ t + 1 ) max = P ( e t + 1 ∣ X t + 1 ) max x t ( P ( X t + 1 ∣ x t ) max x 1 ... x t − 1 P ( x 1 ,..., x t − 1 , x t ∣ e 1 ∶ t )) ● Identical to filtering, except f 1 ∶ t replaced by m 1 ∶ t = x 1 ... x t − 1 P ( x 1 ,..., x t − 1 , X t ∣ e 1 ∶ t ) max i.e., m 1 ∶ t ( i ) gives the probability of the most likely path to state i . ● Update has sum replaced by max, giving the Viterbi algorithm: m 1 ∶ t + 1 = P ( e t + 1 ∣ X t + 1 ) max x t ( P ( X t + 1 ∣ x t ) m 1 ∶ t ) Also requires back-pointers for backward pass to retrieve best sequence bX t + 1 ,t + 1 = argmax x t ( P ( X t + 1 ∣ x t ) m 1 ∶ t ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  14. Viterbi Example 13 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  15. Hidden Markov Models 14 ● X t is a single, discrete variable (usually E t is too) Domain of X t is { 1 ,...,S } ● Transition matrix T ij = P ( X t = j ∣ X t − 1 = i ) , e.g., ( 0 . 7 0 . 7 ) 0 . 3 0 . 3 ● Sensor matrix O t for each time step, diagonal elements P ( e t ∣ X t = i ) e.g., with U 1 = true , O 1 = ( 0 . 9 0 . 2 ) 0 . 1 0 . 8 ● Forward and backward messages as column vectors: = α O t + 1 T ⊺ f 1 ∶ t f 1 ∶ t + 1 = b k + 1 ∶ t TO k + 1 b k + 2 ∶ t ● Forward-backward algorithm needs time O ( S 2 t ) and space O ( St ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  16. 15 dynamic baysian networks Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  17. Dynamic Bayesian Networks 16 ● X t , E t contain arbitrarily many variables in a sequentialized Bayes net Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  18. DBNs vs. HMMs 17 ● Every HMM is a single-variable DBN; every discrete DBN is an HMM ● Sparse dependencies ⇒ exponentially fewer parameters; e.g., 20 state variables, three parents each DBN has 20 × 2 3 = 160 parameters, HMM has 2 20 × 2 20 ≈ 10 12 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  19. 18 speech recognition Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  20. Speech as Probabilistic Inference 19 It’s not easy to wreck a nice beach ● Speech signals are noisy, variable, ambiguous ● What is the most likely word sequence, given the speech signal? I.e., choose Words to maximize P ( Words ∣ signal ) ● Use Bayes’ rule: P ( Words ∣ signal ) = αP ( signal ∣ Words ) P ( Words ) i.e., decomposes into acoustic model + language model ● Words are the hidden state sequence, signal is the observation sequence Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  21. Phones 20 ● All human speech is composed from 40-50 phones, determined by the configuration of articulators (lips, teeth, tongue, vocal cords, air flow) ● Form an intermediate level of hidden states between words and signal ⇒ acoustic model = pronunciation model + phone model ● ARPAbet designed for American English b ea t b et p et [iy] [b] [p] [ih] b i t [ch] Ch et [r] r at b e t d ebt s et [ey] [d] [s] b ough t h at th ick [ao] [hh] [th] [ow] b oa t [hv] h igh [dh] th at [er] B er t [l] l et [w] w et ros e s si ng butt on [ix] [ng] [en] ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ e.g., “ceiling” is [s iy l ih ng] / [s iy l ix ng] / [s iy l en] Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  22. Speech Sounds 21 ● Raw signal is the microphone displacement as a function of time; processed into overlapping 30ms frames, each described by features ● Frame features are typically formants—peaks in the power spectrum Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  23. Speech Spectrogram 22 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend