Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn - PowerPoint PPT Presentation

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Outline 1 ● Hidden Markov models ● Inference: filtering, smoothing, best sequence ● Dynamic Bayesian networks ● Speech recognition Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Time and Uncertainty 2 ● The world changes; we need to track and predict it ● Diabetes management vs vehicle diagnosis ● Basic idea: sequence of state and evidence variables ● X t = set of unobservable state variables at time t e.g., BloodSugar t , StomachContents t , etc. ● E t = set of observable evidence variables at time t e.g., MeasuredBloodSugar t , PulseRate t , FoodEaten t ● This assumes discrete time ; step size depends on problem ● Notation: X a ∶ b = X a , X a + 1 ,..., X b − 1 , X b Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Markov Processes (Markov Chains) 3 ● Construct a Bayes net from these variables: parents? ● Markov assumption: X t depends on bounded subset of X 0 ∶ t − 1 ● First-order Markov process: P ( X t ∣ X 0 ∶ t − 1 ) ≃ P ( X t ∣ X t − 1 ) Second-order Markov process: P ( X t ∣ X 0 ∶ t − 1 ) ≃ P ( X t ∣ X t − 2 , X t − 1 ) ● Sensor Markov assumption: P ( E t ∣ X 0 ∶ t , E 0 ∶ t − 1 ) ≃ P ( E t ∣ X t ) ● Stationary process: transition model P ( X t ∣ X t − 1 ) and sensor model P ( E t ∣ X t ) fixed for all t Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Example 4 ● First-order Markov assumption not exactly true in real world! ● Possible fixes: 1. Increase order of Markov process 2. Augment state , e.g., add Temp t , Pressure t Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

5 inference Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Inference Tasks 6 ● Filtering: P ( X t ∣ e 1 ∶ t ) belief state—input to the decision process of a rational agent ● Smoothing: P ( X k ∣ e 1 ∶ t ) for 0 ≤ k < t better estimate of past states, essential for learning ● Most likely explanation: arg max x 1 ∶ t P ( x 1 ∶ t ∣ e 1 ∶ t ) speech recognition, decoding with a noisy channel Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Filtering 7 ● Aim: devise a recursive state estimation algorithm P ( X t + 1 ∣ e 1 ∶ t + 1 ) = P ( X t + 1 ∣ e 1 ∶ t , e t + 1 ) = α P ( e t + 1 ∣ X t + 1 , e 1 ∶ t ) P ( X t + 1 ∣ e 1 ∶ t ) (Bayes rule) ≃ α P ( e t + 1 ∣ X t + 1 ) P ( X t + 1 ∣ e 1 ∶ t ) (Sensor Markov assumption) = α P ( e t + 1 ∣ X t + 1 )∑ P ( X t + 1 ∣ x t , e 1 ∶ t ) P ( x t ∣ e 1 ∶ t ) (multiplying out) x t ≃ α P ( e t + 1 ∣ X t + 1 )∑ P ( X t + 1 ∣ x t ) P ( x t ∣ e 1 ∶ t ) (first order Markov model) x t ● Summary: P ( X t + 1 ∣ e 1 ∶ t + 1 ) ≃ α P ( e t + 1 ∣ X t + 1 ) P ( X t + 1 ∣ x t ) P ( x t ∣ e 1 ∶ t ) ∑ �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� x t emission transition recursive call ● f 1 ∶ t + 1 = F ORWARD ( f 1 ∶ t , e t + 1 ) where f 1 ∶ t = P ( X t ∣ e 1 ∶ t ) Time and space constant (independent of t ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Filtering Example 8 transition transition emission emission Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Smoothing 9 ● If full sequence is known ⇒ what is the state probability P ( X k ∣ e 1 ∶ t ) including future evidence? ● Smoothing: sum over all paths Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Smoothing 10 ● Divide evidence e 1 ∶ t into e 1 ∶ k , e k + 1 ∶ t : P ( X k ∣ e 1 ∶ t ) = P ( X k ∣ e 1 ∶ k , e k + 1 ∶ t ) = α P ( X k ∣ e 1 ∶ k ) P ( e k + 1 ∶ t ∣ X k , e 1 ∶ k ) α P ( X k ∣ e 1 ∶ k ) P ( e k + 1 ∶ t ∣ X k ) ≃ = α f 1 ∶ k b k + 1 ∶ t ● Backward message b k + 1 ∶ t computed by a backwards recursion P ( e k + 1 ∶ t ∣ X k ) P ( e k + 1 ∶ t ∣ X k , x k + 1 ) P ( x k + 1 ∣ X k ) ∑ = x k + 1 P ( e k + 1 ∶ t ∣ x k + 1 ) P ( x k + 1 ∣ X k ) ∑ ≃ x k + 1 P ( e k + 1 ∣ x k + 1 ) P ( e k + 2 ∶ t ∣ x k + 1 ) P ( x k + 1 ∣ X k ) ∑ = x k + 1 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Smoothing Example 11 Forward–backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O ( t ∣ f ∣) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Most Likely Explanation 12 ● Most likely sequence ≠ sequence of most likely states ● Most likely path to each x t + 1 = most likely path to some x t plus one more step x 1 ... x t P ( x 1 ,..., x t , X t + 1 ∣ e 1 ∶ t + 1 ) max = P ( e t + 1 ∣ X t + 1 ) max x t ( P ( X t + 1 ∣ x t ) max x 1 ... x t − 1 P ( x 1 ,..., x t − 1 , x t ∣ e 1 ∶ t )) ● Identical to filtering, except f 1 ∶ t replaced by m 1 ∶ t = x 1 ... x t − 1 P ( x 1 ,..., x t − 1 , X t ∣ e 1 ∶ t ) max i.e., m 1 ∶ t ( i ) gives the probability of the most likely path to state i . ● Update has sum replaced by max, giving the Viterbi algorithm: m 1 ∶ t + 1 = P ( e t + 1 ∣ X t + 1 ) max x t ( P ( X t + 1 ∣ x t ) m 1 ∶ t ) Also requires back-pointers for backward pass to retrieve best sequence bX t + 1 ,t + 1 = argmax x t ( P ( X t + 1 ∣ x t ) m 1 ∶ t ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Viterbi Example 13 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Hidden Markov Models 14 ● X t is a single, discrete variable (usually E t is too) Domain of X t is { 1 ,...,S } ● Transition matrix T ij = P ( X t = j ∣ X t − 1 = i ) , e.g., ( 0 . 7 0 . 7 ) 0 . 3 0 . 3 ● Sensor matrix O t for each time step, diagonal elements P ( e t ∣ X t = i ) e.g., with U 1 = true , O 1 = ( 0 . 9 0 . 2 ) 0 . 1 0 . 8 ● Forward and backward messages as column vectors: = α O t + 1 T ⊺ f 1 ∶ t f 1 ∶ t + 1 = b k + 1 ∶ t TO k + 1 b k + 2 ∶ t ● Forward-backward algorithm needs time O ( S 2 t ) and space O ( St ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

15 dynamic baysian networks Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Dynamic Bayesian Networks 16 ● X t , E t contain arbitrarily many variables in a sequentialized Bayes net Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

DBNs vs. HMMs 17 ● Every HMM is a single-variable DBN; every discrete DBN is an HMM ● Sparse dependencies ⇒ exponentially fewer parameters; e.g., 20 state variables, three parents each DBN has 20 × 2 3 = 160 parameters, HMM has 2 20 × 2 20 ≈ 10 12 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

18 speech recognition Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Speech as Probabilistic Inference 19 It’s not easy to wreck a nice beach ● Speech signals are noisy, variable, ambiguous ● What is the most likely word sequence, given the speech signal? I.e., choose Words to maximize P ( Words ∣ signal ) ● Use Bayes’ rule: P ( Words ∣ signal ) = αP ( signal ∣ Words ) P ( Words ) i.e., decomposes into acoustic model + language model ● Words are the hidden state sequence, signal is the observation sequence Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Phones 20 ● All human speech is composed from 40-50 phones, determined by the configuration of articulators (lips, teeth, tongue, vocal cords, air flow) ● Form an intermediate level of hidden states between words and signal ⇒ acoustic model = pronunciation model + phone model ● ARPAbet designed for American English b ea t b et p et [iy] [b] [p] [ih] b i t [ch] Ch et [r] r at b e t d ebt s et [ey] [d] [s] b ough t h at th ick [ao] [hh] [th] [ow] b oa t [hv] h igh [dh] th at [er] B er t [l] l et [w] w et ros e s si ng butt on [ix] [ng] [en] ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ e.g., “ceiling” is [s iy l ih ng] / [s iy l ix ng] / [s iy l en] Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Speech Sounds 21 ● Raw signal is the microphone displacement as a function of time; processed into overlapping 30ms frames, each described by features ● Frame features are typically formants—peaks in the power spectrum Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Speech Spectrogram 22 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn - PowerPoint PPT Presentation

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020 Outline 1 Hidden Markov models Inference: filtering, smoothing, best sequence Dynamic Bayesian

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov

Outline Md Md Markov Markov Decision Decision Processes Processes Grid World Example

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo

Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Markov Systems, Markov Decision Processes, and Dynamic Programming Andrew W. Moore Note to

Introduction to Partially Observable Markov Decision Processes CS 886 Sequential Decision Making

The simplex method is strongly polynomial for deterministic Markov decision processes Ian Post

Discrete Time Markov Chains Discrete-Time Markov Chains Books - Introduction to Stochastic

Markov processes (Markov chains) Construct a Bayes net from these variables: parents? Markov

Markov processes (Markov chains) Construct a Bayes net from these variables: parents? CPTs?

Markov Decision Processes Philipp Koehn presented by Shuoyang Ding 11 April 2017 Philipp Koehn

Lecture 11 Authentication 1 Where are we now? We know a bit of the following:

A Versatile Sharp I nterface I mmersed A Versatile Sharp I nterface I mmersed Boundary Method

Continuous Authentication for Voice Assistants Huan Feng * , Kassem Fawaz * , and Kang G. Shin

FACILITATING ICN DEPLOYMENT WITH AN EXTENDED OPENFLOW PROTOCOL Piotr Zuraniewski, Niels van

Nonlinear Aspects of Speech Production: Fractals and Chaotic Dynamics Petros Maragos Summer

BBNANG243 Phonological analysis 34. Contrast in English consonants Zoltn G. Kiss,

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

Intro, packages & tools Advanced functional programming - Lecture 1 Wouter Swierstra and

Sambuz

Useful Links

Newsletter

Mail Us

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn - PowerPoint PPT Presentation

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020 Outline 1 Hidden Markov models Inference: filtering, smoothing, best sequence Dynamic Bayesian

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov

Outline Md Md Markov Markov Decision Decision Processes Processes Grid World Example

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo

Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Markov Systems, Markov Decision Processes, and Dynamic Programming Andrew W. Moore Note to

Introduction to Partially Observable Markov Decision Processes CS 886 Sequential Decision Making

The simplex method is strongly polynomial for deterministic Markov decision processes Ian Post

Discrete Time Markov Chains Discrete-Time Markov Chains Books - Introduction to Stochastic

Markov processes (Markov chains) Construct a Bayes net from these variables: parents? Markov

Markov processes (Markov chains) Construct a Bayes net from these variables: parents? CPTs?

Markov Decision Processes Philipp Koehn presented by Shuoyang Ding 11 April 2017 Philipp Koehn

Lecture 11 Authentication 1 Where are we now? We know a bit of the following:

A Versatile Sharp I nterface I mmersed A Versatile Sharp I nterface I mmersed Boundary Method

Continuous Authentication for Voice Assistants Huan Feng * , Kassem Fawaz * , and Kang G. Shin

FACILITATING ICN DEPLOYMENT WITH AN EXTENDED OPENFLOW PROTOCOL Piotr Zuraniewski, Niels van

Nonlinear Aspects of Speech Production: Fractals and Chaotic Dynamics Petros Maragos Summer

BBNANG243 Phonological analysis 34. Contrast in English consonants Zoltn G. Kiss,

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

Intro, packages &amp; tools Advanced functional programming - Lecture 1 Wouter Swierstra and

Sambuz

Useful Links

Newsletter

Mail Us

Intro, packages & tools Advanced functional programming - Lecture 1 Wouter Swierstra and