Markov Models and Hidden Markov Models Robert Platt Northeastern - PowerPoint PPT Presentation

Markov Models and Hidden Markov Models Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

Markov Models We have already seen that an MDP provides a useful framework for modeling stochastic control problems. Markov Models: model any kind of temporally dynamic system.

Probability recap  Conditional probability  Product rule  Chain rule  X, Y independent if and only if:  X and Y are conditionally independent given Z if and only if:

Probability again: Independence Two random variables, x and y , are independent when: The outcomes of two different coin flips are usually independent of each other

Probability again: Independence If: Then: Why?

Probability again: Independence Two random variables, x and y , are independent when: The outcomes of two different coin flips are usually independent of each other

Example: Independence winter !winter snow 0.1 0.1 !snow 0.3 0.5

Example: Independence winter !winter snow 0.1 0.1 !snow 0.3 0.5 Are snow and winter independent variables?

Example: Independence winter !winter snow 0.1 0.1 !snow 0.3 0.5 Are snow and winter independent variables? P(snow) = 0.2 P(winter) = 0.4

Example: Independence winter !winter snow 0.1 0.1 !snow 0.3 0.5 Are snow and winter independent variables? P(snow) = 0.2 P(winter) = 0.4 What would the distribution look like if snow, winter were independent?

Conditional independence Independence: Conditional independence: Equivalent statements of conditional independence:

Conditional independence: example cavity toothache catch P(toothache, catch | cavity) = P(toothache | cavity) P(catch | cavity) Toothache and catch are conditionally independent given cavity – this is the “common cause” scenario covered in Bayes Nets...

Examples of conditional independence What are the conditional independence relationships in the following? – traffic, raining, late for work – snow, cloudy, crash – fire, smoke, alarm

Markov Processes State at time=1 State at time=2 transitions Markov model can be used to model any sequential time process – the weather – traffic – stock market – news cycle ...

Markov Processes State at time=1 State at time=2 transitions Since this is a Markov process, we assume transitions are Markov: Process model: Markov assumption:

Markov Processes How do we calculate:

Markov Processes How do we calculate: Can we simplify this expression?

Markov Processes How do we calculate:

Markov Processes How do we calculate: In general:

Markov Processes How do we calculate: Process model In general:

Markov Processes: example Two states: cloudy, sunny X_{t-1} X_t X_t sun sun 0.8 sun cloudy 0.2 cloudy sun 0.3 cloudy cloudy 0.7 0.2 0.8 0.7 sun cloudy 0.3

Simulating dynamics forward Joint distribution: But, suppose we want to predict the state at time T, given a prior distribution at time 1? ...

Simulating dynamics forward Suppose is it sunny on mon... Prob sunny tues Prob sunny weds Prob sunny thurs Prob sunny fri

Simulating dynamics forward Suppose is it cloudy on mon... Prob sunny tues Prob sunny weds Prob sunny thurs Prob sunny fri

Simulating dynamics forward Suppose is it cloudy on mon... Prob sunny tues Prob sunny weds Prob sunny thurs Prob sunny fri Converge to same distribution regardless of starting point – called the “stationary distribution”

An aside: the stationary distribution How might you calculate the stationary distribution? Let: Then: Stationary distribution is the value for p such that:

An aside: the stationary distribution How might you calculate the stationary distribution? Let: Then: Stationary distribution is the value for p such that: How calculate p that satisfies this eqn?

Hidden Markov Models (HMMs) Hidden Markov Models: – extension of the Markov model – state is assumed to be “hidden”

Hidden Markov Models (HMMs) Called an “emission” State, , is assumed to be unobserved However, you get to make one observation, , on each timestep. Examples: – speech to text; tracking in computer vision' robot localization

Hidden Markov Models (HMMs) Sensor Markov Assumption: the current observation depends only on current state:

HMM example (state is unobserved) (only observations are observed) 0.2 You live underground... Every day, you're boss comes in either 0.8 0.7 sun cloudy wearing sunglasses or not 0.3 Can you infer whether it's sunny out based on whether you see the glasses 0.7 0.3 0.6 over a sequence of days? 0.4 – e.g. what's the prob it's sunny out glasses No glasses today if you've seen your boss wear glasses three days in a row?

HMM Filtering Given a prior distribution, , and a series of observations, , calculate the posterior distribution: Two steps: Process update Observation update The Kalman filter is perhaps the most famous instance of this idea

HMM Filtering Given a prior distribution, , and a series of observations, , calculate the posterior distribution: Two steps: Process update Observation update

HMM Filtering Given a prior distribution, , and a series of observations, , calculate the posterior distribution: “Beliefs” Two steps: Process update Observation update

Process update This is just forward simulation of the Markov Model

Process update: example T = 1 T = 2 T = 5 Completely certain By now, we've almost A little less certain on about ghost completely lost track the next time step... position at T=1 of the ghost... If we only do the process update, then we typically lose information over time – when might this not be true?

Observation update Where is a normalization factor

Observation update Before observation After observation Observations enable the system to gain information – a single observation may not determine system state exactly – but, the more observations, the better

Robot localization example Prob 0 1

Weather HMM example 0.2 0.8 0.7 sun cloudy 0.3 0.7 0.3 0.6 0.4 glasses No glasses

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) sun 0.5 cloudy 0.5

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) w_t P(w_t) sun 0.5 sun ? cloudy 0.5 cloudy ?

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) w_t P(w_t) sun 0.5 sun 0.55 cloudy 0.5 cloudy 0.45

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) w_t P(w_t) sun 0.5 sun 0.55 cloudy 0.5 cloudy 0.45 w_t P(w_t) sun ? cloudy ?

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) w_t P(w_t) sun 0.5 sun 0.55 cloudy 0.5 cloudy 0.45 w_t P(w_t) sun 0.68 cloudy 0.31

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) sun ? cloudy ? w_t P(w_t) sun 0.68 cloudy 0.31

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) sun 0.64 cloudy 0.36 w_t P(w_t) sun 0.68 cloudy 0.31

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) sun 0.64 cloudy 0.36 w_t P(w_t) w_t P(w_t) sun ? sun 0.68 cloudy ? cloudy 0.31

Weather HMM example X_{t-1} X_t X_t X_t P(g_t|X_t) sun sun 0.8 sun 0.7 sun cloudy 0.2 cloudy sun 0.3 cloudy 0.4 cloudy cloudy 0.7 No glasses glasses glasses w_t P(w_t) sun 0.64 cloudy 0.36 w_t P(w_t) w_t P(w_t) sun 0.76 sun 0.68 cloudy 0.24 cloudy 0.31

Markov Models and Hidden Markov Models Robert Platt Northeastern - PowerPoint PPT Presentation

Markov Models and Hidden Markov Models Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA Markov Models We have already seen that an MDP provides a useful framework for modeling

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Outline depmixS4: an R-package for hidden Markov models Hidden Markov Models Ingmar Visser 1

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes

Hidden Markov Models Pratik Lahiri Introduction A hidden Markov model (HMM) is a

Markov Models Kunsch, H.R., State Space and Hidden Markov Models . ETH- Zurich, Zurich;

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University 2 Markov Chains

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 Lecture Outline Lecture Outline

Markov Chains and Hidden Markov Models COMP 571 - Spring 2015 Luay Nakhleh, Rice University

Hidden Markov Models Markov Model (Finite State Machine with Probs) Modeling a sequence of

A spectral algorithm for learning hidden Markov models . . . h 3 h 2 h 1 x 3 x 2 x 1 Daniel Hsu

CS 4495 Computer Vision Hidden Markov Models Aaron Bobick School of Interactive Computing

Outline Sequential Data - Part 2 Greg Mori - CMPT 419/726 Hidden Markov Models - Most Likely

DTTF/NB479: Dszquphsbqiz Day 29 Announcements: Questions? This week: Digital signatures, DSA

Where are my glasses? I know the following statements are true. 1. If I was reading the newspaper

Designing deep architectures for Visual Question Answering Matthieu Cord Sorbonne University

dnstap: introduction and status update Robert Edmonds (edmonds@fsi.io) Farsight Security, Inc.

Topic 6: Optical Systems Aim: To apply the image formation theory to basic real optical sys- tems

Commonsense resources Grandmas glasses Toms grandma was reading a new book, when she

TEACH ACCESS TEACHING ABOUT ACCESSIBLE TECH THE TEACH ACCESS MISSION To include and enhance the

Spin glasses and Adiabatic Quantum Computing A.P. Young Talk at the Workshop on Theory and