Temporal probability models Chapter 15, Sections 15 Chapter 15, - PowerPoint PPT Presentation

Temporal probability models Chapter 15, Sections 1–5 Chapter 15, Sections 1–5 1

Outline ♦ Time and uncertainty ♦ Inference: filtering, prediction, smoothing ♦ Hidden Markov models ♦ Dynamic Bayesian networks Chapter 15, Sections 1–5 2

Time and uncertainty The world changes; we need to track and predict it Diabetes management vs vehicle diagnosis Basic idea: copy state and evidence variables for each time step X t = set of unobservable state variables at time t e.g., BloodSugar t , StomachContents t , etc. E t = set of observable evidence variables at time t e.g., MeasuredBloodSugar t , PulseRate t , FoodEaten t This assumes discrete time ; step size depends on problem Notation: X a : b = X a , X a +1 , . . . , X b − 1 , X b Chapter 15, Sections 1–5 3

Markov processes (Markov chains) Construct a Bayes net from these variables: parents? CPTs? Chapter 15, Sections 1–5 4

Markov processes (Markov chains) Construct a Bayes net from these variables: parents? CPTs? Markov assumption: X t depends on bounded subset of X 0: t − 1 First-order Markov process: P ( X t | X 0: t − 1 ) = P ( X t | X t − 1 ) Second-order Markov process: P ( X t | X 0: t − 1 ) = P ( X t | X t − 2 , X t − 1 ) X t −2 X t −1 X t X t +1 X t +2 First−order X t −2 X t −1 X t X t +1 X t +2 Second−order Stationary process: transition model P ( X t | X t − 1 ) fixed for all t Chapter 15, Sections 1–5 5

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t , E 1: t − 1 ) = P ( E t | X t ) Stationary process: transition model P ( X t | X t − 1 ) and sensor model P ( E t | X t ) fixed for all t HMM is a special type of Bayes net, X t is single discrete random variable: with joint probability distribution P ( X 0: t , E 1: t ) =? Chapter 15, Sections 1–5 6

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t , E 1: t − 1 ) = P ( E t | X t ) Stationary process: transition model P ( X t | X t − 1 ) and sensor model P ( E t | X t ) fixed for all t HMM is a special type of Bayes net, X t is single discrete random variable: with joint probability distribution P ( X 0: t , E 1: t ) = P ( X 0 ) Π t i =1 P ( X i | X i − 1 ) P ( E i | X i ) Chapter 15, Sections 1–5 7

Example R t −1 P(R ) t t 0.7 f 0.3 Rain t −1 Rain Rain t +1 t R P(U ) t t t 0.9 f 0.2 Umbrella t −1 Umbrella Umbrella t +1 t First-order Markov assumption not exactly true in real world! Possible fixes: 1. Increase order of Markov process 2. Augment state , e.g., add Temp t , Pressure t Example: robot motion. Augment position and velocity with Battery t Chapter 15, Sections 1–5 8

Inference tasks Filtering: P ( X t | e 1: t ) belief state—input to the decision process of a rational agent Prediction: P ( X t + k | e 1: t ) for k > 0 evaluation of possible action sequences; like filtering without the evidence Smoothing: P ( X k | e 1: t ) for 0 ≤ k < t better estimate of past states, essential for learning Most likely explanation: arg max x 1: t P ( x 1: t | e 1: t ) speech recognition, decoding with a noisy channel Chapter 15, Sections 1–5 9

Filtering Aim: devise a recursive state estimation algorithm: P ( X t +1 | e 1: t +1 ) = f ( e t +1 , P ( X t | e 1: t )) Chapter 15, Sections 1–5 10

Filtering example 0.500 0.627 0.500 0.373 0.818 0.883 True 0.500 False 0.500 0.182 0.117 Rain 0 Rain 1 Rain 2 Umbrella 1 Umbrella 2 P ( X t +1 | e 1: t +1 ) = α P ( e t +1 | X t +1 ) Σ x t P ( X t +1 | x t ) P ( x t | e 1: t ) R t − 1 P ( R t ) R t P ( U t ) t 0.7 t 0.9 f 0.3 f 0.2 Chapter 15, Sections 1–5 14

Most likely explanation Chapter 15, Sections 1–5 15

Most likely explanation Most likely sequence � = sequence of most likely states!!!! Most likely path to each x t +1 = most likely path to some x t plus one more step x 1 ... x t P ( x 1 , . . . , x t , X t +1 | e 1: t +1 ) max   = P ( e t +1 | X t +1 ) max  P ( X t +1 | x t ) max x 1 ... x t − 1 P ( x 1 , . . . , x t − 1 , x t | e 1: t )  x t Identical to filtering, except f 1: t replaced by m 1: t = x 1 ... x t − 1 P ( x 1 , . . . , x t − 1 , X t | e 1: t ) , max I.e., m 1: t ( i ) gives the probability of the most likely path to state i . Update has sum replaced by max, giving the Viterbi algorithm: m 1: t +1 = P ( e t +1 | X t +1 ) max x t ( P ( X t +1 | x t ) m 1: t ) Chapter 15, Sections 1–5 16

Viterbi example Rain 1 Rain 2 Rain 3 Rain 4 Rain 5 true true true true true state space paths false false false false false true true false true true umbrella .8182 .5155 .0361 .0334 .0210 most likely paths .1818 .0491 .1237 .0173 .0024 m 1:1 m 1:2 m 1:3 m 1:4 m 1:5 Chapter 15, Sections 1–5 17

Implementation Issues Viterbi message: m 1: t +1 = P ( e t +1 | X t +1 ) max x t ( P ( X t +1 | x t ) m 1: t ) or filtering update: P ( X t +1 | e 1: t +1 ) = α P ( e t +1 | X t +1 ) Σ x t P ( X t +1 | x t ) P ( x t | e 1: t ) What is 10 − 6 · 10 − 6 · 10 − 6 ? What is floating point arithmetic precision? Chapter 15, Sections 1–5 19

Implementation Issues Viterbi message: m 1: t +1 = P ( e t +1 | X t +1 ) max x t ( P ( X t +1 | x t ) m 1: t ) or filtering update: P ( X t +1 | e 1: t +1 ) = α P ( e t +1 | X t +1 ) Σ x t P ( X t +1 | x t ) P ( x t | e 1: t ) What is 10 − 6 · 10 − 6 · 10 − 6 ? What is floating point arithmetic precision? 10 − 6 · 10 − 6 · 10 − 6 = 0 Chapter 15, Sections 1–5 20

Answer? Use either: – Rescaling, multiply values by a (large) constant – logsum trick (Assignment 5) log is monotone increasing, so: arg max f ( x ) = arg max log f ( x ) Also, log( a · b ) = log a + log b Therefore, work with sums of logarithms of probabilities, rather than products of probabilities: m 1: t +1 = P ( e t +1 | X t +1 ) max x t ( P ( X t +1 | x t ) m 1: t ) → log m 1: t +1 = log P ( e t +1 | X t +1 ) + max x t (log P ( X t +1 | x t ) + log m 1: t ) Chapter 15, Sections 1–5 21

Hidden Markov models X t is a single, discrete variable (usually E t is too) Domain of X t is { 1 , . . . , S }    0 . 7 0 . 3 Transition matrix T ij = P ( X t = j | X t − 1 = i ) , e.g.,     0 . 3 0 . 7  Sensor matrix O t for each time step, diagonal elements P ( e t | X t = i )    0 . 9 0 e.g., with U 1 = true , O 1 =     0 0 . 2  Forward messages as column vectors: f 1: t +1 = α O t +1 T ⊤ f 1: t Chapter 15, Sections 1–5 22

Dynamic Bayesian networks X t , E t contain arbitrarily many variables in a replicated Bayes net BMeter 1 R 0 P(R ) P(R ) 1 0 t Battery 0 Battery 0.7 0.7 f 0.3 1 Rain 0 Rain 1 X 0 X 1 R 1 P(U ) 1 t 0.9 f 0.2 X 0 X 1 X t Umbrella 1 Z 1 Chapter 15, Sections 1–5 23

Summary Temporal models use state and sensor variables replicated over time Markov assumptions and stationarity assumption, so we need – transition model P ( X t | X t − 1 ) – sensor model P ( E t | X t ) Tasks are filtering, prediction, smoothing, most likely sequence; all done recursively with constant cost per time step Hidden Markov models have a single discrete state variable; used for speech recognition Dynamic Bayes nets subsume HMMs; exact update intractable Chapter 15, Sections 1–5 24

Temporal probability models Chapter 15, Sections 15 Chapter 15, - PowerPoint PPT Presentation

Temporal probability models Chapter 15, Sections 15 Chapter 15, Sections 15 1 Outline Time and uncertainty Inference: filtering, prediction, smoothing Hidden Markov models Dynamic Bayesian networks Chapter 15, Sections

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Spatio-Temporal Statistics with R Chapter Two: Exploring Spatio-Temporal Data Spatio-Temporal

Counting and Probability Whats to come? Counting and Probability Whats to come?

Outline Outline 2 Probability Models of N Random Variables Probability Models of N Random

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Temporal, Spatial, and Spatio-temporal Granularities Gabriele Pozzani Department of Computer

Temporal Code Temporal Code Temporal Code (Acoustic Front-end) Human Recognition Machine

Temporal Privacy in Wireless Sensor Networks Temporal Privacy in Wireless Sensor Networks

Temporal Planning Planning with Temporal and Concurrent Actions 1 Literature Malik

Introduction Maani Ghaffari January 8, 2020 Robotics Systems: How and Why?

Autonomous Intelligent Robotics Instructor: Shiqi Zhang

Classification based on Bayes decision theory Machine Learning Hamid Beigy Sharif University of

Bayesian parameter estimation in predictive engineering Damon McDougall Institute for

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

Asymmetric Bayesian Personalized Ranking for One-Class Collaborative Filtering Shan Ouyang, Lin

Conditional Expectation as the Basis for Bayesian Updating Hermann G. Matthies Bojana V. Rosi

Estimation of DSGE models St ephane Adjemian e du Maine, GAINS & CEPREMAP Universit

Temporal probability models Chapter 15, Sections 15 Chapter 15, - PowerPoint PPT Presentation

Temporal probability models Chapter 15, Sections 15 Chapter 15, Sections 15 1 Outline Time and uncertainty Inference: filtering, prediction, smoothing Hidden Markov models Dynamic Bayesian networks Chapter 15, Sections

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Spatio-Temporal Statistics with R Chapter Two: Exploring Spatio-Temporal Data Spatio-Temporal

Counting and Probability Whats to come? Counting and Probability Whats to come?

Outline Outline 2 Probability Models of N Random Variables Probability Models of N Random

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Temporal, Spatial, and Spatio-temporal Granularities Gabriele Pozzani Department of Computer

Temporal Code Temporal Code Temporal Code (Acoustic Front-end) Human Recognition Machine

Temporal Privacy in Wireless Sensor Networks Temporal Privacy in Wireless Sensor Networks

Temporal Planning Planning with Temporal and Concurrent Actions 1 Literature Malik

Introduction Maani Ghaffari January 8, 2020 Robotics Systems: How and Why?

Autonomous Intelligent Robotics Instructor: Shiqi Zhang

Classification based on Bayes decision theory Machine Learning Hamid Beigy Sharif University of

Bayesian parameter estimation in predictive engineering Damon McDougall Institute for

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

Asymmetric Bayesian Personalized Ranking for One-Class Collaborative Filtering Shan Ouyang, Lin

Conditional Expectation as the Basis for Bayesian Updating Hermann G. Matthies Bojana V. Rosi

Estimation of DSGE models St ephane Adjemian e du Maine, GAINS &amp; CEPREMAP Universit

Estimation of DSGE models St ephane Adjemian e du Maine, GAINS & CEPREMAP Universit