Temporal Inference 16-385 Computer Vision (Kris Kitani) Carnegie - - PowerPoint PPT Presentation

temporal inference
SMART_READER_LITE
LIVE PREVIEW

Temporal Inference 16-385 Computer Vision (Kris Kitani) Carnegie - - PowerPoint PPT Presentation

Temporal Inference 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Basic Inference Tasks Filtering Prediction P ( X t | e 1: t ) P ( X t + k | e 1: t ) Posterior probability over the current Posterior probability over a future


slide-1
SLIDE 1

Temporal Inference

16-385 Computer Vision (Kris Kitani)

Carnegie Mellon University

slide-2
SLIDE 2

Basic Inference Tasks

Filtering

P(Xt|e1:t)

Posterior probability over the current state, given all evidence up to present

Prediction

Posterior probability over a future state, given all evidence up to present

P(Xt+k|e1:t)

Smoothing

Posterior probability over a past state, given all evidence up to present

P(Xk|e1:t)

Best Sequence

Best state sequence given all evidence up to present

arg max

X1:t

P(X1:t|e1:t)

slide-3
SLIDE 3

Filtering

P(Xt|e1:t)

Posterior probability over the current state, given all evidence up to present

Where am I now?

slide-4
SLIDE 4

Filtering P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

motion model

  • bservation model

prior posterior

slide-5
SLIDE 5

Filtering P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

motion model

  • bservation model

What is this?

slide-6
SLIDE 6

Filtering P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming) same type of ‘message’

slide-7
SLIDE 7

Filtering P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming) same type of ‘message’ called a belief distribution a belief is a reflection of the systems (robot, tracker) knowledge about the state X

Bel(xt)

sometimes people use this annoying notation instead:

slide-8
SLIDE 8

Filtering P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

Where does this equation come from?

(scary math to follow…)

slide-9
SLIDE 9

Filtering

P(Xt+1|e1:t+1) = P(Xt+1|et+1, e1:t)

P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

just splitting up the notation here

slide-10
SLIDE 10

Filtering

P(Xt+1|e1:t+1) = P(Xt+1|et+1, e1:t)

P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

Apply Bayes' rule (with evidence)

slide-11
SLIDE 11

Filtering

P(Xt+1|e1:t+1) = P(Xt+1|et+1, e1:t) = P(et+1|Xt+1, e1:t)P(Xt+1|e1:t) P(et+1|e1:t)

P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

Apply Markov assumption on

  • bservation model
slide-12
SLIDE 12

Filtering

P(Xt+1|e1:t+1) = P(Xt+1|et+1, e1:t) = P(et+1|Xt+1, e1:t)P(Xt+1|e1:t) P(et+1|e1:t) = αP(et+1|Xt+1)P(Xt+1|e1:t) X

P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

Condition on the previous state Xt

slide-13
SLIDE 13

Filtering

P(Xt+1|e1:t+1) = P(Xt+1|et+1, e1:t) = P(et+1|Xt+1, e1:t)P(Xt+1|e1:t) P(et+1|e1:t) = αP(et+1|Xt+1)P(Xt+1|e1:t) = αP(et+1|Xt+1) X

Xt

P(Xt+1|Xt, e1:t)P(Xt|e1:t) X

P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

Apply Markov assumption on motion model

slide-14
SLIDE 14

Filtering

P(Xt+1|e1:t+1) = P(Xt+1|et+1, e1:t) = P(et+1|Xt+1, e1:t)P(Xt+1|e1:t) P(et+1|e1:t) = αP(et+1|Xt+1)P(Xt+1|e1:t) = αP(et+1|Xt+1) X

Xt

P(Xt+1|Xt, e1:t)P(Xt|e1:t) = αP(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Can be computed with recursion (Dynamic Programming)

slide-15
SLIDE 15

‘In the trunk of a car of a sleepy driver’ model

binary random variable (left lane or right lane)

x0 x1 x2 x3 x4 x = {xleft, xright}

right left

Hidden Markov Model example

slide-16
SLIDE 16

From a hole in the car you can see the ground x0 x1 x2 x3 x4 e = {egray, eyellow} e1 e2 e3 e4

binary random variable (center lane is yellow or road is gray)

slide-17
SLIDE 17

x0 x1 x2 x3 x4 e1 e2 e3 e4 xleft xright P(x0)

P(xt|xt−1)

xleft xright xleft xright eyellow egray xleft xright

P(et|xt)

0.5 0.5 0.7 0.3 0.7 0.3 0.9 0.2 0.8 0.1

What needs to sum to

  • ne?

What’s the probability of being in the left lane at t=4?

This is filtering!

slide-18
SLIDE 18

xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

What is the belief distribution if I see yellow at t=1

p(x1) = X x0 p(x1|x0)p(x0)

Prediction step: Update step:

p(x1|e1) = α p(e1|x1)p(x1) p(x1|e1 = eyellow) =?

slide-19
SLIDE 19

p(x1) = X x0 p(x1|x0)p(x0) = [0.7 0.3](0.5) + [0.3 0.7](0.5) =  0.7 0.3 0.3 0.7  0.5 0.5

  • =

 0.5 0.5

  • xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

Prediction step: What is the belief distribution if I see yellow at t=1 p(x1|e1 = eyellow) =?

slide-20
SLIDE 20

xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

p(x1|e1) = α p(e1|x1)p(x1)

Update step: What is the belief distribution if I see yellow at t=1 p(x1|e1 = eyellow) =?

slide-21
SLIDE 21

p(x1|e1) = α p(e1|x1)p(x1) = α (0.9 0.2). ∗ (0.5 0.5) = α  0.9 0.0 0.0 0.2  0.5 0.5

  • =

 0.45 0.1

 0.818 0.182

  • xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

Update step: What is the belief distribution if I see yellow at t=1 p(x1|e1 = eyellow) =? more likely to be in which lane?

  • bserved yellow
slide-22
SLIDE 22

xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

p(x1) = X x0 p(x1|x0)p(x0)

Prediction step: Update step:

p(x1|e1) = α p(e1|x1)p(x1)

 ≈  0.818 0.182

  • What is the belief distribution if I see yellow at t=1 p(x1|e1 = eyellow) =?
  • =

 0.5 0.5

  • Summary
slide-23
SLIDE 23

xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

What if you see yellow again at t=2

p(x2|e1, e2) =?

slide-24
SLIDE 24

p(x2|e1) = X x1 p(x2|x1)p(x1|e1)  

  • p(x1|e1, e2) = α p(e1|x1)p(x1)

 

  • xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

Prediction step: Update step: What if you see yellow again at t=2

p(x2|e1, e2) =?

slide-25
SLIDE 25

p(x2|e1) = X x1 p(x2|x1)p(x1|e1) =  0.7 0.3 0.3 0.7  0.818 0.182

  • =

 0.627 0.373

  • xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

Prediction step: What if you see yellow again at t=2

p(x2|e1, e2) =?

Why does the probability of being in the left lane go down?

slide-26
SLIDE 26

xleft xright

P(x0)

0.5 0.5

P(xt|xt−1)

xleft

xright

xleft

xright

0.7 0.3 0.7 0.3

eyellow

egray

xleft xright

P(et|xt)

0.9 0.2 0.8 0.1 P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t)

Filtering:

Update step: What if you see yellow again at t=2

p(x2|e1, e2) =? p(x2|e1, e2) = α p(e2|x2)p(x2|e1) = α  0.9 0.0 0.0 0.2  0.627 0.373

 0.883 0.117

  • Why does the probability of being in

the left lane go up?

slide-27
SLIDE 27

Basic Inference Tasks

Filtering

P(Xt|e1:t)

Posterior probability over the current state, given all evidence up to present

Prediction

Posterior probability over a future state, given all evidence up to present

P(Xt+k|e1:t)

Smoothing

Posterior probability over a past state, given all evidence up to present

P(Xk|e1:t)

Best Sequence

Best state sequence given all evidence up to present

arg max

X1:t

P(X1:t|e1:t)

slide-28
SLIDE 28

Prediction

Where am I going?

P(Xt+k|e1:t)

Posterior probability over a future state, given all evidence up to present

slide-29
SLIDE 29

Prediction P(Xt+k+1|e1:t) = X

xt+k

P(Xt+k+1|xt+k)P(xt+k|e1:t)

no new evidence!

What happens as you try to predict further into the future?

same recursive form as filtering but…

slide-30
SLIDE 30

Prediction P(Xt+k+1|e1:t) = X

xt+k

P(Xt+k+1|xt+k)P(xt+k|e1:t)

no new evidence

What happens as you try to predict further into the future? Approaches its ‘stationary distribution’

slide-31
SLIDE 31

Basic Inference Tasks

Filtering

P(Xt|e1:t)

Posterior probability over the current state, given all evidence up to present

Prediction

Posterior probability over a future state, given all evidence up to present

P(Xt+k|e1:t)

Smoothing

Posterior probability over a past state, given all evidence up to present

P(Xk|e1:t)

Best Sequence

Best state sequence given all evidence up to present

arg max

X1:t

P(X1:t|e1:t)

slide-32
SLIDE 32

Smoothing

Wait, what did I do yesterday?

P(Xk|e1:t)

Posterior probability over a past state, given all evidence up to present

slide-33
SLIDE 33

Smoothing P(Xk|e1:t) 1 ≤ k < t P(Xk|e1:t) = P(Xk|e1:k, ek+1:t) = αP(Xk|e1:k)P(ek+1:t|Xk, e1:k) = αP(Xk|e1:k)P(ek+1:t|Xk)

‘forward’ message ‘backward’ message some time in the past this is just filtering this is backwards filtering Let me explain…

slide-34
SLIDE 34

Backward message P(ek+1:t|Xk) = X xk+1 P(ek+1:t|Xk, xk+1)P(xk+1|Xk) = X xk+1 P(ek+1:t|xk+1)P(xk+1|Xk) = X xk+1 P(ek+1, ek+2:t|xk+1)P(xk+1|Xk) = X xk+1 P(ek+1|xk+1)P(ek+2:t|xk+1)P(xk+1|Xk)

recursive message

  • bservation model

motion model

P(et−1:t|Xt) = 1

initial message

copied from last slide conditioning Markov Assumption split This is just a ‘backwards’ version of filtering where
slide-35
SLIDE 35

Basic Inference Tasks

Filtering

P(Xt|e1:t)

Posterior probability over the current state, given all evidence up to present

Prediction

Posterior probability over a future state, given all evidence up to present

P(Xt+k|e1:t)

Smoothing

Posterior probability over a past state, given all evidence up to present

P(Xk|e1:t)

Best Sequence

Best state sequence given all evidence up to present

arg max

X1:t

P(X1:t|e1:t)

slide-36
SLIDE 36

Best Sequence

I must have done something right, right?

arg max

X1:t

P(X1:t|e1:t)

Best state sequence given all evidence up to present

slide-37
SLIDE 37

Best Sequence Identical to filtering but with a max operator

P(Xt+1|e1:t+1) ∝ P(et+1|Xt+1) X

Xt

P(Xt+1|Xt)P(Xt|e1:t) max

x1,...,xt P(x1, . . . , xt, Xt+1|e1:t+1)

= αP(et+1|Xt+1) max

xt 

P(Xt+1|xt) max

x1,...,xt−1 P(x1, . . . , xt−1, Xt|e1:t)

  • recursive message

recursive message

‘Viterbi Algorithm’

Recall: Filtering equation

slide-38
SLIDE 38

Now you know how to answer all the important questions in life:

Where am I now? Where am I going? Wait, what did I do yesterday? I must have done something right, right?