Graphical Models Kalman Filter DBN ML 701 Undirected Models - - PowerPoint PPT Presentation

graphical models
SMART_READER_LITE
LIVE PREVIEW

Graphical Models Kalman Filter DBN ML 701 Undirected Models - - PowerPoint PPT Presentation

Outline Dynamic Models Gaussian Linear Models Graphical Models Kalman Filter DBN ML 701 Undirected Models Anna Goldenberg Unification Summary HMMs HMM in short is a Bayes Net satisfies Markov property


slide-1
SLIDE 1

Graphical Models

ML 701 Anna Goldenberg

Outline

Dynamic Models

Gaussian Linear Models

  • Kalman Filter

DBN

Undirected Models Unification Summary

HMMs

qt hidden states Ot

  • bservations

q0 Oo q1 O1 qT OT

. . . P(Q, O) = p(q0)

T −1

  • t=1

p(qt+1|qt)

T

  • t=1

p(Ot|qt)

is a Bayes Net satisfies Markov property (independence of states given present) with discrete states (time steps are discrete)

HMM in short

What about continuous HMMs?

slide-2
SLIDE 2

Gaussian Linear State Space models!!! What about continuous HMMs?

Example of use

SLAM - Simultaneous Localization and Mapping

Drawback: Belief State and Time grow quadratically in the number of landmarks

http://www.stanford.edu/~paskin/slam/

State Space Models

qt hidden states Ot

  • bservations

State Space Models

qt hidden states Ot

  • bservations

q0 Oo q1 O1 qT OT

. . . P(Q, O) = p(q0)

T −1

  • t=1

p(qt+1|qt)

T

  • t=1

p(Ot|qt)

State Space Models

qt hidden states Ot

  • bservations

qt - is a real-valued K-dimensional hidden state variable Ot - is a D-dimensional real-valued observation vector

q0 Oo q1 O1 qT OT

. . .

slide-3
SLIDE 3

State Space Models

qt hidden states Ot

  • bservations

qt = f(qt−1) + wt

f determines mean of qt given mean of qt-1 wt is zero-mean random noise vector

Ot = g(qt) + vt

similarly

q0 Oo q1 O1 qT OT

. . .

B A A A B B

Gaussian Linear State Space Models

Ot and qt are Gaussian f and g are linear and time-invariant

A - transition matrix B - observation matrix

qt = Aqt−1 + wt wt ∼ N(0, R) Ot = Bqt−1 + vt vt ∼ N(0, S)

, ,

q0 ∼ N(0, Σ0)

correction: previously R and S were reversed

Inference

forward step (filtering) backward step (smoothing)

p(qt|Ot, Ot+1, . . . , OT ) Kalman Filter p(qt|O0, . . . , Ot)

Kalman Filter (1960)

time update measurement update

P(qt−1|O0, . . . , Ot−1) → P(qt|O0, . . . , Ot−1)

E(qt|t−1) = A · E(qt−1|t−1)

V (qt|t−1) = A · V (qt−1|t−1)AT + R

P(qt|Oo, . . . , Ot−1) → P(qt|Oo, . . . , Ot) P(qt, Ot|Oo, . . . , Ot−1)

  • E(qt|t−1)

B · E(qt|t−1)

  • 1.
  • V (qt|t−1)

V (qt|t−1)BT BV (qt|t−1) BV (qt|t−1)BT + R

  • 2.

P(qt|Oo, . . . , Ot−1) → P(qt|Oo, . . . , Ot)

Σ11 Σ21

Σ12

Σ22

E(qt|t) = E(qt|t−1) + Σ12Σ−1

22 (Ot − E(Ot|t))

V (qt|t) = V (qt|t−1) − Σ12Σ−1

22 Σ21

q1-1 Ot-1 qt Ot q1-1 Ot-1 qt Ot q1-1 Ot-1 qt Ot

slide-4
SLIDE 4

Example of use

Reported by Welch and Bishop, SIGGRAPH 2001

Kalman Filter Usage

Tracking motion

Missiles Hand motion Lip motion from videos

Signal Processing Navigation Economics (for prediction)

Dynamic Bayes Nets

So far But are there more appealing models?

q0 Oo q1 O1 qT OT

. . .

(Koller and Friedman)

Weather 0 Velocity 0 Location 0 Failure 0 Obs_0 Weather 1 Velocity 1 Location 1 Failure 1 Obs_1 Weather 2 Velocity 2 Location 2 Failure 2 Obs_2

Dynamic Bayes Nets

It’s just a Bayes Net!

  • Approach to the dynamics
  • 1. Start with some prior for the initial state
  • 2. Predict the next state just using the observation up to the previous time step
  • 3. Incorporate the new observation and re-estimate the current state

Weather 0 Velocity 0 Location 0 Failure 0 Obs_0 Weather 1 Velocity 1 Location 1 Failure 1 Obs_1 Weather 2 Velocity 2 Location 2 Failure 2 Obs_2

slide-5
SLIDE 5

Dynamic Bayes Nets

It’s just a Bayes Net!

  • Approach to the dynamics
  • 1. Start with some prior for the initial state
  • 2. Predict the next state just using the observation up to the previous time step
  • 3. Incorporate the new observation and re-estimate the current state

Weather 0 Velocity 0 Location 0 Failure 0 Obs_0 Weather 1 Velocity 1 Location 1 Failure 1 Obs_1 Weather 2 Velocity 2 Location 2 Failure 2 Obs_2

Most importantly: Use the structure of the Bayes Net. Use the independencies!!!

Other graphical models

but first...

Any questions so far? Are all GM directed?

There are Undirected Graphical Models!

B C E D A

Undirected models

p(X) = 1 Z

  • C

ψ(XC)

B C E D A

ψ(XC)

  • non-negative potential function

What are C ?

slide-6
SLIDE 6

Cliques

p(X) = 1 Z

  • C

ψ(XC)

B C E D A

A clique C is a subset C∈V if ∀i,j∈C, (i,j)∈E C is maximal if it is not contained in any other clique

ψ(XC)

  • non-negative potential function

Cliques

B C E D A

i) B - a clique? ii) BC - a maximal clique? iii) ABCD - a clique? iv) ABC - a maximal clique? v) BCDE - a clique?

Decomposition

B C E D A

Note to resolve the confusion: The most common machine learning notation is the decomposition over maximal cliques

p(A, B, C, D, E) = 1 Z p(A, B, C)p(B, D)p(C, E)p(D, E)

Independence

B C E D A

Rule: V1 is independent of V2 given cutset S S is called the Markov Blanket (MB) e.g. MB(B) = {A,C,D}, i.e. the set of neighbors

slide-7
SLIDE 7

Are undirected models useful?

Yes!

Used a lot in Physics (Ising model, Boltzmann machine) In vision (every pixel is a node) Bioinformatics

Are undirected models useful?

Yes!

Used a lot in Physics (Ising model, Boltzmann machine) In vision (every pixel is a node), bioinformatics

Why not more popular?

the ZZZZZZ! it’s the partition function

p(X) = 1 Z

  • C

ψ(XC)

What’s Z and ways to fight it

Approximations Sampling (MCMC sampling is common) Pseudo-Likelihood Mean-field approximation

Z =

  • ∀x
  • C

ψ(XC)

Chain Graphs

Generalization of MRFs and Bayes Nets Structured as blocks

Undirected edges within a block Directed edges between blocks

slide-8
SLIDE 8

Chain Graphs

Generalization of MRFs and Bayes Nets Structured as blocks

Undirected edges within a block Directed edges between blocks

quite intractable not very popular used in BioMedical Engineering (text)

Graphical Models

Chain Graphs Undirected Directed

?

A B C

Directed Undirected?

A B C

Directed? Undirected?

B C D A B C D A

slide-9
SLIDE 9

Chain Graphs Undirected Directed

Graphical Models is a huge evolving field There are many other variations that haven’t been

discussed

Used extensively in variety of domains Tractability issues More work to be done!

Summary

Questions?