graphical models
play

Graphical Models Kalman Filter DBN ML 701 Undirected Models - PowerPoint PPT Presentation

Outline Dynamic Models Gaussian Linear Models Graphical Models Kalman Filter DBN ML 701 Undirected Models Anna Goldenberg Unification Summary HMMs HMM in short is a Bayes Net satisfies Markov property


  1. Outline � Dynamic Models � Gaussian Linear Models Graphical Models Kalman Filter � � DBN ML 701 � Undirected Models Anna Goldenberg � Unification � Summary HMMs HMM in short � is a Bayes Net � satisfies Markov property (independence of states given present) q t . . . q 0 q 1 q T hidden states � with discrete states (time steps are discrete) O t O o O 1 O T observations T − 1 T What about continuous HMMs? � � P ( Q, O ) = p ( q 0 ) p ( q t +1 | q t ) p ( O t | q t ) t =1 t =1

  2. Example of use SLAM - Simultaneous Localization and Mapping What about continuous HMMs? http://www.stanford.edu/~paskin/slam/ Gaussian Linear State Space models!!! Drawback: Belief State and Time grow quadratically in the number of landmarks State Space Models State Space Models State Space Models q t q t q t . . . . . . q 0 q 1 q T q 0 q 1 q T hidden states hidden states hidden states O t O t O t O o O 1 O o O 1 O T O T observations observations observations T − 1 T q t - is a real-valued K-dimensional hidden state variable � � P ( Q, O ) = p ( q 0 ) p ( q t +1 | q t ) p ( O t | q t ) O t - is a D-dimensional real-valued observation vector t =1 t =1

  3. Gaussian Linear State Space Models State Space Models A A A � O t and q t are Gaussian q t . . . q 0 q 1 q T correction: hidden states � f and g are linear and time-invariant previously B B B R and S were , reversed q t = Aq t − 1 + w t w t ∼ N (0 , R ) O t , O o O 1 O T O t = Bq t − 1 + v t v t ∼ N (0 , S ) observations q 0 ∼ N (0 , Σ 0 ) f determines mean of q t given mean of q t-1 q t = f ( q t − 1 ) + w t A - transition matrix w t is zero-mean random noise vector B - observation matrix O t = g ( q t ) + v t similarly Inference Kalman Filter (1960) Kalman Filter � forward step (filtering) � time update P ( q t − 1 | O 0 , . . . , O t − 1 ) → P ( q t | O 0 , . . . , O t − 1 ) q 1-1 q t E ( q t | t − 1 ) = A · E ( q t − 1 | t − 1 ) V ( q t | t − 1 ) = A · V ( q t − 1 | t − 1 ) A T + R p ( q t | O 0 , . . . , O t ) O t-1 O t � measurement update P ( q t | O o , . . . , O t − 1 ) → P ( q t | O o , . . . , O t ) 1. P ( q t , O t | O o , . . . , O t − 1 ) Σ 12 � backward step (smoothing) Σ 11 q 1-1 q t � V ( q t | t − 1 ) B T � � E ( q t | t − 1 ) � V ( q t | t − 1 ) BV ( q t | t − 1 ) B T + R B · E ( q t | t − 1 ) BV ( q t | t − 1 ) Σ 21 O t-1 O t Σ 22 p ( q t | O t , O t +1 , . . . , O T ) 2. P ( q t | O o , . . . , O t − 1 ) → P ( q t | O o , . . . , O t ) E ( q t | t ) = E ( q t | t − 1 ) + Σ 12 Σ − 1 22 ( O t − E ( O t | t )) q 1-1 q t V ( q t | t ) = V ( q t | t − 1 ) − Σ 12 Σ − 1 22 Σ 21 O t-1 O t

  4. Example of use Kalman Filter Usage � Tracking motion � Missiles � Hand motion � Lip motion from videos � Signal Processing � Navigation � Economics (for prediction) Reported by Welch and Bishop, SIGGRAPH 2001 Dynamic Bayes Nets Dynamic Bayes Nets Weather 0 Weather 1 Weather 2 � So far . . . q 0 q 1 q T Velocity 0 Velocity 1 Velocity 2 Location 0 Location 1 Location 2 O o O 1 O T Failure 0 Failure 1 Failure 2 � But are there more appealing models? Obs_2 Obs_1 Obs_0 � It’s just a Bayes Net! Weather 0 Weather 1 Weather 2 Approach to the dynamics � Velocity 0 Velocity 1 Velocity 2 1. Start with some prior for the initial state � Location 0 Location 1 Location 2 2. Predict the next state just using the observation up to the previous time step � (Koller and Friedman) Failure 0 Failure 1 Failure 2 3. Incorporate the new observation and re-estimate the current state � Obs_2 Obs_1 Obs_0

  5. Dynamic Bayes Nets Other graphical models Weather 0 Weather 1 Weather 2 Velocity 0 Velocity 1 Velocity 2 but first... Location 0 Location 1 Location 2 Failure 0 Failure 1 Failure 2 Obs_2 Obs_1 Any questions so far? Obs_0 � It’s just a Bayes Net! Approach to the dynamics � Most importantly: 1. Start with some prior for the initial state � Use the structure of the Bayes Net. 2. Predict the next state just using the observation up to the previous time step � Use the independencies!!! 3. Incorporate the new observation and re-estimate the current state � Are all GM directed? Undirected models There are Undirected Graphical Models! A p ( X ) = 1 � B C ψ ( X C ) A Z C ψ ( X C ) - non-negative potential function D E B C What are C ? D E

  6. Cliques Cliques A A p ( X ) = 1 � B C B C ψ ( X C ) Z C ψ ( X C ) - non-negative potential function D E D E A clique C is a subset C ∈ V if ∀ i,j ∈ C, (i,j) ∈ E i) B - a clique? ii) BC - a maximal clique? C is maximal if it is not contained in any other clique iii) ABCD - a clique? iv) ABC - a maximal clique? v) BCDE - a clique? Decomposition Independence Rule: V 1 is independent of V 2 given cutset S A S is called the Markov Blanket (MB) e.g. MB(B) = {A,C,D}, i.e. the set of neighbors B C A D E B C Note to resolve the confusion: The most common machine learning notation is the decomposition over maximal cliques D E p ( A, B, C, D, E ) = 1 Z p ( A, B, C ) p ( B, D ) p ( C, E ) p ( D, E )

  7. Are undirected models useful? Are undirected models useful? � Yes! � Yes! � Used a lot in Physics (Ising model, Boltzmann machine) � Used a lot in Physics (Ising model, Boltzmann machine) � In vision (every pixel is a node) � In vision (every pixel is a node), bioinformatics � Bioinformatics � Why not more popular? � the ZZZZZZ! it’s the partition function p ( X ) = 1 � ψ ( X C ) Z C What’s Z and ways to fight it Chain Graphs � � � Generalization of MRFs and Bayes Nets Z = ψ ( X C ) � Structured as blocks ∀ x C � Undirected edges within a block � Approximations � Directed edges between blocks � Sampling (MCMC sampling is common) � Pseudo-Likelihood � Mean-field approximation

  8. Chain Graphs Graphical Models Chain Graphs � Generalization of MRFs and Bayes Nets � Structured as blocks quite intractable Undirected � Undirected edges within a block not very popular Directed used in BioMedical � Directed edges between blocks ? Engineering (text) Undirected? Directed? Directed Undirected? A A A B A B B C B C C C D D

  9. Summary Chain Graphs � Graphical Models is a huge evolving field Undirected � There are many other variations that haven’t been Directed discussed � Used extensively in variety of domains � Tractability issues � More work to be done! Questions ?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend