Probabilistic graphical models Yifeng Tao School of Computer - PowerPoint PPT Presentation

Introduction to Machine Learning Probabilistic graphical models Yifeng Tao School of Computer Science Carnegie Mellon University Slides adapted from Eric Xing, Matt Gormley Yifeng Tao Carnegie Mellon University 1

Recap of Basic Probability Concepts o Representation: the joint probability distribution on multiple binary variables? o State configurations in total: 2 8 o Are they all needed to be represented? o Do we get any scientific/medical insight? o Learning: where do we get all this probabilities? o Maximal-likelihood estimation? o Inference: If not all variables are observable, how to compute the conditional distribution of latent variables given evidence? o Computing p ( H | A ) would require summing over all 2 6 configurations of the unobserved variables [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 2

Graphical Model: Structure Simplifies Representation o Dependencies among variables [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 3

Probabilistic Graphical Models o If X i ’s are conditionally independent (as described by a PGM ), the joint can be factored to a product of simpler terms, e.g., o Why we may favor a PGM? o Incorporation of domain knowledge and causal (logical) structures o 2+2+4+4+4+8+4+8=36, an 8-fold reduction from 2 8 in representation cost! [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 4

Two types of GMs o Directed edges give causality relationships ( Bayesian Network or Directed Graphical Model ): o Undirected edges simply give correlations between variables ( Markov Random Field or Undirected Graphical model ): [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 5

Bayesian Network o Definition: o It consists of a graph G and the conditional probabilities P o These two parts full specify the distribution: o Qualitative Specification: G o Quantitative Specification: P [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 6

Where does the qualitative specification come from? o Prior knowledge of causal relationships o Learning from data (i.e. structure learning) o We simply prefer a certain architecture (e.g. a layered graph) o … [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 7

Quantitative Specification o Example: Conditional probability tables (CPTs) for discrete random variables [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 8

Quantitative Specification o Example: Conditional probability density functions (CPDs) for continuous random variables [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 9

Observed Variables o In a graphical model, shaded nodes are “ observed ”, i.e. their values are given [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 10

GMs are your old friends o Density estimation o Parametric and nonparametric methods o Regression o Linear, conditional mixture, nonparametric o Classification o Generative and discriminative approach o Clustering [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 11

What Independencies does a Bayes Net Model? o Independency of X and Z given Y ? P(X|Y)P(Z|Y) = P(X,Z|Y) o Three cases of interest... o Proof? [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 12

The “Burglar Alarm” example o Your house has a twitchy burglar alarm that is also sometimes triggered by earthquakes. o Earth arguably doesn’t care whether your house is currently being burgled. o While you are on vacation, one of your neighbors calls and tells you your home’s burglar alarm is ringing. [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 13

Markov Blanket o Def: the co-parents of a node are the parents of its children o Def: the Markov Blanket of a node is the set containing the node’s parents, children, and co-parents. o Thm: a node is conditionally independent of every other node in the graph given its Markov blanket o Example: The Markov Blanket of X 6 is { X 3 , X 4 , X 5 , X 8 , X 9 , X 10 } [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 14

Markov Blanket o Example: The Markov Blanket of X 6 is { X 3 , X 4 , X 5 , X 8 , X 9 , X 10 } [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 15

D-Separation o Thm: If variables X and Z are d-separated given a set of variables E Then X and Z are conditionally independent given the set E o Definition: o Variables X and Z are d-separated given a set of evidence variables E iff every path from X to Z is “blocked”. [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 16

D-Separation o Variables X and Z are d-separated given a set of evidence variables E iff every path from X to Z is “blocked”. [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 17

Machine Learning [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 18

Recipe for Closed-form MLE [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 19

Learning Fully Observed BNs o How do we learn these conditional and marginal distributions for a Bayes Net? [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 20

Learning Fully Observed BNs o Learning this fully observed Bayesian Network is equivalent to learning five (small / simple) independent networks from the same data [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 21

Learning Fully Observed BNs [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 22

Learning Partially Observed BNs o Partially Observed Bayesian Network: o Maximal likelihood estimation à Incomplete log-likelihood o The log-likelihood contains unobserved latent variables o Solve with EM algorithm o Example: Gaussian Mixture Models (GMMs) [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 23

Inference of BNs o Suppose we already have the parameters of a Bayesian Network... [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 24

Approaches to inference o Exact inference algorithms o The elimination algorithm à Message Passing o Belief propagation o The junction tree algorithms o Approximate inference techniques o Variational algorithms o Stochastic simulation / sampling methods o Markov chain Monte Carlo methods [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 25

Marginalization and Elimination [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 26

Marginalization and Elimination [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 27

[Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 28

o Step 8: Wrap-up [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 29

Elimination algorithm o Elimination on trees is equivalent to message passing on branches o Message-passing is consistent in trees o Application: HMM [Slide from Eric Xing.] Yifeng Tao Carnegie Mellon University 30

Gibbs Sampling [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 31

Gibbs Sampling o Full conditionals only need to condition on the Markov Blanket o Must be “easy” to sample from conditionals o Many conditionals are log-concave and are amenable to adaptive rejection sampling [Slide from Matt Gormley.] Yifeng Tao Carnegie Mellon University 34

Take home message o Graphical models portrays the sparse dependencies of variables o Two types of graphical models: Bayesian network and Markov random field o Conditional independence, Markov blanket, and d-separation o Learning fully observed and partially observed Bayesian networks o Exact inference and approximate inference of Bayesian networks Yifeng Tao Carnegie Mellon University 35

References o Eric Xing, Ziv Bar-Joseph. 10701 Introduction to Machine Learning: http://www.cs.cmu.edu/~epxing/Class/10701/ o Matt Gormley. 10601 Introduction to Machine Learning: http://www.cs.cmu.edu/~mgormley/courses/10601/index.html Yifeng Tao Carnegie Mellon University 36

Probabilistic graphical models Yifeng Tao School of Computer - PowerPoint PPT Presentation

Introduction to Machine Learning Probabilistic graphical models Yifeng Tao School of Computer Science Carnegie Mellon University Slides adapted from Eric Xing, Matt Gormley Yifeng Tao Carnegie Mellon University 1 Recap of Basic Probability

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Computer Science Let me be provocative Probabilistic graphical models is how we do probabilistic

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

CS 6782: Fall 2010 Probabilistic Graphical Models Guozhang Wang December 10, 2010 1

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Probabilistic Graphical Models Probabilistic Graphical Models Review of probability theory

Probabilistic Graphical Models Probabilistic Graphical Models Loopy BP and Bethe Free Energy

Probabilistic Graphical Models Probabilistic Graphical Models Structure learning in Bayesian

Probabilistic Graphical Models Probabilistic Graphical Models MAP inference Siamak Ravanbakhsh

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

The Elimination Algorithm Probabilistic Graphical Models (10- Probabilistic Graphical Models

AUTO-ADAPTIVE WEB LAYOUT OF ACEMAP BASED ON MOBILE CLIENT With the growth of smart phone use,

CS-5630 / CS-6630 Visualization Interaction Alexander Lex alex@sci.utah.edu [xkcd] Project

Project Just when you thought it was safe! Project 2 has been released. GUI Design Tips

A Case Study Interactive Data Visualization with Bokeh The Gapminder Data Set In [1]:

CS 403X Mobile and Ubiquitous Computing Lecture 4: AdapterViews, Intents, Fragments Audio/Video,

GPU Panel for Medicine at UMD Measurement Science Based on Full Coverage Microscopic

CROWD Communications Group, LLC About Me Sean T. Walsh sean@crowdcg.com @seantwalsh @crowdcg

Updating Alternatives in Pragmatic Competition Sunwoo Jeong and James N. Collins Princeton