Belief Propagation Matt Gormley Lecture 9 Sep. 25, 2019 1 - PowerPoint PPT Presentation

10-418 / 10-618 Machine Learning for Structured Data Machine Learning Department School of Computer Science Carnegie Mellon University Belief Propagation Matt Gormley Lecture 9 Sep. 25, 2019 1

Q&A Q: What if I already answered a homework question using different assumptions than what was clarified in a Piazza note? A: Just write down the assumptions you made. We will usually give credit so long as your assumptions are clear in the writeup and your answer correct under those assumptions. (Obviously, this only applies to underspecified / ambiguous questions. You can’t just add arbitrary assumptions!) 2

Reminders • Homework 1: DAgger for seq2seq – Out: Thu, Sep. 12 – Due: Thu, Sep. 26 at 11:59pm • Homework 2: Labeling Syntax Trees – Out: Thu, Sep. 26 – Due: Thu, Oct. 10 at 11:59pm 3

Variable Elimination Complexity ψ 12 ψ 23 X 1 X 2 X 4 Instead, capitalize on the factorization of ψ 45 ψ 13 p( x ) . ψ 234 ψ 5 X 5 X 3 In-Class Exercise: Fill in the blank Brute force, naïve, Variable elimination inference is O(____) is O(____) where n = # of variables k = max # values a variable can take r = # variables participating in largest “intermediate” table 4

Exact Inference Variable Elimination Belief Propagation • • Uses Uses – Computes the partition – Computes the partition function of any factor graph function of any acyclic factor graph – Computes the marginal – Computes all marginal probability of a query variable in any factor graph probabilities of factors and • variables at once, for any Limitations acyclic factor graph – Only computes the marginal • Limitations for one variable at a time (i.e. – Only exact on acyclic factor need to re-run variable elimination for each variable if graphs (though we’ll consider you need them all) its “loopy” variant later) – Elimination order affects – Message passing order runtime affects runtime (but the obvious topological ordering always works best) 6

MESSAGE PASSING 7

Great Ideas in ML: Message Passing Count the soldiers there's 1 of me 1 2 3 4 5 before before before before before you you you you you 5 4 3 2 1 behind behind behind behind behind you you you you you 8 adapted from MacKay (2003) textbook

Great Ideas in ML: Message Passing Count the soldiers there's Belief: 1 of me Must be 2 + 1 + 3 = 6 of 2 1 3 2 2 us before before you you only see 3 behind my incoming you messages 9 adapted from MacKay (2003) textbook

Great Ideas in ML: Message Passing Count the soldiers there's Belief: Belief: 1 of me Must be Must be 1 + 1 + 4 = 6 of 1 1 4 2 + 1 + 3 = 6 of 2 1 3 1 before us us you only see 4 behind my incoming you messages 10 adapted from MacKay (2003) textbook

Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11 here (= 7+3+1) 11 adapted from MacKay (2003) textbook

Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here (= 3+3+1) 3 here 12 adapted from MacKay (2003) textbook

Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 11 here (= 7+3+1) 7 here 3 here 13 adapted from MacKay (2003) textbook

Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here Belief: Must be 14 of us 3 here 14 adapted from MacKay (2003) textbook

Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here Belief: Must be 14 of us 3 here wouldn't work correctly with a 'loopy' (cyclic) graph 15 adapted from MacKay (2003) textbook

Exact marginal inference for factor trees SUM-PRODUCT BELIEF PROPAGATION 16

Message Passing in Belief Propagation v 6 v 1 n 6 n 6 a 9 a 3 My other factors … think I’m a noun X Ψ … … But my other variables and I think you’re a verb … v 6 n 1 a 3 Both of these messages judge the possible values of variable X . Their product = belief at X = product of all 3 messages to X . 17

Sum-Product Belief Propagation Variables Factors ψ 2 X 2 Beliefs ψ 1 ψ 3 ψ 1 X 1 X 1 X 3 ψ 2 Messages X 2 ψ 1 ψ 3 ψ 1 X 1 X 1 X 3 18

Sum-Product Belief Propagation Variable Belief ψ 2 v 1 n 2 p 2 v 0.1 v 4 n 3 n 1 p p 1 0 ψ 1 ψ 3 X 1 v .4 n 6 p 0 19

Sum-Product Belief Propagation Variable Message ψ 2 v 1 n 2 p 2 v 0.1 v 0.1 n 3 n 6 p p 1 2 ψ 1 ψ 3 X 1 20

Sum-Product Belief Propagation Factor Belief v n p 4 p 0.1 8 v 8 d 1 d 3 0 n 0.2 n 0 n 1 1 ψ 1 X 1 X 3 v n p 3.2 6.4 d 24 0 n 0 0 21

Sum-Product Belief Propagation Factor Belief ψ 1 X 1 X 3 v n p 3.2 6.4 d 24 0 n 0 0 22

Sum-Product Belief Propagation Factor Message v n p 0.8 + 0.16 p 0.1 8 v 8 d 24 + 0 d 3 0 n 0.2 n 8 + 0.2 n 1 1 ψ 1 X 1 X 3 23

Sum-Product Belief Propagation Factor Message matrix-vector product (for a binary factor) ψ 1 X 1 X 3 24

Sum-Product Belief Propagation Input: a factor graph with no cycles Output: exact marginals for each variable and factor Algorithm: 1. Initialize the messages to the uniform distribution. 1. Choose a root node. 2. Send messages from the leaves to the root . Send messages from the root to the leaves . 1. Compute the beliefs (unnormalized marginals). 2. Normalize beliefs and return the exact marginals. 25

FORWARD BACKWARD AS SUM-PRODUCT BP 28

CRF Tagging Model X 1 X 2 X 3 find preferred tags Could be verb or noun Could be adjective or verb Could be noun or verb 29

CRF Tagging by Belief Propagation Backward algorithm = Forward algorithm = belief message passing message passing v 1.8 (matrix-vector products) (matrix-vector products) n 0 a 4.2 message message α α β β v n a v n a v 7 v 3 v 2 v 3 … … v 0 2 v 0 2 1 1 n 2 n 1 n 1 n 6 n 2 n 2 1 0 1 0 a 1 a 6 a 7 a 1 a 0 3 a 0 3 1 1 v 0.3 n 0 a 0.1 find tags preferred • Forward-backward is a message passing algorithm. • It’s the simplest case of belief propagation. 30

So Let’s Review Forward-Backward … X 1 X 2 X 3 find preferred tags Could be verb or noun Could be adjective or verb Could be noun or verb 31

So Let’s Review Forward-Backward … X 1 X 2 X 3 v v v n n n START END a a a find preferred tags • Show the possible values for each variable 32

So Let’s Review Forward-Backward … X 1 X 2 X 3 v v v n n n START END a a a find preferred tags • Let’s show the possible values for each variable • One possible assignment 33

So Let’s Review Forward-Backward … X 1 X 2 X 3 v v v n n n START END a a a find preferred tags • Let’s show the possible values for each variable • One possible assignment • And what the 7 factors think of it … 34

Viterbi Algorithm: Most Probable Assignment X 1 X 2 X 3 ) v , v v v ( START ψ {1} ( v ) } 1 ψ {1,2} (v,a) ψ , 0 { ψ {3,4} (a, END ) n n n ) START END n , a ( } ψ {3} ( n ) 3 ψ , 2 { a a a ψ {2} ( a ) find preferred tags • So p( v a n ) = (1/Z) * product of 7 numbers • Numbers associated with edges and nodes of path • Most probable assignment = path with highest product 35

Viterbi Algorithm: Most Probable Assignment X 1 X 2 X 3 ) v , v v v ( START ψ {1} ( v ) } 1 ψ {1,2} (v,a) ψ , 0 { ψ {3,4} (a, END ) n n n ) START END n , a ( } ψ {3} ( n ) 3 ψ , 2 { a a a ψ {2} ( a ) find preferred tags • So p( v a n ) = (1/Z) * product weight of one path 36

Forward-Backward Algorithm: Finds Marginals X 1 X 2 X 3 v v v n n n START END a a a find preferred tags • So p( v a n ) = (1/Z) * product weight of one path • Marginal probability p( X 2 = a) = (1/Z) * total weight of all paths through 37 a

Forward-Backward Algorithm: Finds Marginals X 1 X 2 X 3 v v v n n n START END a a a find preferred tags • So p( v a n ) = (1/Z) * product weight of one path • Marginal probability p( X 2 = a) = (1/Z) * total weight of all paths through 38 n

Forward-Backward Algorithm: Finds Marginals X 1 X 2 X 3 v v v n n n START END a a a find preferred tags • So p( v a n ) = (1/Z) * product weight of one path • Marginal probability p( X 2 = a) = (1/Z) * total weight of all paths through 39 v

Forward-Backward Algorithm: Finds Marginals X 1 X 2 X 3 v v v n n n START END a a a find preferred tags • So p( v a n ) = (1/Z) * product weight of one path • Marginal probability p( X 2 = a) = (1/Z) * total weight of all paths through 40 n

Forward-Backward Algorithm: Finds Marginals X 1 X 2 X 3 v v v n n n START END a a a find preferred tags α 2 ( n ) = total weight of these path prefixes 41 (found by dynamic programming: matrix-vector products)

Belief Propagation Matt Gormley Lecture 9 Sep. 25, 2019 1 - PowerPoint PPT Presentation

10-418 / 10-618 Machine Learning for Structured Data Machine Learning Department School of Computer Science Carnegie Mellon University Belief Propagation Matt Gormley Lecture 9 Sep. 25, 2019 1 Q&A Q: What if I already answered a

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Shuffled Belief Propagation Decoding Juntan Zhang and Marc Fossorier Department of Electrical

An empirical study of Gaussian belief propagation and application in the detection of F-formations

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Belief Decision Behavior: Theory and Evidence Todd Davies Belief Concepts Proposition

Belief and assertion. Evidence from mood shift Alda Mari Institut Jean Nicod , cnrs/ens/ehess/psl

Probabilistic & Unsupervised Learning Belief Propagation Maneesh Sahani

Probabilistic & Unsupervised Learning Belief Propagation Maneesh Sahani

Graphical Models Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Fall

Geometric Sound Transmission Micah Taylor Overview Geometric propagation Very fast Can be

Lecture no: 2 Short on dB calculations Basics about antennas Propagation mechanisms

Introduction to Parallel Computing Irene Moulitsas Programming using the Message-Passing

Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento,

Message Passing Concepts Message Passing Model The message passing model is based on the

Introduction to Human Computer Interaction Course on NPTEL, Spring 2018 Week 7 Usable Security

An empirical study of messaging passing concurrency in Go projects Nicolas Dilley Julien Lange

Optimization Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University 1 The Problem

Message Passing Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1

Lecture 5: Message Passing & Other Communication Mechanisms (SR & Java) Intro:

Belief Propagation Matt Gormley Lecture 9 Sep. 25, 2019 1 - PowerPoint PPT Presentation

10-418 / 10-618 Machine Learning for Structured Data Machine Learning Department School of Computer Science Carnegie Mellon University Belief Propagation Matt Gormley Lecture 9 Sep. 25, 2019 1 Q&A Q: What if I already answered a

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Shuffled Belief Propagation Decoding Juntan Zhang and Marc Fossorier Department of Electrical

An empirical study of Gaussian belief propagation and application in the detection of F-formations

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Belief Decision Behavior: Theory and Evidence Todd Davies Belief Concepts Proposition

Belief and assertion. Evidence from mood shift Alda Mari Institut Jean Nicod , cnrs/ens/ehess/psl

Probabilistic &amp; Unsupervised Learning Belief Propagation Maneesh Sahani

Probabilistic &amp; Unsupervised Learning Belief Propagation Maneesh Sahani

Graphical Models Graphical Models Clique trees &amp; Belief Propagation Siamak Ravanbakhsh Fall

Geometric Sound Transmission Micah Taylor Overview Geometric propagation Very fast Can be

Lecture no: 2 Short on dB calculations Basics about antennas Propagation mechanisms

Introduction to Parallel Computing Irene Moulitsas Programming using the Message-Passing

Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento,

Message Passing Concepts Message Passing Model The message passing model is based on the

Introduction to Human Computer Interaction Course on NPTEL, Spring 2018 Week 7 Usable Security

An empirical study of messaging passing concurrency in Go projects Nicolas Dilley Julien Lange

Optimization Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University 1 The Problem

Message Passing Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1

Lecture 5: Message Passing &amp; Other Communication Mechanisms (SR &amp; Java) Intro:

Probabilistic & Unsupervised Learning Belief Propagation Maneesh Sahani

Probabilistic & Unsupervised Learning Belief Propagation Maneesh Sahani

Graphical Models Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Fall

Lecture 5: Message Passing & Other Communication Mechanisms (SR & Java) Intro: