Variable Elimination Probabilistic Graphical Models Sharif - PowerPoint PPT Presentation

Exact Inference: Variable Elimination Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani

Probabilistic Inference and Learning  We now have compact representations of probability distributions (Graphical Models) A GM M describes a unique probability distribution P   Typical tasks:  Task 1: How do we answer queries about 𝑄 𝑁 , e.g., 𝑄 𝑁 𝑌 𝑍 ?  We use inference as a name for the process of computing answers to such queries  Task 2: How do we estimate a plausible model M from data D?  i.We use learning as a name for the process of obtaining point estimate of M.  ii. But for Bayesian, they seek 𝑞(𝑁|𝐸) , which is actually an inference problem.  iii. When not all variables are observable, even computing point estimate of M need to do inference to impute the missing data. 2 This slide has been adopted from Eric Zing, PGM 10708, CMU.

Why we need inference  If we know the graphical model, we use the inference to find marginal or conditional distributions efficiently  We also need inference in the learning when we try to find a model from incomplete data or when the learning approach is Bayesian (as we will see in the next lectures) 3

Inference query  Likelihood: probability of evidence Nodes: 𝒴 = {𝑌 1 , … , 𝑌 𝑜 } 𝒇 : evidence on a set variables 𝑭 𝒀 = 𝒴 − 𝑭 𝑄 𝒇 = 𝑄(𝒀, 𝒇) 𝒁 = 𝒀 − 𝒂 𝒀  Marginal probability distribution: 𝑄 𝒀 = 𝑄(𝒴) 𝒴−𝒀  Conditional probability distribution (a posteriori belief): 𝑄(𝒀, 𝒇) 𝑄 𝒀|𝒇 = 𝒀 𝑄(𝒀, 𝒇)  Marginalized conditional probability distribution: 𝒂 𝑄(𝒁, 𝒂, 𝒇) 𝑄 𝒁|𝒇 = (𝒀 = 𝒁 ∪ 𝒂) 𝒁 𝒂 𝑄(𝒁, 𝒂, 𝒇) query a subset Y of all domain variables X={Y,Z} 4 and "don't care" about the remaining Z

Most Probable Assignment (MPA)  Most probable assignment for some variables of interest given an evidence 𝑭 = 𝒇 𝒁 ∗ |𝒇 = argmax 𝑄 𝒁|𝒇 𝒁 Maximum a posteriori configuration of 𝒁  Applications of MPA  Classification  find most likely label, given the evidence  Explanation  what is the most likely scenario, given the evidence 5

MPA: Example 6 This slide has been adopted from Eric Zing, PGM 10708, CMU.

Marginal probability: Enumeration  𝑄 𝒁 𝒇 ∝ 𝑄 𝒁, 𝒇  𝑄 𝒁, 𝒇 = 𝒂 𝑄(𝒁, 𝒇, 𝒂)  Marginal probability: exponential computation is required in general  #P-complete problem (enumeration intractable)  Even in the graph of polynomial size it can be exponential  We cannot find a general procedure that works efficiently for arbitrary GMs 7

Harness of Inference  Hardness does not mean we cannot solve inference  It implies that we cannot find a general procedure that works efficiently for arbitrary GMs  For particular families of GMs, we can have provably efficient procedures  For special graph structure, provably efficient algorithms (avoiding exponential cost) are available 8

Exact inference  Exact inference:  Variable elimination algorithm  general graph  one query  Belief propagation , sum-product on factor graphs  Tree  marginal probability on all nodes  Junction tree algorithm  general graph  marginal probability on all clique nodes 9

Inference on a chain 𝐵 𝐶 𝐷 𝐸 𝑄 𝑒 = 𝑄(𝑏, 𝑐, 𝑑, 𝑒) 𝑏 𝑐 𝑑 𝑄 𝑒 = 𝑄 𝑏 𝑄 𝑐 𝑏 𝑄 𝑑 𝑐 𝑄(𝑒|𝑑) 𝑐 𝑑 𝑏  A naïve summation needs to enumerate over an exponential number of terms 10

Inference on a chain: marginalization and elimination 𝐵 𝐶 𝐷 𝐸 𝑄 𝑒 = 𝑄 𝑏 𝑄 𝑐 𝑏 𝑄 𝑑 𝑐 𝑄(𝑒|𝑑) 𝑐 𝑑 𝑏 = 𝑄 𝑏 𝑄 𝑐 𝑏 𝑄 𝑑 𝑐 𝑄(𝑒|𝑑) 𝑐 𝑏 𝑑 = 𝑄(𝑒|𝑑) 𝑄 𝑑 𝑐 𝑄 𝑏 𝑄 𝑐 𝑏 𝑐 𝑏 𝑑 𝑄(𝑐) 𝑄(𝑑) 𝑄(𝑒)  In a chain of 𝑜 nodes each having 𝑙 values, 𝑃(𝑜𝑙 2 ) instead of 𝑃(𝑙 𝑜 ) 11

𝑌 𝑂 𝑌 1 𝑌 2 … 𝑌 𝑂−1 Inference on a chain 𝑌 𝑂 … 𝑌 1 𝑌 2 𝑌 𝑂−1  In both directed and undirected graphical models, the joint probability is a factored expression over subsets of the variables 𝑄 𝒚 = 1 𝑎 𝜚 1,2 𝑦 1 , 𝑦 2 𝜚 2,3 𝑦 2 , 𝑦 3 … 𝜚 𝑂−1,𝑂 𝑦 𝑂−1 , 𝑦 𝑂 undirected 𝑄 𝑦 𝑗 = 1 𝑎 … … 𝜚 1,2 𝑦 1 , 𝑦 2 … 𝜚 𝑂−1,𝑂 𝑦 𝑂−1 , 𝑦 𝑂 𝑦 𝑗−1 𝑦 𝑗+1 𝑦 𝑂 𝑦 1 𝑄 𝑦 𝑗 = 𝜚 𝑦 𝑗−1 , 𝑦 𝑗 𝜚 𝑦 𝑗−2 , 𝑦 𝑗−1 … 𝜚 𝑦 1 , 𝑦 2 𝑦 𝑗−1 𝑦 𝑗−2 𝑦 1 × 𝜚 𝑦 𝑗 , 𝑦 𝑗+1 𝜚 𝑦 𝑗+1 , 𝑦 𝑗+2 … 𝜚 𝑦 𝑂−1 , 𝑦 𝑂 𝑦 𝑗+1 𝑦 𝑗+2 𝑦 𝑂 operations in each elimination 𝑃 𝑊𝑏𝑚 𝑌 × 𝑊𝑏𝑚 𝑌 𝑘 𝑘+1 12

Inference on a chain: improvement reasons  Computing an expression of the form (sum-product inference): 𝜚 𝜲 : the set of factors 𝒂 𝜚∈𝜲  We used the structure of BN to factorize the joint distribution and thus the scope of the resulted factors will be limited.  Distributive law: If 𝑌 ∉ Scope(𝜚 1 ) then 𝑌 𝜚 1 . 𝜚 2 = 𝜚 1 . 𝑌 𝜚 2  Performing the summations over the product of only a subset of factors  We find sub-expressions that can be computed once and then we save and reuse them in later computations  Instead of computing them exponentially many times 13

Variable elimination algorithm for sum-product inference  Sum out each variable one at a time  all factors containing that variable are (removed from the set of factors and) multiplied to generate a product factor  The variable is summed out from the generated product factor and a new factor is obtained  The new factor is added to the set of the available factors The resulted factor does not necessarily correspond to any probability or conditional probability in the network 14

Procedure Sum-Product-VE ( Z, G) Procedure Sum-Product-Elim-Var( 𝚾 , 𝑎 ) // 𝒂 : the variables to be eliminated 𝚾 ′ ← {𝜚 ∈ 𝚾: 𝑎 ∈ Scope(𝜚)} 𝚾 ← all factors of G 𝚾 ′′ ← 𝚾 − 𝚾 ′ Select an elimination order 𝑎 1 , . . . , 𝑎 𝐿 for 𝒂 for 𝑗 = 1, . . . , 𝐿 𝑛 ← 𝜚 𝚾 ← Sum-Product-Elim-Var( 𝚾 , 𝑎 𝑗 )) 𝜚∈𝚾 ′ 𝑎 return 𝚾 ′′ ∪ {𝑛} 𝜚 ∗ ← 𝜚 𝜚∈𝜲 Return 𝜚 ∗ • Move all irrelevant factors (to the variable that must be eliminated now) It does not need normalization for outside of the summation directed graph when we have no evidence • Perform sum, getting a new term Insert the new term into the product • 15

Procedure Cond-Prob-VE ( 𝒧 , // the network over 𝒀 𝒁 , // Set of query variables 𝑭 = 𝒇, // evidence) 𝚾 ← the factors parametrizing 𝒧 Replace each 𝜚 ∈ 𝜲 by 𝜚[𝑭 = 𝒇] Select an elimination order 𝑎 1 , . . . , 𝑎 𝐿 for 𝒂 = 𝒀 − 𝒁 − 𝑭 for 𝑗 = 1, . . . , 𝑙 𝚾 ← Sum-Product-Elim-Var( 𝚾 , 𝑎 𝑗 )) 𝜚 ∗ ← 𝜚 𝜚∈𝜲 𝜚 ∗ (𝒛) 𝛽 ← 𝒛∈𝑊𝑏𝑚(𝒁) Return 𝛽, 𝜚 ∗ 16

Directed example  Query: 𝑄(𝑌 2 |𝑌 7 = 𝑦 7 ) 𝑌 2 𝑌 1 𝑌 3  𝑄 𝑌 2 𝑦 7 ∝ 𝑄 𝑌 2 , 𝑦 7 𝑌 5 𝑌 4 𝑌 7 𝑄 𝑦 2 , 𝑦 7 𝑌 6 𝑌 8 = 𝑄 𝑦 1 , 𝑦 2 , 𝑦 3 , 𝑦 4 , 𝑦 5 , 𝑦 6 , 𝑦 7 , 𝑦 8 𝑦 1 𝑦 3 𝑦 4 𝑦 5 𝑦 6 𝑦 8 Consider the elimination order 𝑌 1 , 𝑌 3 , 𝑌 4 , 𝑌 5 , 𝑌 6 , 𝑌 8 𝑄 𝑦 2 , 𝑦 7 = 𝑄 𝑦 1 𝑄 𝑦 2 𝑄 𝑦 3 𝑦 1 , 𝑦 2 𝑄 𝑦 4 𝑦 3 𝑄 𝑦 5 𝑦 2 𝑄 𝑦 6 𝑦 3 , 𝑦 7 𝑄( 𝑦 7 |𝑦 4 , 𝑦 5 )𝑄 𝑦 8 𝑦 7 𝑦 8 𝑦 6 𝑦 5 𝑦 4 𝑦 3 𝑦 1 17

𝑄 𝑦 2 , 𝑦 7 = 𝑄 𝑦 2 𝑄 𝑦 4 𝑦 3 𝑄 𝑦 5 𝑦 2 𝑄 𝑦 6 𝑦 3 , 𝑦 7 𝑄( 𝑦 7 |𝑦 4 , 𝑦 5 )𝑄 𝑦 8 𝑦 7 𝑄 𝑦 1 𝑄 𝑦 3 𝑦 1 , 𝑦 2 𝑦 8 𝑦 6 𝑦 5 𝑦 4 𝑦 3 𝑦 1 = 𝑄 𝑦 2 𝑄 𝑦 4 𝑦 3 𝑄 𝑦 5 𝑦 2 𝑄 𝑦 6 𝑦 3 , 𝑦 7 𝑄 𝑦 7 𝑦 4 , 𝑦 5 𝑄 𝑦 8 𝑦 7 𝑛 1 (𝑦 2 , 𝑦 3 ) 𝑦 8 𝑦 6 𝑦 5 𝑦 4 𝑦 3 = 𝑄 𝑦 2 𝑄 𝑦 5 𝑦 2 𝑄 𝑦 7 𝑦 4 , 𝑦 5 𝑄 𝑦 8 𝑦 7 𝑄 𝑦 4 𝑦 3 𝑄 𝑦 6 𝑦 3 , 𝑦 7 𝑛 1 (𝑦 2 , 𝑦 3 ) 𝑦 8 𝑦 6 𝑦 5 𝑦 4 𝑦 3 = 𝑄 𝑦 2 𝑄 𝑦 5 𝑦 2 𝑄 𝑦 7 𝑦 4 , 𝑦 5 𝑄 𝑦 8 𝑦 7 𝑛 3 (𝑦 2 , 𝑦 6 , 𝑦 4 ) 𝑦 8 𝑦 6 𝑦 5 𝑦 4 = 𝑄 𝑦 2 𝑄 𝑦 5 𝑦 2 𝑄 𝑦 8 𝑦 7 𝑄 𝑦 7 𝑦 4 , 𝑦 5 𝑛 3 (𝑦 2 , 𝑦 6 , 𝑦 4 ) 𝑦 8 𝑦 6 𝑦 5 𝑦 4 = 𝑄 𝑦 2 𝑄 𝑦 5 𝑦 2 𝑄 𝑦 8 𝑦 7 𝑛 4 (𝑦 2 , 𝑦 5 , 𝑦 6 ) 𝑦 8 𝑦 6 𝑦 5 = 𝑄 𝑦 2 𝑄 𝑦 8 𝑦 7 𝑄 𝑦 5 𝑦 2 𝑛 4 (𝑦 2 , 𝑦 5 , 𝑦 6 ) 𝑦 8 𝑦 6 𝑦 5 = 𝑄 𝑦 2 𝑄 𝑦 8 𝑦 7 𝑛 5 (𝑦 2 , 𝑦 6 ) 𝑦 8 𝑦 6 = 𝑄 𝑦 2 𝑄 𝑦 8 𝑦 7 𝑛 5 (𝑦 2 , 𝑦 6 ) 𝑦 8 𝑦 6 = 𝑄 𝑦 2 𝑄 𝑦 8 𝑦 7 𝑛 6 (𝑦 2 ) = 𝑛 8 (𝑦 2 )𝑛 6 (𝑦 2 ) 18 𝑦 8

Conditional probability 𝑛 8 (𝑦 2 )𝑛 6 (𝑦 2 ) 𝑄 𝑦 2 | 𝑦 7 = 𝑦 2 𝑛 8 (𝑦 2 )𝑛 6 (𝑦 2 ) 19

Variable Elimination Probabilistic Graphical Models Sharif - PowerPoint PPT Presentation

Exact Inference: Variable Elimination Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani Probabilistic Inference and Learning We now have compact representations of probability distributions

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Dead Code Elimination & Dead code elimination Constant Propagation Conceptually similar

Second Order Cut-Elimination Mikheil Rukhaia Supervisor: Prof. Alexander Leitsch Introduction

Implementing the Omega Test in HOL Outline: Basic Fourier-Motzkin variable elimination Omegas

Dead Code Elimination (DCE) Dead code elimination is an optimization that removes DEAD

Variable and clause elimination for LTL satisfiability checking Martin Suda Max Planck Institut

A framework for malaria elimination Dr Pedro Alonso, GMP Director Rationale for new elimination

Redundant Feature Elimination Redundant Feature Elimination for Multi-Class Problems for

Image Weather Image Weather 7 Effects Elimination Effects Elimination Abstract Problem

Hepatitis C Elimination in New York State Clifton Garmon, Angie Woody & Mary Taylor from

Decentralization towards elimination Datuk Dr. Muhammad Radzi Abu Hassan, Ministry of Health,

CS3220 Gaussian Elimination and LU Steve Marschner Spring 2010 one step of the elimination

Malaria elimination will require New tools Science and politics of malaria elimination in

5. Linear Inequalities and Elimination Searching for certificates Projection of polyhedra

Why Publish to the VO? Sverin Gaudet Canadian Astronomy Data Centre Canadian Advanced Network

(Bayesian) Statistics with Rankings Marina Meil a University of Washington

MOSCOW, RUSSIA JILL COOPER PROGRAMME MARKETING DIRECTOR GSMA RCS GLOBAL ADOPTION RCS

Software Quality Engineering: Testing, Quality Assurance, and Quantifiable Improvement Jeff

Measurement of Wire Sag in a Vibrating Wire Setup* Animesh Jain, Ping He, George Ganetis

CloudKitty Hands-on 1 / 56 Lets meet your hosts! 2 / 56 Lets meet your hosts! Todays

SAF and closed-loop automation Matthias Runge Senior Software Engineer 1 SAF AND CLOSED-LOOP

Survey and Comparison of Open Source Time Series Databases SCDM @ BTW 2017 Andreas Bader,

Variable Elimination Probabilistic Graphical Models Sharif - PowerPoint PPT Presentation

Exact Inference: Variable Elimination Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani Probabilistic Inference and Learning We now have compact representations of probability distributions

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Dead Code Elimination &amp; Dead code elimination Constant Propagation Conceptually similar

Second Order Cut-Elimination Mikheil Rukhaia Supervisor: Prof. Alexander Leitsch Introduction

Implementing the Omega Test in HOL Outline: Basic Fourier-Motzkin variable elimination Omegas

Dead Code Elimination (DCE) Dead code elimination is an optimization that removes DEAD

Variable and clause elimination for LTL satisfiability checking Martin Suda Max Planck Institut

A framework for malaria elimination Dr Pedro Alonso, GMP Director Rationale for new elimination

Redundant Feature Elimination Redundant Feature Elimination for Multi-Class Problems for

Image Weather Image Weather 7 Effects Elimination Effects Elimination Abstract Problem

Hepatitis C Elimination in New York State Clifton Garmon, Angie Woody &amp; Mary Taylor from

Decentralization towards elimination Datuk Dr. Muhammad Radzi Abu Hassan, Ministry of Health,

CS3220 Gaussian Elimination and LU Steve Marschner Spring 2010 one step of the elimination

Malaria elimination will require New tools Science and politics of malaria elimination in

5. Linear Inequalities and Elimination Searching for certificates Projection of polyhedra

Why Publish to the VO? Sverin Gaudet Canadian Astronomy Data Centre Canadian Advanced Network

(Bayesian) Statistics with Rankings Marina Meil a University of Washington

MOSCOW, RUSSIA JILL COOPER PROGRAMME MARKETING DIRECTOR GSMA RCS GLOBAL ADOPTION RCS

Software Quality Engineering: Testing, Quality Assurance, and Quantifiable Improvement Jeff

Measurement of Wire Sag in a Vibrating Wire Setup* Animesh Jain, Ping He, George Ganetis

CloudKitty Hands-on 1 / 56 Lets meet your hosts! 2 / 56 Lets meet your hosts! Todays

SAF and closed-loop automation Matthias Runge Senior Software Engineer 1 SAF AND CLOSED-LOOP

Survey and Comparison of Open Source Time Series Databases SCDM @ BTW 2017 Andreas Bader,

Dead Code Elimination & Dead code elimination Constant Propagation Conceptually similar

Hepatitis C Elimination in New York State Clifton Garmon, Angie Woody & Mary Taylor from