graphical models graphical models
play

Graphical Models Graphical Models MAP inference Siamak Ravanbakhsh - PowerPoint PPT Presentation

Graphical Models Graphical Models MAP inference Siamak Ravanbakhsh Winter 2018 Learning objectives Learning objectives MAP inference and its complexity exact & approximate MAP inference max-product and max-sum message passing


  1. Graphical Models Graphical Models MAP inference Siamak Ravanbakhsh Winter 2018

  2. Learning objectives Learning objectives MAP inference and its complexity exact & approximate MAP inference max-product and max-sum message passing relationship to LP relaxation graph-cuts for MAP inference

  3. Definition & complexity Definition & complexity MAP arg max p ( x ) x given Bayes-net, deciding whether decision problem for some is NP-complete! p ( x ) > c x side-chain prediction as MAP inference (Yanover & Weiss)

  4. Definition & complexity Definition & complexity MAP arg max p ( x ) x given Bayes-net, deciding whether decision problem for some is NP-complete! p ( x ) > c x Marginal MAP arg max p ( x , y ) x ∑ y side-chain prediction as MAP inference (Yanover & Weiss) given Bayes-net for , deciding whether for p ( x , y ) p ( x ) > c decision some is complete for problem x NP PP is NP-hard even for trees a non-deterministic Turing machine that accepts if the majority of paths accept cannot use the distributive law a non-deterministic Turing machine that accepts if a single path accepts (with access to a PP oracle)

  5. Problem & terminology Problem & terminology 1 ∏ I MAP inference: arg max p ( x ) = arg max ϕ ( x ) x x Z I I ~ ≡ arg max ( x ) = arg max ϕ ( x ) x ∏ I x p I I ignore the normalization constant aka max-product inference

  6. Problem & terminology Problem & terminology 1 ∏ I MAP inference: arg max p ( x ) = arg max ϕ ( x ) x x Z I I ~ ≡ arg max ( x ) = arg max ϕ ( x ) x ∏ I x p I I ignore the normalization constant aka max-product inference with evidence: p ( x , e ) arg max p ( x ∣ e ) = arg max ≡ arg max p ( x , e ) x x x p ( e )

  7. Problem & terminology Problem & terminology 1 ∏ I MAP inference: arg max p ( x ) = arg max ϕ ( x ) x x Z I I ~ ≡ arg max ( x ) = arg max ϕ ( x ) x ∏ I x p I I ignore the normalization constant aka max-product inference with evidence: p ( x , e ) arg max p ( x ∣ e ) = arg max ≡ arg max p ( x , e ) x x x p ( e ) log domain: ~ arg max p ( x ) ≡ arg max ln ϕ ( x ) ≡ arg min − ln ( x ) x ∑ I p x I I x aka max-sum inference aka min-sum inference (energy minimization)

  8. Max-marginals Max-marginals marginal used in sum-product inference ϕ ( x , y ) ∑ x ∈ V al ( x ) is replaced with max-marginal max ϕ ( x , y ) x ∈ V al ( x ) ϕ ( a , c ) = max ϕ ( a , b , c ) ϕ ( a , b , c ) b ′

  9. distributive law distributive law for MAP inference for MAP inference max( ab , ac ) = a max( b , c ) max-product inference max( a + b , a + c ) = a + max( b , c ) max-sum inference max(min( a , b ), min( a , c )) = max( a , min( b , c )) min-max inference ab + ac = a ( b + c ) sum-product inference 3 operations 2 operations

  10. distributive law distributive law for MAP inference for MAP inference max( ab , ac ) = a max( b , c ) max-product inference max( a + b , a + c ) = a + max( b , c ) max-sum inference max(min( a , b ), min( a , c )) = max( a , min( b , c )) min-max inference ab + ac = a ( b + c ) sum-product inference 3 operations 2 operations save computation by factoring the operations in disguise max f ( x , y ) g ( y , z ) = max g ( y , z ) max f ( x , y ) x , y y x assuming ∣ V al ( X )∣ = ∣ V al ( Y )∣ = ∣ V al ( Z )∣ = d complexity: from to 3 2 O ( d ) O ( d )

  11. Max-product Max-product variable elimination variable elimination the procedure is similar to VE for sum-product inference eliminate all the variables input: a set of factors (e.g. CPDs) t =0 Φ = { ϕ , … , ϕ } 1 K ~ output: max ( x ) = max x ∏ I ϕ ( x ) x p I I go over in some order: x , … , x i 1 i n collect all the relevant factors: Ψ = { ϕ ∈ Φ ∣ x t t ∈ Scope [ ϕ ]} i t calculate their product: ψ = ∏ ϕ ∈Ψ t ϕ t max-marginalize out : ′ x i t ψ = max ψ x it t t update the set of factors: t −1 ′ Φ = Φ t − Ψ + { ψ } t t return the scalar in as ~ Φ t = m max ( x ) maximizing value x p ~ Z = ( x ) ∑ x p similar to the partition function:

  12. Decoding Decoding the max-value the max-value we need to recover the maximizing assignment x ∗ keep , produced during inference { ψ , … , ψ } t =1 t = n input: a set of factors (e.g. CPDs) t =0 Φ = { ϕ , … , ϕ } 1 K ~ output: max ( x ) = max x ∏ I ϕ ( x ) x p I I go over in some order: x , … , x i 1 i n collect all the relevant factors: Ψ = { ϕ ∈ Φ ∣ x t t ∈ Scope [ ϕ ]} i t calculate their product: ψ = ∏ ϕ ∈Ψ t ϕ t max-marginalize out : ′ x i t ψ = max ψ x it t t update the set of factors: t −1 ′ Φ = Φ t − Ψ + { ψ } t t return the scalar in as ~ Φ t = m max ( x ) x p

  13. Decoding Decoding the max-value the max-value start from the last eliminated variable should have been a function of alone: ∗ ← arg max ψ x i n x ψ t = n i n n t =0 input: a set of factors (e.g. CPDs) Φ = { ϕ , … , ϕ } 1 K ~ max ( x ) = max ϕ ( x ) x ∏ I output: x p I I go over in some order: x , … , x i 1 i n collect all the relevant factors: Ψ = { ϕ ∈ Φ ∣ x t t ∈ Scope [ ϕ ]} i t calculate their product: ψ = ∏ ϕ ∈Ψ t ϕ t max-marginalize out : ′ x i t ψ = max ψ x it t t update the set of factors: t −1 ′ Φ = Φ t − Ψ + { ψ } t t return the scalar in as ~ Φ t = m max ( x ) x p

  14. Decoding Decoding the max-value the max-value start from the last eliminated variable ∗ at this point we have x i n can only have in its domain , x ∗ ← arg max ( x , x ) x ψ t = n −1 x ψ n −1 ∗ i n −1 i n i n −1 x in −1 i n −1 i n and so on... input: a set of factors (e.g. CPDs) t =0 Φ = { ϕ , … , ϕ } 1 K ~ max ( x ) = max x ∏ I ϕ ( x ) output: x p I I x , … , x go over in some order: i 1 i n collect all the relevant factors: Ψ = { ϕ ∈ Φ ∣ x t t ∈ Scope [ ϕ ]} i t calculate their product: ψ = ∏ ϕ ∈Ψ t ϕ t max-marginalize out : ′ x i t ψ = max ψ x it t t update the set of factors: t −1 ′ Φ = Φ t − Ψ + { ψ } t t return the product of scalars in as ~ Φ t = m max ( x ) x p

  15. Marginal-MAP Marginal-MAP variable elimination variable elimination max ϕ ( x ) the procedure remains similar for m ∑ x ,…, x n ∏ I y ,…, y I I 1 1 max and sum in do not commute max ϕ ( x , y ) ≠ max ϕ ( x , y ) x ∑ y ∑ y x

  16. Marginal-MAP Marginal-MAP variable elimination variable elimination max ϕ ( x ) the procedure remains similar for m ∑ x ,…, x n ∏ I y ,…, y I I 1 1 max and sum in do not commute max ϕ ( x , y ) ≠ max ϕ ( x , y ) x ∑ y ∑ y x cannot use arbitrary elimination order

  17. Marginal-MAP Marginal-MAP variable elimination variable elimination max ϕ ( x ) the procedure remains similar for m ∑ x ,…, x n ∏ I y ,…, y I I 1 1 max and sum in do not commute max ϕ ( x , y ) ≠ max ϕ ( x , y ) x ∑ y ∑ y x cannot use arbitrary elimination order first, eliminate (sum-prod VE) { x , … , x } 1 n

  18. Marginal-MAP Marginal-MAP variable elimination variable elimination max ϕ ( x ) the procedure remains similar for m ∑ x ,…, x n ∏ I y ,…, y I I 1 1 max and sum in do not commute max ϕ ( x , y ) ≠ max ϕ ( x , y ) x ∑ y ∑ y x cannot use arbitrary elimination order first, eliminate (sum-prod VE) { x , … , x } 1 n then eliminate (max-prod VE) { y , … , y } 1 m decode the maximizing value

  19. Marginal-MAP Marginal-MAP variable elimination variable elimination max ϕ ( x ) the procedure remains similar for m ∑ x ,…, x n ∏ I y ,…, y I I 1 1 max and sum in do not commute max ϕ ( x , y ) ≠ max ϕ ( x , y ) x ∑ y ∑ y x cannot use arbitrary elimination order first, eliminate (sum-prod VE) { x , … , x } 1 n then eliminate (max-prod VE) example: exponential complexity despite { y , … , y } 1 m low tree-width decode the maximizing value

  20. Max-product BP Max-product BP In clique-trees, cluster-graphs, factor-graph building the chordal graph building the clique-tree tree-width (complexity of inference) ... remains the same !

  21. Max-product BP Max-product BP In clique-trees, cluster-graphs, factor-graph building the chordal graph building the clique-tree tree-width (complexity of inference) ... remains the same ! main differences : replacing sum with max decoding the maximizing assignment variational interpretation

  22. Max-product BP Max-product BP ψ {1,2,4} ψ {3,5} Example factor-graph 1 ∏ I p ( x ) = ψ ( x ) I I Z x 1 x 5 x 2 x 3 x 4

  23. Max-product BP Max-product BP ψ {1,2,4} ψ {3,5} Example factor-graph 1 ∏ I p ( x ) = ψ ( x ) I I Z x 1 x 5 x 2 x 3 x 4 variable-to-factor message: ( x ) ∝ ∏ J ∣ i ∈ J , J ≠ I ( x ) δ δ i → I J → i i i

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend