anytime approximate inference in graphical models
play

Anytime Approximate Inference in Graphical Models Qi Lou Final - PowerPoint PPT Presentation

Anytime Approximate Inference in Graphical Models Qi Lou Final Defense Dec. 5, 2018 Committee: Alexander Ihler (Chair) Rina Dechter Sameer Singh 1 Core of This Thesis 2 Graphical Models Describe structure in large problems Large


  1. Anytime Approximate Inference in Graphical Models Qi Lou Final Defense Dec. 5, 2018 Committee: Alexander Ihler (Chair) Rina Dechter Sameer Singh 1

  2. Core of This Thesis 2

  3. Graphical Models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence • More formally: A graphical model consists of: -- variables (we’ll assume discrete) -- domains -- (non- negative) functions or “factors” • Example: A B f(A,B) B C f(B,C) A B 0 0 0.24 0 0 0.12 … 0 1 0.56 0 1 0.36 1 0 1.1 1 0 0.3 C 1 1 1.2 1 1 1.8 3

  4. Graphical Models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence • Examples & Tasks – Maximization ( MAP ): compute the most probable configuration [Yanover & Weiss 2002] 4

  5. Graphical Models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence • Examples & Tasks – Summation & marginalization “ partition function ” and Observation y Marginals p( x i | y ) Observation y Marginals p( x i | y ) sky cow plane grass grass e.g., [Plath et al. 2009] 5

  6. Graphical Models • Describe structure in large problems – Large complex system – Made of “smaller”, “local” interactions – Complexity emerges through interdependence • Examples & Tasks – Mixed inference (marginal MAP, MEU, …) Test Drill Oil Test cost cost sales Influence diagrams & optimal decision-making Oil sale Test Oil Drill result produced policy (the “oil wildcatter” problem) e.g., [Raiffa 1968; Shachter 1986] Oil Market Seismic Sales underground information structure cost 6

  7. Inference Queries/Tasks • Maximum A Posteriori (MAP) NP-hard in general • The Partition Function #P-complete [Valiant 1979]) • Marginal MAP (MMAP) NP PP (decision version) [Park 2002]) 7

  8. Desired Properties: Guarantee, Anytime, Anyspace Bounded error time • Anytime – valid solution at any point – solution quality improves with additional computation • Anyspace – run with limited memory resources 8

  9. Approximate inference • Three major paradigms Variational methods Reason over small subsets of variables at a time Sampling Search Use randomization to estimate Structured enumeration over averages over the state space all possible states 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 9

  10. Approximate inference • Three major paradigms Variational methods – Variational methods (e.g., tree- Reason over small subsets of variables at a time reweighted belief propagation [Wainwright et al. 2003]), mini- bucket elimination [Dechter & Rish] 2001). Sampling Search Use randomization to estimate Structured enumeration over averages over the state space all possible states 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 10

  11. Approximate inference • Three major paradigms Variational methods – (Monte Carlo) Sampling (e.g., Reason over small subsets of variables at a time importance sampling based (e.g., [Bidyuk & Dechter 2007]), approximate hash-based counting (e.g., [Chakraborty et al. 2016])). Sampling Search Use randomization to estimate Structured enumeration over averages over the state space all possible states 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 11

  12. Approximate inference • Three major paradigms Variational methods – (Heuristic) Search (e.g., [Lou et al. Reason over small subsets of variables at a time 2017], [Viricel et al. 2016], [Henrion 1991]). Sampling Search Use randomization to estimate Structured enumeration over averages over the state space all possible states 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 12

  13. Main Contributions of This Thesis 13

  14. Chapter 3: Best-first Search Aided by Variational Heuristics Variational methods provide pre-compiled heuristics Search 0 AND/OR best-first search (AOBFS) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 unified best-first search (UBFS) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 14

  15. Search Trees and Summation • Organize / structure the state space – Leaf nodes = model configurations – “Value” of a node = sum of configurations below A 0 0 1 B 0 1 1 0 1 C 0 1 0 1 1 0 1 0 1 D 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 1 E 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 15

  16. Search Trees and Summation • Heuristic search for summation – Heuristic function upper bounds value (sum below) at any node – Expand tree and compute updated bounds A 0 1 B 0 1 C 0 1 16

  17. AND/OR Best-first Search (AOBFS) AND/OR search tree search space best-first search priority heuristic potentially reduce the weighted mini-bucket bound gap U – L on Z most 17

  18. AND/OR Search Trees [Nillson 1980, Dechter and Mateescu 2007] F G A B E (full) solution tree : corresponds D C OR A to a complete configuration of all AND variables 1 0 OR B B AND 1 0 1 0 OR C F F F F C C C AND 0 0 0 0 0 1 1 1 1 1 0 1 0 1 0 1 OR G G G G G G G E D E D E D G E D E D E D E D E D AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 18

  19. weighted mini-bucket (WMB) Heuristics [Liu and Ihler , ICML’11] …  Formed by intermediately generated A f(A) factors (called messages, e.g., ) λ D (A) λ B (A)  Upper (or lower) bound of the node f(A,B) B value. λ F (A,B) λ C (B) λ D (A)  Monotonic: Resolving relaxations using C f(B,C) F f(B,F) search makes heuristics more (no less) λ D (B,C) λ E (B,C) λ G (A,F) accurate.  Quality can be roughly controlled by f(B,E) f(A,G) G f(B,D) f(A,D) f(C,E) f(F,G) the ibound. f(C,D) 19

  20. Priority  Intuition: expand the frontier node A that potentially reduces the bound gap U – L ( L<=Z<=U ) most 1 0 B gap priority B 0 1 0 1 upper priority C F 0 1 0 1 20

  21. Overcome The Memory Limit • Main strategy (SMA*-like A [Russell 1992]) 1 0 – Keep track of the lowest-priority B B node as well 0 1 0 1 – When reach the memory limit, delete the lowest-priority nodes, F C C F and keep expanding the top- priority ones 0 0 1 0 1 0 1 1 21

  22. Anytime Behavior of AOBFS (a) PIC’11/queen5_5_4 (b) Protein/1g6x 22

  23. Aggregated Results • Number of instances solved to “ tight ” tolerance interval. The best (most solved) for each setting is bolded . 23

  24. Best-first Search Aided by Variational Heuristics Variational weighted mini-bucket (WMB) methods [Liu and Ihler , ICML’11] provide optimized heuristics Search 0 AND/OR best-first search (AOBFS) for Z 0 1 0 1 0 1 0 1 0 1 0 1 0 1 unified best-first search (UBFS) for marginal MAP 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 25

  25. Unified Best-first Search (UBFS) • Idea: unify max- and sum- inference in one search framework – avoids some unnecessary exact evaluation of conditional summation problems • Principle: focus on reducing the upper bound of MMAP as quickly as possible • How it works: – Track the current most promising (partial) MAP configuration, i.e., one with the highest upper bound – Expand the most “influential” frontier node of that (partial) MAP configuration • Frontier node that contributes most to its upper bound • Identified by a specially designed “double - priority” system 26

  26. 27

  27. 28

  28. 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend