probability recap cs 188 artificial intelligence
play

Probability recap CS 188: Artificial Intelligence Conditional - PDF document

Probability recap CS 188: Artificial Intelligence Conditional probability Product rule Review of Probability, Bayes nets Chain rule X, Y independent iff: DISCLAIMER: It is insufficient to simply study these slides,


  1. Probability recap CS 188: Artificial Intelligence § Conditional probability § Product rule Review of Probability, Bayes’ nets § Chain rule § X, Y independent iff: DISCLAIMER: It is insufficient to simply study these slides, equivalently, iff: ∀ x, y : P ( x | y ) = P ( x ) they are merely meant as a quick refresher of the high-level ideas covered. You need to study all materials covered in equivalently, iff: ∀ x, y : P ( y | x ) = P ( y ) lecture, section, assignments and projects ! § X and Y are conditionally independent given Z iff: equivalently, iff: Pieter Abbeel – UC Berkeley ∀ x, y, z : P ( x | y, z ) = P ( x | z ) Many slides adapted from Dan Klein equivalently, iff: ∀ x, y, z : P ( y | x, z ) = P ( y | z ) 2 Inference by Enumeration Bayes’ Nets Recap § Representation § P(sun)? S T W P § Chain rule -> Bayes’ net = DAG + CPTs summer hot sun 0.30 § Conditional Independences summer hot rain 0.05 § P(sun | winter)? summer cold sun 0.10 § D-separation summer cold rain 0.05 winter hot sun 0.10 § Probabilistic Inference winter hot rain 0.05 § Enumeration (exact, exponential complexity) winter cold sun 0.15 § P(sun | winter, hot)? § Variable elimination (exact, worst-case winter cold rain 0.20 exponential complexity, often better) § Probabilistic inference is NP-complete 3 4 § Sampling (approximate) Chain Rule à Bayes net Probabilities in BNs § Chain rule: can always write any joint distribution as an § Bayes ’ nets implicitly encode joint distributions incremental product of conditional distributions § As a product of local conditional distributions § To see what probability a BN gives to a full assignment, multiply all the relevant conditionals together: § Example: § Bayes nets: make conditional independence assumptions of the form: B E P ( x i | x 1 · · · x i − 1 ) = P ( x i | parents ( X i )) A giving us: § This lets us reconstruct any entry of the full joint J M § Not every BN can represent every joint distribution § The topology enforces certain conditional independencies 5 6 1

  2. Example: Alarm Network Size of a Bayes’ Net for E P(E) B P(B) § How big is a joint distribution over N Boolean variables? B urglary E arthqk +e 0.002 +b 0.001 2 N ¬ e 0.998 ¬ b 0.999 § Size of representation if we use the chain rule A larm 2 N B E A P(A|B,E) § How big is an N-node net if nodes have up to k parents? +b +e +a 0.95 J ohn M ary O(N * 2 k+1 ) +b +e ¬ a 0.05 calls calls +b ¬ e +a 0.94 A J P(J|A) A M P(M|A) +b ¬ e ¬ a 0.06 § Both give you the power to calculate +a +j 0.9 +a +m 0.7 ¬ b +e +a 0.29 § BNs: ¬ b +e ¬ a 0.71 +a ¬ j 0.1 +a ¬ m 0.3 § Huge space savings! ¬ a +j 0.05 ¬ a +m 0.01 ¬ b ¬ e +a 0.001 § Easier to elicit local CPTs ¬ a ¬ j 0.95 ¬ a ¬ m 0.99 ¬ b ¬ e ¬ a 0.999 § Faster to answer queries 8 Bayes Nets: Assumptions D-Separation § Assumptions made by specifying the graph: § Question: Are X and Y Active Triples Inactive Triples conditionally independent given evidence vars {Z}? P ( x i | x 1 · · · x i − 1 ) = P ( x i | parents ( X i )) § Yes, if X and Y “ separated ” by Z § Consider all (undirected) paths § Given a Bayes net graph additional conditional from X to Y independences can be read off directly from the graph § No active paths = independence! § Question: Are two nodes guaranteed to be independent given § A path is active if each triple certain evidence? is active: § If no, can prove with a counter example § Causal chain A → B → C where B is unobserved (either direction) § I.e., pick a set of CPT’s, and show that the independence § Common cause A ← B → C assumption is violated by the resulting distribution where B is unobserved § If yes, can prove with § Common effect (aka v-structure) A → B ← C where B or one of its § Algebra (tedious) descendents is observed § D-separation (analyzes graph) § All it takes to block a path is 9 a single inactive segment D-Separation Example ? § Given query ⊥ X j |{ X k 1 , ..., X k n } X i ⊥ L § Shade all evidence nodes Yes § For all (undirected!) paths between and R B Yes § Check whether path is active § If active return ⊥ X j |{ X k 1 , ..., X k n } X i ⊥ D T § (If reaching this point all paths have been checked and shown inactive) Yes T ’ X i ⊥ ⊥ X j |{ X k 1 , ..., X k n } § Return 11 12 2

  3. All Conditional Independences Topology Limits Distributions Y Y X Z § Given a Bayes net structure, can run d- § Given some graph { X ⊥ ⊥ Y, X ⊥ ⊥ Z, Y ⊥ ⊥ Z, topology G, only certain X Z X ⊥ ⊥ Z | Y, X ⊥ ⊥ Y | Z, Y ⊥ ⊥ Z | X } separation to build a complete list of joint distributions can Y { X ⊥ ⊥ Z | Y } be encoded conditional independences that are X Z § The graph structure necessarily true of the form guarantees certain Y (conditional) independences X Z ⊥ X j |{ X k 1 , ..., X k n } X i ⊥ § (There might be more {} independence) § This list determines the set of probability § Adding arcs increases Y Y Y the set of distributions, distributions that can be represented by X Z X Z X Z but has several costs Bayes’ nets with this graph structure § Full conditioning can Y Y Y encode any distribution 13 14 X Z X Z X Z Inference by Enumeration Example: Enumeration § In this simple method, we only need the BN to § Given unlimited time, inference in BNs is easy synthesize the joint entries § Recipe: § State the marginal probabilities you need § Figure out ALL the atomic probabilities you need § Calculate and combine them § Example: B E A J M 15 16 Variable Elimination Variable Elimination Outline § Track objects called factors § Why is inference by enumeration so slow? § Initial factors are local CPTs (one per node) R § You join up the whole joint distribution before you sum out the hidden variables +r ¡ 0.1 ¡ +r ¡ +t ¡ 0.8 ¡ +t ¡ +l ¡ 0.3 ¡ § You end up repeating a lot of work! -­‑r ¡ 0.9 ¡ +r ¡ -­‑t ¡ 0.2 ¡ +t ¡ -­‑l ¡ 0.7 ¡ T -­‑r ¡ +t ¡ 0.1 ¡ -­‑t ¡ +l ¡ 0.1 ¡ -­‑r ¡ -­‑t ¡ 0.9 ¡ -­‑t ¡ -­‑l ¡ 0.9 ¡ § Idea: interleave joining and marginalizing! § Any known values are selected L § Called “ Variable Elimination ” § E.g. if we know , the initial factors are § Still NP-hard, but usually much faster than inference by enumeration +r ¡ 0.1 ¡ +r ¡ +t ¡ 0.8 ¡ +t ¡ +l ¡ 0.3 ¡ -­‑r ¡ 0.9 ¡ +r ¡ -­‑t ¡ 0.2 ¡ -­‑t ¡ +l ¡ 0.1 ¡ -­‑r ¡ +t ¡ 0.1 ¡ -­‑r ¡ -­‑t ¡ 0.9 ¡ 17 § VE: Alternately join factors and eliminate variables 18 3

  4. Variable Elimination Example Variable Elimination Example T +r ¡ 0.1 ¡ T, L L Sum out R Join R Join T Sum out T -­‑r ¡ 0.9 ¡ L +r ¡ +t ¡ 0.08 ¡ R +t ¡ 0.17 ¡ +r ¡ +t ¡ 0.8 ¡ +r ¡ -­‑t ¡ 0.02 ¡ -­‑t ¡ 0.83 ¡ +r ¡ -­‑t ¡ 0.2 ¡ -­‑r ¡ +t ¡ 0.09 ¡ +t ¡ 0.17 ¡ -­‑r ¡ +t ¡ 0.1 ¡ -­‑r ¡ -­‑t ¡ 0.81 ¡ -­‑t ¡ 0.83 ¡ T T +t ¡ +l ¡ 0.051 ¡ -­‑r ¡ -­‑t ¡ 0.9 ¡ R, T +l ¡ 0.134 ¡ +t ¡ -­‑l ¡ 0.119 ¡ -­‑l ¡ 0.886 ¡ -­‑t ¡ +l ¡ 0.083 ¡ +t ¡ +l ¡ 0.3 ¡ L L -­‑t ¡ -­‑l ¡ 0.747 ¡ +t ¡ +l ¡ 0.3 ¡ +t ¡ +l ¡ 0.3 ¡ +t ¡ +l ¡ 0.3 ¡ +t ¡ -­‑l ¡ 0.7 ¡ L +t ¡ -­‑l ¡ 0.7 ¡ +t ¡ -­‑l ¡ 0.7 ¡ +t ¡ -­‑l ¡ 0.7 ¡ -­‑t ¡ +l ¡ 0.1 ¡ -­‑t ¡ +l ¡ 0.1 ¡ -­‑t ¡ +l ¡ 0.1 ¡ -­‑t ¡ +l ¡ 0.1 ¡ -­‑t ¡ -­‑l ¡ 0.9 ¡ -­‑t ¡ -­‑l ¡ 0.9 ¡ -­‑t ¡ -­‑l ¡ 0.9 ¡ -­‑t ¡ -­‑l ¡ 0.9 ¡ 19 * VE is variable elimination Example Example Choose E Choose A Finish with B Normalize 21 22 Another (bit more abstractly worked General Variable Elimination out) Variable Elimination Example § Query: § Start with initial factors: § Local CPTs (but instantiated by evidence) § While there are still hidden variables (not Q or evidence): § Pick a hidden variable H § Join all factors mentioning H § Eliminate (sum out) H § Join all remaining factors and normalize Computational complexity critically depends on the largest factor being generated in this process. Size of factor = number of entries in table. In example above (assuming binary) all factors generated are of size 2 --- as 23 24 they all only have one variable (Z, Z, and X3 respectively). 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend