Bayesian Belief Network 14.4 Inference Decision Theoretic Agents - PDF document

RN, Chapter Bayesian Belief Network 14.4 Inference

Decision Theoretic Agents � Introduction to Probability [Ch13] � Belief networks [Ch14] � Introduction [Ch14.1-14.2] � Bayesian Net Inference [Ch14.4] (Bucket Elimination) � Dynamic Belief Networks [Ch15] � Single Decision [Ch16] � Sequential Decisions [Ch17] Game Theory [Ch17.6 – 17.7] � 2

Types of Reasoning � Typical case: P( QueryVar | EvidenceVars = vals ) � Eg: P( + Burglary | + JohnCalls, ¬ MaryCalls ) � Diagnostic : from effect to (possible) causes P( + Burglary | + JohnCalls ) = 0.016 � � Causal : from cause to effects P( + JohnCalls | + Burglary ) = 0.86 � � I nterCausal : between causes of common effect P( + Burglary | + Alarm ) = 0.376 � P(+ Burglary | + Alarm, + Earthquake ) = 0.003 � Earthquake EXPLAINS alarms, and so Earthquake EXPLAI NS AWAY burglary � Mixed : combinations of . . . P( Alarm | JohnCall, ¬ Earthquake ) = 0.03 3 �

Approaches to Belief Assessment � Exact, Guaranteed � PolyTree Algorithm � Inherent complexity. . . � Clustering Approach � Bucket Elimination � CutSet Approach Approximate, Guaranteed � � Algorithm Modification � Value Merging � Node Merging � Arc Removal Approximate, Probabilistic � � Logic Sampling � Likelihood Sampling 5

Inherent Complexity 1. A v B v C � Worst case: 2. C v D v ~ A 3. B v C v ~ D � NP-hard to get exact answer (# P-complete) � NP-hard to get answer within 0.5 � Cannot get relative error within 2 n1- ε unless P = NP � Cannot stochastically approximate 1-bit, unless P= RP � Efficient algorithm . . . � for “PolyTree”: Poly time � ≤ 1 path between any two nodes � if CPtable “bounded” (sub-exp time) wrt λ = M/m M = largest CPtable entry; m = smallest 11

Exact Inference: Re-arrange Sums ) b = B , a = A ∑ ( P = b ) a = A ( P P(+ b, + j, + m ) = ∑ e ∑ a P(+ b, E= e, A= a, + j, + m) = ∑ e ∑ a P(+ b) P(e) P(a|+ b,e) P(+ j|a) P(+ m|a) = P(+ b) ∑ e P(e) ∑ a P(a|+ b,e) P(+ j|a) P(+ m|a) 15

Still Duplicated Computation! P( + b, + j, + m ) = P(+ b ) ∑ e P( e ) ∑ a P( a | + b, e ) P(+ j | a ) P(+ m | a ) � Enumeration is inefficient: ... as repeated computation Computes P(+ j | a )P(+ m | a ) for each value of E: { + e, – e } � Better to have DAG… re-use COMMON SUBEXPRESSION ! 16

Bucket-Elimination : Set-up θ A= 1 θ A= 0 A 0.4 0.6 θ B= 1|A= a θ B= 0|A= a a C 1 0.325 0.675 B 0 0.440 0.550 θ C= 1|A= a θ C= 0|A= a a � Given 1 0.200 0.800 0 0.367 0.633 D � specific structure θ D= 1|B= b,C= θ D= 0|B= b,C= b c c c � specific CPtable entries 1 1 0.300 0.700 1 0 0.333 0.667 0 1 0.250 0.750 � Fixed ordering over variables: 0 0 0.450 0.550 π 0 = 〈 A,B,C,D 〉 � Create |Vars|+ 1 buckets � b { } , b A , b B , b C , b D 24

b f(b) e f(e) (b) = λ 〈 b 〉 . f B 0 0.999 (e) = λ 〈 e 〉 . f E 0 0.998 1 0.001 1 0.002 a e b f(a, e, b) 1 1 1 0.95 (a,e,b) = λ 〈 A,E,B 〉 . f A 1 1 0 0.29 : : : : 0 0 1 0.06 0 0 0 0.999 j a f(j,a) m a f(m,a) 1 1 0.90 1 1 0.70 (j,a) = λ 〈 J, A 〉 . f J 1 0 0.05 (m,a) = λ 〈 M, A 〉 . f M 1 0 0.01 0 1 0.10 0 1 0.30 0 0 0.95 0 0 0.99 –b, + j, + m 25

b f(b) e f(e) () = λ 〈〉 . f -b 0 0.999 (e) = λ 〈 e 〉 . f E 0 0.998 1 0.001 1 0.002 a e b f(a, e, b) 1 1 1 0.95 (a,e) = λ 〈 A,E 〉 . f A,-b 1 1 0 0.29 : : : : 0 0 1 0.06 0 0 0 0.999 j a f(j,a) m a f(m,a) 1 1 0.90 1 1 0.70 (a) = λ 〈 A 〉 . f + j 1 0 0. 05 (a) = λ 〈 A 〉 . f + m 1 0 0. 01 0 1 0.10 0 1 0.30 0 0 0.95 0 0 0.99 –b, + j, + m 26

b f(-b) e f(e) () = λ 〈〉 . f -b 0 0.999 (e) = λ 〈 e 〉 . f E 0 0.998 1 0.002 a e f(a, e, -b) (a,e) = λ 〈 A,E 〉 . f A,-b 1 1 0.29 : : : 0 0 0.999 a f(+ j,a) a f(+ m,a) 1 0.90 1 0.70 (a) = λ 〈 A 〉 . f + j 0 0.05 (a) = λ 〈 A 〉 . f + m 0 0.01 b { } b nil b B b B b E b E b A b A b J b J b M b M () = θ -b (e) = θ e (a,e) = θ a| -b,e f { } ,1 f E,1 f A,1 - - - (a) = θ + j|a f A,2 (a) = θ + m|a f A,3 27

“Variable Elimination”: Factors P( -b, + j, + m ) = P(-b ) ∑ e P( e ) ∑ a P( a | -b, e ) P(+ j | a ) P(+ m | a ) B E A J M � Store intermediate results (factors) to avoid recomputation � Factor for M: 2-element vector � Factor for J: � Factor for A: ≡ 4-element vector 28

BE Alg, con’t Process buckets, from highest to lowest � g X := elim X [ f X,1 ⋈ f X,2 ⋈ … ⋈ f X,k ] � g x is function of ∪ i Vars( f X,I ) – { X} � Process b A � Let highest index by “Y” g A (e) = elim A [ f A,1 ⋈ f A,2 ⋈ f A,3 ] Store g X into b Y � add to b E … � b { } b B b E b A b J b M () = θ -b (e) = θ e (a,e) = θ a|-b,e f { } ,1 f E,1 f A,1 - - - (a) = θ + j|a f A,2 (a) = θ + m|a f A,3 [ f A,1 ⋈ f A,2 ⋈ f E,2 (e) = elim A f A,3 ] 30

BE Alg, con’t Process buckets, from highest to lowest � g X := elim X [ f X,1 ⋈ f X,2 ⋈ … ⋈ f X,k ] � g x is function of ∪ i Vars( f X,I ) – { X} � Process b E � Let highest index by “Y” g E () = elim E [ f E,1 ⋈ f E,2 ] Store g X into b Y � add to b nill … � b nil b B b E b A b J b M () = θ -b (e) = θ e (a,e) = θ a|-b,e f { } ,1 f E,1 f A,1 - - - (a) = θ + j|a f E,2 (e) = … f A,2 (a) = θ + m|a f A,3 [ f E,1 ⋈ f { } ,2 () = elim E f E,2 ] 33

BE Alg, con’t Process buckets, from highest to lowest � g X := elim X [ f X,1 ⋈ f X,2 ⋈ … ⋈ f X,k ] � g x is function of ∪ i Vars( f X,I ) – { X} � Process b { } � Let highest index by “Y” g { } () = [ f { } ,1 ⋈ f { } ,2 ] Store g X into b Y � Return g { } l … � b { } b B b E b A b J b M () = θ -b (e) = θ e (a,e) = θ a|-b,e f { } ,1 f E,1 f A,1 - - - (a) = θ + j|a f { } ,2 () = … f E,2 (e) = … f A,2 (a) = θ + m|a f A,3 Return f { } ,1 ⋈ f { } ,2 34

Bucket Elimination Algorithm Given : � Belief Net BN = 〈 N, A, C 〉 � Order of nodes π = 〈 X 1 , … , X |N| 〉 � Evidence (nodes { E i } ⊂ N, values { e i } ) � (Single) Query node X ∈ N Compute: P(X | E 1 = e 1 , … ) by computing , … ) ∀ P(X = x, E 1 = e 1 x � Step# 1: Initialize |N| + 1 “buckets” � . . . bucket b i for variable X i � Each “instantiated form of CPtables" is function of variables � Store in bucket with highest index � Step# 2: Process each bucket � . . . from highest index down � to eliminate associated variable � Step# 3: Read off answer 35 � . . . in “top” bucket, b { }

Remove “Dead Variables” P(+ b, + j ) = = ∑ e ∑ a ∑ m P(+ b, E= e, A= a, + j, M= m) = ∑ e ∑ a ∑ m P(+ b) P(E= e) P(a|+ b,e) P(+ j|a) P(m|a) = P(+ b) ∑ e P(e) ∑ a P(a|+ b,e) P(+ j|a) ∑ m P(m|a) � Note for any A= a, ∑ m P( M= m | a ) = 1 ⇒ can remove this node! � In general: need to keep only nodes ABOVE query, evidence notes (Remove any nodes below) 36

Approaches to Belief Assessment � Exact, Guaranteed � PolyTree Algorithm � Inherent complexity. . . � Clustering Approach � Bucket Elimination � CutSet Approach Approximate, Guaranteed � � Algorithm Modification � Value Merging � Node Merging � Arc Removal Approximate, Probabilistic � � Logic Sampling � Likelihood Sampling 46

Logic Sampling + What is P( WG = + ) ? + � Get DataSample � Of 5 tuples, 2 have WG = + Set P( WG= + ) = 2/5 � But … how to generate examples? A � Uniform?? No! � What is P(+ a, -b) ? a P(+ b|a) + 1.0 � Based on distribution!! B - 0.0 47

Example of Logic Sampling To get value of “Cloudy”: Flip 0.5-coin � Assume “Cloudy = True” To get value of “Sprinkler”: Flip 0.1-coin � (as Cloudy = True, P( + s | + c ) = 0.10) Assume “Sprinkler = False” To get value of “Rain”: Flip 0.8-coin � (as Cloudy = True, P( + r | + c ) = 0.8) C C S S R R W W Assume “Rain = True” + T 0 F + T + T + + 0 + To get value of “WetGrass”: Flip 0.9-coin � 0 0 + 0 (as Sprinkler = F, Rain = T, P( + w | ¬ s, + r ) = 0.9) + + 0 + Assume “WetGrass = True” On other trials, get other results, as different results of coin-flips � 48

Stochastic Approximation 1: Logic Sampling � To estimate P(X | E = e ) : � To produce random instance from BN: PriorSample � Note: if E ≠ e, just ignore instance 49

Aside: Flipping A Coin � Consider flipping a (fair) coin m times. … expect to observe ≈ 0.5 m heads � Could have “bad run” ... suggesting coin is not fair. � How (un)likely to observe ≥ 55% heads? (10% more than expected) � ... as function of m : What's probability of � (1) m = 100, ≥ 55 heads � (2) m = 500, ≥ 275 heads � (3) m = 1000, ≥ 550 heads � (4) m = 10,000, ≥ 5,500 heads ? 50

Using Chernoff Bounds X i 's are iid… for now, with μ = 0.5 � � Prob of S m > 0.55 is < e -2 m 0.05^ 2 m = 100 ⇒ < 0.6 m = 500 ⇒ < 0.08 m = 1,000 ⇒ < 0.007 m = 10,000 ⇒ < 10 -22 51

Bayesian Belief Network 14.4 Inference Decision Theoretic Agents - PDF document

RN, Chapter Bayesian Belief Network 14.4 Inference Decision Theoretic Agents Introduction to Probability [Ch13] Belief networks [Ch14] Introduction [Ch14.1-14.2] Bayesian Net Inference [Ch14.4] (Bucket Elimination) Dynamic

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Bayesian Belief Networks Decision Theoretic Agents Introduction to Probability [Ch13]

Belief Networks Some Belief Network references E. Charniak Bayesian Networks without

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Inference in Belief Networks CMPUT 366: Intelligent Systems P&M 8.4 Lecture Outline

Building a Bayesian Network 223 / 385 The construction of a Bayesian network Construction of a

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Belief Decision Behavior: Theory and Evidence Todd Davies Belief Concepts Proposition

Belief and assertion. Evidence from mood shift Alda Mari Institut Jean Nicod , cnrs/ens/ehess/psl

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

This time... Bayesian Net Belief Propagation Algorithm LDPC/IRA Codes S. Cheng (OU-Tulsa)

1 CSE 6242 Fall 15 Capstone Project Team Advisor Matt Garvey Dr. Polo Chau Nilaksh Das

Bayesian networks Chapter 14.13 Chapter 14.13 1 Outline Syntax Semantics

AI Large Practical: Assignment 2 Ewan Klein School of Informatics Oct 29 2014 Ewan Klein

AI Large Practical: Assignment 3 ctd Alan Smaill School of Informatics Nov 16 2016 Alan Smaill

Bayesian Networks Lab Andrea Passerini and Paolo Dragone Machine Learning BN Lab The software

Directed Graphical Models: Bayesian Networks Probabilistic Graphical Models Sharif University of

rSMILE, an interface to the Bayesian Network package GeNIe/SMILE Roman Klinger, Christoph M.

Homers Iliad Book 24 The Ransom of Hectors Body Clst 181SK Ancient Greece and the

Sambuz

Useful Links

Newsletter

Mail Us

Bayesian Belief Network 14.4 Inference Decision Theoretic Agents - PDF document

RN, Chapter Bayesian Belief Network 14.4 Inference Decision Theoretic Agents Introduction to Probability [Ch13] Belief networks [Ch14] Introduction [Ch14.1-14.2] Bayesian Net Inference [Ch14.4] (Bucket Elimination) Dynamic

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Bayesian Belief Networks Decision Theoretic Agents Introduction to Probability [Ch13]

Belief Networks Some Belief Network references E. Charniak Bayesian Networks without

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Inference in Belief Networks CMPUT 366: Intelligent Systems P&amp;M 8.4 Lecture Outline

Building a Bayesian Network 223 / 385 The construction of a Bayesian network Construction of a

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Belief Decision Behavior: Theory and Evidence Todd Davies Belief Concepts Proposition

Belief and assertion. Evidence from mood shift Alda Mari Institut Jean Nicod , cnrs/ens/ehess/psl

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

This time... Bayesian Net Belief Propagation Algorithm LDPC/IRA Codes S. Cheng (OU-Tulsa)

1 CSE 6242 Fall 15 Capstone Project Team Advisor Matt Garvey Dr. Polo Chau Nilaksh Das

Bayesian networks Chapter 14.13 Chapter 14.13 1 Outline Syntax Semantics

AI Large Practical: Assignment 2 Ewan Klein School of Informatics Oct 29 2014 Ewan Klein

AI Large Practical: Assignment 3 ctd Alan Smaill School of Informatics Nov 16 2016 Alan Smaill

Bayesian Networks Lab Andrea Passerini and Paolo Dragone Machine Learning BN Lab The software

Directed Graphical Models: Bayesian Networks Probabilistic Graphical Models Sharif University of

rSMILE, an interface to the Bayesian Network package GeNIe/SMILE Roman Klinger, Christoph M.

Homers Iliad Book 24 The Ransom of Hectors Body Clst 181SK Ancient Greece and the

Sambuz

Useful Links

Newsletter

Mail Us

Inference in Belief Networks CMPUT 366: Intelligent Systems P&M 8.4 Lecture Outline