Introduction to Bayesian Belief Nets Russ Greiner Dept of - PowerPoint PPT Presentation

Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta http://www.cs.ualberta.ca/~ greiner/bn.html

Motivation � Gates says [LATimes, 28/Oct/96]: Microsoft’s competitive advantages is its expertise in “Bayesian networks” � Current Products � Microsoft Pregnancy and Child Care (MSN) � Answer Wizard (Office, …) � Print Troubleshooter Excel Workbook Troubleshooter Office 95 Setup Media Troubleshooter Windows NT 4.0 Video Troubleshooter Word Mail Merge Troubleshooter 3

Motivation (II) � US Army: SAI P ( Battalion Detection from SAR, IR… GulfWar) � NASA: Vista (DSS for Space Shuttle) � GE: Gems (real-time monitor for utility generators) � Intel: (infer possible processing problems from end-of-line tests on semiconductor chips) � KIC: � medical: sleep disorders, pathology, trauma care, hand and wrist evaluations, dermatology, home- based health evaluations � DSS for capital equipment : locomotives, gas- turbine engines, office equipment 4

Motivation (III) � Lymph-node pathology diagnosis � Manufacturing control � Software diagnosis � Information retrieval � Types of tasks � Classification/Regression � Sensor Fusion � Prediction/Forecasting 5

Outline � Existing uses of Belief Nets (BNs) � What is a BN ? � Specific Examples of BNs � Contrast with Rules, Neural Nets, … � Possible applications of BNs � Challenges � How to reason efficiently � How to learn BNs 6

Symptoms Symptoms Blah blah ouch yak Chief complaint ouch blah ouch blah History, … blah ouch blah Signs Signs Physical Exam Test results, … Plan Plan Diagnosis Treatment, … 7

Objectives: Decision Support System � Determine � which tests to perform � which repair to suggest based on costs, sensitivity/specificity , … � Use all sources of information � symbolic (discrete observations, history, …) � signal (from sensors) � Handle partial information � Adapt to track fault distribution 8

Underlying Task � Situation: Given observations { O 1 = v 1 , … O k = v k } (symptoms, history, test results, …) what is best DIAGNOSIS Dx i for patient? : Use set of obs 1 & … & obs m → Dx i rules Approach1 : � Approach1 � but… Need rule for each situation � for each diagnosis Dx r � for each set of possible values v j for O j � for each subset of obs. {O x1 , O x2 , … } ⊂ {O j } Can’t use If Temp>100 & BP = High & Cough = Yes → DiseaseX if only know Temp and BP � Seldom Completely Certain 9

Underlying Task � Situation: Given observations { O 1 = v 1 , … O k = v k } (symptoms, history, test results, …) what is best DIAGNOSIS Dx i for patient? � Approach 2 � Approach 2 : Compute Probabilities of Dx i given observations { obs j } P ( Dx = u | O 1 = v 1 , …, O k = v k ) � Challenge: How to express Probabilities? 10

How to deal with Probabilities � Sufficient: “atomic events”: P ( Dx = u, O 1 =v 1 ,..., O k = v k ,…, O N =v N ) for all 2 1+ N values u ∈ { T, F} , v j ∈ { T, F} P ( Dx=T, O 1 =T, O 2 =T, …, O N =T ) = 0.03 P ( Dx=T, O 1 =T, O 2 =T, …, O N =F ) = 0.4 ⇒ … P ( Dx=T, O 1 =F, O 2 =F, … , O N =T ) = 0 … P ( Dx=F, O 1 =F, O 2 =F, …, O N =F ) = 0.01 � Then: Marginalize : ∑ = = = = = = = = P Dx ( u O , v ,... O v ) P Dx ( u O , v ,... O v ,... O v ) 1 1 7 7 1 1 7 7 N N v ,... v 8 N Conditionalize : = = = P Dx ( u O , v ,... O v ) = = = = 1 1 7 7 P Dx ( u O | v ,... O v ) = = 1 1 7 7 P O ( v ,... O v ) 1 1 7 7 • But… even if binary Dx, 20 binary obs.’s. ⇒ > 2,097,000 numbers! 11

Problems with “Atomic Events” Representation is not intuitive � ⇒ Should make “connections” explicit use “local information” P (Jaundice | Hepatitis), P (LightDim | BadBattery),… � Too many numbers – O (2 N ) � Hard to store � Hard to use [Must add 2 r values to marginalize r variables] � Hard to learn [Takes O( 2 N ) samples to learn 2 N parameters] ⇒ Include only necessary “connections” Belief Nets ⇒ 12

13 but +BloodTest not Jaunticed ? Hepatitis? ? Hepatitis, ? BloodTest Jaunticed

Encoding Causal Links � Simple Belief Net: P(H=1) P(H=0) H 0.05 0.95 h P(B=1 | H=h) P(B=0 | H=h) B 1 0.95 0.05 0 0.03 0.97 h b P(J=1|h,b) P(J=0|h,b) J 1 1 0.8 0.2 � Node ~ Variable 1 0 0.8 0.2 0 1 0.3 0.7 Link ~ “Causal dependency” 0 0 0.3 0.7 � “CPTable” ~ P(child | parents) 14

H “Factoring” B � B does depend on J: J If J=1, then likely that H=1 ⇒ B =1 � but… ONLY THROUGH H: � If know H=1, then likely that B=1 � … doesn’t matter whether J=1 or J=0 ! ⇒ P(J=0 | B=1, H=1) = P(J=0 | H=1) N.b., B and J ARE correlated a priori P(J | B ) ≠ P(J) GIVEN H, they become uncorrelated P(J | B, H) = P(J | H) 19

Factored Distribution � Symptoms independent, given Disease ≠ P ( B | J ) P ( B ) but H Hepatitis P ( B | J,H ) = P ( B | H ) J Jaundice B (positive) Blood test � ReadingAbility and ShoeSize are dependent, P (ReadAbility | ShoeSize ) ≠ P (ReadAbility ) but become independent, given Age P (ReadAbility | ShoeSize, Age ) = P (ReadAbility | Age) Age ShoeSize Reading 20

“Naïve Bayes” � Classification Task: Given { O 1 = v 1 , …, O n = v n } Find h i that maximizes (H = h i | O 1 = v 1 , …, O n = v n ) P(H = h i ) H P(O j = v j | H = h i ) � Given ... O 1 O 2 O n Independent: P(O j | H, O k ,…) = P(O j | H) 1 ∏ = = = = = = = P ( H h | O v ..., O v ) P ( H h ) P ( O v | H h ) α i 1 1 n n i j j i j � Find argmax {h i } 21

H Naïve Bayes (con’t) ... O 1 O 2 O n 1 ∏ = = = = = = = P ( H h | O v ..., O v ) P ( H h ) P ( O v | H h ) α i 1 1 n n i j j i j ∑ ∏ α = = = = = = = P ( O v ,..., O v ) P ( H h ) P ( O v | H h ) Normalizing term � 1 1 n n i j j i i j (No need to compute, as same for all h i ) � Easy to use for Classification � Can use even if some v j s not specified � If k Dx ’s and n O i s, requires only k priors, n * k pairwise-conditionals (Not 2 n+k … relatively easy to learn) 2 n+1 – 1 n 1+2n 10 21 2,047 22 30 61 2,147,438,647

Bigger Networks P(I= 1) P(H= 1) GeneticPH LiverTrauma 0.20 0.32 g lt P(H= 1|g ,lt ) 1 1 0.82 Hepatitis 1 0 0.10 0 1 0.45 0 0 0.04 h P(J= 1| h ) Jaundice Bloodtest 1 0.8 h P(B= 1| h ) 0 0.3 1 0.98 0 0.01 Intuition: Show CAUSAL connections: � GeneticPH CAUSES Hepatitis; Hepatitis CAUSES Jaundice � If GeneticPH, then expect Jaundice: GeneticPH ⇒ Hepatitis ⇒ Jaundice But only via Hepatitis: GeneticPH and not Hepatitis ⇒ Jaundice P ( J | G ) ≠ P ( J ) but P ( J | G,H ) = P ( J | H) 23

Belief Nets � DAG structure � Each node ≡ Variable v � v depends (only) on its parents + conditional prob: P(v i | parent i = 〈 0,1,… 〉 ) � v is INDEPENDENT of non-descendants, given assignments to its parents D I Given H = 1, - D has no influence on J H - J has no influence on B - etc. J B 24

Less Trivial Situations N.b., obs 1 is not always independent of obs 2 given H • Eg, FamilyHistoryDepression ‘causes’ MotherSuicide and Depression • MotherSuicide causes Depression (w/ or w/o F.H.Depression) P(FHD=1) FHD 0.001 f P(MS=1 | FHD=f) 1 0.10 MS f m P(D=1 | FHD=f, MS=m) 0 0.03 1 1 0.97 1 0 0.90 D 0 1 0.08 0 0 0.04 • Here, P ( D | MS, FHD ) ≠ P ( D | FHD ) ! � Can be done using Belief Network, but need to specify: P( FHD ) 1 P( MS | FHD ) 2 P( D | MS, FHD ) 4 25

26 Example: Car Diagnosis

27 MammoNet

28 A Logical Alarm Reduction Mechanism • 8 diagnoses, 16 findings, … ALARM

29 Troup Detection

30 ARCO1: Forecasting Oil Prices

31 ARCO1: Forecasting Oil Prices

32 Forecasting Potato Production

33 Warning System

Introduction to Bayesian Belief Nets Russ Greiner Dept of - PowerPoint PPT Presentation

Introduction to Bayesian Belief Nets Russ Greiner Dept of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta http://www.cs.ualberta.ca/~ greiner/bn.html 2 Motivation Gates says [LATimes, 28/Oct/96]:

Conflict nets: Efficient locally canonical MALL proof nets Dominic J. D. Hughes and Willem

outline of this tutorial motivations 1 ACISS09 tutorial on deep belief nets deep

Petri Nets Petri Nets Inputs and Outputs Petri Nets vs FSM Lionel Morel Modeling Templates

Mix-Nets Lecture 19 Some tools for electronic-voting (and other things) Mix-Nets Mix-Nets

Petri Nets and Model Checking Natasa Gkolfi University of Oslo March 31, 2017 Petri Nets and

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Bayesian Belief Networks Decision Theoretic Agents Introduction to Probability [Ch13]

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

4 Bayesian Belief Networks (also called Bayes Nets) Interesting because: The Naive Bayes

From DB-nets to Coloured Petri Nets with Priorities Marco Montali and Andrey Rivkin KRDB Research

Why Are Convlotuional Nets More Sample-Efficient than Fully-Connected Nets? Zhiyuan Li Joint

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

SRI, 8 Feb 2008 Bayesian Belief Nets: Demo and Introduction to Hugin John Rushby Computer

Bayes Nets (Ch. 14) Announcements Homework 1 posted Bayesian Network A Bayesian network (Bayes

Carbon Intensity of Natural Gas C8 trucks in Transportation (focus on long haul) Rosa

Getting Started with CHP: KPPC Cheryl Eakle, CEM Initial Screening and Air Permitting

Duality and Automata Theory Mai Gehrke Universit e Paris VII and CNRS Joint work with Serge

Analyzing the Reliability and Resiliency of New Jerseys Urban Energy Systems in Response to

Higgs and SUSY Howard E. Haber 16 December, 2011 Annual Theory Meeting IPPP University of

High-Capacity Superconducting dc Cables Paul M. Grant Visiting Scholar in Applied Physics,

Second meeting 31 January 2012 Agenda Information Gothenburg Protocol revision, NEC directive

Introducing SILE A New Typesetting System W ide Margin % cat > specfile.txt pages: 160

Introduction to Bayesian Belief Nets Russ Greiner Dept of - PowerPoint PPT Presentation

Introduction to Bayesian Belief Nets Russ Greiner Dept of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta http://www.cs.ualberta.ca/~ greiner/bn.html 2 Motivation Gates says [LATimes, 28/Oct/96]:

Conflict nets: Efficient locally canonical MALL proof nets Dominic J. D. Hughes and Willem

outline of this tutorial motivations 1 ACISS09 tutorial on deep belief nets deep

Petri Nets Petri Nets Inputs and Outputs Petri Nets vs FSM Lionel Morel Modeling Templates

Mix-Nets Lecture 19 Some tools for electronic-voting (and other things) Mix-Nets Mix-Nets

Petri Nets and Model Checking Natasa Gkolfi University of Oslo March 31, 2017 Petri Nets and

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Bayesian Belief Networks Decision Theoretic Agents Introduction to Probability [Ch13]

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

4 Bayesian Belief Networks (also called Bayes Nets) Interesting because: The Naive Bayes

From DB-nets to Coloured Petri Nets with Priorities Marco Montali and Andrey Rivkin KRDB Research

Why Are Convlotuional Nets More Sample-Efficient than Fully-Connected Nets? Zhiyuan Li Joint

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

SRI, 8 Feb 2008 Bayesian Belief Nets: Demo and Introduction to Hugin John Rushby Computer

Bayes Nets (Ch. 14) Announcements Homework 1 posted Bayesian Network A Bayesian network (Bayes

Carbon Intensity of Natural Gas C8 trucks in Transportation (focus on long haul) Rosa

Getting Started with CHP: KPPC Cheryl Eakle, CEM Initial Screening and Air Permitting

Duality and Automata Theory Mai Gehrke Universit e Paris VII and CNRS Joint work with Serge

Analyzing the Reliability and Resiliency of New Jerseys Urban Energy Systems in Response to

Higgs and SUSY Howard E. Haber 16 December, 2011 Annual Theory Meeting IPPP University of

High-Capacity Superconducting dc Cables Paul M. Grant Visiting Scholar in Applied Physics,

Second meeting 31 January 2012 Agenda Information Gothenburg Protocol revision, NEC directive

Introducing SILE A New Typesetting System W ide Margin % cat &gt; specfile.txt pages: 160

Introducing SILE A New Typesetting System W ide Margin % cat > specfile.txt pages: 160