Some applications of Bayesian networks Ji r Vomlel Institute of - PowerPoint PPT Presentation

Some applications of Bayesian networks Jiˇ r´ ı Vomlel Institute of Information Theory and Automation Academy of Sciences of the Czech Republic This presentation is available at http://www.utia.cas.cz/vomlel/ 1

Contents • Brief introduction to Bayesian networks • Typical tasks that can be solved using Bayesian networks • 1: Medical diagnosis (a very simple example) • 2: Decision making maximizing expected utility (another simple example) • 3: Adaptive testing (a case study) • 4: Decision-theoretic troubleshooting (a commercial product) 2

Bayesian network • a directed acyclic graph G = ( V , E ) • each node i ∈ V corresponds to a random variable X i with a finite set X i of mutually exclusive states • pa ( i ) denotes the set of parents of node i in graph G • to each node i ∈ V corresponds a conditional probability table P ( X i | ( X j ) j ∈ pa ( i ) ) • the DAG implies conditional independence relations between ( X i ) i ∈ V • d-separation (Pearl, 1986) can be used to read the CI relations from the DAG 3

Using the chain rule we have that: = ∏ P (( X i ) i ∈ V ) P ( X i | X i − 1 , . . . , X 1 ) i ∈ V Assume an ordering of X i , i ∈ V such that if j ∈ pa ( i ) then j < i . From the DAG we can read conditional independence relations X i ⊥ ⊥ X k | ( X j ) j ∈ pa ( i ) for i ∈ V and k < i and k �∈ pa ( i ) Using the conditional independence relations from the DAG we get = ∏ P (( X i ) i ∈ V ) P ( X i | ( X j ) j ∈ pa ( i ) ) . i ∈ V It is the joint probability distribution represented by the Bayesian network. 4

Typical use of Bayesian networks • to model and explain a domain. • to update beliefs about states of certain variables when some other variables were observed, i.e., computing conditional probability distributions, e.g., P ( X 23 | X 17 = yes , X 54 = no ) . • to find most probable configurations of variables • to support decision making under uncertainty • to find good strategies for solving tasks in a domain with uncertainty. 6

Simplified diagnostic example We have a patient. Possible diagnoses: tuberculosis, lung cancer, bronchitis. 7

We don’t know anything about the pa- Patient is a smoker. tient 8

Patient is a smoker. ... and he complains about dyspnoea 9

Patient is a smoker and complains ... and his X-ray is positive about dyspnoea 10

Patient is a smoker and complains ... and he visited Asia recently about dyspnoea and his X-ray is positive 11

Application 2:Decision making The goal: maximize expected utility Hugin example: mildew4.net 12

Fixed and Adaptive Test Strategies Q 1 Q 2 Q 5 wrong correct Q 3 Q 4 Q 8 Q 4 wrong correct wrong correct Q 5 Q 7 Q 6 Q 9 Q 2 wrong correct wrong correct wrong correct wrong correct Q 6 Q 7 Q 1 Q 3 Q 6 Q 8 Q 4 Q 7 Q 7 Q 10 Q 8 Q 9 Q 10 13

For all nodes n of a strategy s we X 2 have defined: X 3 • evidence e n , i.e. outcomes of X 1 X 2 steps performed to get to node X 3 n , X 1 • probability P ( e n ) of getting to X 3 node n , and X 2 X 1 • utility f ( e n ) being a real num- X 3 ber. X 1 Let L ( s ) be the set of terminal X 2 nodes of strategy s . X 3 X 1 Expected utility of strategy is E f ( s ) = ∑ ℓ ∈L ( s ) P ( e ℓ ) · f ( e ℓ ) . X 2 14

X 2 X 3 X 1 X 2 Strategy s ⋆ is optimal iff it maxi- X 3 mizes its expected utility. X 1 Strategy s is myopically optimal iff X 3 X 2 each step of strategy s is selected X 1 so that it maximizes expected utility X 3 after the selected step is performed X 1 ( one step look ahead ). X 2 X 3 X 1 X 2 15

Application 3: Adaptive test of basic operations with fractions Examples of tasks: � − 1 � 3 4 · 5 15 24 − 1 8 = 5 8 − 1 8 = 4 8 = 1 T 1 : = 6 8 2 1 6 + 1 12 + 1 2 12 = 3 12 = 1 T 2 : = 12 4 1 4 · 1 1 1 4 · 3 2 = 3 T 3 : = 2 8 � · � 1 � 1 2 · 1 3 + 1 1 4 · 2 3 = 2 12 = 1 � T 4 : = 6 . 2 3 16

Elementary and operational skills 1 2 > 1 2 3 > 1 CP Comparison (common nu- 3 , 3 merator or denominator) 7 = 1 + 2 1 7 + 2 = 3 AD Addition (comm. denom.) 7 7 5 = 2 − 1 5 − 1 2 = 1 SB Subtract. (comm. denom.) 5 5 1 2 · 3 3 MT Multiplication 5 = 10 � � � � 1 2 , 2 6 , 4 3 CD Common denominator = 3 6 6 = 2 · 2 4 2 · 3 = 2 CL Cancelling out 3 2 = 3 · 2 + 1 7 = 3 1 CIM Conv. to mixed numbers 2 2 2 = 3 · 2 + 1 3 1 = 7 CMI Conv. to improp. fractions 2 2 17

Misconceptions Label Description Occurrence d = a + c a b + c MAD 14.8% b + d b − c a d = a − c MSB 9.4% b − d a b · c b = a · c MMT1 14.1% b b · c a b = a + c MMT2 8.1% b · b d = a · d a b · c MMT3 15.4% b · c a · c a b · c d = MMT4 8.1% b + d c = a · b a b MC 4.0% c 18

Student model HV2 HV1 ACMI ACIM ACL ACD AD SB CMI CIM CL CD MT CP MAD MSB MC MMT1 MMT2 MMT3 MMT4 19

Evidence model for task T 1 � 3 � 4 · 5 − 1 8 = 15 24 − 1 8 = 5 8 − 1 8 = 4 8 = 1 6 2 ⇔ MT & CL & ACL & SB & ¬ MMT 3 & ¬ MMT 4 & ¬ MSB T 1 ACL CL MT SB MMT4 MSB T1 MMT3 P ( X1 | T1 ) X1 Hugin: model-hv-2.net 20

Using information gain as the utility function “The lower the entropy of a probability distribution the more we know.” H ( P ( X )) = − ∑ P ( X = x ) · log P ( X = x ) x 1 entropy 0.5 0 0 0.5 1 probability Information gain in a node n of a strategy IG ( e n ) = H ( P ( S )) − H ( P ( S | e n )) 21

Skill Prediction Quality 92 adaptive average descending 90 ascending 88 Quality of skill predictions 86 84 82 80 78 76 74 0 2 4 6 8 10 12 14 16 18 20 Number of answered questions 22

Application 4: Troubleshooting Dezide Advisor customized to a specific portal, seen from the user’s perspective through a web browser. 23

Application 2: Troubleshooting - Light print problem Actions Faults A 1 F 1 A 2 Problem F 2 A 3 F F 3 Questions Q 1 F 4 • Problems: F 1 Distribution problem , F 2 Defective toner , F 3 Corrupted dataflow , and F 4 Wrong driver setting . • Actions: A 1 Remove, shake and reseat toner , A 2 Try another toner , and A 3 Cycle power . • Questions: Q 1 Is the configuration page printed light? 24

Troubleshooting strategy A 2 = yes A 1 = yes A 1 = no A 2 = no Q 1 = no A 1 A 2 Q 1 A 2 = no A 1 = no Q 1 = yes A 2 A 1 A 2 = yes A 1 = yes The task is to find a strategy s ∈ S minimising expected cost of repair ∑ E CR ( s ) = P ( e ℓ ) · ( t ( e ℓ ) + c ( e ℓ ) ) . ℓ ∈L ( s ) 25

Expected cost of repair for a given strategy E CR ( s ) = A 2 = yes A 1 = yes � � P ( Q 1 = no , A 1 = yes ) · c Q 1 + c A 1 A 1 = no A 2 = no Q 1 = no A 1 A 2 � � + P ( Q 1 = no , A 1 = no , A 2 = yes ) · c Q 1 + c A 1 + c A 2 � � + P ( Q 1 = no , A 1 = no , A 2 = no ) · c Q 1 + c A 1 + c A 2 + c CS Q 1 � � + P ( Q 1 = yes , A 2 = yes ) · c Q 1 + c A 2 A 2 = no A 1 = no Q 1 = yes A 2 A 1 � � + P ( Q 1 = yes , A 2 = no , A 1 = yes ) · c Q 1 + c A 2 + c A 1 A 2 = yes A 1 = yes � � + P ( Q 1 = yes , A 2 = no , A 1 = no ) · c Q 1 + c A 2 + c A 1 + c CS Demo: www.dezide.com Products/Demo/‘‘Try out expert mode’’ 26

Commercial applications of Bayesian networks in educational testing and troubleshooting • Hugin Expert A/S. software product: Hugin - a Bayesian network tool. http://www.hugin.com/ • Educational Testing Service (ETS) the world’s largest private educational testing organization Research unit doing research on adaptive tests using Bayesian networks: http://www.ets.org/research/ • SACSO Project Systems for Automatic Customer Support Operations - research project of Hewlett Packard and Aalborg University. The troubleshooter offered as DezisionWorks by Dezide Ltd. http://www.dezide.com/ 27

Some applications of Bayesian networks Ji r Vomlel Institute of - PowerPoint PPT Presentation

Some applications of Bayesian networks Ji r Vomlel Institute of Information Theory and Automation Academy of Sciences of the Czech Republic This presentation is available at http://www.utia.cas.cz/vomlel/ 1 Contents Brief

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Methods for Neural Networks Readings: Bishop, Neural Networks for Pattern Recognition .

Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Bayesian Networks Philipp Koehn 2 April 2020 Philipp Koehn Artificial Intelligence: Bayesian

Bayesian Networks Philipp Koehn 6 April 2017 Philipp Koehn Artificial Intelligence: Bayesian

Probabilistic Modeling: Bayesian Networks Bioinformatics: Sequence Analysis COMP 571 - Spring

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

Bayesian Networks Philipp Koehn 29 October 2015 Philipp Koehn Artificial Intelligence: Bayesian

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 19: Bayesian Networks

Graphical Models and Bayesian Networks Guanajuato, Mxico, 2015 Sren Hjsgaard

Melanocytic Nevi Arise From Outline Initiating Oncogenic Mutations Melanoma oncogenes and

Care of the Child with Bronchopulmonary Dysplasia and Pulmonary Hypertension Peter Mourani Has

Early-Warning Signals and Phase Transitions in Psychotherapy Early-warning signals for phase

Respiratory Features Charlotte Massey, Specialist Physiotherapist, National Hospital for

3000PATowards a National Reference Corpus of German Clinical Language Udo Hahn a , Franz

Princess Marga ret Hospital and Ontario Can Ontario Can ncer Institute ncer Institute 50th

MASTINO MASTINO The P The P- -metric metric: a MGA Algorithm MGA Algorithm: A Utility