An Introduction to Bayesian Network Inference using Variable - - PowerPoint PPT Presentation

an introduction to bayesian network inference using
SMART_READER_LITE
LIVE PREVIEW

An Introduction to Bayesian Network Inference using Variable - - PowerPoint PPT Presentation

An Introduction to Bayesian Network Inference using Variable Elimination Jhonatan Oliveira Department of Computer Science University of Regina Outline Introduction F B Background Bayesian networks L D Variable


slide-1
SLIDE 1

An Introduction to
 Bayesian Network Inference using Variable Elimination

Jhonatan Oliveira Department of Computer Science University of Regina

slide-2
SLIDE 2

Outline

  • Introduction
  • Background
  • Bayesian networks
  • Variable Elimination
  • Repeated Computation
  • Conclusions

L F D B H

slide-3
SLIDE 3

Introduction

Bayesian networks are probabilistic graphical models used when reasoning under uncertainty.

slide-4
SLIDE 4

Uncertainty

  • Conflicting information
  • Missing information

family out bowel problem dog out dog out

slide-5
SLIDE 5

Uncertainty

  • Conflicting information
  • Missing information

family out bowel problem dog out dog out

slide-6
SLIDE 6

Uncertainty

  • Conflicting information
  • Missing information

family out bowel problem dog out dog out

slide-7
SLIDE 7

Real World Applications

slide-8
SLIDE 8

Real World Applications

TrueSkill™

slide-9
SLIDE 9

Real World Applications

Turbo Codes

slide-10
SLIDE 10

Real World Applications

Mars Exploration Rover

slide-11
SLIDE 11

Background

Probability theory: introducing joint probability distribution, chain rule, and conditional independence

slide-12
SLIDE 12

Joint Probability Distribution

  • A multivariate function over a finite set of variables
  • Assigns a real number between 0 and 1 to each

configuration (combination of variable’s values) of the variables

  • Summing all assigned real numbers yields 1
slide-13
SLIDE 13

Joint Probability Distribution

Lights On Family Out Dog Out Bowel Problem Hear Bark P(L,F,D,B,H) 0.01 1 0.25 1 0.08 1 0.19

slide-14
SLIDE 14

Joint Probability Distribution

Lights On Family Out Dog Out Bowel Problem Hear Bark P(L,F,D,B,H) 0.01 1 0.25 1 0.08 1 0.19

1st Query 2nd Query

+

slide-15
SLIDE 15

Joint Probability Distribution

The size issue = 32 probabilities

slide-16
SLIDE 16

Chain Rule

Conditional Probability Tables

P(…) = P(L) P(F|L) P(D|L,F) P(B|L,F,D) P(H|L,F,D,B)

slide-17
SLIDE 17

Chain Rule

The size issue = 62 probabilities

slide-18
SLIDE 18

Conditional Independence

family out dog out

Given:

dog out hear bark

slide-19
SLIDE 19

Conditional Independence

family out dog out

Given:

dog out hear bark

Independence I(family out, dog out, hear bark):

family out hear bark dog out

slide-20
SLIDE 20

Conditional Independence

  • Given I(X,Y,Z):
  • P(X|Y,Z) = P(X|Y)



 


  • Given I(L,F,D)
  • P(D|L,F) = P(D|F)

I(L,F,D)

slide-21
SLIDE 21

Chain Rule & Conditional Independence

P(L) P(F|L) P(D|L,F) P(B|L,F,D) P(H|L,F,D,B) P(L,F,D,B,H) P(L) P(F|L) P(D|F) P(B|L,F,D) P(H|L,F,D,B) P(L) P(F|L) P(D|F) P(B|L,D) P(H|L,F,D,B)

Chain Rule I(D,F,L) I(B, ,F)

?

slide-22
SLIDE 22

Bayesian network

A graphical interpretation of probability theory

slide-23
SLIDE 23

Directed Acyclic Graph

Lights on Family out Dog out Bowel problem Hear bark

slide-24
SLIDE 24

Testing Independences

L F D B H

A set of variables X is d-separated from a set of variables Y in the DAG if all paths from X to Y are blocked

slide-25
SLIDE 25

Testing Independences

L F D B H

Is F d-separated from H given D? Yes, namely, I(F,D,H) holds in P(L,F,D,B,H)

slide-26
SLIDE 26

Testing Independences

L F D B H

P(F) P(B) P(D|B,F) P(L|F) P(H|D) The size issue = 18 probabilities

slide-27
SLIDE 27

Bayesian Network

L F D B H

P(H|D) P(F) P(B) P(D|B,F) P(L|F)

A directed acyclic graph B and
 a set of conditional probability tables P(U) = P(v | Pa(v)), where v is in B and Pa(v) are the parents of v

slide-28
SLIDE 28

Bayesian Network

L F D B H

P(H|D) P(F) P(B) P(D|B,F) P(L|F) P(L,F,D,B,H) =

slide-29
SLIDE 29

Inference

P(H|D) P(F) P(B) P(D|B,F) P(L|F) P(L)

part

P(L,F,D,B,H)

slide-30
SLIDE 30

Inference

P(H|D) P(F) P(B) P(D|B,F) P(L|F) P(L)

X

P(L,F)

+

F

slide-31
SLIDE 31

Inference

L F P(L|F) 0.8 1 0.3 1 0.2 1 1 0.7

Multiplication

F P(F) 0.8 1 0.3 L F P(L,F) 0.64 1 0.09 1 0.16 1 1 0.21

X =

slide-32
SLIDE 32

Inference

Marginalization

L F P(L,F) 0.2 1 0.3 1 0.4 1 1 0.1 L P(F) 0.5 1 0.5

+ =

F

slide-33
SLIDE 33

Inference Algorithms

P(H|D) P(F) P(B) P(D|B,F) P(L|F) P(L)

Shafer-Shennoy Lauritzen and Spiegalhalter Hugin Lazy Propagation Variable Elimination

slide-34
SLIDE 34

Variable Elimination

Eliminates all variables that are not in the query

slide-35
SLIDE 35

Variable Elimination Algorithm

Input: factorization F, elimination ordering L, query X, evidence Y Output: P(X|Y) For each variable v in L: multiply all CPTs in F involving v yielding CPT P1 marginalize v out of P1 remove all CPTs from F involving v append P1 to F Multiply all remaining CPTs in F yielding P(X,Y) return P(X|Y) = P(X,Y) / P(Y)

slide-36
SLIDE 36

Variable Elimination Algorithm

L F D B H

P(H|D) P(F) P(B) P(D|B,F) P(L|F) P(L,F,D,B,H) = P(H | L)?

slide-37
SLIDE 37

Variable Elimination Algorithm

P(H|D) P(F) P(B) P(D|B,F) P(L|F) Factorization: Query variable: H Evidence variable: L=1 Elimination ordering: B, F, D Input

slide-38
SLIDE 38

Variable Elimination Algorithm

P(H|D) P(F) P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) P(L|F) Eliminating B Factorization: P(D|F) P(H|D) P(D,F,L) = P(L|F) P(F) P(D|F) P(D,L) = marginalize F from P(D,F,L) Eliminating F Factorization: P(D,L)

slide-39
SLIDE 39

Variable Elimination Algorithm

P(D,H,L) = P(H|D) P(D,L) P(H,L) = marginalize D from P(D,H,L) Eliminating D Factorization: P(H,L) Output P(L) = marginalize H from P(H,L) P(H|L) = P(H,L) / P(L)

slide-40
SLIDE 40

Repeated Computation

Variable Elimination can perform repeated computation

slide-41
SLIDE 41

Variable Elimination Algorithm

L F D B H

P(H|D) P(F) P(B) P(D|B,F) P(L|F) P(L,F,D,B,H) = P(H | F)?

slide-42
SLIDE 42

Variable Elimination Algorithm

P(H|D) P(F) P(B) P(D|B,F) P(L|F) Factorization: Query variable: H Evidence variable: F=1 Elimination ordering: L, B, D Input

slide-43
SLIDE 43

Variable Elimination Algorithm

1(F) = marginalize L from P(L|F) Eliminating L Factorization: Eliminating B P(H|D) P(F) P(B) P(D|B,F) P(H|D) P(F) P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) Factorization: P(D|F)

slide-44
SLIDE 44

Variable Elimination Algorithm

P(D,H|F) = P(H|D) P(D|F) P(H|F) = marginalize D from P(D,H|F) Eliminating D Factorization: P(H|F) Output P(F) = marginalize H from P(F, H) P(H|F) = P(F,H) / P(F) P(F) Multiply all: P(F,H) = P(F) P(H|F)

slide-45
SLIDE 45

Repeated Computation

P(H|D) P(F) P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) P(L|F) Eliminating B Factorization: P(D|F) Eliminating B P(H|D) P(F) P(B,D|F) = P(B) P(D|B,F) P(D|F) = marginalize B from P(B,D|F) Factorization: P(D|F)

P(H|L) P(H|F)

slide-46
SLIDE 46

Repeated Computation

  • Store past computation
  • Find relevant computation for new query
  • Retrieve computation that can be reused
slide-47
SLIDE 47

Variable Elimination as a Join Tree

P(B) P(D|B,F)

D,F,L

P(L|F) P(F)

D,B,F

P(D|F)

D,H,L

P(H|D)

P(D,L)

H,L

P(H,L)

Answering P(H|L)

slide-48
SLIDE 48

Variable Elimination as a Join Tree

P(B) P(D|B,F)

D,F,L

P(L|F) P(F)

D,B,F

P(D|F)

D,H,L

P(H|D)

P(D,L)

H,L

P(H,L)

Answering P(H|F)

slide-49
SLIDE 49

Conclusions

  • Bayesian networks are useful

probabilistic graphical models

  • Inference can be performed

by Variable Elimination

  • Future work will investigate

how to avoid repeated computation during Variable Elimination

L F D B H

slide-50
SLIDE 50

References

  • Bonaparte Project: http://www.bonaparte-dvi.com/
  • McEliece, Robert J.; MacKay, David J. C.; Cheng, Jung-Fu (1998), "Turbo decoding as an

instance of Pearl's "belief propagation" algorithm", IEEE Journal on Selected Areas in Communications 16 (2): 140–152, doi:10.1109/49.661103, ISSN 0733-8716.

  • Microsoft True Skill: http://research.microsoft.com/en-us/projects/trueskill/
  • N. Serrano, "A Bayesian Framework for Landing Site Selection during Autonomous Spacecraft

Descent," Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on, Beijing, 2006, pp. 5112-5117

  • Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models - Principles and Techniques.

MIT Press 2009.

  • Darwiche, A. (2009). Modeling and Reasoning with Bayesian Networks (1st ed.). Cambridge

University Press.

  • Shafer, G., & Shenoy, P. P. (1989). Probability Propagation.
  • Charniak, E. (1991). Bayesian networks without tears. AI Magazine, 12(4), 50–63.