Reasoning with Bayes Bayes Networks Networks Reasoning with - - PowerPoint PPT Presentation

reasoning with bayes bayes networks networks reasoning
SMART_READER_LITE
LIVE PREVIEW

Reasoning with Bayes Bayes Networks Networks Reasoning with - - PowerPoint PPT Presentation

Reasoning with Bayes Bayes Networks Networks Reasoning with Course: CS40022 Course: CS40022 Instructor: Dr. Pallab Dasgupta Pallab Dasgupta Instructor: Dr. Department of Computer Science & Engineering Department of Computer Science


slide-1
SLIDE 1

Reasoning with Reasoning with Bayes Bayes Networks Networks

Course: CS40022 Course: CS40022 Instructor: Dr. Instructor: Dr. Pallab Dasgupta Pallab Dasgupta

Department of Computer Science & Engineering Department of Computer Science & Engineering Indian Institute of Technology Indian Institute of Technology Kharagpur Kharagpur

slide-2
SLIDE 2

CSE, IIT CSE, IIT Kharagpur Kharagpur

Belief Network Example Belief Network Example

Alarm Burglary Earthquake JohnCalls MaryCalls

0.05 0.05 F F 0.90 0.90 T T P(J) P(J) A A 0.01 0.01 F F 0.70 0.70 T T P(M) P(M) A A 0.001 0.001 F F F F 0.29 0.29 T T F F 0.95 0.95 F F T T 0.95 0.95 T T T T P(A) P(A) E E B B 0.001 0.001 P(B) P(B) 0.002 0.002 P(E) P(E)

slide-3
SLIDE 3

CSE, IIT CSE, IIT Kharagpur Kharagpur

Answering queries Answering queries

  • We consider cases where the belief network

We consider cases where the belief network is a poly is a poly-

  • tree

tree

  • There is at most one undirected path

There is at most one undirected path between any two nodes between any two nodes

slide-4
SLIDE 4

CSE, IIT CSE, IIT Kharagpur Kharagpur

Answering queries Answering queries

U1 Um

+ X

E X Y1 Z1j Znj Yn

− X

E

slide-5
SLIDE 5

CSE, IIT CSE, IIT Kharagpur Kharagpur

Answering queries Answering queries

  • U = U1 … Um are parents of node X
  • Y = Y1 … Yn are children of node X
  • X is the query variable
  • E is a set of evidence variables
  • The aim is to compute P(X | E)
slide-6
SLIDE 6

CSE, IIT CSE, IIT Kharagpur Kharagpur

Definitions Definitions

  • E

EX

X+ + is the causal support for X

is the causal support for X

  • The evidence variables “

The evidence variables “above above” X that are ” X that are connected to X through its parents connected to X through its parents

  • E

EX

X– – is the evidential support for X

is the evidential support for X

  • The evidence variables “

The evidence variables “below below” X that are ” X that are connected to X through its children connected to X through its children

  • E

EUi

Ui \ \ X X refers to all the evidence connected to

refers to all the evidence connected to node node U Ui

i except via the path from X

except via the path from X

  • E

EYi

Yi \ \ X X+ + refers to all the evidence connected to

refers to all the evidence connected to node Y node Yi

i through its parents for X

through its parents for X

slide-7
SLIDE 7

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(X|E) The computation of P(X|E)

) E | E ( P ) E | X ( P ) E , X | E ( P ) E , E | X ( P ) E | X ( P

X X X X X X X + − + + − + −

= =

  • Since X d-separates EX+ from EX

–, we can

use conditional independence to simplify the first term in the numerator

  • We can treat the denominator as a constant

) E | X ( P ) X | E ( P ) E | X ( P

X X + −

α =

slide-8
SLIDE 8

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(X | E The computation of P(X | EX

X+ +)

)

) E | u ( P ) u | X ( P ) E | X ( P

X \ Ui i i u X

∏ ∑

=

+

  • P(X | u) is a lookup in the cond prob table of X
  • P(ui | EUi\X) is a recursive (smaller) sub-problem
slide-9
SLIDE 9

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

Let Zi be the parents of Yi other than X, and let zi be an assignment of values to the parents – The evidence in each Yi box is conditionally independent of the others given X

) X | E ( P ) X | E ( P

X \ Yi i X

=

slide-10
SLIDE 10

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

) X | E ( P ) X | E ( P

X \ Yi i X

=

Averaging over Yi and zi yields:

) X | z , y ( P ) z , y , X | E ( P ) X | E ( P

i i i y z i i X \ Yi X

i i

∏∑∑

=

slide-11
SLIDE 11

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

) X | z , y ( P ) z , y , X | E ( P ) X | E ( P

i i i y z i i X \ Yi X

i i

∏∑∑

=

Breaking EYi\X into the two independent components EYi

– and EYi\X+

) X | z , y ( P ) z , y , X | E ( P ) z , y , X | E ( P ) X | E ( P

i i i i X \ Yi i y z i i Yi X

i i

+ − −

∏∑∑

=

slide-12
SLIDE 12

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X) ) X | z , y ( P ) z , y , X | E ( P ) z , y , X | E ( P ) X | E ( P

i i i i X \ Yi i y z i i Yi X

i i

+ − −

∏∑∑

=

EYi

– is independent of X and zi given yi, and

EYi\X+ is independent of X and yi

) X | z , y ( P ) z | E ( P ) y | E ( P ) X | E ( P

i i i i X \ Yi y z i Yi X

i i

∏∑ ∑

+ − −

=

slide-13
SLIDE 13

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

) X | z , y ( P ) z | E ( P ) y | E ( P ) X | E ( P

i i i i X \ Yi y z i Yi X

i i

∏∑ ∑

+ − −

=

Apply Bayes’ rule to P(EYi\X+ | zi):

) X | z , y ( P ) z ( P ) E ( P ) E | z ( P ) y | E ( P ) X | E ( P

i i i y z i X \ Yi X \ Yi i i Yi X

i i

∏∑ ∑

+ + − −

=

slide-14
SLIDE 14

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

) X | z , y ( P ) z ( P ) E ( P ) E | z ( P ) y | E ( P ) X | E ( P

i i i y z i X \ Yi X \ Yi i i Yi X

i i

∏∑ ∑

+ + − −

=

  • Rewriting the conjunction of Yi and zi:

) X | z ( P ) z , X | y ( P ) z ( P ) E ( P ) E | z ( P ) y | E ( P ) X | E ( P

i i i z i X \ Yi X \ Yi i i y i Yi X

i i

∑ ∏∑

+ + − −

=

slide-15
SLIDE 15

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

) X | z ( P ) z , X | y ( P ) z ( P ) E ( P ) E | z ( P ) y | E ( P ) X | E ( P

i i i z i X \ Yi X \ Yi i i y i Yi X

i i

∑ ∏∑

+ + − −

=

P(zi | X) = P(zi) because Z and X are d-separated. Also P(EYi\X+) is a constant

) z , X | y ( P ) E | z ( P ) y | E ( P ) X | E ( P

i i i X \ Yi y i z i i Yi X

i i

∏∑ ∑

+ − −

β =

slide-16
SLIDE 16

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

) z , X | y ( P ) E | z ( P ) y | E ( P ) X | E ( P

i i i X \ Yi y i z i i Yi X

i i

∏∑ ∑

+ − −

β =

  • The parents of Yi (the Zij) are independent of

each other.

  • We also combine the βi into one single β
slide-17
SLIDE 17

CSE, IIT CSE, IIT Kharagpur Kharagpur

The computation of P(E The computation of P(EX

X – – | X)

| X)

∏∑ ∑ ∏

− −

β =

i Y \ Z y ij z j i i i Yi X

) E | z ( P ) z , X | y ( P ) y | E ( P ) X | E ( P

i ij i i

  • P(EYi

– | yi) is a recursive instance of P(EX – | X)

  • P(yi | X, zi) is a cond prob table entry for Yi
  • P(zij | EZij\Yi) is a recursive sub-instance of the

P(X | E) calculation

slide-18
SLIDE 18

CSE, IIT CSE, IIT Kharagpur Kharagpur

Inference in multiply connected Inference in multiply connected belief networks belief networks

  • Clustering methods

Clustering methods

  • Transform the net into a probabilistically

Transform the net into a probabilistically equivalent (but topologically different) poly equivalent (but topologically different) poly-

  • tree by merging offending nodes

tree by merging offending nodes

  • Conditioning methods

Conditioning methods

  • Instantiate variables to definite values, and

Instantiate variables to definite values, and then evaluate a poly then evaluate a poly-

  • tree for each possible

tree for each possible instantiation instantiation

slide-19
SLIDE 19

CSE, IIT CSE, IIT Kharagpur Kharagpur

Inference in multiply connected Inference in multiply connected belief networks belief networks

  • Stochastic simulation methods

Stochastic simulation methods

  • Use the network to generate a large

Use the network to generate a large number of concrete models of the domain number of concrete models of the domain that are consistent with the network that are consistent with the network distribution. distribution.

  • They give an approximation of the exact

They give an approximation of the exact evaluation. evaluation.

slide-20
SLIDE 20

CSE, IIT CSE, IIT Kharagpur Kharagpur

Default reasoning Default reasoning

  • Some conclusions are made by default unless a

Some conclusions are made by default unless a counter counter-

  • evidence is obtained

evidence is obtained

  • Non

Non-

  • monotonic reasoning

monotonic reasoning

  • Points to ponder

Points to ponder

  • Whats

Whats the semantic status of default rules? the semantic status of default rules?

  • What happens when the evidence matches the

What happens when the evidence matches the premises of two default rules with conflicting premises of two default rules with conflicting conclusions? conclusions?

  • If a belief is retracted later, how can a system

If a belief is retracted later, how can a system keep track of which conclusions need to be keep track of which conclusions need to be retracted as a consequence? retracted as a consequence?

slide-21
SLIDE 21

CSE, IIT CSE, IIT Kharagpur Kharagpur

Issues in Rule Issues in Rule-

  • based methods for

based methods for Uncertain Reasoning Uncertain Reasoning

  • Locality

Locality

  • In logical reasoning systems, if we have

In logical reasoning systems, if we have A A ⇒ ⇒ B, then we can conclude B given B, then we can conclude B given evidence A, evidence A, without worrying about any without worrying about any

  • ther rules
  • ther rules. In probabilistic systems, we

. In probabilistic systems, we need to consider need to consider all all available evidence. available evidence.

slide-22
SLIDE 22

CSE, IIT CSE, IIT Kharagpur Kharagpur

Issues in Rule Issues in Rule-

  • based methods for

based methods for Uncertain Reasoning Uncertain Reasoning

  • Detachment

Detachment

  • Once a logical proof is found for

Once a logical proof is found for proposition B, we can use it regardless of proposition B, we can use it regardless of how it was derived ( how it was derived (it can be detached it can be detached from its justification from its justification). ). In probabilistic In probabilistic reasoning, the source of the evidence is reasoning, the source of the evidence is important for subsequent reasoning. important for subsequent reasoning.

slide-23
SLIDE 23

CSE, IIT CSE, IIT Kharagpur Kharagpur

Issues in Rule Issues in Rule-

  • based methods for

based methods for Uncertain Reasoning Uncertain Reasoning

  • Truth functionality

Truth functionality

  • In logic, the truth of complex sentences

In logic, the truth of complex sentences can be computed from the truth of the can be computed from the truth of the

  • components. Probability combination does
  • components. Probability combination does

not work this way, except under strong not work this way, except under strong independence assumptions. independence assumptions. A famous example of a truth functional system A famous example of a truth functional system for uncertain reasoning is the for uncertain reasoning is the certainty factors certainty factors model model, developed for the , developed for the Mycin Mycin medical medical diagnostic program diagnostic program

slide-24
SLIDE 24

CSE, IIT CSE, IIT Kharagpur Kharagpur

Dempster Dempster-

  • Shafer Theory

Shafer Theory

  • Designed to deal with the distinction between

Designed to deal with the distinction between uncertainty uncertainty and and ignorance ignorance. .

  • We use a belief function

We use a belief function Bel Bel(X) (X) – – probability probability that the evidence supports the proposition that the evidence supports the proposition

  • When we do not have any evidence about X,

When we do not have any evidence about X, we assign we assign Bel Bel(X) = 0 as well as (X) = 0 as well as Bel Bel( (¬ ¬X) = 0 X) = 0

slide-25
SLIDE 25

CSE, IIT CSE, IIT Kharagpur Kharagpur

Dempster Dempster-

  • Shafer Theory

Shafer Theory

For example, if we do not know whether a coin For example, if we do not know whether a coin is fair, then: is fair, then: Bel Bel( Heads ) = ( Heads ) = Bel Bel( ( ¬ ¬Heads ) = 0 Heads ) = 0 If we are given that the coin is fair with 90% If we are given that the coin is fair with 90% certainty, then: certainty, then: Bel Bel( Heads ) = 0.9 X 0.5 = 0.45 ( Heads ) = 0.9 X 0.5 = 0.45 Bel Bel( (¬ ¬Heads ) = 0.9 X 0.5 = 0.45 Heads ) = 0.9 X 0.5 = 0.45 Note that we still have a gap of Note that we still have a gap of 0.1 0.1 that is not that is not accounted for by the evidence accounted for by the evidence

slide-26
SLIDE 26

CSE, IIT CSE, IIT Kharagpur Kharagpur

Fuzzy Logic Fuzzy Logic

  • Fuzzy set theory is a means of specifying

Fuzzy set theory is a means of specifying how well an object satisfies a vague how well an object satisfies a vague description description

  • Truth is a value between 0 and 1

Truth is a value between 0 and 1

  • Uncertainty stems from lack of evidence,

Uncertainty stems from lack of evidence, but given the dimensions of a man but given the dimensions of a man concluding whether he is fat has no concluding whether he is fat has no uncertainty involved uncertainty involved

slide-27
SLIDE 27

CSE, IIT CSE, IIT Kharagpur Kharagpur

Fuzzy Logic Fuzzy Logic

  • The rules for evaluating the fuzzy truth, T, of

The rules for evaluating the fuzzy truth, T, of a complex sentence are a complex sentence are T(A T(A ∧ ∧ B) = min( T(A), T(B) ) B) = min( T(A), T(B) ) T(A T(A ∨ ∨ B) = max( T(A), T(B) ) B) = max( T(A), T(B) ) T( T(¬ ¬A) = 1 A) = 1 − − T(A) T(A)