Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Bayesian Networks Volker Sorge Intro to AI: Specifying Probability - - PowerPoint PPT Presentation
Bayesian Networks Volker Sorge Intro to AI: Specifying Probability - - PowerPoint PPT Presentation
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks Bayesian Networks Volker Sorge Intro to AI: Specifying Probability Distributions Lecture 8 Volker Sorge Introduction A Bayesian
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Specifying Probability Distributions
◮ Specifying a probability for every atomic event is
impractical
◮ We have already seen it can be easier to specify
probability distributions by using (conditional) independence
◮ Bayesian (Belief) Networks allow us
◮ to specify any distribution, ◮ to specify such distributions concisely if there is
(conditional) independence, in a natural way
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Idea of a Bayesian Network
◮ Fix set of random variables {X1, . . . , Xn}. ◮ If every variable takes k values, we have to compute kn
conditional probabilities to get the complete set of probability distributions.
◮ A Bayesian Network tries to avoid this by representing
direct influences between random variables and restricting the necessary probability distributions that need to be computed to those direct influences.
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Setting Up a Bayesian Network
◮ Every random variable {X1, . . . , Xn} is a node in the
network.
◮ Influences are given by directed edges between nodes. ◮ Each node holds the joint probability distribution of
with its parents nodes.
◮ If we do this naively, we can still end up computing
close to kn probabilities.
◮ If we exploit conditional independence, we can reduce
complexity to kn. Every node of a Bayesian Network is conditionallly independent of its non-descendants given its parents.
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Example: Bayesian Net
Oversleeps A P(A) .6 Pershore closed B P(B) .2 Volker late C A B P(C) T T .9 T F .7 F T .8 F F .2 Mark late D B P(D) T .3 F .4 Committee cancelled E C D P(E) T T .9 T F .4 F T .5 F F .3
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Probabilistic Inference: Goal
◮ Compute the probability distribution for some event
given some evidence.
◮ More formally:
◮ Let Q be a set of query variables, ◮ let E be a set of evidence variables, ◮ compute P(Q|E).
◮ Here evidence means that we know the exact event for
the variables in E.
◮ E.g., we know that Volker has overslept, how likely is it
that the committee will be cancelled?
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Types of Inference
Diagnostic Inferences From effects to causes. How likely is a cause for some observed event? Causal Inferences From causes to effects. How likely will some observed events cause some other event? Intercausal Inferences Between causes of a common effect. How likely is that if we know one cause for an event, that some other cause is also happening? This is also sometimes called “explaining away”. Mixed Inferences Combining two or more of the above.
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Inferences in Bayesian Nets
A schematic overview for queries Q and observed evidence E: Diagnostic Causal Intercausal Mixed Q E E Q Q E E Q E
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Inference Examples
Diagnostic P(B|E) = [< .21, .79 >] Causal P(E|A) = [< .533, .467 >] Intercausal P(C|D) = [< .557, .443 >] Mixed P(C|B, E) = [< .904, .096 >] Computed with http://aispace.org/bayes/.
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Notation
◮ P stands for simple probability. ◮ P stands for a probability distribution (i.e., a set of
probabilities).
◮ P(A|B) denotes the probability of A under the
condition B.
◮ P(A|B) denotes the probability distribution for A under
the condition B.
◮ P(A, B) is the not yet normalised distribution for A
under the condition B. That is, αP(A, B) = P(A|B).
◮ Finally small letters stand for probability variables that
have to be summed out (sometimes called nuisance variables).
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Example Computation
◮ P(B|E = T) = αP(B, E = T) ◮ We compute P(B, E) by summing out the remaining
variables A, C, D.
◮ We will write a, c, d for the respective events. ◮ This means we have to compute
- a
- c
- d P(B, E, a, c, d), which is the (not
normalised) distribution of B under the assumption that E = T, while summing out a, c, d.
◮ A simple example how to sum out is:
- a P(B, a) = P(B|A) + P(B|¬A).
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
The great advantage of a Bayesian network is that we effectively can use all conditional probabilities given to express the term P(B, E, a, c, d) as follows: P(B, E) =
- a
- c
- d
P(B, E, a, c, d) =
- a
- c
- d
P(B)P(a)P(d|B)P(c|B, a We observe that all the probability distributions on the right hand side are indeed fully given in the network. The summing out works as follows: P(B, E) =
- a
- c
- d
P(B, E, a, c, d) =
- a
- c
- d
P(B)P(a)P(d|B)P(c|B, a =
- a
- c
- d
P(B)P(a)P(d|B)P(c|B, a)P(E|c, d) =
- c
- d
P(B)P(A)P(d|B)P(c|B, A)P(E|c, d) +
- c
- d
P(B)P(¬A)P(d|B)P(c|B, ¬A)P(E|c, d) =
- P(B)P(A)P(d|B)P(C|B, A)P(E|C, d)
Intro to AI: Lecture 8 Volker Sorge Introduction A Bayesian Network Inference in Bayesian Networks
Questions
◮ In the above Bayesian Network, give examples for the
following concepts
◮ independent events, ◮ conditionally independent events, and ◮ dependent events.