Larry Holder School of EECS Washington State University Artificial - - PowerPoint PPT Presentation

larry holder school of eecs washington state university
SMART_READER_LITE
LIVE PREVIEW

Larry Holder School of EECS Washington State University Artificial - - PowerPoint PPT Presentation

Larry Holder School of EECS Washington State University Artificial Intelligence 1 } Full joint probability distribution Can answer any query But typically too large } Conditional independence Can reduce the number of probabilities


slide-1
SLIDE 1

Larry Holder School of EECS Washington State University

1 Artificial Intelligence

slide-2
SLIDE 2

} Full joint probability distribution

  • Can answer any query
  • But typically too large

} Conditional independence

  • Can reduce the number of probabilities needed
  • P(X | Y,Z) = P(X | Z), if X independent of Y given Z

} Bayesian network

  • Concise representation of above

Artificial Intelligence 2

slide-3
SLIDE 3

} Example

Artificial Intelligence 3

slide-4
SLIDE 4

} Bayesian network is a directed, acyclic graph } Each node corresponds to a random variable } A directed link from node X to node Y implies

that X “influences” Y

  • X is the parent of Y

} Each node X has a conditional probability

distribution P(X | Parents(X))

  • Quantifies the influence on X from its parent nodes
  • Conditional probability table (CPT)

Artificial Intelligence 4

slide-5
SLIDE 5

} Represents full joint distribution } Represents conditional independence

  • E.g., JohnCalls is independent of Burglary and

Earthquake given Alarm

Artificial Intelligence 5

) ) ( | ( ) ,..., ( ) ) ( | ( ) ... (

1 1 1 1 1

Õ Õ

= =

= = = = Ù Ù =

n i i i n n i i i i n n

X parents x P x x P X parents x X P x X x X P

slide-6
SLIDE 6

} P(b,¬e,a,j,m) = ?

Artificial Intelligence 6

slide-7
SLIDE 7

} Determine set of random variables {X1,…,Xn} } Order them so that causes precede effects } For i = 1 to n do

  • Choose minimal set of parents for Xi such that

P(Xi | Xi-1,…,X1) = P(Xi | Parents(Xi))

  • For each parent Xk insert link from Xk to Xi
  • Write down the CPT, P(Xi | Parents(Xi))

} E.g., Burglary, Earthquake, Alarm, JohnCalls,

MaryCalls

Artificial Intelligence 7

slide-8
SLIDE 8

} Bad orderings lead to more complex

networks with more CPT entries

a) MaryCalls, JohnCalls, Alarm, Burglary, Earthquake b) MaryCalls, JohnCalls, Earthquake, Burglary, Alarm

Artificial Intelligence 8

slide-9
SLIDE 9

} Example: Tooth World

Artificial Intelligence 9

toothache ¬toothache catch ¬catch catch ¬catch cavity .108 .012 .072 .008 ¬cavity .016 .064 .144 .576

slide-10
SLIDE 10

} Node X is conditionally independent of its non-

descendants (Zij’s) given its parents (Ui’s)

} Markov blanket of node X is X’s parents (Ui’s),

children (Yi’s) and children’s parents (Zij’s)

} Node X is conditionally independent of all other

nodes in the network given its Markov blanket

Artificial Intelligence 10

slide-11
SLIDE 11

} Want P(X | e) } X is the query variable (can be more than one) } e is an observed event, i.e., values for the

evidence variables E = {E1,…,Em}

} Any other variables Y are hidden variables } Example

  • P(Burglary | JohnCalls=true, MaryCalls=true) = ?
  • X = Burglary
  • e = {JohnCalls=true, MaryCalls=true}
  • Y = {Earthquake, Alarm}

Artificial Intelligence 11

slide-12
SLIDE 12

} Enumerate over all possible values for Y

  • P(X | e) = α P(X,e) = α Sy P(X,e,y)

} Example

  • P(Burglary | JohnCalls=true, MaryCalls=true)
  • P(B | j, m) = ?

Artificial Intelligence 12

slide-13
SLIDE 13

} P(B|j,m) = α P(B) SE P(E) SA P(A|B,E) P(j|A) P(m|A) } P(b|j,m) = α P(b) SE P(E) SA P(A|b,E) P(j|A) P(m|A)

Artificial Intelligence 13

slide-14
SLIDE 14

} P(b|j,m) = α P(b) SE P(E)SA P(A|b,E) P(j|A) P(m|A)

Artificial Intelligence 14

slide-15
SLIDE 15

Artificial Intelligence 15

function ENUMERATION-ASK (X, e, bn) returns a distribution over X inputs: X, the query variable e, observed values of variables E bn, a Bayes net with variables {X} È E È Y // Y = hidden variables Q(X) ← a distribution over X, initially empty for each value xi of X do Q(xi) ← ENUMERATE-ALL(bn.V

ARS, exi)

where exi is e extended with X = xi return NORMALIZE(Q(X))

function ENUMERATE-ALL (vars, e) returns a real number if EMPTY? (vars) then return 1.0 Y ← FIRST (vars) if Y has value y in e then return P(y | parents(Y)) ´ ENUMERATE-ALL(REST(vars), e) else return Sy P(y | parents(Y)) ´ ENUMERATE-ALL(REST(vars), ey) where ey is e extended with Y = y bn.VARS has variables in causeàeffect order

slide-16
SLIDE 16

} ENUMERATION-ASK evaluates trees using depth-

first recursion

} Space complexity O(n) } Time complexity O(vn), where each of n

variables has v possible values

Artificial Intelligence 16

slide-17
SLIDE 17

Artificial Intelligence 17

Note redundant computation

slide-18
SLIDE 18

} Avoid redundant computation

  • Dynamic programming
  • Store intermediate computations and reuse

} Eliminate irrelevant variables

  • Variables that are not an ancestor of a query or

evidence variable

Artificial Intelligence 18

slide-19
SLIDE 19

} General case (any type of network)

  • Worst case space and time complexity is exponential

} Polytree is a network with at most one undirected

path between any two nodes

  • Space and time complexity is linear in size of network

Artificial Intelligence 19

Polytree Not a polytree

slide-20
SLIDE 20

} P(Pit3,3 | Breeze3,2=true) = ?

Artificial Intelligence 20

P? B

slide-21
SLIDE 21

} Exact inference can be too expensive } Approximate inference

  • Estimate probabilities from sample, rather than

computing exactly

} Monte Carlo methods

  • Choose values for hidden variables
  • Compute query variables
  • Repeat and average

} Direct sampling } Converges to exact inference

Artificial Intelligence 21

slide-22
SLIDE 22

} Choose value for variables according to their CPT

  • Consider variables in topological order

} E.g.,

  • P(B) = á0.001,0.999ñ, B=false
  • P(E) = á0.002,0.998ñ, E=false
  • P(A|B=false,E=false) = á0.001,0.999ñ, A=false
  • P(J|A=false) = á0.05,0.95ñ, J=false
  • P(M|A=false) = á0.01,0.99ñ, M=false
  • Sample is [false,false,false,false,false]

Artificial Intelligence 22

| samples | | where samples | ) (

i i

x X x X P = » =

slide-23
SLIDE 23

} Another example

Artificial Intelligence 23

slide-24
SLIDE 24

} Commercial

  • Bayes Server (www.bayesserver.com)
  • BayesiaLab (www.bayesia.com)
  • HUGIN (www.hugin.com)

} Free

  • BayesPy (www.bayespy.org)
  • JavaBayes (www.cs.cmu.edu/~javabayes)
  • SMILE (www.bayesfusion.com)

} Sample networks

  • www.bnlearn.com/bnrepository

Artificial Intelligence 24

slide-25
SLIDE 25

} Bayesian networks

  • Captures full joint probability distribution and

conditional independence

} Exact inference

  • Intractable in worst case

} Approximate inference

  • Sampling
  • Converges to exact inference

Artificial Intelligence 25