Uncertain t y Chapter 14 c AIMA Slides Stuart Russell and - - PowerPoint PPT Presentation

uncertain t y chapter 14 c aima slides stuart russell and
SMART_READER_LITE
LIVE PREVIEW

Uncertain t y Chapter 14 c AIMA Slides Stuart Russell and - - PowerPoint PPT Presentation

Uncertain t y Chapter 14 c AIMA Slides Stuart Russell and P eter Norvig, 1998 Chapter 14 1 Outline } Uncertaint y } Probabili t y } Syntax } Semantics } Inference rules c AIMA Slides Stuart Russell and P


slide-1
SLIDE 1 Uncertain t y Chapter 14 AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 1
slide-2
SLIDE 2 Outline } Uncertaint y } Probabili t y } Syntax } Semantics } Inference rules AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 2
slide-3
SLIDE 3 Uncertain t y Let action A t = leave fo r airp
  • rt
t minutes b efo re ight Will A t get me there
  • n
time? Problems: 1) pa rtial
  • bservabili
t y (road state,
  • ther
drivers' plans, etc.) 2) noisy senso rs (K CBS trac rep
  • rts)
3) uncertain t y in action
  • utcomes
(at tire, etc.) 4) immense complexit y
  • f
mo delling and p redicting trac Hence a purely logical app roach either 1) risks falseho
  • d:
\A 25 will get me there
  • n
time"
  • r
2) leads to conclusions that a re to
  • w
eak fo r decision making: \A 25 will get me there
  • n
time if there's no accident
  • n
the b ridge and it do esn't rain and my tires remain intact etc etc." (A 1440 might reasonably b e said to get me there
  • n
time but I'd have to sta y
  • vernight
in the airp
  • rt
: : :) AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 3
slide-4
SLIDE 4 Metho ds for handling uncertain t y Default
  • r
nonmonotonic logic: Assume my ca r do es not have a at tire Assume A 25 w
  • rks
unless contradicted b y evidence Issues: What assumptions a re reasonable? Ho w to handle contradicti
  • n?
Rules with fudge facto rs : A 25 7! 0:3 get there
  • n
time S pr ink l er 7! 0:99 W etGr ass W etGr ass 7! 0:7 R ain Issues: Problems with combinati
  • n,
e.g., S pr ink l er causes R ain?? Probabili t y Given the available evidence, A 25 will get me there
  • n
time with p robabili t y 0.04 Mahaviraca ry a (9th C.), Ca rdamo (1565) theo ry
  • f
gambling (F uzzy logic handles de gr e e
  • f
truth NOT uncertaint y e.g., W etGr ass is true to degree 0.2) AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 4
slide-5
SLIDE 5 Probabilit y Probabili stic assertions summarize eects
  • f
laziness : failure to enumerate exceptions, qualicati
  • ns,
etc. igno rance : lack
  • f
relevant facts, initial conditions, etc. Subjective
  • r
Ba y esian p robabili t y: Probabili ties relate p rop
  • sitions
to
  • ne's
  • wn
state
  • f
kno wledge e.g., P (A 25 jno rep
  • rted
accidents ) = 0:06 These a re not assertions ab
  • ut
the w
  • rld
Probabili ties
  • f
p rop
  • sitions
change with new evidence: e.g., P (A 25 jno rep
  • rted
accidents ; 5 a.m.) = 0:15 (Analogous to logical entailment status K B j = , not truth.) AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 5
slide-6
SLIDE 6 Making decisions under uncertain t y Supp
  • se
I b elieve the follo wing: P (A 25 gets me there
  • n
time j : : :) = 0:04 P (A 90 gets me there
  • n
time j : : :) = 0:70 P (A 120 gets me there
  • n
time j : : :) = 0:95 P (A 1440 gets me there
  • n
time j : : :) = 0:9999 Which action to cho
  • se?
Dep ends
  • n
my p references fo r missing ight vs. airp
  • rt
cuisine, etc. Utilit y theo ry is used to rep resent and infer p references Decision theo ry = utilit y theo ry + p robabili t y theo ry AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 6
slide-7
SLIDE 7 Axioms
  • f
probabilit y F
  • r
any p rop
  • sitions
A, B 1.
  • P
(A)
  • 1
2. P (T r ue) = 1 and P (F al se) = 3. P (A _ B ) = P (A) + P (B )
  • P
(A ^ B )

>

A B True A B

de Finetti (1931): an agent who b ets acco rding to p robabiliti es that violate these axioms can b e fo rced to b et so as to lose money rega rdless
  • f
  • utcome.
AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 7
slide-8
SLIDE 8 Syn tax Simila r to p rop
  • sitional
logic: p
  • ssible
w
  • rlds
dened b y assignment
  • f
values to random va riables . Prop
  • sitiona
l
  • r
Bo
  • lean
random va riables e.g., C av ity (do I have a cavit y?) Include p rop
  • sitiona
l logic exp ressions e.g., :B ur g l ar y _ E ar thq uak e Multivalued random va riables e.g., W eather is
  • ne
  • f
hsunny ; r ain; cl
  • udy
; snow i V alues must b e exhaustive and mutually exclusive Prop
  • sition
constructed b y assignment
  • f
a value: e.g., W eather = sunny ; also C av ity = tr ue fo r cla rit y AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 8
slide-9
SLIDE 9 Syn tax con td. Prio r
  • r
uncondition al p robabili tie s
  • f
p rop
  • sition
s e.g., P (C av ity ) = 0:1 and P (W eather = sunny ) = 0:72 co rresp
  • nd
to b elief p rio r to a rrival
  • f
any (new) evidence Probabili t y distribution gives values fo r all p
  • ssible
assignments: P(W eather ) = h0:72; 0:1; 0:08; 0:1i (no rmalize d , i.e., sums to 1) Joint p robabilit y distributi
  • n
fo r a set
  • f
va riables gives values fo r each p
  • ssible
assignment to all the va riables P(W eather ; C av ity ) = a 4
  • 2
matrix
  • f
values: W eather = sunny r ain cl
  • udy
snow C av ity = tr ue C av ity = f al se AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 9
slide-10
SLIDE 10 Syn tax con td. Conditional
  • r
p
  • sterio
r p robabili tie s e.g., P (C av ity jT
  • othache)
= 0:8 i.e., given that T
  • othache
is all I kno w Notation fo r conditiona l distributi
  • ns:
P(W eather jE ar thq uak e) = 2-element vecto r
  • f
4-element vecto rs If w e kno w mo re, e.g., C av ity is also given, then w e have P (C av ity jT
  • othache;
C av ity ) = 1 Note: the less sp ecic b elief r emains valid after mo re evidence a rrives, but is not alw a ys useful New evidence ma y b e irrelevant , allo wing simplicati
  • n,
e.g., P (C av ity jT
  • othache;
49er sW in) = P (C av ity jT
  • othache)
= 0:8 This kind
  • f
inference, sanctioned b y domain kno wledge, is crucial AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 10
slide-11
SLIDE 11 Conditional probabilit y Denition
  • f
conditional p robabili t y: P (AjB ) = P (A ^ B ) P (B ) if P (B ) 6= Pro duct rule gives an alternative fo rmulation: P (A ^ B ) = P (AjB )P (B ) = P (B jA)P (A) A general version holds fo r whole distribution s, e.g., P(W eather ; C av ity ) = P(W eather jC av ity )P (C av ity ) (View as a 4
  • 2
set
  • f
equations, not matrix mult.) Chain rule is derived b y successive applicati
  • n
  • f
p ro duct rule: P(X 1 ; : : : ; X n ) = P(X 1 ; : : : ; X n1 ) P(X n jX 1 ; : : : ; X n1 ) = P (X 1 ; : : : ; X n2 ) P(X n 1 jX 1 ; : : : ; X n2 ) P (X n jX 1 ; : : : ; X n1 ) = : : : =
  • n
i = 1 P(X i jX 1 ; : : : ; X i1 ) AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 11
slide-12
SLIDE 12 Ba y es' Rule Pro duct rule P (A ^ B ) = P (AjB )P (B ) = P (B jA)P (A) ) Ba y es' rule P (AjB ) = P (B jA)P (A) P (B ) Why is this useful??? F
  • r
assessing diagnostic p robabili t y from causal p robabilit y: P (C ausejE f f ect) = P (E f f ectjC ause)P ( C aus e) P (E f f ect) E.g., let M b e meningitis, S b e sti neck: P (M jS ) = P (S jM )P (M ) P (S ) = 0:8
  • 0:0001
0:1 = 0:0008 Note: p
  • sterio
r p robabilit y
  • f
meningiti s still very small! AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 12
slide-13
SLIDE 13 Normalization Supp
  • se
w e wish to compute a p
  • sterio
r distributi
  • n
  • ver
A given B = b, and supp
  • se
A has p
  • ssible
values a 1 : : : a m W e can apply Ba y es' rule fo r each value
  • f
A: P (A = a 1 jB = b) = P (B = bjA = a 1 )P (A = a 1 )=P (B = b) : : : P (A = a m jB = b) = P (B = bjA = a m )P (A = a m )=P (B = b) Adding these up, and noting that
  • i
P (A = a i jB = b) = 1: 1=P (B = b) = 1= i P (B = bjA = a i )P (A = a i ) This is the no rmalizati
  • n
facto r , constant w.r.t. i, denoted : P (AjB = b) = P(B = bjA)P (A) T ypically compute an unno rmalize d distributi
  • n,
no rmalize at end e.g., supp
  • se
P (B = bjA)P(A) = h0:4; 0:2; 0:2i then P(AjB = b) = h0:4; 0:2; 0:2i = h0:4;0:2;0:2i 0:4+0:2+0:2 = h0:5; 0:25; 0:25i AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 13
slide-14
SLIDE 14 Conditioning Intro ducin g a va riable as an extra condition: P (X jY ) =
  • z
P (X jY ; Z = z )P (Z = z jY ) Intuition:
  • ften
easier to assess each sp ecic circumstance , e.g., P (R unO v er jC r
  • ss)
= P (R unO v er jC r
  • ss;
Lig ht = g r een)P (Lig ht = g r eenjC r
  • ss)
+ P (R unO v er jC r
  • ss;
Lig ht = y el l
  • w
)P (Lig ht = y el l
  • w
jC r
  • ss)
+ P (R unO v er jC r
  • ss;
Lig ht = r ed)P (Lig ht = r edjC r
  • ss)
When Y is absent, w e have summing
  • ut
  • r
ma rginaliz ati
  • n
: P (X ) =
  • z
P (X jZ = z )P (Z = z ) =
  • z
P (X ; Z = z ) In general, given a joint distributi
  • n
  • ver
a set
  • f
va riables, the dis- tribution
  • ver
any subset (called a ma rginal distribution fo r histo rical reasons) can b e calculate d b y summing
  • ut
the
  • ther
va riables. AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 14
slide-15
SLIDE 15 F ull join t distributions A complete p robabili t y mo del sp ecies every entry in the joint distribu- tion fo r all the va riables X = X 1 ; : : : ; X n I.e., a p robabilit y fo r each p
  • ssible
w
  • rld
X 1 = x 1 ; : : : ; X n = x n (Cf. complete theo ries in logic.) E.g., supp
  • se
T
  • othache
and C av ity a re the random va riables: T
  • othache
= tr ue T
  • othache
= f al se C av ity = tr ue 0:04 0:06 C av ity = f al se 0:01 0:89 P
  • ssible
w
  • rlds
a re mutually exclusive ) P (w 1 ^ w 2 ) = P
  • ssible
w
  • rlds
a re exhaustive ) w 1 _
  • _
w n is T r ue hence
  • i
P (w i ) = 1 AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 15
slide-16
SLIDE 16 F ull join t distributions con td. 1) F
  • r
any p rop
  • sition
  • dened
  • n
the random va riables (w i ) is true
  • r
false 2)
  • is
equivalen t to the disjunction
  • f
w i s where (w i ) is true Hence P () =
  • fw
i : (w i )g P (w i ) I.e., the unconditi
  • na
l p robabilit y
  • f
any p rop
  • sition
is computable as the sum
  • f
entries from the full joint distributi
  • n
Conditional p robabili ti es can b e computed in the same w a y as a ratio: P (j ) = P ( ^
  • )
P ( ) E.g., P (C av ity jT
  • othache)
= P (C av ity ^ T
  • othache)
P (T
  • othache)
= 0:04 0:04 + 0:01 = 0:8 AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 16
slide-17
SLIDE 17 Inference from join t distributions T ypically , w e a re interested in the p
  • sterio
r joint distribut ion
  • f
the query va riables Y given sp ecic values e fo r the evidence va riables E Let the hidden va riables b e H = X
  • Y
  • E
Then the required summation
  • f
joint entries is done b y summing
  • ut
the hidden va riables: P (Y jE = e ) = P(Y; E = e ) =
  • h
P(Y; E = e ; H = h) The terms in the summation a re joint entries b ecause Y, E, and H together exhaust the set
  • f
random va riables Obvious p roblems: 1) W
  • rst-case
time complexit y O (d n ) where d is the la rgest a rit y 2) Space complexit y O (d n ) to sto re the joint distribut ion 3) Ho w to nd the numb ers fo r O (d n ) entries??? AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 14 17