Dealing with Uncertainty Paolo Turrini Department of Computing, - - PowerPoint PPT Presentation

dealing with uncertainty
SMART_READER_LITE
LIVE PREVIEW

Dealing with Uncertainty Paolo Turrini Department of Computing, - - PowerPoint PPT Presentation

Intro to AI (2nd Part) Dealing with Uncertainty Paolo Turrini Department of Computing, Imperial College London Introduction to Artificial Intelligence 2nd Part Paolo Turrini Intro to AI (2nd Part) Intro to AI (2nd Part) Uncertainty and


slide-1
SLIDE 1

Intro to AI (2nd Part)

Dealing with Uncertainty

Paolo Turrini

Department of Computing, Imperial College London

Introduction to Artificial Intelligence 2nd Part

Paolo Turrini Intro to AI (2nd Part)

slide-2
SLIDE 2

Intro to AI (2nd Part)

Uncertainty and Probabilities

Paolo Turrini Intro to AI (2nd Part)

slide-3
SLIDE 3

Intro to AI (2nd Part)

The main reference

Stuart Russell and Peter Norvig Artificial Intelligence: a modern approach Chapter 13

Paolo Turrini Intro to AI (2nd Part)

slide-4
SLIDE 4

Intro to AI (2nd Part)

Outline

Uncertainty Probability Probability and logic Inference

Paolo Turrini Intro to AI (2nd Part)

slide-5
SLIDE 5

Intro to AI (2nd Part)

Uncertain outcomes

Paolo Turrini Intro to AI (2nd Part)

slide-6
SLIDE 6

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning

Paolo Turrini Intro to AI (2nd Part)

slide-7
SLIDE 7

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier.

Paolo Turrini Intro to AI (2nd Part)

slide-8
SLIDE 8

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier. Let action St = snooze the alarm clock t times

Paolo Turrini Intro to AI (2nd Part)

slide-9
SLIDE 9

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier. Let action St = snooze the alarm clock t times Will St get me there on time?

Paolo Turrini Intro to AI (2nd Part)

slide-10
SLIDE 10

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier. Let action St = snooze the alarm clock t times Will St get me there on time? Problems:

Paolo Turrini Intro to AI (2nd Part)

slide-11
SLIDE 11

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier. Let action St = snooze the alarm clock t times Will St get me there on time? Problems:

1 partial observability (planned engineering works, announced

strikes, etc.)

Paolo Turrini Intro to AI (2nd Part)

slide-12
SLIDE 12

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier. Let action St = snooze the alarm clock t times Will St get me there on time? Problems:

1 partial observability (planned engineering works, announced

strikes, etc.)

2 noisy sensors (BBC reports, Google maps) Paolo Turrini Intro to AI (2nd Part)

slide-13
SLIDE 13

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier. Let action St = snooze the alarm clock t times Will St get me there on time? Problems:

1 partial observability (planned engineering works, announced

strikes, etc.)

2 noisy sensors (BBC reports, Google maps) 3 uncertainty in action outcomes (my phone might die, etc.) Paolo Turrini Intro to AI (2nd Part)

slide-14
SLIDE 14

Intro to AI (2nd Part)

Uncertain outcomes

I have a lecture on Thursday in the early morning and an alarm clock set for even earlier. Let action St = snooze the alarm clock t times Will St get me there on time? Problems:

1 partial observability (planned engineering works, announced

strikes, etc.)

2 noisy sensors (BBC reports, Google maps) 3 uncertainty in action outcomes (my phone might die, etc.) 4 immense complexity of modelling and predicting traffic Paolo Turrini Intro to AI (2nd Part)

slide-15
SLIDE 15

Intro to AI (2nd Part)

Uncertainty

A binary true-false approach either:

Paolo Turrini Intro to AI (2nd Part)

slide-16
SLIDE 16

Intro to AI (2nd Part)

Uncertainty

A binary true-false approach either:

1 might lead to conclusions that are too strong: Paolo Turrini Intro to AI (2nd Part)

slide-17
SLIDE 17

Intro to AI (2nd Part)

Uncertainty

A binary true-false approach either:

1 might lead to conclusions that are too strong:

“S25 will not get me there on time”

Paolo Turrini Intro to AI (2nd Part)

slide-18
SLIDE 18

Intro to AI (2nd Part)

Uncertainty

A binary true-false approach either:

1 might lead to conclusions that are too strong:

“S25 will not get me there on time”

2 or too weak: Paolo Turrini Intro to AI (2nd Part)

slide-19
SLIDE 19

Intro to AI (2nd Part)

Uncertainty

A binary true-false approach either:

1 might lead to conclusions that are too strong:

“S25 will not get me there on time”

2 or too weak:

“S25 will not get me there on time unless there’s no delay on the District Line and it doesn’t rain and I haven’t forgotten the keys at home etc.”

Paolo Turrini Intro to AI (2nd Part)

slide-20
SLIDE 20

Intro to AI (2nd Part)

Methods for handling uncertainty: defaults

default logic handles ”normal circumstances”:

Paolo Turrini Intro to AI (2nd Part)

slide-21
SLIDE 21

Intro to AI (2nd Part)

Methods for handling uncertainty: defaults

default logic handles ”normal circumstances”: Tube normally runs

Paolo Turrini Intro to AI (2nd Part)

slide-22
SLIDE 22

Intro to AI (2nd Part)

Methods for handling uncertainty: defaults

default logic handles ”normal circumstances”: Tube normally runs Announced strikes normally happen

Paolo Turrini Intro to AI (2nd Part)

slide-23
SLIDE 23

Intro to AI (2nd Part)

Methods for handling uncertainty: defaults

default logic handles ”normal circumstances”: Tube normally runs Announced strikes normally happen Issues:

Paolo Turrini Intro to AI (2nd Part)

slide-24
SLIDE 24

Intro to AI (2nd Part)

Methods for handling uncertainty: defaults

default logic handles ”normal circumstances”: Tube normally runs Announced strikes normally happen Issues:

What assumptions are reasonable?

Paolo Turrini Intro to AI (2nd Part)

slide-25
SLIDE 25

Intro to AI (2nd Part)

Methods for handling uncertainty: defaults

default logic handles ”normal circumstances”: Tube normally runs Announced strikes normally happen Issues:

What assumptions are reasonable? How to handle contradiction? (e.g., will the tube run?)

Paolo Turrini Intro to AI (2nd Part)

slide-26
SLIDE 26

Intro to AI (2nd Part)

Methods for handling uncertainty: defaults

default logic handles ”normal circumstances”: Tube normally runs Announced strikes normally happen Issues:

What assumptions are reasonable? How to handle contradiction? (e.g., will the tube run?)

Also, fuzzy logic handles degrees of truth. It doesn’t arguably handle uncertainty e.g., Asleep is true to degree 0.2

Paolo Turrini Intro to AI (2nd Part)

slide-27
SLIDE 27

Intro to AI (2nd Part)

Rules with fudge factors

e..g, S25 →0.4 AtLectureOnTime

Paolo Turrini Intro to AI (2nd Part)

slide-28
SLIDE 28

Intro to AI (2nd Part)

Rules with fudge factors

e..g, S25 →0.4 AtLectureOnTime But... ReadingSteinbeck →0.7 FallAsleep

Paolo Turrini Intro to AI (2nd Part)

slide-29
SLIDE 29

Intro to AI (2nd Part)

Rules with fudge factors

e..g, S25 →0.4 AtLectureOnTime But... ReadingSteinbeck →0.7 FallAsleep FallAsleep →0.99 DarkOutside

Paolo Turrini Intro to AI (2nd Part)

slide-30
SLIDE 30

Intro to AI (2nd Part)

Rules with fudge factors

e..g, S25 →0.4 AtLectureOnTime But... ReadingSteinbeck →0.7 FallAsleep FallAsleep →0.99 DarkOutside Problems with combination, e.g., ReadingSteinbeck →∼0.7 DarkOutside

Paolo Turrini Intro to AI (2nd Part)

slide-31
SLIDE 31

Intro to AI (2nd Part)

Rules with fudge factors

e..g, S25 →0.4 AtLectureOnTime But... ReadingSteinbeck →0.7 FallAsleep FallAsleep →0.99 DarkOutside Problems with combination, e.g., ReadingSteinbeck →∼0.7 DarkOutside Causal connections?

Paolo Turrini Intro to AI (2nd Part)

slide-32
SLIDE 32

Intro to AI (2nd Part)

Probabilities

Probability P(S25 gets me there on time| . . .) = 0.2 Given the available evidence, S25 will get me there on time with probability 0.2

Paolo Turrini Intro to AI (2nd Part)

slide-33
SLIDE 33

Intro to AI (2nd Part)

Probabilities

Probability P(S25 gets me there on time| . . .) = 0.2 Given the available evidence, S25 will get me there on time with probability 0.2 Probabilistic assertions summarize effects of:

Paolo Turrini Intro to AI (2nd Part)

slide-34
SLIDE 34

Intro to AI (2nd Part)

Probabilities

Probability P(S25 gets me there on time| . . .) = 0.2 Given the available evidence, S25 will get me there on time with probability 0.2 Probabilistic assertions summarize effects of: laziness: failure to enumerate exceptions, qualifications, etc.

Paolo Turrini Intro to AI (2nd Part)

slide-35
SLIDE 35

Intro to AI (2nd Part)

Probabilities

Probability P(S25 gets me there on time| . . .) = 0.2 Given the available evidence, S25 will get me there on time with probability 0.2 Probabilistic assertions summarize effects of: laziness: failure to enumerate exceptions, qualifications, etc. ignorance: lack of relevant facts, initial conditions, etc.

Paolo Turrini Intro to AI (2nd Part)

slide-36
SLIDE 36

Intro to AI (2nd Part)

Probabilities

Probability P(S25 gets me there on time| . . .) = 0.2 Given the available evidence, S25 will get me there on time with probability 0.2 Probabilistic assertions summarize effects of: laziness: failure to enumerate exceptions, qualifications, etc. ignorance: lack of relevant facts, initial conditions, etc. Subjective/Bayesian view: Probabilities relate propositions to

  • ne’s own state of knowledge e.g.,

P(S25 gets me there on time|no reported accidents) = 0.3

Paolo Turrini Intro to AI (2nd Part)

slide-37
SLIDE 37

Intro to AI (2nd Part)

Probability

These are not claims of a “probabilistic tendency” in the current situation (but might be learned from past experience

  • f similar situations)

Paolo Turrini Intro to AI (2nd Part)

slide-38
SLIDE 38

Intro to AI (2nd Part)

Probability

These are not claims of a “probabilistic tendency” in the current situation (but might be learned from past experience

  • f similar situations)

Probabilities of propositions change with new evidence:

Paolo Turrini Intro to AI (2nd Part)

slide-39
SLIDE 39

Intro to AI (2nd Part)

Probability

These are not claims of a “probabilistic tendency” in the current situation (but might be learned from past experience

  • f similar situations)

Probabilities of propositions change with new evidence: e.g., P(S25|no reported accidents, 5 a.m.) = 0.8

Paolo Turrini Intro to AI (2nd Part)

slide-40
SLIDE 40

Intro to AI (2nd Part)

Probability

These are not claims of a “probabilistic tendency” in the current situation (but might be learned from past experience

  • f similar situations)

Probabilities of propositions change with new evidence: e.g., P(S25|no reported accidents, 5 a.m.) = 0.8 Analogous to logical entailment status KB | = α, not truth.

Paolo Turrini Intro to AI (2nd Part)

slide-41
SLIDE 41

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following:

Paolo Turrini Intro to AI (2nd Part)

slide-42
SLIDE 42

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99

Paolo Turrini Intro to AI (2nd Part)

slide-43
SLIDE 43

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90

Paolo Turrini Intro to AI (2nd Part)

slide-44
SLIDE 44

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90 P(S10 gets me there on time| . . .) = 0.6

Paolo Turrini Intro to AI (2nd Part)

slide-45
SLIDE 45

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90 P(S10 gets me there on time| . . .) = 0.6 P(S25 gets me there on time| . . .) = 0.1

Paolo Turrini Intro to AI (2nd Part)

slide-46
SLIDE 46

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90 P(S10 gets me there on time| . . .) = 0.6 P(S25 gets me there on time| . . .) = 0.1 Which action should I choose?

Paolo Turrini Intro to AI (2nd Part)

slide-47
SLIDE 47

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90 P(S10 gets me there on time| . . .) = 0.6 P(S25 gets me there on time| . . .) = 0.1 Which action should I choose? IT DEPENDS

Paolo Turrini Intro to AI (2nd Part)

slide-48
SLIDE 48

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90 P(S10 gets me there on time| . . .) = 0.6 P(S25 gets me there on time| . . .) = 0.1 Which action should I choose? IT DEPENDS on my preferences

Paolo Turrini Intro to AI (2nd Part)

slide-49
SLIDE 49

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90 P(S10 gets me there on time| . . .) = 0.6 P(S25 gets me there on time| . . .) = 0.1 Which action should I choose? IT DEPENDS on my preferences e.g., missing class vs. sleeping

Paolo Turrini Intro to AI (2nd Part)

slide-50
SLIDE 50

Intro to AI (2nd Part)

If you snooze you lose?

Suppose I believe the following: P(S0 gets me there on time| . . .) = 0.99 P(S1 gets me there on time| . . .) = 0.90 P(S10 gets me there on time| . . .) = 0.6 P(S25 gets me there on time| . . .) = 0.1 Which action should I choose? IT DEPENDS on my preferences e.g., missing class vs. sleeping S0: ages in the Huxley building, therefore feeling miserable.

Paolo Turrini Intro to AI (2nd Part)

slide-51
SLIDE 51

Intro to AI (2nd Part)

Chances and Utility

Utility theory is used to represent and infer preferences

Paolo Turrini Intro to AI (2nd Part)

slide-52
SLIDE 52

Intro to AI (2nd Part)

Chances and Utility

Utility theory is used to represent and infer preferences Decision theory = utility theory + probability theory

Paolo Turrini Intro to AI (2nd Part)

slide-53
SLIDE 53

Intro to AI (2nd Part)

Probability basics

Begin with a set Ω—the sample space

Paolo Turrini Intro to AI (2nd Part)

slide-54
SLIDE 54

Intro to AI (2nd Part)

Probability basics

Begin with a set Ω—the sample space e.g., 6 possible rolls of a dice.

Paolo Turrini Intro to AI (2nd Part)

slide-55
SLIDE 55

Intro to AI (2nd Part)

Probability basics

Begin with a set Ω—the sample space e.g., 6 possible rolls of a dice. w ∈ Ω is a sample point/possible world/atomic event

Paolo Turrini Intro to AI (2nd Part)

slide-56
SLIDE 56

Intro to AI (2nd Part)

Probability basics

A probability space or probability model is a sample space Ω with an assignment P(w) for every w ∈ Ω s.t.

Paolo Turrini Intro to AI (2nd Part)

slide-57
SLIDE 57

Intro to AI (2nd Part)

Probability basics

A probability space or probability model is a sample space Ω with an assignment P(w) for every w ∈ Ω s.t. 0 ≤ P(w) ≤ 1

Paolo Turrini Intro to AI (2nd Part)

slide-58
SLIDE 58

Intro to AI (2nd Part)

Probability basics

A probability space or probability model is a sample space Ω with an assignment P(w) for every w ∈ Ω s.t. 0 ≤ P(w) ≤ 1

ΣwP(w) = 1

Paolo Turrini Intro to AI (2nd Part)

slide-59
SLIDE 59

Intro to AI (2nd Part)

Probability basics

A probability space or probability model is a sample space Ω with an assignment P(w) for every w ∈ Ω s.t. 0 ≤ P(w) ≤ 1

ΣwP(w) = 1

e.g., P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6.

Paolo Turrini Intro to AI (2nd Part)

slide-60
SLIDE 60

Intro to AI (2nd Part)

Events

An event A is any subset of Ω

Paolo Turrini Intro to AI (2nd Part)

slide-61
SLIDE 61

Intro to AI (2nd Part)

Events

An event A is any subset of Ω P(A) = Σ{w∈A}P(w)

Paolo Turrini Intro to AI (2nd Part)

slide-62
SLIDE 62

Intro to AI (2nd Part)

Events

An event A is any subset of Ω P(A) = Σ{w∈A}P(w) E.g., P(dice roll < 4) = P(1) + P(2) + P(3) = 1/6 + 1/6 + 1/6 = 1/2

Paolo Turrini Intro to AI (2nd Part)

slide-63
SLIDE 63

Intro to AI (2nd Part)

Random variables

A random variable is a function from sample points to some range, e.g., R, [0, 1],{true, false} . . .

Paolo Turrini Intro to AI (2nd Part)

slide-64
SLIDE 64

Intro to AI (2nd Part)

Random variables

A random variable is a function from sample points to some range, e.g., R, [0, 1],{true, false} . . . e.g., Odd(1) = true.

Paolo Turrini Intro to AI (2nd Part)

slide-65
SLIDE 65

Intro to AI (2nd Part)

Random variables

A random variable is a function from sample points to some range, e.g., R, [0, 1],{true, false} . . . e.g., Odd(1) = true. P induces a probability distribution for any random variable X: P(X = xi) = Σ{w:X(w) = xi}P(w)

Paolo Turrini Intro to AI (2nd Part)

slide-66
SLIDE 66

Intro to AI (2nd Part)

Random variables

A random variable is a function from sample points to some range, e.g., R, [0, 1],{true, false} . . . e.g., Odd(1) = true. P induces a probability distribution for any random variable X: P(X = xi) = Σ{w:X(w) = xi}P(w) e.g., P(Odd = true) = P(1) + P(3) + P(5) = 1/6 + 1/6 + 1/6 = 1/2

Paolo Turrini Intro to AI (2nd Part)

slide-67
SLIDE 67

Intro to AI (2nd Part)

Propositions

A proposition can be seen as an event (set of sample points) where the proposition is true

Paolo Turrini Intro to AI (2nd Part)

slide-68
SLIDE 68

Intro to AI (2nd Part)

Propositions

A proposition can be seen as an event (set of sample points) where the proposition is true Given Boolean random variables A and B:

Paolo Turrini Intro to AI (2nd Part)

slide-69
SLIDE 69

Intro to AI (2nd Part)

Propositions

A proposition can be seen as an event (set of sample points) where the proposition is true Given Boolean random variables A and B: event a = set of sample points where A(w) = true

Paolo Turrini Intro to AI (2nd Part)

slide-70
SLIDE 70

Intro to AI (2nd Part)

Propositions

A proposition can be seen as an event (set of sample points) where the proposition is true Given Boolean random variables A and B: event a = set of sample points where A(w) = true event ¬a = set of sample points where A(w) = false

Paolo Turrini Intro to AI (2nd Part)

slide-71
SLIDE 71

Intro to AI (2nd Part)

Propositions

A proposition can be seen as an event (set of sample points) where the proposition is true Given Boolean random variables A and B: event a = set of sample points where A(w) = true event ¬a = set of sample points where A(w) = false event a ∧ b = points where A(w) = true and B(w) = true

Paolo Turrini Intro to AI (2nd Part)

slide-72
SLIDE 72

Intro to AI (2nd Part)

Events and Propositional Logic

Proposition = disjunction of atomic events in which it is true

Paolo Turrini Intro to AI (2nd Part)

slide-73
SLIDE 73

Intro to AI (2nd Part)

Events and Propositional Logic

Proposition = disjunction of atomic events in which it is true e.g., (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b)

Paolo Turrini Intro to AI (2nd Part)

slide-74
SLIDE 74

Intro to AI (2nd Part)

Events and Propositional Logic

Proposition = disjunction of atomic events in which it is true e.g., (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b) ⇒ P(a ∨ b)

Paolo Turrini Intro to AI (2nd Part)

slide-75
SLIDE 75

Intro to AI (2nd Part)

Events and Propositional Logic

Proposition = disjunction of atomic events in which it is true e.g., (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b) ⇒ P(a ∨ b) = P(¬a ∧ b) + P(a ∧ ¬b) + P(a ∧ b)

Paolo Turrini Intro to AI (2nd Part)

slide-76
SLIDE 76

Intro to AI (2nd Part)

Events and Propositional Logic

Proposition = disjunction of atomic events in which it is true e.g., (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b) ⇒ P(a ∨ b) = P(¬a ∧ b) + P(a ∧ ¬b) + P(a ∧ b) = P(¬a ∧ b) + P(a ∧ ¬b) + P(a ∧ b) + P(a ∧ b) − P(a ∧ b)

Paolo Turrini Intro to AI (2nd Part)

slide-77
SLIDE 77

Intro to AI (2nd Part)

Events and Propositional Logic

Proposition = disjunction of atomic events in which it is true e.g., (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b) ⇒ P(a ∨ b) = P(¬a ∧ b) + P(a ∧ ¬b) + P(a ∧ b) = P(¬a ∧ b) + P(a ∧ ¬b) + P(a ∧ b) + P(a ∧ b) − P(a ∧ b) = P(a) + P(b) − P(a ∧ b)

Paolo Turrini Intro to AI (2nd Part)

slide-78
SLIDE 78

Intro to AI (2nd Part)

Probabilities are logical

Theorem (De Finetti 1931) An agent who bets according to ”illogical” probabilities can be tricked into a bet that loses money regardless of outcome.

Paolo Turrini Intro to AI (2nd Part)

slide-79
SLIDE 79

Intro to AI (2nd Part)

Syntax for propositions

Propositional e.g., Cavity (do I have a cavity?) Cavity = true is a proposition, also written cavity

Paolo Turrini Intro to AI (2nd Part)

slide-80
SLIDE 80

Intro to AI (2nd Part)

Syntax for propositions

Propositional e.g., Cavity (do I have a cavity?) Cavity = true is a proposition, also written cavity Discrete e.g., Weather is one of sunny, rain, cloudy, snow. Weather = rain is a proposition.

Paolo Turrini Intro to AI (2nd Part)

slide-81
SLIDE 81

Intro to AI (2nd Part)

Syntax for propositions

Propositional e.g., Cavity (do I have a cavity?) Cavity = true is a proposition, also written cavity Discrete e.g., Weather is one of sunny, rain, cloudy, snow. Weather = rain is a proposition. Important: exhaustive and mutually exclusive

Paolo Turrini Intro to AI (2nd Part)

slide-82
SLIDE 82

Intro to AI (2nd Part)

Syntax for propositions

Propositional e.g., Cavity (do I have a cavity?) Cavity = true is a proposition, also written cavity Discrete e.g., Weather is one of sunny, rain, cloudy, snow. Weather = rain is a proposition. Important: exhaustive and mutually exclusive Continuous e.g., Temp = 21.6; Temp < 22.0.

Paolo Turrini Intro to AI (2nd Part)

slide-83
SLIDE 83

Intro to AI (2nd Part)

Probabilities

Unconditional probabilities Conditional probabilities

Paolo Turrini Intro to AI (2nd Part)

slide-84
SLIDE 84

Intro to AI (2nd Part)

Prior probability

Prior/unconditional probabilities of propositions:

Paolo Turrini Intro to AI (2nd Part)

slide-85
SLIDE 85

Intro to AI (2nd Part)

Prior probability

Prior/unconditional probabilities of propositions: e.g., P(Cavity = true) = 0.1 and P(Weather = sunny) = 0.72, correspond to belief prior to arrival of any (new) evidence

Paolo Turrini Intro to AI (2nd Part)

slide-86
SLIDE 86

Intro to AI (2nd Part)

Prior probability

Prior/unconditional probabilities of propositions: e.g., P(Cavity = true) = 0.1 and P(Weather = sunny) = 0.72, correspond to belief prior to arrival of any (new) evidence Probability distribution gives values for all possible assignments:

P(Weather) = 0.72, 0.1, 0.08, 0.1 (normalized, i.e., sums to 1)

Paolo Turrini Intro to AI (2nd Part)

slide-87
SLIDE 87

Intro to AI (2nd Part)

Prior probability cont.

Joint probability distribution probability of every sample point

Paolo Turrini Intro to AI (2nd Part)

slide-88
SLIDE 88

Intro to AI (2nd Part)

Prior probability cont.

Joint probability distribution probability of every sample point

P(Weather, Cavity) = a 4 × 2 matrix of values:

Paolo Turrini Intro to AI (2nd Part)

slide-89
SLIDE 89

Intro to AI (2nd Part)

Prior probability cont.

Joint probability distribution probability of every sample point

P(Weather, Cavity) = a 4 × 2 matrix of values: Weather = sunny rain cloudy snow Cavity = true 0.144 0.02 0.016 0.02 Cavity = false 0.576 0.08 0.064 0.08

Paolo Turrini Intro to AI (2nd Part)

slide-90
SLIDE 90

Intro to AI (2nd Part)

Prior probability cont.

Joint probability distribution probability of every sample point

P(Weather, Cavity) = a 4 × 2 matrix of values: Weather = sunny rain cloudy snow Cavity = true 0.144 0.02 0.016 0.02 Cavity = false 0.576 0.08 0.064 0.08

Every question about a domain can be answered by the joint distribution because every event is a sum of sample points

Paolo Turrini Intro to AI (2nd Part)

slide-91
SLIDE 91

Intro to AI (2nd Part)

Conditional probability

Conditional or posterior probabilities

Paolo Turrini Intro to AI (2nd Part)

slide-92
SLIDE 92

Intro to AI (2nd Part)

Conditional probability

Conditional or posterior probabilities e.g., P(cavity|toothache) = 0.8

Paolo Turrini Intro to AI (2nd Part)

slide-93
SLIDE 93

Intro to AI (2nd Part)

Conditional probability

Conditional or posterior probabilities e.g., P(cavity|toothache) = 0.8 i.e., given that toothache is all I know NOT “if toothache then 80% chance of cavity”

Paolo Turrini Intro to AI (2nd Part)

slide-94
SLIDE 94

Intro to AI (2nd Part)

Conditional probability

Conditional or posterior probabilities e.g., P(cavity|toothache) = 0.8 i.e., given that toothache is all I know NOT “if toothache then 80% chance of cavity” (Notation for conditional distributions: P(Cavity|Toothache) = 2-element vector of 2-element vectors)

Paolo Turrini Intro to AI (2nd Part)

slide-95
SLIDE 95

Intro to AI (2nd Part)

Conditional probability

If we know more, e.g., cavity is also given, then we have P(cavity|toothache, cavity) = ...

Paolo Turrini Intro to AI (2nd Part)

slide-96
SLIDE 96

Intro to AI (2nd Part)

Conditional probability

If we know more, e.g., cavity is also given, then we have P(cavity|toothache, cavity) = ... = 1

Paolo Turrini Intro to AI (2nd Part)

slide-97
SLIDE 97

Intro to AI (2nd Part)

Conditional probability

If we know more, e.g., cavity is also given, then we have P(cavity|toothache, cavity) = ... = 1 Note: the less specific belief remains valid after more evidence arrives, but is not always useful

Paolo Turrini Intro to AI (2nd Part)

slide-98
SLIDE 98

Intro to AI (2nd Part)

Conditional probability

If we know more, e.g., cavity is also given, then we have P(cavity|toothache, cavity) = ... = 1 Note: the less specific belief remains valid after more evidence arrives, but is not always useful New evidence may be irrelevant, allowing simplification

Paolo Turrini Intro to AI (2nd Part)

slide-99
SLIDE 99

Intro to AI (2nd Part)

Conditional probability

If we know more, e.g., cavity is also given, then we have P(cavity|toothache, cavity) = ... = 1 Note: the less specific belief remains valid after more evidence arrives, but is not always useful New evidence may be irrelevant, allowing simplification , e.g., P(cavity|toothache) = P(cavity|toothache, Cristiano Ronaldo scores) = 0.8

Paolo Turrini Intro to AI (2nd Part)

slide-100
SLIDE 100

Intro to AI (2nd Part)

Conditional probability

If we know more, e.g., cavity is also given, then we have P(cavity|toothache, cavity) = ... = 1 Note: the less specific belief remains valid after more evidence arrives, but is not always useful New evidence may be irrelevant, allowing simplification , e.g., P(cavity|toothache) = P(cavity|toothache, Cristiano Ronaldo scores) = 0.8 This kind of inference is crucial!

Paolo Turrini Intro to AI (2nd Part)

slide-101
SLIDE 101

Intro to AI (2nd Part)

Conditional probability

Definition of conditional probability: P(a|b) = P(a ∧ b) P(b) if P(b) = 0

Paolo Turrini Intro to AI (2nd Part)

slide-102
SLIDE 102

Intro to AI (2nd Part)

Conditional probability

Definition of conditional probability: P(a|b) = P(a ∧ b) P(b) if P(b) = 0 Product rule gives an alternative formulation: P(a ∧ b) = P(a|b)P(b) = P(b|a)P(a)

Paolo Turrini Intro to AI (2nd Part)

slide-103
SLIDE 103

Intro to AI (2nd Part)

Conditional probability

A general version holds for whole distributions, e.g., P(Weather, Cavity) = P(Weather|Cavity)P(Cavity)

Paolo Turrini Intro to AI (2nd Part)

slide-104
SLIDE 104

Intro to AI (2nd Part)

Conditional probability

A general version holds for whole distributions, e.g., P(Weather, Cavity) = P(Weather|Cavity)P(Cavity) (View as a 4 × 2 set of equations, not matrix multiplication)

Paolo Turrini Intro to AI (2nd Part)

slide-105
SLIDE 105

Intro to AI (2nd Part)

Conditional probability

A general version holds for whole distributions, e.g., P(Weather, Cavity) = P(Weather|Cavity)P(Cavity) (View as a 4 × 2 set of equations, not matrix multiplication) Chain rule is derived by successive application of product rule:

Paolo Turrini Intro to AI (2nd Part)

slide-106
SLIDE 106

Intro to AI (2nd Part)

Conditional probability

A general version holds for whole distributions, e.g., P(Weather, Cavity) = P(Weather|Cavity)P(Cavity) (View as a 4 × 2 set of equations, not matrix multiplication) Chain rule is derived by successive application of product rule: P(X1, . . . , Xn) = P(X1, . . . , Xn−1) P(Xn|X1, . . . , Xn−1)

Paolo Turrini Intro to AI (2nd Part)

slide-107
SLIDE 107

Intro to AI (2nd Part)

Conditional probability

A general version holds for whole distributions, e.g., P(Weather, Cavity) = P(Weather|Cavity)P(Cavity) (View as a 4 × 2 set of equations, not matrix multiplication) Chain rule is derived by successive application of product rule: P(X1, . . . , Xn) = P(X1, . . . , Xn−1) P(Xn|X1, . . . , Xn−1) = P(X1, . . . , Xn−2) P(Xn−1|X1, . . . , Xn−2) P(Xn|X1, . . . , Xn−1)

Paolo Turrini Intro to AI (2nd Part)

slide-108
SLIDE 108

Intro to AI (2nd Part)

Conditional probability

A general version holds for whole distributions, e.g., P(Weather, Cavity) = P(Weather|Cavity)P(Cavity) (View as a 4 × 2 set of equations, not matrix multiplication) Chain rule is derived by successive application of product rule: P(X1, . . . , Xn) = P(X1, . . . , Xn−1) P(Xn|X1, . . . , Xn−1) = P(X1, . . . , Xn−2) P(Xn−1|X1, . . . , Xn−2) P(Xn|X1, . . . , Xn−1) = . . . = Π

n i = 1P(Xi|X1, . . . , Xi−1)

Paolo Turrini Intro to AI (2nd Part)

slide-109
SLIDE 109

Intro to AI (2nd Part)

Inference by enumeration

Paolo Turrini Intro to AI (2nd Part)

Start with the joint distribution:

For any proposition ϕ, sum the atomic events where it is true: P(ϕ) = Σw:w|

=ϕP(w)

slide-110
SLIDE 110

Intro to AI (2nd Part)

Inference by enumeration

Paolo Turrini Intro to AI (2nd Part)

Start with the joint distribution:

For any proposition ϕ, sum the atomic events where it is true: P(ϕ) = Σw:w|

=ϕP(w)

P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2

slide-111
SLIDE 111

Intro to AI (2nd Part)

Inference by enumeration

Paolo Turrini Intro to AI (2nd Part)

Start with the joint distribution:

For any proposition ϕ, sum the atomic events where it is true: P(ϕ) = Σw:w|

=ϕP(w)

P(cavity ∨ toothache) = 0.108 + 0.012 + 0.072 + 0.008 + 0.016 + 0.064 = 0.28

slide-112
SLIDE 112

Intro to AI (2nd Part)

Inference by enumeration

Paolo Turrini Intro to AI (2nd Part)

Start with the joint distribution:

Can also compute conditional probabilities: P(¬cavity|toothache) = P(¬cavity ∧ toothache) P(toothache) = 0.016 + 0.064 0.108 + 0.012 + 0.016 + 0.064 = 0.4

slide-113
SLIDE 113

Intro to AI (2nd Part)

Inference by enumeration

Paolo Turrini Intro to AI (2nd Part)

Start with the joint distribution:

Can also compute conditional probabilities: P(cavity|toothache) = P(cavity ∧ toothache) P(toothache) = 0.108 + 0.12 0.108 + 0.012 + 0.016 + 0.064 = 0.6

slide-114
SLIDE 114

Intro to AI (2nd Part)

Normalization

Paolo Turrini Intro to AI (2nd Part)

Start with the joint distribution:

Denominator can be viewed as a normalization constant α P(Cavity|toothache) = α P(Cavity, toothache) = α [P(Cavity, toothache, catch) + P(Cavity, toothache, ¬catch)] = α [0.108, 0.016 + 0.012, 0.064] = α 0.12, 0.08 = 0.6, 0.4

slide-115
SLIDE 115

Intro to AI (2nd Part)

Inference by enumeration, contd.

Let X be all the variables.

Paolo Turrini Intro to AI (2nd Part)

slide-116
SLIDE 116

Intro to AI (2nd Part)

Inference by enumeration, contd.

Let X be all the variables. Typically, we want the posterior joint distribution of the query variables Y given specific values e for the evidence variables E

Paolo Turrini Intro to AI (2nd Part)

slide-117
SLIDE 117

Intro to AI (2nd Part)

Inference by enumeration, contd.

Let X be all the variables. Typically, we want the posterior joint distribution of the query variables Y given specific values e for the evidence variables E Let the hidden variables be H = X − Y − E

Paolo Turrini Intro to AI (2nd Part)

slide-118
SLIDE 118

Intro to AI (2nd Part)

Inference by enumeration, contd.

Let X be all the variables. Typically, we want the posterior joint distribution of the query variables Y given specific values e for the evidence variables E Let the hidden variables be H = X − Y − E Then the required summation of joint entries is done by summing out the hidden variables: P(Y|E = e) = αP(Y, E = e) = αΣhP(Y, E = e, H = h)

Paolo Turrini Intro to AI (2nd Part)

slide-119
SLIDE 119

Intro to AI (2nd Part)

Inference by enumeration, contd.

Let X be all the variables. Typically, we want the posterior joint distribution of the query variables Y given specific values e for the evidence variables E Let the hidden variables be H = X − Y − E Then the required summation of joint entries is done by summing out the hidden variables: P(Y|E = e) = αP(Y, E = e) = αΣhP(Y, E = e, H = h) The terms in the summation are joint entries because Y, E, and H together exhaust the set of random variables.

Paolo Turrini Intro to AI (2nd Part)

slide-120
SLIDE 120

Intro to AI (2nd Part)

Inference by enumeration, contd.

Obvious problems: with n variables...

Paolo Turrini Intro to AI (2nd Part)

slide-121
SLIDE 121

Intro to AI (2nd Part)

Inference by enumeration, contd.

Obvious problems: with n variables...

1

Worst-case time complexity O(dn) where d is the largest arity

Paolo Turrini Intro to AI (2nd Part)

slide-122
SLIDE 122

Intro to AI (2nd Part)

Inference by enumeration, contd.

Obvious problems: with n variables...

1

Worst-case time complexity O(dn) where d is the largest arity

2

Space complexity O(dn) to store the joint distribution

Paolo Turrini Intro to AI (2nd Part)

slide-123
SLIDE 123

Intro to AI (2nd Part)

Inference by enumeration, contd.

Obvious problems: with n variables...

1

Worst-case time complexity O(dn) where d is the largest arity

2

Space complexity O(dn) to store the joint distribution

3

How to find the numbers for O(dn) entries?

Paolo Turrini Intro to AI (2nd Part)

slide-124
SLIDE 124

Intro to AI (2nd Part)

Inference by enumeration, contd.

Obvious problems: with n variables...

1

Worst-case time complexity O(dn) where d is the largest arity

2

Space complexity O(dn) to store the joint distribution

3

How to find the numbers for O(dn) entries?

Paolo Turrini Intro to AI (2nd Part)

slide-125
SLIDE 125

Intro to AI (2nd Part)

Summary

Paolo Turrini Intro to AI (2nd Part)

slide-126
SLIDE 126

Intro to AI (2nd Part)

Summary

Probability is a rigorous formalism for uncertain knowledge

Paolo Turrini Intro to AI (2nd Part)

slide-127
SLIDE 127

Intro to AI (2nd Part)

Summary

Probability is a rigorous formalism for uncertain knowledge Joint probability distribution specifies probability of every atomic event

Paolo Turrini Intro to AI (2nd Part)

slide-128
SLIDE 128

Intro to AI (2nd Part)

Summary

Probability is a rigorous formalism for uncertain knowledge Joint probability distribution specifies probability of every atomic event Queries can be answered by summing over atomic events

Paolo Turrini Intro to AI (2nd Part)

slide-129
SLIDE 129

Intro to AI (2nd Part)

Summary

Probability is a rigorous formalism for uncertain knowledge Joint probability distribution specifies probability of every atomic event Queries can be answered by summing over atomic events For nontrivial domains, we must find a way to reduce the size

Paolo Turrini Intro to AI (2nd Part)

slide-130
SLIDE 130

Intro to AI (2nd Part)

Summary

Probability is a rigorous formalism for uncertain knowledge Joint probability distribution specifies probability of every atomic event Queries can be answered by summing over atomic events For nontrivial domains, we must find a way to reduce the size Independence and conditional independence provide the tools.

Paolo Turrini Intro to AI (2nd Part)

slide-131
SLIDE 131

Intro to AI (2nd Part)

What’s next?

Bayes’ rule Conditional and unconditional independence (hopefully) Bayesian Networks

Paolo Turrini Intro to AI (2nd Part)

slide-132
SLIDE 132

Intro to AI (2nd Part)

Appendix:Independence

A and B are independent iff

Paolo Turrini Intro to AI (2nd Part)

slide-133
SLIDE 133

Intro to AI (2nd Part)

Appendix:Independence

A and B are independent iff P(A|B) = P(A)

  • r

P(B|A) = P(B)

  • r

P(A, B) = P(A)P(B)

Paolo Turrini Intro to AI (2nd Part)

slide-134
SLIDE 134

Intro to AI (2nd Part)

Appendix:Independence

A and B are independent iff P(A|B) = P(A)

  • r

P(B|A) = P(B)

  • r

P(A, B) = P(A)P(B) P(cavity|Cristiano Ronaldo scores) = P(cavity)

Paolo Turrini Intro to AI (2nd Part)

slide-135
SLIDE 135

Intro to AI (2nd Part)

Appendix:Independence

A and B are independent iff P(A|B) = P(A)

  • r

P(B|A) = P(B)

  • r

P(A, B) = P(A)P(B) P(cavity|Cristiano Ronaldo scores) = P(cavity) P(Cristiano Ronaldo scores|cavity) = P(Cristiano Ronaldo scores|¬cavity) = P(Cristiano Ronaldo scores)

Paolo Turrini Intro to AI (2nd Part)

slide-136
SLIDE 136

Intro to AI (2nd Part)

Appendix:Independence

A and B are independent iff P(A|B) = P(A)

  • r

P(B|A) = P(B)

  • r

P(A, B) = P(A)P(B) P(cavity|Cristiano Ronaldo scores) = P(cavity) P(Cristiano Ronaldo scores|cavity) = P(Cristiano Ronaldo scores|¬cavity) = P(Cristiano Ronaldo scores)

Paolo Turrini Intro to AI (2nd Part)

slide-137
SLIDE 137

Intro to AI (2nd Part)

Appendix: Independence

Paolo Turrini Intro to AI (2nd Part)

slide-138
SLIDE 138

Intro to AI (2nd Part)

Appendix: Independence

P(Toothache, Catch, Cavity, Weather) = P(Toothache, Catch, Cavity)P(Weather)

Paolo Turrini Intro to AI (2nd Part)

slide-139
SLIDE 139

Intro to AI (2nd Part)

Appendix: Independence

P(Toothache, Catch, Cavity, Weather) = P(Toothache, Catch, Cavity)P(Weather) 32 entries reduced to 12; for n independent biased coins, 2n → n

Paolo Turrini Intro to AI (2nd Part)

slide-140
SLIDE 140

Intro to AI (2nd Part)

Appendix: Independence

P(Toothache, Catch, Cavity, Weather) = P(Toothache, Catch, Cavity)P(Weather) 32 entries reduced to 12; for n independent biased coins, 2n → n Absolute independence powerful but rare

Paolo Turrini Intro to AI (2nd Part)