Intelligente Systeme WS 18/19 Dr. Benjamin Guthier Professur fr - - PowerPoint PPT Presentation

intelligente systeme
SMART_READER_LITE
LIVE PREVIEW

Intelligente Systeme WS 18/19 Dr. Benjamin Guthier Professur fr - - PowerPoint PPT Presentation

Intelligente Systeme WS 18/19 Dr. Benjamin Guthier Professur fr Bildverarbeitung Intelligente Systeme Dr. Benjamin Guthier 2. PROBABILISTIC MODELS Intelligente Systeme Dr. Benjamin Guthier Need for Probabilistic Reasoning Human


slide-1
SLIDE 1

Intelligente Systeme

WS 18/19

  • Dr. Benjamin Guthier

Professur für Bildverarbeitung

Intelligente Systeme – Dr. Benjamin Guthier

slide-2
SLIDE 2
  • 2. PROBABILISTIC MODELS

Intelligente Systeme – Dr. Benjamin Guthier

slide-3
SLIDE 3

| 2. Probabilistic Models

Need for Probabilistic Reasoning

  • Human reasoning is based on uncertain evidence
  • In classical logic, conclusions are true or false

– Does not account for uncertainty – Cannot handle conflicting evidence

  • Probability theory can model uncertainty in the real world

Intelligente Systeme – Dr. Benjamin Guthier 3

slide-4
SLIDE 4

| 2. Probabilistic Models

  • Recommended reading:

– S. Russel and P. Norvig, Artificial Intelligence: A Modern

  • Approach. Chapter 13 “Quantifying Uncertainty”

PROBABILITY THEORY

Intelligente Systeme – Dr. Benjamin Guthier 4

slide-5
SLIDE 5

| 2. Probabilistic Models

Random Variable

  • Random variable takes on values with a certain probability
  • Boolean random variables: Cavity (do I have a cavity?)
  • Discrete random variables: Weather

– one of <sunny, rainy, cloudy, snow> – Values must be exhaustive and mutually exclusive

  • Construct propositions by assigning a value to a variable

– Weather = sunny – Cavity = false – Complex propositions: Weather = sunny ∨ Cavity = false

Intelligente Systeme – Dr. Benjamin Guthier 5

slide-6
SLIDE 6

| 2. Probabilistic Models

Events

  • Event: Complete specification of the state of the world about

which the agent is uncertain

  • If the world consists of only two Boolean variables Cavity

and Toothache, then there are 4 distinct events:

– Cavity = true ∧ Toothache = false (short: cavity ∧ ¬toothache) – Cavity = false ∧ Toothache = false – Cavity = true ∧ Toothache = true – Cavity = false ∧ Toothache = true

Intelligente Systeme – Dr. Benjamin Guthier 6

slide-7
SLIDE 7

| 2. Probabilistic Models

Prior Probability

  • Prior probability of event prior to arrival of any new evidence:

– 𝑄(Cavity = true) = 0.2 – 𝑄(Weather = sunny) = 0.72

  • Probability distribution: gives values for all possible assignments:

– 𝑸 Weather = < 0.72, 0.1, 0.08, 0.1 > (for sunny, rainy, cloudy, snow) – Must sum to 1

  • Joint probability distribution: gives the probabilities of all

combinations of events

– 𝑸(Weather, Cavity) is a 2 × 4 matrix of values:

Intelligente Systeme – Dr. Benjamin Guthier 7

Weather= sunny rainy cloudy snow Cavity=true 0.144 0.02 0.016 0.02 Cavity=false 0.576 0.08 0.064 0.08

slide-8
SLIDE 8

| 2. Probabilistic Models

A Note on Notation

  • Logical operators: ∧ (and), ∨ (or), ¬ (not)
  • Random variables uppercase: 𝑌, Weather, Cavity
  • Values of random variables lowercase: 𝑦, sunny, cavity, true
  • Lists of values in boldface or with < ⋯ >

– List of probabilities: 𝑸 Weather = < 0.72, 0.1, 0.08, 0.1 > – List of variables: 𝒀 =< 𝑌1, 𝑌2, … , 𝑌𝑜 > – Corresponding list of values: 𝒚 =< true, sunny, … , cavity > – List of equations: 𝑸 𝑌, 𝑍 = 𝑸 𝑌 𝑸(𝑍) (one for each combination of values of 𝑌 and 𝑍)

Intelligente Systeme – Dr. Benjamin Guthier 8

slide-9
SLIDE 9

| 2. Probabilistic Models

Conditional Probability

  • Also called posterior probability

– Probability after more information becomes available – Or: “Probability conditioned on a prior event”

  • 𝑄 Cavity = true Toothache = true = 0.8

– Probably of having a cavity, given that one has a toothache – In this case, higher than prior probability 𝑄(Cavity = true) = 0.2

  • Posterior probability is often the answer in prob. reasoning

– Chance of rain, given a cloudy sky and high pressure – Prob. of someone voting for a party, given age, gender, location

Intelligente Systeme – Dr. Benjamin Guthier 9

slide-10
SLIDE 10

| 2. Probabilistic Models

Conditional Probability (2)

  • Definition of conditional probability: 𝑄 𝐵 𝐶 =

𝑄 𝐵∧𝐶 𝑄(𝐶)

  • Or as an alternative formulation (product rule):

𝑄 𝐵 ∧ 𝐶 = 𝑄 𝐵 𝐶 𝑄(𝐶)

Intelligente Systeme – Dr. Benjamin Guthier 10

slide-11
SLIDE 11

| 2. Probabilistic Models

Conditional Probability (Example)

  • A mother comes home and finds a broken flower vase (b).

She suspects that her child did it (c).

  • The (incredibly smart) child argues: “But mom, the

probability that on any given day, a child breaks a flower vase is only P c, b = 1: 3650, i.e., it only happens like once every ten years! It’s pretty unlikely that it was me.”

  • The mother retorts: “Vases don’t just break. The probability
  • f a vase breaking on any given day is only 𝑄 b = 0.034%.“
  • Both calculate the probability of it being the child’s fault,

given that the vase is already broken: 𝑄 c b = 𝑄(c, b) 𝑄(b) = 1: 3650 0.00034 = 80%

Intelligente Systeme – Dr. Benjamin Guthier 11

slide-12
SLIDE 12

| 2. Probabilistic Models

Conditional Probability (Example)

Intelligente Systeme – Dr. Benjamin Guthier 12

𝑄 c, b =very small 𝑄 b =also small = 𝑄 c|b = large 𝑄 true = 1

slide-13
SLIDE 13

| 2. Probabilistic Models

Numerical Random Variables

  • Sometimes we need to model numerical values
  • If the values are discrete, nothing changes
  • Number of leaves on a clover: 𝐷 ∈ 3, 4, 5

– 𝑸 𝐷 =< 0.999, 0.0008, 0.0002 >

  • Fair die: 𝐸 ∈ {1, 2, 3, 4, 5, 6}

– 𝑸 𝐸 =<

1 6 , 1 6 , 1 6 , 1 6 , 1 6 , 1 6 >

Intelligente Systeme – Dr. Benjamin Guthier 13

slide-14
SLIDE 14

| 2. Probabilistic Models

Expected Value

  • Expected value (mean, average) of a numerical random

variable: 𝐹 𝑌 = ෍

𝑗

𝑦𝑗𝑄(𝑌 = 𝑦𝑗)

– 𝑦𝑗 are the values the variable can take on

  • Clover example:

– 𝐹 𝐷 = 3 ∗ 0.999 + 4 ∗ 0.0008 + 5 ∗ 0.0002 ≈ 3

  • Fair die:

– 𝐹 𝐸 = σ𝑗 𝑦𝑗 ∗

1 6 = 1 6 1 + 2 + 3 + 4 + 5 + 6 = 3.5

Intelligente Systeme – Dr. Benjamin Guthier 14

slide-15
SLIDE 15

| 2. Probabilistic Models

Continuous Random Variables

  • Continuous random variables can take on an infinite

number of different values

– E.g., temperature in a room 𝑌 ∈ [18, 26] can be any value in interval

  • Give the probability of the variable being in a sub-interval

– E.g., probability of temperature being between 18 and 26 is 1.0 – For any 4 degree interval, probability is 0.5 – Probability that the temperature is exactly 21.384 degrees is 0

Intelligente Systeme – Dr. Benjamin Guthier 15

slide-16
SLIDE 16

| 2. Probabilistic Models

Probability Distribution Function

  • Probabilities of continuous variables are defined by a probability

distribution function (pdf) 𝑄(𝑦)

– E.g., 𝑄 𝑦 = ൝

1 8

if 18 ≤ 𝑦 ≤ 26

  • therwise
  • Calculate probabilities by integrating over intervals

𝐺

𝑌 𝑏 ≤ 𝑌 ≤ 𝑐 = න 𝑏 𝑐

𝑄 𝑦 𝑒𝑦

Intelligente Systeme – Dr. Benjamin Guthier 16

1 8 18 26 22 20 24 න

24 26 1

8 𝑒𝑦 = 1 4

slide-17
SLIDE 17

| 2. Probabilistic Models

Gaussian Distribution

  • Real variables are often centered around the average, and

values are less likely, the further from the average they are

– Example: Height of adult male. 177cm is average. 169cm and 185cm are still fairly common. 200cm is very rare.

  • Gaussian Distribution:

𝑄 𝑦 =

1 𝜏 2𝜌 𝑓− 𝑦−𝜈 2

2𝜏2

𝜈 is the mean 𝜏 is the standard deviation (from the mean)

Intelligente Systeme – Dr. Benjamin Guthier 17

𝜈 = 177 𝜏 = 8

slide-18
SLIDE 18

| 2. Probabilistic Models

PROBABILISTIC INFERENCE

Intelligente Systeme – Dr. Benjamin Guthier 18

slide-19
SLIDE 19

| 2. Probabilistic Models

Inference Using Joint Probability

  • Assume we have three random variables

– Toothache: My tooth is hurting – Catch: The dentist’s hook catches on to something (possibly a cavity) – Cavity: My tooth has a cavity

  • Also assume the joint probability distribution is given

Intelligente Systeme – Dr. Benjamin Guthier 19

𝑢𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 ¬𝑢𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 𝑑𝑏𝑢𝑑ℎ ¬𝑑𝑏𝑢𝑑ℎ 𝑑𝑏𝑢𝑑ℎ ¬𝑑𝑏𝑢𝑑ℎ 𝑑𝑏𝑤𝑗𝑢𝑧 .108 .012 .072 .008 ¬𝑑𝑏𝑤𝑗𝑢𝑧 .016 .064 .144 .576

slide-20
SLIDE 20

| 2. Probabilistic Models

Inference Using Joint Probability (2)

  • Calculate probabilities by summing up elements
  • Prior probability of having a cavity:

– 𝑄 cavity = 0.108 + 0.012 + 0.072 + 0.008 = 0.2

  • Prior probability of having a toothache:

– 𝑄 toothache = 0.108 + 0.012 + 0.016 + 0.064 = 0.2

Intelligente Systeme – Dr. Benjamin Guthier 20

𝑢𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 ¬𝑢𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 𝑑𝑏𝑢𝑑ℎ ¬𝑑𝑏𝑢𝑑ℎ 𝑑𝑏𝑢𝑑ℎ ¬𝑑𝑏𝑢𝑑ℎ 𝑑𝑏𝑤𝑗𝑢𝑧 .108 .012 .072 .008 ¬𝑑𝑏𝑤𝑗𝑢𝑧 .016 .064 .144 .576

slide-21
SLIDE 21

| 2. Probabilistic Models

Inference Using Joint Probability (3)

  • Also calculate conditional probabilities
  • If I have a toothache, what is the probability of having a cavity?

– 𝑄 cavity|toothache =

𝑄 cavity ∧ toothache 𝑄(toothache)

=

0.108+0.012 0.2

= 0.6

  • What if additionally the dentist’s hook caught on?

– 𝑄 cavity|toothache, catch =

𝑄 cavity,toothache,catch 𝑄(toothache,catch)

=

0.108 0.108+0.016 = 0.87

Intelligente Systeme – Dr. Benjamin Guthier 21

𝑢𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 ¬𝑢𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 𝑑𝑏𝑢𝑑ℎ ¬𝑑𝑏𝑢𝑑ℎ 𝑑𝑏𝑢𝑑ℎ ¬𝑑𝑏𝑢𝑑ℎ 𝑑𝑏𝑤𝑗𝑢𝑧 .108 .012 .072 .008 ¬𝑑𝑏𝑤𝑗𝑢𝑧 .016 .064 .144 .576

slide-22
SLIDE 22

| 2. Probabilistic Models

Problems of Joint Probability

  • Joint probability distributions can be used to make

predictions about the world, but…

  • A real scenario may consist of 𝑜 variables with 𝑤 different

values each

– E.g., hundreds of variables in dentistry

  • Joint probability table then contains 𝑤𝑜 different entries

– Large storage space required – Summation over O(𝑤𝑜) values – Difficulty to determine all 𝑤𝑜 probabilities in practice

Intelligente Systeme – Dr. Benjamin Guthier 22

slide-23
SLIDE 23

| 2. Probabilistic Models

Independence

  • 𝐵 and 𝐶 are independent iff (all equivalent):

𝑄 𝐵 𝐶 = 𝑄(𝐵) or 𝑄 𝐶 𝐵 = 𝑄(𝐶) or 𝑄 𝐵, 𝐶 = 𝑄 𝐵 𝑄(𝐶)

  • This leads to:

𝑄 Toothache, Catch, Cavity, Weather = 𝑄 Toothache, Catch, Cavity ∗ 𝑄(Weather)  2 ∗ 2 ∗ 2 ∗ 4 = 32 entries reduced to 2 ∗ 2 ∗ 2 + 4 = 12

– For 𝑜 independent random variables: 𝑃(𝑤 ∗ 𝑜) instead of 𝑃 𝑤𝑜

  • Problem: Interesting variables typically are dependent…

Intelligente Systeme – Dr. Benjamin Guthier 23

slide-24
SLIDE 24

| 2. Probabilistic Models

Conditional Independence

  • Toothache and Catch are not independent (both caused by a

cavity)

  • However they are independent given the presence/absence
  • f a cavity

– Toothache depends on the state of the nerves – Catch depends on the skill of the dentist

  • Mathematically:

𝑸 𝑈𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓, 𝐷𝑏𝑢𝑑ℎ 𝐷𝑏𝑤𝑗𝑢𝑧 = 𝑸 𝑈𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 𝐷𝑏𝑤𝑗𝑢𝑧 𝑸 𝐷𝑏𝑢𝑑ℎ 𝐷𝑏𝑤𝑗𝑢𝑧

  • Or in general: 𝑸 𝑌, 𝑍 𝑎 = 𝑸 𝑌 𝑎 𝑸(𝑍|𝑎)

Intelligente Systeme – Dr. Benjamin Guthier 24

slide-25
SLIDE 25

| 2. Probabilistic Models

Bayes’ Rule

  • From the definition of conditional probability, we get

𝑄 𝑏 ∧ 𝑐 = 𝑄 𝑏 𝑐 𝑄(𝑐) and 𝑄 𝑏 ∧ 𝑐 = 𝑄 𝑐 𝑏 𝑄(𝑏)

  • This leads to Bayes’ rule (or Bayes’ theorem)

𝑄 𝑐 𝑏 = 𝑄 𝑏 𝑐 𝑄(𝑐) 𝑄(𝑏)

  • In an AI application, this could look like

𝑄 cause effect = 𝑄 effect cause 𝑄(cause) 𝑄(effect) – Typically the prob. of an effect, given a cause is more easily available – Finding the most likely cause when observing effects is the goal

Intelligente Systeme – Dr. Benjamin Guthier 25

slide-26
SLIDE 26

| 2. Probabilistic Models

Bayes’ Rule Example

Patient with a stiff neck (𝑡) goes to the doctor. Doctor knows that in 70% of the cases of meningitis (𝑛), patients have a stiff neck. What is the probability of the patient having meningitis?

  • The doctor knows:

– 𝑄 𝑛 = 1/50000 (meningitis is very rare) – 𝑄 𝑡 = 0.01 (a stiff neck is comparably common) – 𝑄 𝑡 𝑛 = 0.7 (stiff neck during meningitis is very common)

  • We want to know: 𝑄(𝑛|𝑡)

𝑄 𝑛 𝑡 = 𝑄 𝑡 𝑛 𝑄(𝑛) 𝑄(𝑡) = 0.7 ∗ 1/50000 0.01 = 0.0014 ≈ 1/700  It’s probably just a stiff neck 

Intelligente Systeme – Dr. Benjamin Guthier 26

slide-27
SLIDE 27

| 2. Probabilistic Models

Naïve Bayes Model

  • We want to build a machine that can classify fruits into apples,
  • ranges and bananas. It can recognize the features of being red,

being round and having a smooth surface.

  • Model this as

– Classes 𝐷 = {apple, orange, banana} – Boolean random variables 𝑌1, 𝑌2, 𝑌3 for red-, round-, smooth-ness

  • Start by picking 1000 random fruits and counting

– We find 400 apples, 300 oranges and 300 bananas – Out of these, count how many are red, round and smooth

Intelligente Systeme – Dr. Benjamin Guthier 27

Apples Oranges Bananas Red 60% 20% 0% Round 90% 90% 10% Smooth 80% 30% 70%

slide-28
SLIDE 28

| 2. Probabilistic Models

Naïve Bayes Model (2)

  • We now know

– 𝑸 𝐷 =< 0.4, 0.3, 0.3 > (a priori probability of being an apple, orange, banana) – 𝑸(𝑦𝑗|𝐷) (all probabilities of having a feature given its class)

  • E.g., 𝑸 red 𝐷 =< 0.6, 0.2, 0 > or 𝑄 red banana = 0
  • We want to calculate 𝑸(𝐷|𝑦1, 𝑦2, 𝑦3) for a new fruit

– Given values for its features, what are the probabilities of it being an apple,

  • range, banana. Pick the highest for classification.
  • Use Bayes’ rule (1) and conditional independence (2)

𝑸 𝐷 𝑦1, 𝑦2, 𝑦3 =1 𝑸 𝑦1, 𝑦2, 𝑦3 𝐷 𝑸(𝐷) 𝑄(𝑦1, 𝑦2, 𝑦3) = 𝛽𝑸 𝐷 𝑸 𝑦1, 𝑦2, 𝑦3 𝐷 =2 𝛽𝑸 𝐷 ෑ

𝑗=1 𝑜

𝑸(𝑦𝑗|𝐷)

Intelligente Systeme – Dr. Benjamin Guthier 28

slide-29
SLIDE 29

| 2. Probabilistic Models

Naïve Bayes Model (Example)

  • We see a fruit that is not red, but round and smooth. What is it?

𝑄 apple ¬red, round, smooth = 𝛽𝑄 apple 𝑄 ¬red apple 𝑄 round apple 𝑄 smooth apple = 𝛽 ∗ 0.4 ∗ 0.4 ∗ 0.9 ∗ 0.8 = 𝛽 ∗ 0.1152 𝑄 orange … = 𝛽 ∗ 0.3 ∗ 0.8 ∗ 0.9 ∗ 0.3 = 𝛽 ∗ 0.0648 𝑄 banana … = 𝛽 ∗ 0.3 ∗ 1.0 ∗ 0.1 ∗ 0.7 = 𝛽 ∗ 0.021

Intelligente Systeme – Dr. Benjamin Guthier 29

Apples (40%) Oranges (30%) Bananas (30%) Red 60% 20% 0% Round 90% 90% 10% Smooth 80% 30% 70%