ECE 4524 Artificial Intelligence and Engineering Applications - - PowerPoint PPT Presentation

ece 4524 artificial intelligence and engineering
SMART_READER_LITE
LIVE PREVIEW

ECE 4524 Artificial Intelligence and Engineering Applications - - PowerPoint PPT Presentation

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference Reading: AIAMA 13.5 and MacKay book Chapter 28 Todays Schedule: Bayes Rule and its implications Causal versus Diagnostic Reasoning


slide-1
SLIDE 1

ECE 4524 Artificial Intelligence and Engineering Applications

Lecture 17: Bayesian Inference Reading: AIAMA 13.5 and MacKay book Chapter 28 Today’s Schedule:

◮ Bayes’ Rule and its implications ◮ Causal versus Diagnostic Reasoning ◮ Combining Evidence ◮ Conditional Independence ◮ Examples

slide-2
SLIDE 2

Bayes’ Theorem

Consider a joint probability P(A, B) with A, B ∈ A. We can factor it using conditionals one of two ways: P(A, B) = P(A|B)P(B) = P(B|A)P(A) Rearranging give Bayes rule for probabilities: P(A|B) = P(B|A)P(A) P(B)

  • r equivalently

P(B|A) = P(A|B)P(B) P(A) This same relation holds for PMFs and PDFs.

slide-3
SLIDE 3

Bayes’ Theorem (Discrete Case)

Given a discrete r.v. X and some data D: p(x|D) =

p(D|x)p(x)

  • i p(D|x)P(x)
  • r

Posterior = Likelihood Prior Evidence

slide-4
SLIDE 4

Bayes’ Theorem (Continuous Case)

Given a continuous random variable X and some data D: f (x|D) =

f (D|x)f (x)

  • f (D|x)f (x)dx
  • r again

Posterior = Likelihood Prior Evidence

slide-5
SLIDE 5

Models

To specify the likelihood, we need a way to generate the probability of the data,D, given x. This is a forward or generative model that depends on x.

slide-6
SLIDE 6

Causal versus Diagnostic Reasoning

Two ways to view Bayes’ rule: P(cause|effect) = P(effect|cause)P(cause)

P(effect)

This allows us to do causal reasoning using Models. P(effect|cause) = P(cause|effect)P(effect)

P(cause)

This allows us to do diagnostic reasoning using Models.

slide-7
SLIDE 7

Warmup #1

There is a test for a deadly disease you could have. A test

  • utcome of T=0 implies you do not have the disease and T=1

that you do. The test is 95% reliable (meaning it is correct 95% of the time). Given your age and family history you have a 1% prior probability of having the disease. The test comes back positive (T=1). How worried are you and why?

slide-8
SLIDE 8

Exercise

Suppose a Robot has an acoustic sensor that measures distance to an obstacle every T seconds. The sensor has an associated error represented as a bias and variance from the true distance. Establish a probability model for this problem, making appropriate suggestions for the form of any probability densities.

slide-9
SLIDE 9

Another Classic Example (Pearl 1988, McKay 2003)

◮ Fred lives in Los Angeles and commutes 60 miles to work.

Whilst at work, he receives a phone-call from his neighbor saying that Fred’s burglar alarm is ringing. What is the probability that there was a burglar in his house today?

◮ While driving home to investigate, Fred hears on the radio

that there was a small earthquake that day near his home. ‘Oh’, he says, feeling relieved, ‘it was probably the earthquake that set off the alarm’. What is the probability that there was a burglar in his house?

slide-10
SLIDE 10

Combining Evidence

◮ Conditional Independence ◮ Factoring the Joint Probability

Recall that the joint probability distribution tells us all we need to know to make inferences. However,

◮ The complexity of Bayesian inference is dominated by the

dimensionality of the joint density.

◮ For every additional evidence feature introduced the data

required to estimate the parameters goes up by a factor of at least 10, for even simple N-D Gaussians.

◮ For 100s of features, most samples from a 100-D Gaussian

distribution are not even inside the variance ellipsoid!

◮ This is even worse for more complex joint distributions.

slide-11
SLIDE 11

Warmup #2

You are given the prior probability of an event A, P(A) = 0.7. There is another event, B, that we know the outcome of. Describe briefly what effect there is on our knowledge of event A in three cases: if P(A|B) < P(A), if P(A|B) > P(A), and if P(A|B) = P(A).

slide-12
SLIDE 12

The naive Bayes model

Let C be a condition (class) and Ei the evidence for that condition (feature), the naive Bayes model assumes a factorization of the joint probability: P(C, E1, E2, · · · , EN) = P(C)

N

  • i=1

P(Ei|C) i.e. that the evidence features are independent given the condition.

slide-13
SLIDE 13

Example from Wumpus World

slide-14
SLIDE 14

Next Actions

◮ Reading on Bayesian Networks AIAMA 14.1-14.3 ◮ There is no warmup.

Quiz II is Thursday 3/22. Covers lectures 9-15 (PL and FOL).