ece 4524 artificial intelligence and engineering
play

ECE 4524 Artificial Intelligence and Engineering Applications - PowerPoint PPT Presentation

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference Reading: AIAMA 13.5 and MacKay book Chapter 28 Todays Schedule: Bayes Rule and its implications Causal versus Diagnostic Reasoning


  1. ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference Reading: AIAMA 13.5 and MacKay book Chapter 28 Today’s Schedule: ◮ Bayes’ Rule and its implications ◮ Causal versus Diagnostic Reasoning ◮ Combining Evidence ◮ Conditional Independence ◮ Examples

  2. Bayes’ Theorem Consider a joint probability P ( A , B ) with A , B ∈ A . We can factor it using conditionals one of two ways: P ( A , B ) = P ( A | B ) P ( B ) = P ( B | A ) P ( A ) Rearranging give Bayes rule for probabilities: P ( A | B ) = P ( B | A ) P ( A ) P ( B ) or equivalently P ( B | A ) = P ( A | B ) P ( B ) P ( A ) This same relation holds for PMFs and PDFs.

  3. Bayes’ Theorem (Discrete Case) Given a discrete r.v. X and some data D: p ( D | x ) p ( x ) p ( x | D ) = � i p ( D | x ) P ( x ) or Posterior = Likelihood Prior Evidence

  4. Bayes’ Theorem (Continuous Case) Given a continuous random variable X and some data D: f ( D | x ) f ( x ) f ( x | D ) = � f ( D | x ) f ( x ) dx or again Posterior = Likelihood Prior Evidence

  5. Models To specify the likelihood, we need a way to generate the probability of the data, D , given x . This is a forward or generative model that depends on x .

  6. Causal versus Diagnostic Reasoning Two ways to view Bayes’ rule: P ( cause | effect ) = P ( effect | cause ) P ( cause ) P ( effect ) This allows us to do causal reasoning using Models. P ( effect | cause ) = P ( cause | effect ) P ( effect ) P ( cause ) This allows us to do diagnostic reasoning using Models.

  7. Warmup #1 There is a test for a deadly disease you could have. A test outcome of T=0 implies you do not have the disease and T=1 that you do. The test is 95% reliable (meaning it is correct 95% of the time). Given your age and family history you have a 1% prior probability of having the disease. The test comes back positive (T=1). How worried are you and why?

  8. Exercise Suppose a Robot has an acoustic sensor that measures distance to an obstacle every T seconds. The sensor has an associated error represented as a bias and variance from the true distance. Establish a probability model for this problem, making appropriate suggestions for the form of any probability densities.

  9. Another Classic Example (Pearl 1988, McKay 2003) ◮ Fred lives in Los Angeles and commutes 60 miles to work. Whilst at work, he receives a phone-call from his neighbor saying that Fred’s burglar alarm is ringing. What is the probability that there was a burglar in his house today? ◮ While driving home to investigate, Fred hears on the radio that there was a small earthquake that day near his home. ‘Oh’, he says, feeling relieved, ‘it was probably the earthquake that set off the alarm’. What is the probability that there was a burglar in his house?

  10. Combining Evidence ◮ Conditional Independence ◮ Factoring the Joint Probability Recall that the joint probability distribution tells us all we need to know to make inferences. However, ◮ The complexity of Bayesian inference is dominated by the dimensionality of the joint density. ◮ For every additional evidence feature introduced the data required to estimate the parameters goes up by a factor of at least 10, for even simple N-D Gaussians. ◮ For 100s of features, most samples from a 100-D Gaussian distribution are not even inside the variance ellipsoid! ◮ This is even worse for more complex joint distributions.

  11. Warmup #2 You are given the prior probability of an event A , P ( A ) = 0 . 7. There is another event, B , that we know the outcome of. Describe briefly what effect there is on our knowledge of event A in three cases: if P ( A | B ) < P ( A ), if P ( A | B ) > P ( A ), and if P ( A | B ) = P ( A ).

  12. The naive Bayes model Let C be a condition (class) and E i the evidence for that condition (feature), the naive Bayes model assumes a factorization of the joint probability: N � P ( C , E 1 , E 2 , · · · , E N ) = P ( C ) P ( E i | C ) i =1 i.e. that the evidence features are independent given the condition.

  13. Example from Wumpus World

  14. Next Actions ◮ Reading on Bayesian Networks AIAMA 14.1-14.3 ◮ There is no warmup. Quiz II is Thursday 3/22 . Covers lectures 9-15 (PL and FOL).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend