lecture 19
play

Lecture 19 Conditional Independence, Bayesian networks intro 1 - PowerPoint PPT Presentation

Lecture 19 Conditional Independence, Bayesian networks intro 1 Announcement nouncement Assignment 4 will be out on next week. Due Friday Dec 1 you can still use late days if you have any left) 2 Lecture cture Ov Overvie rview


  1. Lecture 19 Conditional Independence, Bayesian networks intro 1

  2. Announcement nouncement • Assignment 4 will be out on next week. • Due Friday Dec 1 • you can still use late days if you have any left) 2

  3. Lecture cture Ov Overvie rview • Recap lecture 18 • Marginal Independence • Conditional Independence • Bayesian Networks Introduction 3

  4. Proba obabili bility ty Dis istri tributions butions Consider the case where possible worlds are simply assignments to one random variable. Definition (probability distribution) A probability distribution P on a random variable X is a function dom(X)  [0,1] such that x  P(X=x) Example: X represents a female adult’s hight in Canada with domain {short, normal, tall} – based on some definition of these terms short  P(hight = short) = 0.2 normal  P(hight = normal) = 0.5 tall  P(hight = tall) = 0.3 4

  5. Joint nt Pr Probabilit ility y Distrib tributio ution (JPD PD) • Joint probability distribution over random variables X 1 , …, X n : • a probability distribution over the joint random variable <X 1 , …, X n > with domain dom(X 1 ) × … × dom(X n ) (the Cartesian product) • Think of a joint distribution over n variables as the table of the corresponding possible worlds • There is a column (dimension) for each variable, and one for the probability • Each row corresponds to an assignment X 1 = x 1 , …, X n = x n and its probability P(X 1 = x 1 , … , X n = x n ) We can also write P(X 1 = x 1  …  X n = x n ) • • The sum of probabilities across the whole table is 1. Weather Temperature µ(w) {Weather, Temperature} sunny hot 0.10 example from before sunny mild 0.20 sunny cold 0.10 cloudy hot 0.05 cloudy mild 0.35 5 cloudy cold 0.20

  6. Recap: cap: Condi nditioning tioning • Conditioning: revise beliefs based on new observations • We need to integrate two sources of knowledge • Prior probability distribution P(X): all background knowledge • New evidence e • Combine the two to form a posterior probability distribution • The conditional probability P(h|e) 6

  7. Recap: cap: Condi nditional tional probabil obability ity Possible Weather Temperature µ(w) T P(T|W=sunny) world hot 0.10/0.40=0.25 w 1 sunny hot 0.10 mild 0.20/0.40=0.50 w 2 sunny mild 0.20 cold 0.10/0.40=0.25 w 3 sunny cold 0.10 JPD for P(T|W=sunny ) w 4 cloudy hot 0.05 w 5 cloudy mild 0.35 w 6 cloudy cold 0.20 7

  8. Recap: cap: In Infer ference ence by Enumeration meration • Great, we can compute arbitrary probabilities now! • Given • Prior joint probability distribution (JPD) on set of variab riables les X • specific values e for the evidenc idence e variables ariables E (subset of X) • We want to compute • posterior joint distribution of quer ery y variables ariables Y (a subset of X) given evidence e • Step 1: Condition to get distribution P(X|e) • Step 2: Marginalize to get distribution P(Y|e) 8

  9. In Infer ference ence by Enumerati umeration: on: example ample • Given P(W,C,T) as JPD below, and evidence e : “ Wind=yes ” • What is the probability that it is cold? I.e., P(T= cold | W=yes) • Step 1: condition to get distribution P(C, T| W=yes) Windy Cloudy Temperature P(W, C, T) Cloudy Temperature P(C, T| W=yes) W C T C T yes no hot 0.04 no 0.04/0.43  0.10 hot yes no mild 0.09 no 0.09/0.43  0.21 mild yes no cold 0.07 no 0.07/0.43  0.16 cold yes yes hot 0.01 yes 0.01/0.43  0.02 hot yes yes mild 0.10 yes 0.10/0.43  0.23 mild yes yes cold 0.12 yes 0.12/0.43  0.28 cold no no hot 0.06 no no mild 0.11 𝑄(𝐷 = 𝑑 ∧ 𝑈 = 𝑢|𝑋 = 𝑧𝑓𝑡) = no no cold 0.03 𝑄(𝐷=𝑑ٿ 𝑈=𝑢ٿ 𝑋=𝑧𝑓𝑡) no yes hot 0.04 = 𝑄(𝑋=𝑧𝑓𝑡) no yes mild 0.25 9 no yes cold 0.08

  10. In Infer ference ence by Enumerati umeration: on: example ample • Given P(W,C,T) as JPD in previous slide, and evidence e : “ Wind=yes ” • What is the probability that it is cold? I.e., P(T=cold | W=yes) • Step 2: marginalize over Cloudy to get distribution P(T | W=yes) Cloudy Temperature P(C, T| W=yes) Temperature P(T| W=yes) C T T sunny hot 0.10 hot 0.10+0.02 = 0.12 sunny mild 0.21 mild 0.21+0.23 = 0.44 sunny 0.16 cold 0.16+0.28 = 0.44 cold cloudy 0.02 hot cloudy 0.23 mild cloudy cold 0.28 P(T=cold | W=yes) is a specific entry of the • This is a probability distribution: it defines the probability probability distribution for of all the possible values of Temperature (three here), P(T | W=yes ) given the observed value for Windy (yes). • Because this is a probability distribution, the sum of all its values is 10

  11. Conditi ition onal al Pr Probabili ility y among g Random m Va Variabl bles es It expresses the conditional probability of P(X | Y) = P(X , Y) / P(Y) each possible value for X given each possible value for Y P(X | Y) = P(Temperature | Weather) = P(Temperature  Weather) / P(Weather) Example: Temperature {hot, cold}; Weather = {sunny, cloudy} P(Temperature | Weather) T = hot T = cold W = sunny P(hot|sunny) P(cold|sunny) W = cloudy P(hot|cloudy) P(cold|cloudy) Which of the following is true? A. The probabilities in each row should sum to 1 B. The probabilities in each column should sum to 1 C. Both of the above D. None of the above 11

  12. Conditional Probability among Random Variables It expresses the conditional probability P(X | Y) = P(X , Y) / P(Y) of each possible value for X given each possible value for Y P(X | Y) = P(Temperature | Weather) = P(Temperature  Weather) / P(Weather) Example: Temperature {hot, cold}; Weather = {sunny, cloudy} P(Temperature | Weather) T = hot T = cold P(T | Weather = sunny) W = sunny P(hot|sunny) P(cold|sunny) W = cloudy P(hot|cloudy) P(cold|cloudy) P(T | Weather = cloudy) A. The probabilities in each row should These are two JPDs! sum to 1 12

  13. Recap: cap: In Infer ference ence by Enumeration meration • Great, we can compute arbitrary probabilities now! • Given • Prior joint probability distribution (JPD) on set of variab riables les X • specific values e for the evidence idence var ariables iables E (subset of X) • We want to compute • posterior joint distribution of query ery var ariables iables Y (a subset of X) given evidence e • Step 1: Condition to get distribution P(X|e) • Step 2: Marginalize to get distribution P(Y|e) Generally applicable, but memory-heavy and slow We will see a better way to do probabilistic inference 13

  14. Bayes yes rule le and d Chain ain Rule le  ( | ) P fire alarm 14

  15. Bayes yes rule le and d Chain ain Rule le 15

  16. Product oduct Rule le • By definition, we know that :  ( ) P f f  2 1 ( | ) P f f 2 1 ( ) P f 1 • We can rewrite this to    ( ) ( | ) ( ) P f f P f f P f 2 1 2 1 1 • In general

  17. Chain ain Rule le 1 Theorem: Chain Rule 𝑜 𝑄(𝑔 1 ٿ…ٿ𝑔 𝑜 ) = ෑ 𝑄(𝑔𝑗|𝑔 𝑗 − 1ٿ…ٿ 𝑔 1 ) 𝑗=1 17

  18. Chain ain Rule le example ample 𝑜 𝑄(𝑔 1 ٿ…ٿ𝑔 𝑜 ) = ෑ 𝑄(𝑔𝑗|𝑔 𝑗 − 1ٿ…ٿ 𝑔 1 ) 𝑗=1 P(A,B,C,D) = P(D|A,B,C) × P(A,B,C) = = P(D|A,B,C) × P(C|A,B) × P(A,B) = P(D|A,B,C) × P(C|B,A) × P(B|A) × P(A) = P(A)P(B|A)P(C|A,B)P(D|A,B,C) 18

  19. Chain ain Rule le • Allows representing a Join Probability Distribution (JPD) as the product of conditional probability distributions Theorem: Chain Rule 𝑜 𝑄(𝑔 1 ٿ…ٿ𝑔 𝑜 ) = ෑ 𝑄(𝑔𝑗|𝑔 𝑗 − 1ٿ…ٿ 𝑔 1 ) 𝑗=1 19

  20. Wh Why does es th the chain in rule le help lp us? We will see how, under specific circumstances (variables independence), this rule helps gain compactness • We can represent the JPD as a product of marginal distributions • We can simplify some terms when the variables involved are marginally independent or conditionally independent 20

  21. Lecture cture Ov Overvie rview • Recap lecture 18 • Marginal Independence • Conditional Independence • Bayesian Networks Introduction 21

  22. Margi rginal nal In Independenc ependence • Intuitively: if X ╨ Y , then • learning that Y=y does not change your belief in X • and this is true for all values y that Y could take • For example, weather is marginally independent of the result of a coin toss 22

  23. Examples mples fo for marginal ginal in independence dependence Weather W Temperature T P(W,T) • Is Temperature marginally sunny hot 0.10 independent of Weather (see sunny mild 0.20 previous example)? sunny cold 0.10 cloudy hot 0.05 cloudy mild 0.35 cloudy cold 0.20 23

  24. • Is Temperature marginally independent of Weather (see previous example Weather W Temperature T P(W,T) A. yes sunny hot 0.10 sunny mild 0.20 B. no sunny cold 0.10 C. It depends of the value of T cloudy hot 0.05 cloudy mild 0.35 D. It depends of the value of W cloudy cold 0.20 T P(T) T P(T|W=sunny) hot 0.15 hot 0.25 mild 0.55 mild 0.50 cold 0.30 cold 0.25 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend