probabilistic graphical models probabilistic graphical
play

Probabilistic Graphical Models Probabilistic Graphical Models - PowerPoint PPT Presentation

Probabilistic Graphical Models Probabilistic Graphical Models Review of probability theory Review of probability theory Siamak Ravanbakhsh Fall 2019 Learning objectives Learning objectives Probability distribution and density functions


  1. Probabilistic Graphical Models Probabilistic Graphical Models Review of probability theory Review of probability theory Siamak Ravanbakhsh Fall 2019

  2. Learning objectives Learning objectives Probability distribution and density functions Random variable Bayes' rule Conditional independence Expectation and Variance

  3. Ω Sample space Sample space Ω = { ω } : the set of all possible outcomes ( a.k.a. outcome space) Example1 : three tosses of a coin Ω = { hhh , hht , hth , … , ttt } image: http://web.mnstate.edu/peil/MDEV102/U3/S25/Cartesian3.PNG

  4. Sample space Ω Sample space Ω = { ω } : the set of all possible outcomes ( a.k.a. outcome space) Example 2 : two dice Ω = {(1, 1), … , (6, 6)} Image source: http://www.stat.ualberta.ca/people/schmu/preprints/article/Article.htm

  5. Σ Event Event space space An event is a set of outcomes E ⊆ Ω event space is a set of events Σ ⊆ 2 Ω

  6. Σ Event Event space space An event is a set of outcomes E ⊆ Ω event space is a set of events Σ ⊆ 2 Ω Example: Event : at least two heads Σ = { hht , thh , hth , hhh } Event : draw a pair of aces from a deck ∣ E ∣ = 6

  7. space Σ Event Event space Requirements for event space ( σ − algebra) Ω ∈ Σ The complement of an event is also an event A ∈ Σ → Ω − A ∈ Σ (Countable) intersection of events is also an event A , B ∈ Σ → A ∩ B ∈ Σ Example: at least one head, at least one tail ∈ Σ → at least one head and one tail ∈ Σ at least one head ∈ Σ → no heads ∈ Σ Extends to uncountable sets (Real numbers)

  8. Probability distribution Probability distribution Assigns a real value to each event P : Σ → R Probability axioms (Kolmogorov axioms) other axiomatizations of probability? Probability is non-negative P ( A ) ≥ 0 measure The probability of disjoint events is (countably) additive A ∩ B = ∅ → P ( A ∪ B ) = P ( A ) + P ( B ) P (Ω) = 1 The triple is a probability space (Ω, Σ, P )

  9. Probability distribution Probability distribution Probability axioms (Kolmogorov axioms) Probability is non-negative P ( A ) ≥ 0 disjoint events are additive: A ∩ B = ∅ → P ( A ∪ B ) = P ( A ) + P ( B ) P (Ω) = 1 Derivatives: P (∅) = 0 P (Ω\ A ) = 1 − P ( A ) . P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B ) P ( A ∩ B ) ≤ min{ P ( A ), P ( B )} union bound: P ( A ∪ B ) ≤ P ( A ) + P ( B )

  10. Probability distribution: Probability distribution: examples examples Ω = {1, 2, 3, 4, 5, 6} (a minimal choice of event space) Σ = {∅, Ω} P (∅) = 0, P (Ω) = 1

  11. Probability distribution: Probability distribution: examples examples Ω = {1, 2, 3, 4, 5, 6} (a minimal choice of event space) Σ = {∅, Ω} P (∅) = 0, P (Ω) = 1 Σ = 2 Ω (a maximal choice of event space) ∣ A ∣ P ( A ) = that is 2 P ({1, 3}) = (any other consistent assignment is acceptable) 6 6

  12. 2 Ω Can't we always use even for uncountable outcome spaces?

  13. 2 Ω Can't we always use even for uncountable outcome spaces? It turns out some events are not measurable Banach­Tarski paradox

  14. 2 Ω Can't we always use even for uncountable outcome spaces? It turns out some events are not measurable Banach­Tarski paradox Having a event space and probability measure avoids this

  15. Conditional Conditional probability probability Probability of an event A after observing the event B P ( A ∩ B ) P ( A ∣ B ) = P ( B )

  16. Conditional Conditional probability probability Probability of an event A after observing the event B P ( A ∩ B ) P ( A ∣ B ) = P ( B ) P ( B ) > 0

  17. Conditional Conditional probability probability Probability of an event A after observing the event B P ( A ∩ B ) P ( A ∣ B ) = P ( B ) P ( B ) > 0 Example: three coin tosses P (at least one head and one tail) P (at least one head ∣ at least one tail) = P (at least one tail)

  18. Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B )

  19. Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) Chain rule: P ( A ∩ B ) = P ( B ) P ( A ∣ B )

  20. Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B )

  21. Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B ) P ( A ∩ C ∩ D ) = P ( C ∩ D ) P ( A ∣ C ∩ D )

  22. Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B ) P ( A ∩ C ∩ D ) = P ( C ∩ D ) P ( A ∣ C ∩ D ) P ( A ∩ C ∩ D ) = P ( D ) P ( C ∣ D ) P ( A ∣ C ∩ D )

  23. Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B ) P ( A ∩ C ∩ D ) = P ( C ∩ D ) P ( A ∣ C ∩ D ) P ( A ∩ C ∩ D ) = P ( D ) P ( C ∣ D ) P ( A ∣ C ∩ D ) More generally: P ( A ∩ … ∩ A ) = P ( A ) P ( A ∣ ) … P ( A ∣ ∩ … ∩ A ) A A 1 1 2 1 1 n −1 n n

  24. Bayes Bayes' rule ' rule Reasoning about event A: likelihood of the event B if A were to happen our prior belief about A our posterior belief about A after observing B P ( B ∣ A ) P ( A ) P ( A ∣ B ) = P ( B )

  25. Bayes' rule: Bayes' rule: example example 1% of the population has cancer prior likelihood posterior cancer test P ( B ∣ A ) P ( A ) P ( A ∣ B ) = False positive 10% P ( B ) False negative 5% chance of having cancer given a positive test result?

  26. Bayes' rule: Bayes' rule: example example 1% of the population has cancer prior likelihood posterior cancer test P ( B ∣ A ) P ( A ) P ( A ∣ B ) = False positive 10% P ( B ) False negative 5% chance of having cancer given a positive test result? sample space? {TP, TN, FP, FN} events A, B? A = {TP, FN}, B = {TP, FP} prior? likelihood? P(A) = .01, P(B|A) = .9 P(B) is not trivial

  27. Bayes' rule: Bayes' rule: example example 1% of the population has cancer prior likelihood posterior cancer test P ( B ∣ A ) P ( A ) P ( A ∣ B ) = False positive 10% P ( B ) False negative 5% chance of having cancer given a positive test result? sample space? {TP, TN, FP, FN} events A, B? A = {TP, FN}, B = {TP, FP} prior? likelihood? P(A) = .01, P(B|A) = .9 P(B) is not trivial P ( cancer ∣ +) ∝ P (+ ∣ cancer ) P ( cancer ) = .009 .009 P (¬ cancer ∣ +) ∝ P (+ ∣ ¬ cancer ) P (¬ cancer ) = .99 × .1 = .099 P ( cancer ∣ +) = ≈ .08 .009+.099

  28. Independence Independence P ⊨ ( A ⊥ B ) Events A and B are independent iff P ( A ∩ B ) = P ( A ) P ( B ) Observing A does not change P(B)

  29. Independence Independence P ⊨ ( A ⊥ B ) Events A and B are independent iff P ( A ∩ B ) = P ( A ) P ( B ) Observing A does not change P(B) using P ( A ∩ B ) = P ( A ) P ( B ∣ A ) Equivalent definition: or P ( B ) = P ( B ∣ A ) P ( A ) = 0

  30. Independence: Independence: example example Are A and B independent? Ω B A

  31. Independence: Independence: example example 1 Example 1: P (hhh) = P (hht) … = P (ttt) = 8 1 P (h * * ∣ * t *) = P (h * *) = 2 equivalently: 1 P (h t *) = P (* t *) P (h * *) = 4

  32. Independence: Independence: example example 1 Example 1: P (hhh) = P (hht) … = P (ttt) = 8 1 P (h * * ∣ * t *) = P (h * *) = 2 equivalently: 1 P (h t *) = P (* t *) P (h * *) = 4 Example 2: are these two events independent? P ({ ht , hh }) = .3, P ({ th }) = .1

  33. Conditional Conditional independence independence P ⊨ ( A ⊥ B ∣ C ) a more common phenomenon: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C )

  34. Conditional Conditional independence independence P ⊨ ( A ⊥ B ∣ C ) a more common phenomenon: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C ) using P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ A ∩ C )

  35. Conditional Conditional independence independence P ⊨ ( A ⊥ B ∣ C ) a more common phenomenon: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C ) using P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ A ∩ C ) P ( B ∣ C ) = P ( B ∣ A ∩ C ) or Equivalent definition: P ( A ∩ C ) = 0

  36. Conditional independence: Conditional independence: example example Generalization of independence: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C ) Ω P ⊨ ( R ⊥ B ∣ Y ) from: wikipedia

  37. Summary Summary Basics of probability Outcome space: a set Event: a subset of outcomes Event space: a set of events Probability dist. is associated with events Conditional probability: based on intersection of events Chain rule follows from conditional probability (Conditional) independence: relevance of some events to others

  38. Random Variable Random Variable is an attribute associated with each outcome X : Ω → V al ( X ) intensity of a pixel head/tail value of the first coin in multiple coin tosses first draw from a deck is larger than the second a formalism to define events P ( X = x ) ≜ P ({ ω ∈ Ω ∣ X ( ω ) = x })

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend