Graphical Models Graphical Models Review of probability theory - PowerPoint PPT Presentation

Graphical Models Graphical Models Review of probability theory Review of probability theory Siamak Ravanbakhsh Winter 2018

Learning objectives Learning objectives Probability distribution and density functions Random variable Bayes' rule Conditional independence Expectation and Variance

Ω Sample space Sample space Ω = { ω } : the set of all possible outcomes ( a.k.a. outcome space) Example1 : three tosses of a coin Ω = { hhh , hht , hth , … , ttt } image: http://web.mnstate.edu/peil/MDEV102/U3/S25/Cartesian3.PNG

Sample space Ω Sample space Ω = { ω } : the set of all possible outcomes ( a.k.a. outcome space) Example 2 : two dice Ω = {(1, 1), … , (6, 6)} Image source: http://www.stat.ualberta.ca/people/schmu/preprints/article/Article.htm

Sample space Ω Sample space Ω = { ω } : the set of all possible outcomes ( a.k.a. outcome space) Example 3 : 2 cards from a deck (assuming order doesn't matter) A 2 3 4 5 6 7 8 9 10 J Q K 54 ) 54! ∣Ω∣ = = = 1431 ( 2 A 2 3 4 5 6 7 8 9 10 J Q K 2!52! A 2 3 4 5 6 7 8 9 10 J Q K A 2 3 4 5 6 7 8 9 10 J Q K A 2 3 4 5 6 7 8 9 10 J Q K A 2 3 4 5 6 7 8 9 10 J Q K A 2 3 4 5 6 7 8 9 10 J Q K A 2 3 4 5 6 7 8 9 10 J Q K

Event Event space space S An event is a set of outcomes F ⊆ Ω event space is a set of events S ⊆ 2 Ω

Event Event space space S An event is a set of outcomes F ⊆ Ω event space is a set of events S ⊆ 2 Ω Example: Event : at least two heads F = { hht , thh , hth , hhh } Event : pair of aces ∣ F ∣ = 6

Event space Event space S Requirements for event space: Complement of an event is also an event A ∈ S → Ω − A ∈ S Intersection of events is also an event A , B ∈ S → A ∩ B ∈ S Example: at least one head ∈ S → no heads ∈ S at least one head, at least one tail ∈ S → at least one head and one tail ∈ S

Probability distribution Probability distribution Assigns a real value to each event P : S → ℜ Probability axioms (Kolmogorov axioms) Probability is non-negative P ( A ) ≥ 0 The probability of disjoint events is additive A ∩ B = ∅ → P ( A ∪ B ) = P ( A ) + P ( B ) P (Ω) = 1

Probability distribution Probability distribution Probability axioms (Kolmogorov axioms) Probability is non-negative P ( A ) ≥ 0 disjoint events are additive: A ∩ B = ∅ → P ( A ∪ B ) = P ( A ) + P ( B ) P (Ω) = 1 Derivatives: P (∅) = 0 P (Ω\ A ) = 1 − P ( A ) . P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B ) P ( A ∩ B ) ≤ min{ P ( A ), P ( B )} union bound: P ( A ∪ B ) ≤ P ( A ) + P ( B )

Probability distribution; Probability distribution; examples examples Ω = {1, 2, 3, 4, 5, 6} (a minimal choice of event space) S = {∅, Ω} P (∅) = 0, P (Ω) = 1

Probability distribution; Probability distribution; examples examples Ω = {1, 2, 3, 4, 5, 6} (a minimal choice of event space) S = {∅, Ω} P (∅) = 0, P (Ω) = 1 S = 2 Ω (a maximal choice of event space) ∣ A ∣ P ( A ) = that is 2 P ({1, 3}) = 6 (any other consistent assignment is acceptable) 6

Conditional Conditional probability probability Probability of an event A after observing the event B P ( A ∩ B ) P ( A ∣ B ) = P ( B )

Conditional Conditional probability probability Probability of an event A after observing the event B P ( A ∩ B ) P ( A ∣ B ) = P ( B ) P ( B ) > 0

Conditional Conditional probability probability Probability of an event A after observing the event B P ( A ∩ B ) P ( A ∣ B ) = P ( B ) P ( B ) > 0 Example: three coin tosses P (at least one head and one tail) P (at least one head ∣ at least one tail) = P (at least one tail)

Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B )

Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) Chain rule: P ( A ∩ B ) = P ( B ) P ( A ∣ B )

Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B )

Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B ) P ( A ∩ C ∩ D ) = P ( C ∩ D ) P ( A ∣ C ∩ D )

Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B ) P ( A ∩ C ∩ D ) = P ( C ∩ D ) P ( A ∣ C ∩ D ) P ( A ∩ C ∩ D ) = P ( D ) P ( C ∣ D ) P ( A ∣ C ∩ D )

Chain Chain rule rule P ( A ∩ B ) P ( A ∣ B ) = P ( B ) and Chain rule: B = C ∩ D P ( A ∩ B ) = P ( B ) P ( A ∣ B ) P ( A ∩ C ∩ D ) = P ( C ∩ D ) P ( A ∣ C ∩ D ) P ( A ∩ C ∩ D ) = P ( D ) P ( C ∣ D ) P ( A ∣ C ∩ D ) More generally: P ( A ∩ … ∩ A ) = P ( A ) P ( A ∣ A ) … P ( A ∣ A ∩ … ∩ A ) 1 1 2 1 1 n −1 n n

Bayes Bayes' rule ' rule Reasoning about the event A: likelihood of the event B if A were to happen our prior belief about A our posterior belief about A after observing B P ( B ∣ A ) P ( A ) P ( A ∣ B ) = P ( B )

Bayes' rule; Bayes' rule; example example 1% of the population has cancer prior likelihood posterior cancer test P ( B ∣ A ) P ( A ) P ( A ∣ B ) = False positive 10% P ( B ) False negative 5% chance of having cancer given a positive test result? sample space? {TP, TN, FP, FN} events A, B? A = {TP, FN}, B = {TP, TN} prior? lilkelihood? P(A) = .01, P(B|A) = .9

Bayes' rule; Bayes' rule; example example 1% of the population has cancer prior likelihood posterior cancer test P ( B ∣ A ) P ( A ) P ( A ∣ B ) = False positive 10% P ( B ) False negative 5% chance of having cancer given a positive test result? sample space? {TP, TN, FP, FN} events A, B? A = {TP, FN}, B = {TP, TN} prior? lilkelihood? P(A) = .01, P(B|A) = .9 P(B) is not trivial P ( cancer ∣ +) ∝ P (+ ∣ cancer ) P ( cancer ) = .009 .009 P ( cancer ∣ +) = ≈ .08 .009+.099 P ( cancer ∣ −) ∝ P (+ ∣ cancer ) P ( cancer ) = .99 × .1 = .099

Independence Independence P ⊨ ( A ⊥ B ) Events A and B are independent iff P ( A ∩ B ) = P ( A ) P ( B ) Observing A does not change P(B)

Independence Independence P ⊨ ( A ⊥ B ) Events A and B are independent iff P ( A ∩ B ) = P ( A ) P ( B ) Observing A does not change P(B) using P ( A ∩ B ) = P ( A ) P ( B ∣ A ) Equivalent definition: or P ( B ) = P ( B ∣ A ) P ( A ) = 0

Independence; Independence; example example Are A and B independent? Ω B A

Independence; Independence; example example 1 Example 1: P (hhh) = P (hht) … = P (ttt) = 8 1 P (h * * ∣ * t *) = P (h * *) = 2 equivalently: 1 P (h t *) = P (* t *) P (h * *) = 4

Independence; Independence; example example 1 Example 1: P (hhh) = P (hht) … = P (ttt) = 8 1 P (h * * ∣ * t *) = P (h * *) = 2 equivalently: 1 P (h t *) = P (* t *) P (h * *) = 4 Example 2: are these two events independent? P ({ ht , hh }) = .3, P ({ th }) = .1

Conditional Conditional independence independence P ⊨ ( A ⊥ B ∣ C ) a more common phenomenon: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C )

Conditional Conditional independence independence P ⊨ ( A ⊥ B ∣ C ) a more common phenomenon: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C ) using P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ A ∩ C )

Conditional Conditional independence independence P ⊨ ( A ⊥ B ∣ C ) a more common phenomenon: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C ) using P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ A ∩ C ) P ( B ∣ C ) = P ( B ∣ A ∩ C ) or Equivalent definition: P ( A ∩ C ) = 0

Conditional independence; Conditional independence; example example Generalization of independence: P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C ) Ω P ⊨ ( R ⊥ B ∣ Y ) from: wikipedia

Summary Summary Basics of probability Outcome space: a set Event: a subset of outcomes Event space: a set of events Probability dist. is associated with events Conditional probability: based on intersection of events Chain rule follows from conditional probability (Conditional) independence: relevance of some events to others

Random Variable Random Variable is an attribute associated with each outcome X : Ω → V al ( X ) intensity of a pixel head/tail value of the first coin in multiple coin tosses first draw from a deck is larger than the second a formalism to define events P ( X = x ) ≜ P ({ ω ∈ Ω ∣ X ( ω ) = x })

Random Variable Random Variable is an attribute associated with each outcome X : Ω → V al ( X ) intensity of a pixel head/tail value of the first coin in multiple coin tosses first draw from a deck is larger than the second a formalism to define events P ( X = x ) ≜ P ({ ω ∈ Ω ∣ X ( ω ) = x }) Example: three tosses of coin number of heads X : Ω → {0, 1, 2, 3} 1 number of heads in the first two trials X : Ω → {0, 1, 2} 2 at least one head X : Ω → { True , False } 3

Graphical Models Graphical Models Review of probability theory - PowerPoint PPT Presentation

Graphical Models Graphical Models Review of probability theory Review of probability theory Siamak Ravanbakhsh Winter 2018 Learning objectives Learning objectives Probability distribution and density functions Random variable Bayes' rule

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed & undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Announcements Announcements For Monday read Becker sections 1 4-1 8 For Monday, read Becker,

Probabilistic Graphical Models Probabilistic Graphical Models Review of probability theory

Processing Independent Component Analysis Class 8. 24 Sep 2015 Instructor: Bhiksha Raj

Probability theory Adapted from F. Xia 17 Basic concepts Possible outcomes, sample space,

A technique for computing minors of orthogonal ( 0 , 1 ) matrices and an application to the

10-701 Machine Learning Recita2on 2: Probability / Sta2s2cs

Quantum codes from generalized quadrangles Petr Lison ek Simon Fraser University Burnaby, BC,

Projected entangled-pair states for chiral topological phases Hong-Hao Tu (MPI for Quantum

Graphical Models Graphical Models Review of probability theory - PowerPoint PPT Presentation

Graphical Models Graphical Models Review of probability theory Review of probability theory Siamak Ravanbakhsh Winter 2018 Learning objectives Learning objectives Probability distribution and density functions Random variable Bayes' rule

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical &gt; Tangible? What are their limitations? 93 94 Graphical &gt; Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Announcements Announcements For Monday read Becker sections 1 4-1 8 For Monday, read Becker,

Probabilistic Graphical Models Probabilistic Graphical Models Review of probability theory

Processing Independent Component Analysis Class 8. 24 Sep 2015 Instructor: Bhiksha Raj

Probability theory Adapted from F. Xia 17 Basic concepts Possible outcomes, sample space,

A technique for computing minors of orthogonal ( 0 , 1 ) matrices and an application to the

10-701 Machine Learning Recita2on 2: Probability / Sta2s2cs

Quantum codes from generalized quadrangles Petr Lison ek Simon Fraser University Burnaby, BC,

Projected entangled-pair states for chiral topological phases Hong-Hao Tu (MPI for Quantum

Graphical Models Graphical Models Relationship between the directed & undirected models

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical