CSC2515 Lecture 6: Probabilistic Models
Marzyeh Ghassemi
Material and slides developed by Roger Grosse, University of Toronto
UofT CSC2515 Lec6 1 / 54
CSC2515 Lecture 6: Probabilistic Models Marzyeh Ghassemi Material - - PowerPoint PPT Presentation
CSC2515 Lecture 6: Probabilistic Models Marzyeh Ghassemi Material and slides developed by Roger Grosse, University of Toronto UofT CSC2515 Lec6 1 / 54 Todays Agenda Bayesian parameter estimation: average predictions over all hypotheses,
Material and slides developed by Roger Grosse, University of Toronto
UofT CSC2515 Lec6 1 / 54
UofT CSC2515 Lec6 2 / 54
UofT CSC2515 Lec6 3 / 54
UofT CSC2515 Lec6 3 / 54
UofT CSC2515 Lec6 4 / 54
UofT CSC2515 Lec6 4 / 54
UofT CSC2515 Lec6 4 / 54
UofT CSC2515 Lec6 5 / 54
UofT CSC2515 Lec6 5 / 54
UofT CSC2515 Lec6 6 / 54
UofT CSC2515 Lec6 7 / 54
UofT CSC2515 Lec6 7 / 54
UofT CSC2515 Lec6 7 / 54
UofT CSC2515 Lec6 8 / 54
UofT CSC2515 Lec6 8 / 54
UofT CSC2515 Lec6 9 / 54
UofT CSC2515 Lec6 9 / 54
UofT CSC2515 Lec6 10 / 54
UofT CSC2515 Lec6 11 / 54
UofT CSC2515 Lec6 11 / 54
θ
θ
θ
UofT CSC2515 Lec6 12 / 54
UofT CSC2515 Lec6 13 / 54
UofT CSC2515 Lec6 13 / 54
UofT CSC2515 Lec6 13 / 54
NH NH+NT
55 100 = 0.55
NH+a NH+NT +a+b 4 6 ≈ 0.67 57 104 ≈ 0.548
NH+a−1 NH+NT +a+b−2 3 4 = 0.75 56 102 ≈ 0.549
UofT CSC2515 Lec6 14 / 54
UofT CSC2515 Lec6 15 / 54
UofT CSC2515 Lec6 16 / 54
UofT CSC2515 Lec6 17 / 54
UofT CSC2515 Lec6 18 / 54
UofT CSC2515 Lec6 19 / 54
posterior
class likelihood
prior
constant
UofT CSC2515 Lec6 20 / 54
posterior
class likelihood
prior
constant
UofT CSC2515 Lec6 20 / 54
posterior
class likelihood
prior
constant
UofT CSC2515 Lec6 20 / 54
UofT CSC2515 Lec6 21 / 54
UofT CSC2515 Lec6 21 / 54
UofT CSC2515 Lec6 21 / 54
UofT CSC2515 Lec6 22 / 54
UofT CSC2515 Lec6 22 / 54
UofT CSC2515 Lec6 23 / 54
N
N
D
j
N
D
j
N
D
N
j
for feature xj
UofT CSC2515 Lec6 24 / 54
i=1 log p(x(i) j
UofT CSC2515 Lec6 25 / 54
i=1 log p(x(i) j
N
j
N
j
N
j ) log(1 − θ11)
N
j
N
j ) log(1 − θ10) UofT CSC2515 Lec6 25 / 54
i=1 log p(x(i) j
N
j
N
j
N
j ) log(1 − θ11)
N
j
N
j ) log(1 − θ10)
UofT CSC2515 Lec6 25 / 54
j=1 p(xj | t)
j=1 p(xj | t′)
D
UofT CSC2515 Lec6 26 / 54
UofT CSC2515 Lec6 27 / 54
UofT CSC2515 Lec6 27 / 54
UofT CSC2515 Lec6 27 / 54
UofT CSC2515 Lec6 27 / 54
UofT CSC2515 Lec6 27 / 54
UofT CSC2515 Lec6 28 / 54
UofT CSC2515 Lec6 28 / 54
UofT CSC2515 Lec6 29 / 54
UofT CSC2515 Lec6 30 / 54
UofT CSC2515 Lec6 31 / 54
UofT CSC2515 Lec6 32 / 54
1
2
D
UofT CSC2515 Lec6 33 / 54
UofT CSC2515 Lec6 34 / 54
UofT CSC2515 Lec6 35 / 54
UofT CSC2515 Lec6 36 / 54
UofT CSC2515 Lec6 37 / 54
UofT CSC2515 Lec6 38 / 54
UofT CSC2515 Lec6 39 / 54
UofT CSC2515 Lec6 40 / 54
k (x − µk)
UofT CSC2515 Lec6 41 / 54
k (x − µk)
UofT CSC2515 Lec6 41 / 54
N
1
i=1 r (i) k
i=1 r (i) k
i=1 r (i) k N
k (x(i) − µk)(x(i) − µk)⊤
k
UofT CSC2515 Lec6 42 / 54
k (x − µk) +
k (x − µk) = (x − µℓ)⊤Σ−1 ℓ (x − µℓ) + Const
UofT CSC2515 Lec6 43 / 54
k (x − µk) +
k (x − µk) = (x − µℓ)⊤Σ−1 ℓ (x − µℓ) + Const
UofT CSC2515 Lec6 43 / 54
UofT CSC2515 Lec6 44 / 54
k (x − µk) = (x − µℓ)⊤Σ−1 ℓ (x − µℓ) + Const
k x − 2µ⊤ k Σ−1 k x = x⊤Σ−1 ℓ x − 2µ⊤ ℓ Σ−1 ℓ x + Const
UofT CSC2515 Lec6 45 / 54
k (x − µk) = (x − µℓ)⊤Σ−1 ℓ (x − µℓ) + Const
k x − 2µ⊤ k Σ−1 k x = x⊤Σ−1 ℓ x − 2µ⊤ ℓ Σ−1 ℓ x + Const
k Σ−1x = −2µ⊤ ℓ Σ−1x + Const
UofT CSC2515 Lec6 45 / 54
UofT CSC2515 Lec6 46 / 54
UofT CSC2515 Lec6 47 / 54
UofT CSC2515 Lec6 48 / 54
UofT CSC2515 Lec6 48 / 54
UofT CSC2515 Lec6 48 / 54
UofT CSC2515 Lec6 48 / 54
UofT CSC2515 Lec6 49 / 54
D
UofT CSC2515 Lec6 49 / 54
D
UofT CSC2515 Lec6 49 / 54
jk
i=1 r (i) k x(i) j
i=1 r (i) k
jk
i=1 r (i) k (x(i) j
i=1 r (i) k
k
UofT CSC2515 Lec6 50 / 54
k | − 1
k (x − µk) +
UofT CSC2515 Lec6 51 / 54
UofT CSC2515 Lec6 52 / 54
UofT CSC2515 Lec6 53 / 54
UofT CSC2515 Lec6 54 / 54