CSC 411: Lecture 13: Mixtures of Gaussians and EM
Class based on Raquel Urtasun & Rich Zemel’s lectures Sanja Fidler
University of Toronto
March 9, 2016
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 1 / 33
CSC 411: Lecture 13: Mixtures of Gaussians and EM Class based on - - PowerPoint PPT Presentation
CSC 411: Lecture 13: Mixtures of Gaussians and EM Class based on Raquel Urtasun & Rich Zemels lectures Sanja Fidler University of Toronto March 9, 2016 Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 1 / 33 Today Mixture
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 1 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 2 / 33
◮ This makes it possible to judge different methods. ◮ It may help us decide on the number of clusters.
◮ Then we adjust the model parameters to maximize the probability that
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 3 / 33
K
K
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 4 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 5 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 6 / 33
N
◮ Singularities: Arbitrarily large likelihood when a Gaussian explains a
◮ Identifiability: Solution is up to permutations
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 7 / 33
K
K
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 8 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 9 / 33
K
N
N
K
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 10 / 33
N
N
n=1 1[z(n)=k] x(n)
n=1 1[z(n)=k]
n=1 1[z(n)=k] (x(n) − µk)(x(n) − µk)T
n=1 1[z(n)=k]
N
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 11 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 12 / 33
◮ In order to adjust the parameters, we must first solve the inference
◮ We cannot be sure, so it’s a distribution over all possibilities.
k
◮ Each Gaussian gets a certain amount of posterior probability for each
◮ At the optimum we shall satisfy
◮ We can derive closed form updates for all parameters Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 13 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 14 / 33
j=1 p(z = j)p(x|z = j)
j=1 πjN(x|µj, Σj)
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 15 / 33
N
N
j=1 πjN(x|µj, Σj)
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 16 / 33
N
j=1 πjN(x|µj, Σj)
k
N
k x(n)
N
k
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 17 / 33
N
k (x(n) − µk)(x(n) − µk)T
N
k
k , which are complex functions of the
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 18 / 33
◮ E-step: Evaluate the responsibilities given current parameters
k
j=1 πjN(x(n)|µj, Σj) ◮ M-step: Re-estimate the parameters given current responsibilities
N
k x(n)
N
k (x(n) − µk)(x(n) − µk)T
N
k ◮ Evaluate log likelihood and check for convergence
N
CSC 411: 13-MoG March 9, 2016 19 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 20 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 21 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 22 / 33
Θ Q(Θ, Θold)
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 23 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 24 / 33
◮ But we know that the posterior will change after updating the
◮ The function we need is called Free Energy. Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 25 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 26 / 33
CSC 411: 13-MoG March 9, 2016 27 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 28 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 29 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 30 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 31 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 32 / 33
Urtasun, Zemel, Fidler (UofT) CSC 411: 13-MoG March 9, 2016 33 / 33