Learning From Data Lecture 19 A Peek At Unsupervised Learning
k-Means Clustering Probability Density Estimation Gaussian Mixture Models
- M. Magdon-Ismail
CSCI 4100/6100
Learning From Data Lecture 19 A Peek At Unsupervised Learning k - - PowerPoint PPT Presentation
Learning From Data Lecture 19 A Peek At Unsupervised Learning k -Means Clustering Probability Density Estimation Gaussian Mixture Models M. Magdon-Ismail CSCI 4100/6100 recap: Radial Basis Functions Nonparametric RBF Parametric k -RBF-Network
k-Means Clustering Probability Density Estimation Gaussian Mixture Models
CSCI 4100/6100
recap: Radial Basis Functions
Nonparametric RBF Parametric k-RBF-Network g(x) =
N
N
m=1 αm(x)
αn(x) = φ
| x−xn | | r
r = 0.05
No Training h(x) = w0 +
k
wj · φ | | x − µj | | r
(bump on µj)
linear model given µj choose µj as centers of k-clusters of data
x y x y
k = 4, r = 1
k
k = 10, regularized
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 2 /23
Unsupervised learning − →
Organize data for faster nearest neighbor search Determine centers for RBF bumps.
Learn the patterns in data, e.g. the patterns in a language before getting into a supervised setting. amazon.com organizes books into categories
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 3 /23
Clustering digits − →
Average Intensity Symmetry
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 4 /23
Clustering − →
A cluster is a collection of points S A k-clustering is a partition of the data into k clusters S1, . . . , Sk. ∪k
j=1Sj = D
Si ∩ Sj = ∅ for i = j Each cluster has a center µj
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 5 /23
k-means error − →
k
N
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 6 /23
− →
Add to Sj all points closest to µj
Center µj is the centroid of cluster Sj µj = 1 |Sj|
xn
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 7 /23
Lloyd’s algorithm − →
Ein(S1, . . . , Sk; µ1, . . . , µk) =
N
| | xn − µ(xn) | |2
1: Initialize Pick well separated centers µj. 2: Update Sj to be all points closest µj.
Sj ← {xn : | | xn − µj | | ≤ | | xn − µℓ | | for ℓ = 1, . . . , k}.
3: Update µj to the centroid of Sj.
µj ← 1 |Sj|
xn
4: Repeat steps 2 and 3 until Ein stops decreasing. c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 8 /23
Update clusters − →
Ein(S1, . . . , Sk; µ1, . . . , µk) =
N
| | xn − µ(xn) | |2
1: Initialize Pick well separated centers µj. 2: Update Sj to be all points closest µj.
Sj ← {xn : | | xn − µj | | ≤ | | xn − µℓ | | for ℓ = 1, . . . , k}.
3: Update µj to the centroid of Sj.
µj ← 1 |Sj|
xn
4: Repeat steps 2 and 3 until Ein stops decreasing. c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 9 /23
Update centers − →
Ein(S1, . . . , Sk; µ1, . . . , µk) =
N
| | xn − µ(xn) | |2
1: Initialize Pick well separated centers µj. 2: Update Sj to be all points closest µj.
Sj ← {xn : | | xn − µj | | ≤ | | xn − µℓ | | for ℓ = 1, . . . , k}.
3: Update µj to the centroid of Sj.
µj ← 1 |Sj|
xn
4: Repeat steps 2 and 3 until Ein stops decreasing. c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 10 /23
Application to RBF-Network − →
Choosing k - knowledge of problem (10 digits) or CV.
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 11 /23
Probability density estimation − →
Clusters are regions of high probability.
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 12 /23
Parzen windows − →
N on each data point.
x P(x)
N
|
1 (2π)d/2e− 1
2z2 c A
M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 13 /23
Digits data − →
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 14 /23
GMM − →
(Similar to nonparametric RBF − → parametric k-RBF-network)
2(x − µj)tΣj−1(x − µj).
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 15 /23
GMM formula − →
N(x; µj, Σj) = 1 (2π)d/2|Σj|1/2e−1
2(x − µj)tΣj−1(x − µj).
k
wj > 0,
k
wj = 1 You get to pick {wj, µj, Σj}j=1,...,k
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 16 /23
Maximum likelihood − →
(We saw this when we derived the cross entropy error for logistic regression)
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 17 /23
E-M algorithm − →
Partition variables into two sets. Given one-set, you can estimate the other ‘Bootstrap’ your way to a decent solution.
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 18 /23
γnj − →
N
(‘number’ of points in bump j)
(probability bump j)
N
(centroid of bump j)
N
n − µjµt j (covariance matrix of bump j)
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 19 /23
Parameters given γnj − →
N
(‘number’ of points in bump j)
(probability bump j)
N
(centroid of bump j)
N
n − µjµt j (covariance matrix of bump j)
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 20 /23
Restimating γnj − →
k
ℓ=1 wℓ N(xn; µℓ, Σℓ)
probability of bump j: wj probability density for xn given bump j: N(xn; µj, Σj)
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 21 /23
E-M Algorithm − →
1: Start with estimates for the bump membership γnj. 2: Estimate wj, µj, Σj given the bump memberships. 3: Update the bump memberships given wj, µj, Σj; 4: Iterate to step 2 until convergence. c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 22 /23
GMM on digits − →
c A M L Creator: Malik Magdon-Ismail
Unsupervised Learning: 23 /23