无监督学习中的 选代表和被代表问题
- AP & LLE
张响亮
Xiangliang Zhang
King Abdullah University of Science and Technology
CNCC, Oct 25, 2018 Hangzhou, China
- AP & LLE Xiangliang Zhang King Abdullah University of - - PowerPoint PPT Presentation
- AP & LLE Xiangliang Zhang King Abdullah University of Science and Technology CNCC, Oct 25, 2018 Hangzhou, China Outline Affinity Propagation (AP) [Frey and Dueck,
Xiangliang Zhang
King Abdullah University of Science and Technology
CNCC, Oct 25, 2018 Hangzhou, China
2
[Frey and Dueck, Science, 2007]
[Roweis and Saul, Science, 2000]
3
[Frey and Dueck, Science 2007]
4
[Frey and Dueck, NIPS 2005]
We describe a new method that, for the first time to our knowledge, combines the advantages of model-based clustering and affinity-based clustering.
Component Mixing coefficient
2 1
) (
m i C x k m
x
m i
µ
S
Î =
i C x m m
x C
m iÎ
S = | | 1 µ Minimize where
2 1
) (
m i C x k m
x
m i
µ
S
Î =
} | {
m i i m
C x x Î = µ
Minimize where
K-medoids K-medians
Data to cluster: The likelihood of belong to the cluster with center is the Bayesian prior probability that is a cluster center
The responsibility of k-th component generating xi Assign xi with center si Choose a new center
Message sent from xi to each center/exemplar: preference to be with each exemplar Hard decision for cluster centers/exemplars
Introduce: “availabilities”, which is sent from exemplars to other points, and provides soft evidence of the preference for each exemplar to be available as a center for each point
8
Affinities
9
is the index of the exemplar for
Objective function is
Constraints:
Should not be empty with a single exemplar An exemplar must select itself as exemplar
10
Preference (prior)
11
12
13
14
15
16
17
18
19
Xiangliang Zhang, KAUST CS340: Data Mining 20
21
22
[Frey and Dueck, Science, 2007]
[Roweis and Saul, Science, 2000]
23
[Roweis and Saul, Science, 2000]
Saul and Roweis. Think globally, fit locally: unsupervised learning of low dimensional
24
25
Given pairwise similarity Embedding to find
Eliminate the need to estimate pairwise distances between widely separated data points?
Locally, on a fine enough
scale, everything looks linear
Represent object as linear
combination of its neighbors
Assumption: same linear
representation will hold in the low dimensional space
Find a low dimensional
embedding which minimizes reconstruction loss
26
27
2 i j j ij i
28
) e convarianc unit with ( I Y Y N 1 and
(centered Y s.t. W) (I W) (I M where ) Y (Y M || Y W Y || (Y)
N i i T i N i i T N i N j j i ij 2 i j j ij i
å = å =
åå
å å
e
Create a sparse matrix M = (I-W)T(I-W) Set Y to be the eigenvectors corresponding to the bottom d
non-zero eigenvectors of M
29
neighbors nearest K s x'
is and ), x ( ) x ( C where C C w
j k T j jk 1
h h h
= 1 1 1
30
Allows “many-to-one” mappings in which a single ambiguous object really belongs in several disparate locations in the low-dimensional space, while LLE makes one-to-one mapping. pj|i is the asymmetric probability that i would pick j as its neighbor Gaussian Neighborhood in
qj|i is induced probability that point i picks point j as its neighbor Gaussian Neighborhood in low- dim space
uses a Student-t distribution (heavier tail) rather than a Gaussian to compute the similarity between two points in the low-dimensional space symmetrized version of the SNE cost function with simpler gradients
31
Jing Chen and Yang Liu. Locally linear embedding: a survey. Artificial Intelligence Review (2011)
32 [Chen and Liu, 2011]
[Ting and Jordan, 2018]
Lab of Machine Intelligence and kNowledge Engineering (MINE): http://mine.kaust.edu.sa/