Information Retrieval & Data Mining Universität des Saarlandes, Saarbrücken Winter Semester 2013/14
VIII.1&2-
Chapter VIII: Clustering
1
Chapter VIII: Clustering Information Retrieval & Data Mining - - PowerPoint PPT Presentation
Chapter VIII: Clustering Information Retrieval & Data Mining Universitt des Saarlandes, Saarbrcken Winter Semester 2013/14 VIII.1&2- 1 Chapter VIII: Clustering* 1. Basic idea 2. Representative-based clustering 2.1. k -means
VIII.1&2-
1
IR&DM ’13/14 7 January 2014 VIII.1&2-
2
*Zaki & Meira, Chapters 13–15; Tan, Steinbach & Kumar, Chapter 8
IR&DM ’13/14 7 January 2014 VIII.1&2-
3
IR&DM ’13/14 VIII.1&2- 7 January 2014
4
IR&DM ’13/14 VIII.1&2- 7 January 2014
5
IR&DM ’13/14 VIII.1&2- 7 January 2014
6
⇣Pd
i=1 |ui − vi|p⌘ 1
p
IR&DM ’13/14 VIII.1&2- 7 January 2014
7
Pd
i=1 |ui − vi|2
IR&DM ’13/14 VIII.1&2- 7 January 2014
8
IR&DM ’13/14 7 January 2014 VIII.1&2-
9
IR&DM ’13/14 VIII.1&2- 7 January 2014
10
k
i=1
xj∈Ci
2 = k
i=1
xj∈Ci d
l=1
IR&DM ’13/14 VIII.1&2- 7 January 2014
11
k
j=0
IR&DM ’13/14 VIII.1&2- 7 January 2014
12
1 |Ci|
xj∈Ci xj
IR&DM ’13/14 VIII.1&2- 7 January 2014
13
1 2 3 4 5 1 2 3 4 5
k1 k2 k3
1 2 3 4 5 1 2 3 4 5
k1 k2 k3
1 2 3 4 5 1 2 3 4 5
k1 k2 k3
1 2 3 4 5 1 2 3 4 5
k1 k2 k3
1 3 4 5 1 3 4 5
expression in condition 2 expression in condition 1
k1 k2 k3
IR&DM ’13/14 VIII.1&2- 7 January 2014
14
IR&DM ’13/14 VIII.1&2- 7 January 2014
15
IR&DM ’13/14 VIII.1&2- 7 January 2014
16
IR&DM ’13/14 VIII.1&2- 7 January 2014
17
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 1
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 2
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 3
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 4
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 5
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 6
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 1
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 2
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 3
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 4
0.5 1 1.5 2 0.5 1 1.5 2 2.5 3
x y
Iteration 5
IR&DM ’13/14 VIII.1&2- 7 January 2014
18
D(x0)2 P
x2X D(x)2
Original Points K-means (3 Clusters)
IR&DM ’13/14 VIII.1&2- 7 January 2014
19
Original Points K-means (3 Clusters)
Original Points K-means (2 Clusters)
IR&DM ’13/14 VIII.1&2- 7 January 2014
20
IR&DM ’13/14 VIII.1&2- 7 January 2014
21
2 |Σi|− 1 2 exp
i (x − µi)
k
i=1
k
i=1
n
j=1
i=1
IR&DM ’13/14 VIII.1&2- 7 January 2014
22
a=1 P(xj | Ca)P(Ca)
IR&DM ’13/14 VIII.1&2- 7 January 2014
23
i) = 1 √ 2πσi exp
2σ2
i
i)P(Ci)
a=1 f(xj | µa, σ2 a)P(Ca)
j=1 wijxj
j=1 wij
i =
j=1 wij(xj − µi)2
j=1 wij
j=1 wij
IR&DM ’13/14 VIII.1&2- 7 January 2014
24
0.1 0.2 0.3 0.4
1 2 3 4 5 6 7 8 9 10 11
µ1 = 6.63 µ2 = 7.57
(a) Initialization: t = 0
0.1 0.2 0.3 0.4 0.5
1 2 3 4 5 6 7 8 9 10 11
µ1 = 3.72 µ2 = 7.4
(b) Iteration: t = 1
0.3 0.6 0.9 1.2 1.5 1.8
1 2 3 4 5 6 7 8 9 10 11
µ1 = 2.48 µ2 = 7.56
(c) Iteration: t = 5 (converged)
IR&DM ’13/14 VIII.1&2- 7 January 2014
25
aa)2 =
j=1 wij(xja − µia)2
j=1 wij
IR&DM ’13/14 VIII.1&2- 7 January 2014
26
X1 X2 f (x)
(a) iteration: t = 0
X1 X2 f (x)
(b) iteration: t = 1
X1 X2 f (x)
1 2 3
1 2
(c) iteration: t = 36
IR&DM ’13/14 VIII.1&2- 7 January 2014
27
IR&DM ’13/14 VIII.1&2- 7 January 2014
28
IR&DM ’13/14 VIII.1&2- 7 January 2014
29
IR&DM ’13/14 VIII.1&2- 7 January 2014
30