SLIDE 1
Document clustering is the process of organizing documents into clusters so that
- Documents within a cluster have high similarity in comparison
to one another.
- But are very dissimilar to documents in other clusters.
1
Parallel Clustering of Large Document Collections Xiaohu Li, Deyun - - PowerPoint PPT Presentation
Parallel Clustering of Large Document Collections Xiaohu Li, Deyun Gao, Zheyuan Yu 31 July 2003 Document clustering is the process of organizing documents into clusters so that Documents within a cluster have high similarity in comparison
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
500 1000 1500 2000 2500 3000 3500 500 1000 1500 2000 2500 Document Size Running Time(s)
98×1004 185×1328 2340×1458
20
21
22
23