1 #" - - PDF document

1
SMART_READER_LITE
LIVE PREVIEW

1 #" - - PDF document


slide-1
SLIDE 1

1

Data Mining: Clustering 52

  • !"

# $% # &%'

  • Data Mining: Clustering

53

  • "(
  • )(

# *!" +

1 3 2 5 4 6 0.05 0.1 0.15 0.2 1 2 3 4 5 6 1 2 3 4 5

Data Mining: Clustering 54

  • ),+)-

+"

# " ./ ++)

+"

# '0+

,112*2+ 23-

Data Mining: Clustering 55

  • 4+"

# )

4+) +2+" ,*-"

# ))

42) +2++ ,*-

# 5+

Data Mining: Clustering 56

5++! "4

61 ++00 71 8+

  • 91

54 :1 &++00

  • ++"+0"

4

""++"4 ""

!"!"

Data Mining: Clustering 57

  • 4")

++00

p1 p3 p5 p4 p2 p1 p2 p3 p4 p5

. . .

. . .

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12
slide-2
SLIDE 2

2

Data Mining: Clustering 58

#"

  • "+2

43

C1 C4 C2 C5 C3 C2 C1 C1 C3 C5 C4 C2 C3 C4 C5

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12

Data Mining: Clustering 59

#"

  • 44

,7:- ++001

C1 C4 C2 C5 C3 C2 C1 C1 C3 C5 C4 C2 C3 C4 C5

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12

Data Mining: Clustering 60

!$

  • !;44

++00<=

C1 C4 C2 U C5 C3 ? ? ? ? ? ? ? C2 U C5 C1 C1 C3 C4 C2 U C5 C3 C4

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12

Data Mining: Clustering 61

%&

'( 4+ "'( 4+ !'( )4+ ( 4

Data Mining: Clustering 62

!" 8* 5 ,5+ -8* +8* )8*

Data Mining: Clustering 63

"($#)' "44 ,-+""

  • # +"+2112*

+0+1

I1 I2 I3 I4 I5 I1 1.00 0.90 0.10 0.65 0.20 I2 0.90 1.00 0.70 0.60 0.50 I3 0.10 0.70 1.00 0.40 0.30 I4 0.65 0.60 0.40 1.00 0.80 I5 0.20 0.50 0.30 0.80 1.00

1 2 3 4 5

slide-3
SLIDE 3

3

Data Mining: Clustering 64

> ? : ? ? * ? > 6 9 7 % : 6 > 7 7

  • ?

9 7 > 6 & ? 7 7 6 > ! * %

  • &

!

B A E C D 4 Threshold of 2 3 5 1 A B C D E

!"$#)*+"

Data Mining: Clustering 65

($#)

Nested Clusters Dendrogram

1 2 3 4 5 6 1 2 3 4 5

3 6 2 5 4 1 0.05 0.1 0.15 0.2

Data Mining: Clustering 66

$#)

Original Points Two Clusters

  • Can handle non-elliptical shapes

Data Mining: Clustering 67

"$#)

Original Points Two Clusters

  • Sensitive to noise and outliers

Data Mining: Clustering 68

"($!,"' "4 4,-+ ""

# +"+4

  • I1

I2 I3 I4 I5 I1 1.00 0.90 0.10 0.65 0.20 I2 0.90 1.00 0.70 0.60 0.50 I3 0.10 0.70 1.00 0.40 0.30 I4 0.65 0.60 0.40 1.00 0.80 I5 0.20 0.50 0.30 0.80 1.00

1 2 3 4 5

Data Mining: Clustering 69

($!,

Nested Clusters Dendrogram

3 6 4 1 2 5 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

1 2 3 4 5 6 1 2 5 3 4

slide-4
SLIDE 4

4

Data Mining: Clustering 70

$!,

Original Points Two Clusters

  • Data Mining: Clustering

71

"$!,

Original Points Two Clusters

  • '

&-

Data Mining: Clustering 72

"(.!

  • 0"4)"+4

+04+41

  • ))"

+0")

=

∈ ∈

I1 I2 I3 I4 I5 I1 1.00 0.90 0.10 0.65 0.20 I2 0.90 1.00 0.70 0.60 0.50 I3 0.10 0.70 1.00 0.40 0.30 I4 0.65 0.60 0.40 1.00 0.80 I5 0.20 0.50 0.30 0.80 1.00

1 2 3 4 5

Data Mining: Clustering 73

(.!

Nested Clusters Dendrogram

3 6 4 1 2 5 0.05 0.1 0.15 0.2 0.25

1 2 3 4 5 6 1 2 5 3 4

Data Mining: Clustering 74

(.!

  • +4

+8*

  • #

8+

  • 8

# 4

Data Mining: Clustering 75

"(/$ "4 !44

  • # +)"4+

!

8+ 4 "

# (

slide-5
SLIDE 5

5

Data Mining: Clustering 76

("

Group Average Ward’s Method 1 2 3 4 5 6 1 2 5 3 4 MIN MAX 1 2 3 4 5 6 1 2 5 3 4 1 2 3 4 5 6 1 2 5 3 4 1 2 3 4 5 6 1 2 3 4 5

Data Mining: Clustering 77

("01"

@,7-++0 01

# "+1

@,?-

# ++(272 +00+ # +0@,7 ,-- "++

Data Mining: Clustering 78

(2-""

@4 2 A)"( "")+4 ""4

# ) # """"( )0+ # *

Data Mining: Clustering 79

23"4 &4 +++ )+ "+2 + "2* B "++

Data Mining: Clustering 80

"2 !" 5 5+ !' 5 55

# 8%8%++ # 8%8++ %(

Data Mining: Clustering 81

> ? : ? ? * ? > 6 9 7 % : 6 > 7 7

  • ?

9 7 > 6 & ? 7 7 6 > ! * %

  • &

!

B A E C D

$*+"

slide-6
SLIDE 6

6

Data Mining: Clustering 82

$!"

Data Mining: Clustering 83

"1*3*4

Data Mining: Clustering 84

1*!"

Data Mining: Clustering 85

))- $)0

  • $

22" 044

  • Data Mining: Clustering

86

))-!"

Data Mining: Clustering 87

2!$(2!$ 356$4 4 @"++ ++0 4 '+2

  • $"*

4 0"+ 01

slide-7
SLIDE 7

7

Data Mining: Clustering 88

2!$

+2" )+)1 A # "A 4 4++ 41

Data Mining: Clustering 89

2!$!"

Data Mining: Clustering 90

2!$*+"

Data Mining: Clustering 91

!!!!) 8% 8% ++

61 "5+/ "24C/CDDCC 71 ""5

8%8++ %(

# $+)8%4 )+*+/"

  • Data Mining: Clustering

92

%&!)(%& !) " @4""" 4+4+

# $2# "+ # * # "+ +4

"2*2+ "1

Data Mining: Clustering 93

%&!)*+"

slide-8
SLIDE 8

8

Data Mining: Clustering 94

%&!)*+"

Data Mining: Clustering 95

%&!)*+"

Data Mining: Clustering 96

%&!)

# E"+4+" ,'+- # ++ " +""+,5-4'+

+"

# + "45 4'+2 "+ # + ++ +1

Data Mining: Clustering 97

%&!)%

*6-( 4'+" +1 ( 4 '+ ,5-"+"

  • %6-( ++

"+!" 4,'+-!+1 %6-( + "+"+" "+1

Data Mining: Clustering 98

%%&!)

Minpts = 4 Eps

Eps-neighborhood

Minpts = 4

Core points & Border points

Eps

Density reachable

p q

Data Mining: Clustering 99

%&!)!" '+ "+

slide-9
SLIDE 9

9

Data Mining: Clustering 100

%&!)(7&)2

Original Points Point types: core, border and noise Eps = 10, MinPts = 4

Data Mining: Clustering 101

/%&!)/'/

Original Points Clusters

  • Resistant to Noise
  • Can handle clusters of different shapes and sizes

Data Mining: Clustering 102

/%&!)%)/'/

Original Points

(MinPts=4, Eps=9.75). (MinPts=4, Eps=9.92)

  • Varying densities
  • High-dimensional data

Data Mining: Clustering 103

%&!)(%"*2$2

$"+2*

  • +)* "
  • 2+")+*
  • Data Mining: Clustering

104

%- 5 4 +"" +"2++

  • # $%

# &%'

Data Mining: Clustering 105

%-(%8 @,-" @

# +4

+2+2 $

# +

4*4 ""!,11 +- +

slide-10
SLIDE 10

10

Data Mining: Clustering 106

&#(&#

  • $22

)"

  • '

" 4 +( )

# 8⇒

Data Mining: Clustering 107

8

  • F +,282-

# "+ # 8"+ # "!"+

  • F

# # F+" # 8"+F)"

  • # 0

Data Mining: Clustering 108

&#!"

Data Mining: Clustering 109

&#(#"

61 F1 $"""F 4)2) 4F1 71 ++++++ "F1" +"1 ?1 +,4+- ++4 1

Data Mining: Clustering 110

*( * &++ " %+)+4

  • 4# )4

# '4)4 # @)"4+

Data Mining: Clustering 111

*!

slide-11
SLIDE 11

11

Data Mining: Clustering 112

*!"

Data Mining: Clustering 113

*%- 61 @+" 71 ++ + ?1 ++ 91 %)(" :1 ++ ,+)- G1 * + +1 +4 +)+1

Data Mining: Clustering 114

"1

Data Mining: Clustering 115

9

F+)"4))" )4

# 2+2

F2!4 );="< ;"=H