1 "#$% - - PDF document

1
SMART_READER_LITE
LIVE PREVIEW

1 "#$% - - PDF document


slide-1
SLIDE 1

1

  • Data Mining Lecture 1

2

  • !!! "# $ %&'
  • ()*+,,-).,-

+)/.+,*))00&,-- )-)

#1!!

2 3 4(1!! $+5-,+,1!!

Data Mining Lecture 1 3

  • ,-,

/-+--6 789--(

  • .-

29--&8

  • &:-+)

2--,+) ;(<8(),,)- 2+,(,--,=''>

Data Mining Lecture 1 4

  • Data Mining Lecture 1

5

  • -,,-.

(% .,+,?! (- -!

Data Mining Lecture 1 6

!

  • --
  • )
  • @-A

,

  • ,(

,

  • ,

,

  • 2,

,

  • A

,

  • 7+

,

,

  • 3.)&.-

,

slide-2
SLIDE 2

2

  • Data Mining Lecture 1

8

"#$%

.7)-B 7-B %C*#-(-B %(,) 2,,B ,((-) D-

Data Mining Lecture 1 9

&

C,,);-<*?( .--*.,-( @(? *() +,,-**,,+( ,),*),.,-)! E8,%.-F;),<%#

0.)G'?--(;8) <+)#'-& -+,#! 2+ '''(-,,'-,)-

  • ,-.=3!7-(-

)--+.!, ;$,,,).=3,6<

Data Mining Lecture 1 10

'(!)

%8,+, 2--,,,-

  • +%@,),--

(--( %@ --* 7-*-+.(#*,-6

Data Mining Lecture 1 11

!*+% 5(-+ ,,--*-

7+-& # +- @# --

  • .+-

*(, .3

3.-+H-.(- !!F, !*+% %,,---

  • 4@

,, ,# ) 8- (, +)(-

$-,A (+,(*- %),

,()-- )(

slide-3
SLIDE 3

3

Data Mining Lecture 1 13

",

  • $((-- --,).-
  • ,))#*#-.(,(
  • (-.,)H-,,

500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 4,000,000 1995 1996 1997 1998 1999

The Data Gap

Total new disk (TB) since 1995 Number of analysts

From: R. Grossman, C. Kamath, V. Kumar, “Data Mining for Scientific and Engineering Applications”

Data Mining Lecture 1 14

*

%#*,--.)-+

=((8(.,-&., ,.,)#* -,,)(, ((-,-+

2,.-;-<

*,--.)-+%% #*,-8- ,)- ,)---(. +,,!

7-B

%-.A) =8),,5 ,!

Data Mining Lecture 1 15

  • I)

7,,-(- I5

I)

3,)-(- /;--<A),

%

C,

%

/,

C

3 +(-+

C

EHH) /+(-+

Data Mining Lecture 1 16

.!-

%+C5$3 C523I)

*)(--=2,,(( ''*-.!%#B E-,,=2*.+(* '!'''#! E-,,=2*.+(!

%I

7=2-,(,( (''B,(

  • (),,+)(=2(*.,+)

!, 7-(=2(A,)+) .)(B, 7(A,)+A( (((BA

Data Mining Lecture 1 17

!*/

! %+,)--

#,)- #, #+#,),,# F#,)- E.--* A,),.,) E-:-

  • C2,

$8 (, -,(

7+ @,):@&(%/2-,) 2)

Data Mining Lecture 1 18

!012

7-(,)B

  • -,),)--

,,,,+,,(),-

$#

E-, (;-,<* ,.,-+!

%.

.(,D+#!

&#,)

2 &,+*-, 3-+-(

slide-4
SLIDE 4

4

Data Mining Lecture 1 19

!032

(,

  • -()*)(+)*
  • , ,(
  • ()A
  • ();+<-(-((
  • A(-*(*,,

*

3.-)(

.,&-,) ,)(-,-)- .

Data Mining Lecture 1 20

!

E,-.,

(,*,)-- ,,)., &,-,)(,&- ,)!

F,

H---

  • #-

,-,&+-

  • ),).#

Data Mining Lecture 1 21

4012

2,

*-,)-,,--. ,-(-!

2

,-+,--,((-,+.- ,)-,-(),

=8,

  • (,*-

,, ),--)

  • ,-(,(
  • ((

Data Mining Lecture 1 22

%-,

2,,-( )+,#*A- .-2,J )!

%,(-

$,,,-,-(,,- (-)*#!2,)H-.( 8-! @$,-(--(,,* (A&,,,,)+,- +#,,,-,,(-!

F,

2,)KL(,#-

  • ,)!

4032

Data Mining Lecture 1 23

  • @2-.- ,)H-/@2

+,#--(,.-. (/*M# # -

2)

N35-3,C+.)-.-A *,(-

7+(&2-

@(&2- ,-,7+ ,(#&,--. (-+.,)H((.( 7+#.7+H!

$

Data Mining Lecture 1 24

56')

  • %-(+#&,,,*+#
  • -*,-*!
  • 3(.,+.,(--

#,,!

  • ;%-+<!3,(#,.

,!

  • #)-,#).),#)+D+)
  • ,.,-((+-,,*-
  • +*,8+)-)#-

(,,+D!

  • ()(,* *-

,,*--.)((( )(-+!

slide-5
SLIDE 5

5

Data Mining Lecture 1 25

7!!

4,$-,,8)(#)+D ,,).,,)(+-, .)(3,C+.)

'''*'O'8'O'8,

2

+(& O'(+D

  • ,,+-(

),-(->*-&(A ((+D-((,(-6

From [Fayyad, et.al.] Advances in Knowledge Discovery and Data Mining, 1996 Data Mining Lecture 1 26

!86

Early Intermediate Late

  • !"

#$ !$ %!% &''()

Data Mining Lecture 1 27

  • 9

9%!0927 ( (-(,(--! 7 0(,8 (--.-+)%%!

Data Mining Lecture 1 28

79

7 C+-(.! 7 ,-! :7 .(! $(*(! 7 C+--,! ;57 3, (,!

Data Mining Lecture 1 29

  • 5,-

,.#*,--,(,

  • -,
  • %, -)#>'L(((6
  • %-- (

E-(,(-,) .+,-. !

  • ((-

H,(,!

  • ,
  • %((
  • 3.,- #*,-

.,H(.--!

  • 0(-.-#*,-

9

Data Mining Lecture 1 30

9567"!

  • 7

,,---,

  • 7
  • F.-()0F5
  • F.,
  • :7

H-

  • 7
  • ()-
  • ;57
  • ()--,)(A,)-A!
  • <7

3,H

slide-6
SLIDE 6

6

Data Mining Lecture 1 31

9

C.( C, P,H 5% %,) ,-% % ,.% /)% % 2,

Data Mining Lecture 1 32

:!!

Data Warehouse

Data cleaning & data integration Filtering

Databases

  • Data Mining Lecture 1

33

7$9* F,,-+ %* $,-+ 2-.-%@-(

C+D&--+D&,,-+ ,-,- $&--- $8-+-,--+

  • ,)-+

777

Data Mining Lecture 1 34

:

Data Mining Lecture 1 35

:

  • (-,

– .-, – 3 – 3-

  • ,.,-
  • .+,!

,-

,!

0.-, 3

=-

+*- ,-!

H 4,H

"! .

,-!

2((),) 2, A,,)

  • A,

!

Data Mining Lecture 1 36

567:!

=8,## 3-(., %,. ,()+.

slide-7
SLIDE 7

7

Data Mining Lecture 1 37

4012

  • H-
  • 4,HH--

!!-).!*

2 ,-,)

,&-,.!,&-, Q;'!!G<:,)Q;'!!G< +)Q;3,$P< R9L(-9>'LS +)$;%P%< +)$;%P%.< R9L(-9TLS

Data Mining Lecture 1 38

4032

,(-3-

E--,(-+-- ,((- =!!,()+-,,() +-, 3-&,(,, *# 3-3-#*,.,

,,)

,,+,#*4-(*,!! ,(--+ ,+-,8H&, ,)-H,,)

Data Mining Lecture 1 39

40>2

C,,)

C,-+D-,)*, +.(- +--8+A(, (--.,)

$--.,,)

$---.,) A,-),) ,)&+-,)

C&--,,)

Data Mining Lecture 1 40

')*

  • 2-) A))-(

,,(!

  • &-A)&+-(-
  • 2(

,)-- +).,-*- *

  • (),,)(,.,.,-

) #(

  • $?-?7

C+D. +--(!! (-! +D. +-?+,(-!!8- .,)+,)!

Data Mining Lecture 1 41

4$!*

E-,,,

  • )(-,, B

2.!,(.!,

(,)CH

  • )(-,) B

2

E,,-(, ! 4,)UA) H

Data Mining Lecture 1 42

  • 2*,,-;,<

F,,- ,*# P,H %+ **,,8- ,)(-+.!

slide-8
SLIDE 8

8

Data Mining Lecture 1 43

  • ,+,)

F,7,-% 0- =(0

Data Mining Lecture 1 44

7

  • !

"

  • #

$%

  • Bayes Theorem
  • Regression Analysis
  • EM Algorithm
  • K-Means Clustering
  • Time Series Analysis
  • Relational Data Model
  • SQL
  • Association Rule Algorithms
  • Data Warehousing
  • Scalability Techniques
  • Neural Networks
  • Decision Tree Algorithms
  • Similarity Measures
  • Hierarchical Clustering
  • IR Systems
  • Imprecise Queries
  • Textual Data
  • Web Search Engines
  • Algorithm Design Techniques
  • Algorithm Analysis
  • Data Structures

Data Mining Lecture 1 45

7 4,(,)

%.- 3-.-

%((.*-((,(

  • (-++-
  • (#*,-+-.-
  • (A,H-
  • (,--

Data Mining Lecture 1 46

,+%

  • F,,,+D&-+D&,,

.,&8,&- ,)777!

  • 9%

H-,( ,--.-,,)! ,, -(-,,,.,

  • : =

%+&--*C523 ,.,H,*#!

  • F,,+#(-,)%/2

##,)7+7+&,,)!

Data Mining Lecture 1 47

  • ,)-
  • ((#-(#*,--+

.( #*,-,,,.,( + (+#-#*,- %A),--&- =8-.,H(-,

  • ,-,-

3.,+,

3(-,+,)

=(()-,+,)(-, 3,,,-+--,-

?012

Data Mining Lecture 1 48

,-.)(-)

  • ,,,-,8)(-

((-+-,+, ()777

,-,-,

2,(-.-#*,-

%&(-, ,,A)* 3,--#

(-.-#*,-*8 #*,-2#*,-(+, 3(-))-.)

?032

slide-9
SLIDE 9

9

Data Mining Lecture 1 49

$ ,,(%

3.) 3(, 0H-

%

0(, F.FC 2) ,8) 7*,,,)- $

Data Mining Lecture 1 50

!

  • %((,)-.(

,(-

  • 2,.,(-+,)--*

*-(,

  • 2%%,--,--

,(-.,- #*,-

  • +(-.)((
  • %(,H-

,(,,--,) !

  • ,((-)
  • D-