Co nc e pt a nd Applic a tio ns
- f Da ta Mining
Co nc e pt a nd Applic a tio ns o f Da ta Mining We e k 1 Topics - - PowerPoint PPT Presentation
Co nc e pt a nd Applic a tio ns o f Da ta Mining We e k 1 Topics Topics Introduction Introduction Syllabus Data Mining Concepts Team Organization Introduction Session Introduction Session Your name and major The
Data‐Mining Applications Percentage Banking 13 Banking 13 Bioinformatics/biotech 10 Direct marketing/fundraising 10 F d d t ti 9 Fraud detection 9 Scientific data 9 Insurance 8 l
So urc e :
Telecommunication 8 Medical/pharmaceuticals 6 Retail 6
www.kdnu
e‐Commerce/Web 5 Other 4 Investment/stocks 3
ug g e ts.c o m
Manufacturing 2 Security 2 Supply chain analysis 2
m
Travel 2 Entertainment 1
Newsweek, May 22, 2006
F ig ure 9
9.14 A Ch
he mic a l d
da ta b a se
e .
Source: Cover page of Advanced in Knowledge Discovery and Data Mining , edited by U Fayyad G Piatesky‐Shapiro P Smyth and R Uthurusamy MIT Press edited by U. Fayyad, G. Piatesky Shapiro, P. Smyth and R. Uthurusamy, MIT Press
DATABASE DATABASE TECHNOLOGY TECHNOLOGY MACHINE MACHINE LEARNING LEARNING STATISTICS STATISTICS
& MATH
MATH INFORMATION INFORMATION
DATA DATA MINING MINING
& & MATH
MATH THEORY THEORY INFORMATION INFORMATION RETRIEVAL RETRIEVAL OTHER OTHER DISCIPLINES DISCIPLINES
m te c hno a se syste m
vo lutio n o 1.1 T he e v F ig ure
F ig ure 1.4 Da ta mining a s a ste p in the pro c e ss o f kno wle dg e disc o ve ry
Graphical User Interface Pattern/Model Evaluation Data Mining Engine
Knowledge- Base
Database or Data Warehouse Server
Data World-Wide Other Info
data cleaning, integration, and selection Database
Data Warehouse
de Web Repositories
Database
F ig ure 1.5 Arc hite c ture o f a typic a l da ta mining syste m
Increasing potential to support End User M ki business decisions End User Business Making Decisions Data Presentation Business Analyst Data Presentation Visualization Techniques Data Mining K l d Di Data Analyst Knowledge Discovery Data Exploration Statistical Analysis, Querying and Reporting DBA OLAP y y g p g Data Warehouses / Data Marts Data Sources Data Sources Paper, Files, Information Providers, Database Systems, OLTP
20
21
22
23
24
<bank‐1> <customer> H / <customer_name> Hayes </customer_name> <customer_street> Main </customer_street> <customer_city> Harrison </customer_city> <account> <account_number> A‐102 </account_number> <branch_name> Perryridge </branch_name> <balance> 400 </balance> </account> </account> <account> … </account> </customer> . . </bank 1> </bank‐1>
25
26
28
29
30
31
32
33
34
35
TI D List of item _ I Ds T100 I1, I2, I5 T200 I2 I4 T200 I2, I4 T300 I2, I3 T400 I1, I2, I4 T500 I1, I3 T600 I2, I3 T700 I1 I3 T700 I1, I3 T800 I1, I2, I3, I5 T900 I1, I2, I3
36
T a ble 5.1 T
ra nsa c tio na l da ta fo r a n AllE
le c tro nic s b ra nc h
F ig
F ro
g ure 1.6.
F ra g me
la tiona l D
e nts o f Re
Da ta ba s
e la tio ns
se fo r AllE
le c tro nic s
37
38
F ig ure 1.7 T
ypic a l fra me wo rk o f a da ta wa re ho use fo r AllE
le c tro nic s
T a ble 3.1 Co mpa riso n b e twe e n OL
T P a nd OL AP syste ms
39
40
F ig ure 3.4 Sta r sc he ma o f a da ta wa re ho use fo r sa le s
T a ble 3 3 A 3 D vie w o f sa le s da ta fo r AllE
le c tro nic s a c c o rding to the
T a ble 3.3 A 3-D vie w o f sa le s da ta fo r AllE
le c tro nic s, a c c o rding to the
dime nsio ns time , ite m, a nd lo c atio n. T he me a sure displa ye d is do llar_so ld (in tho usa nds).
42
F ig ure 3.1 A 3-D da ta c ub e re pre se nta tio n o f the da ta in T
a b le 3.3, a c c o rding to the dime nsio ns time , ite m, a nd lo c atio n. T he me a sure displa ye d is do llar_so ld (in tho usa nds).
F ig
c o
g ure 3.10.
pe ra tio ns o
E xa mple
use d fo r d e s o f T ypic dime nsio n da ta wa r c a l OL AP na l da ta c re ho using c ub e , g
43
Saptio temporal databases
H t d t b