Part I: Introductory Materials
Introduction to Data Mining
- Dr. Nagiza F. Samatova
Part I: Introductory Materials Introduction to Data Mining Dr. - - PowerPoint PPT Presentation
Part I: Introductory Materials Introduction to Data Mining Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer Science and Mathematics Division Oak Ridge National Laboratory What is common among
2
3
4
5
6
7
8
9
9
10
11
12
13
T1: Bab T2: Child T3: Health T4: Home T5: Infant T6: Safety T7: Toddler D1: D2: D3: T1: 1 T2: 1 T3: 1 T4: 1 T5: 1 1 T6: 1 1 T7: 1 1
14
D1: D2: D3: T1: 1 T2: 1 T3: 1 T4: 1 T5: 1 1 T6: 1 1 T7: 1 1
15
16 Credit: Images are from Google images via search of keywords
17
18
19
20
Petabytes Data
21
Complex regulation Single gene ~30k genes
22
Climate Now: 20-40 Terabytes/year 5 years: 5-10 Petabytes/year Fusion Now: 100 Megabytes/15 min 5 years: 1000 Megabytes/2 min
23
24
kB/s GB/$M MIPS/$M CPU, Disk, Network Trend CPU: every 1.2 years Disk: every 1.4 years WAN: 0.7 years
Src: Richard Mount, SLAC
Latency and Speed – Storage Performance
Retrieval Rate Mbytes/s log10(Object Size Bytes)
25
26