MIDAS: An Execution-Driven Simulator for Active Storage - - PowerPoint PPT Presentation

midas
SMART_READER_LITE
LIVE PREVIEW

MIDAS: An Execution-Driven Simulator for Active Storage - - PowerPoint PPT Presentation

MIDAS: An Execution-Driven Simulator for Active Storage Architectures Shahrukh R. Tarapore Lockheed Martin Advanced Technologies Lab Clinton W. Smullen, IV Sudhanva Gurumurthi Department of Computer Science, University of Virginia Outline


slide-1
SLIDE 1

MIDAS:

An Execution-Driven Simulator for Active Storage Architectures

Shahrukh R. Tarapore

Lockheed Martin Advanced Technologies Lab

Clinton W. Smullen, IV Sudhanva Gurumurthi

Department of Computer Science, University of Virginia

slide-2
SLIDE 2

Outline

  • Growth of unstructured data-processing
  • Embarrassingly data-parallel
  • Active storage architectures
  • Small-scale data-parallel systems
  • Need a simulation infrastructure

2

slide-3
SLIDE 3

Growth of Data

3

slide-4
SLIDE 4

Growth of Data

  • Cost of storage

continuously dropping

3

slide-5
SLIDE 5

Growth of Data

  • Cost of storage

continuously dropping

  • Growth of devices

producing content

3

slide-6
SLIDE 6

Growth of Data

  • Cost of storage

continuously dropping

  • Growth of devices

producing content

  • Study from IDC on

“digital universe”:

Source: IDC

3

slide-7
SLIDE 7

What is the data?

4

slide-8
SLIDE 8

What is the data?

  • Majority of data is unstructured
  • Images
  • Audio
  • Video
  • Free-form text (books, email)

4

slide-9
SLIDE 9

What is the data?

  • Majority of data is unstructured
  • Images
  • Audio
  • Video
  • Free-form text (books, email)
  • Need the ability to process this data

4

slide-10
SLIDE 10

What is the difference?

  • Unstructured data can be given metadata
  • Labor intensive (difficult to automate)
  • Imperfect - users want freedom to search
  • Unstructured search is data-intensive

5

slide-11
SLIDE 11

What is the difference?

  • Unstructured data can be given metadata
  • Labor intensive (difficult to automate)
  • Imperfect - users want freedom to search
  • Unstructured search is data-intensive

5

slide-12
SLIDE 12

Workloads

  • Move data
  • Scan data - nearest neighbor search
  • Transform data - image edge detection
  • Developed benchmark suite
  • Presented at SNAPI ‘07

6

slide-13
SLIDE 13

Unstructured-Data Processing Opportunities

  • Unstructured data processing has heavy I/O
  • Must scan large amounts of data
  • Embarrassingly data parallel
  • Parallel operations, MapReduce, etc.
  • Multi/manycore increases I/O demands

further

7

slide-14
SLIDE 14

Unstructured-Data Processing Problems

  • Emerging workloads are very data parallel
  • Still moving data from storage to CPU
  • Processors consume lots of power
  • Data movement is ‘wasted’ power
  • Can systems target these workloads?

8

slide-15
SLIDE 15

Typical Approaches

9

slide-16
SLIDE 16

Typical Approaches

9

Cluster+SAN:

slide-17
SLIDE 17

Typical Approaches

9

Cluster+SAN: GFS+MapReduce:

slide-18
SLIDE 18

Storage-centric Computing

10

Can we use disk drive and array controller processors to execute workloads?

slide-19
SLIDE 19

Storage-centric Computing

10

Can we use disk drive and array controller processors to execute workloads?

slide-20
SLIDE 20

Active-Storage Architectures

  • Move computation to disk drives and array

controllers

  • What is the performance?
  • What is their power consumption?
  • What are the microarchitectural

tradeoffs?

11

slide-21
SLIDE 21

MIDAS

  • Both in-order and out-of-order cores
  • Hard disk timing model
  • Interconnect modeling
  • RPC programming model [Sivathanu02]

12

slide-22
SLIDE 22

Programming Model

13

AS_COMPUTE_REQUEST AS_DATA_READY AS_COMPUTE_DONE

PE Requesting Computation PE Performing Computation Time

slide-23
SLIDE 23

Modeling

  • Network connects Processing Elements
  • PE consists of up to one core and disk
  • SimpleScalar for cores
  • DiskSim for disks
  • Finite amount of local memory
  • Full-duplex, point-to-point network links

14

slide-24
SLIDE 24

Space Manager

  • Standard UNIX syscalls for file I/O
  • Abstracts use of disks
  • Simulates FAT
  • like filesystem
  • Handles file address translation
  • Models sequential and random layouts
  • Also handles swap space

15

slide-25
SLIDE 25

Complete PE

16

SimpleScalar DiskSim

slide-26
SLIDE 26

Complete PE

16

SimpleScalar DiskSim

Computation request

slide-27
SLIDE 27

Complete PE

16

SimpleScalar DiskSim

Computation request Disk block request

slide-28
SLIDE 28

Complete PE

16

+

SimpleScalar DiskSim

Computation request Latency Disk block request Computation latency Disk access latency

slide-29
SLIDE 29

Disk-only

17

SimpleScalar DiskSim

slide-30
SLIDE 30

Disk-only

17

+

DiskSim

Disk block request Latency Disk block request Disk access latency

slide-31
SLIDE 31

Experimental Setup

  • Host is 1.6 GHz, 8-wide, out-of-order, with

512 MB of RAM

  • Vary the number of disks
  • Vary DPU frequency (200/300/400 MHz)
  • Vary data layout (sequential/random)
  • Vary DPU superscalar width (1/2/4 wide)

18

slide-32
SLIDE 32

Image Edge Detection

19 Image Edge Detection Normalized Speedup of Active Storage Sequential Data Layout

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 200 300 400 DPU Frequency (mhz) Normalized Speedup 2 Disks 4 Disks 8 Disks

slide-33
SLIDE 33

Image Edge Detection

19

Image Edge Detection Normalized Speedup of Active Storage Sequential Data Layout

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 200 300 400 DPU Frequency (mhz) Normalized Speedup 2 Disks 4 Disks 8 Disks

Image Edge Detection Normalized Speedup of Active Storage Random Data Layout

0.5 1 1.5 2 2.5 3 200 300 400 DPU Frequency (mhz) Normalized Speedup 2 Disks 4 Disks 8 Disks

Data Layout

slide-34
SLIDE 34

Image Edge Detection

19

slide-35
SLIDE 35

Image Edge Detection

20

slide-36
SLIDE 36

Image Edge Detection

20

Image Edge Detection Effect of Processor Width - 8 Disk Active Storage Sequential Data Layout

0.5 1 1.5 2 2.5 1 way 2 way 4 way Processor Width Normalized Speedup 200 Mhz 300 Mhz 400 Mhz

Superscalar Width

slide-37
SLIDE 37

Active Storage Architecture

21

Processing at the host

Host Core Disk Drive

slide-38
SLIDE 38

Active Storage Architecture

21

Processing at the array controller

Host Core Array Controller Disk Drive

slide-39
SLIDE 39

Active Storage Architecture

21

Processing at the disk drives

Host Core Array Controller Disk Drive Disk Processor

slide-40
SLIDE 40

Disk Processors

[Computing Frontiers ‘08]

22

Image Edge Detection Effect of Disk Processor Width - 8 Disk System

0.5 1 1.5 2 2.5 3 3.5 1-way 2-way 4-way

Processor Width

Normalized Speedup 200 MHz 300 MHz 400 MHz

Nearest Neighbor Search Effect of Disk Processor Width - 8 Disk System

1 2 3 4 5 6 7 1-way 2-way 4-way

Processor Width Normalized Speedup

200 MHz 300 MHz 400 MHz

slide-41
SLIDE 41

Conclusion

  • Unstructured data-processing is a growing
  • Need smaller scale systems than Google
  • Shift data-parallel computation to storage
  • Need the ability to model them

23

slide-42
SLIDE 42

Questions?