Asynchronous Object Storage w ith QoS for Scientific and - - PowerPoint PPT Presentation

asynchronous object storage w ith qos for scientific and
SMART_READER_LITE
LIVE PREVIEW

Asynchronous Object Storage w ith QoS for Scientific and - - PowerPoint PPT Presentation

Asynchronous Object Storage w ith QoS for Scientific and Commercial Data Michael J. Brim David A. Dillow Sarp Oral Bradley W. Settlemyer* Fieyi Wang PDSW Nov. 18, 2013 Introduction Scientific discovery in energy research and a wide


slide-1
SLIDE 1

Asynchronous Object Storage w ith QoS for Scientific and Commercial Data

PDSW

  • Nov. 18, 2013

Michael J. Brim David A. Dillow Sarp Oral Bradley W. Settlemyer* Fieyi Wang

slide-2
SLIDE 2

2 SOS: The Scalable Object Store

Introduction

  • What is Big Data

– Volume, variety, and velocity – Good support for statistical inference/induction

  • As opposed to traditional descriptive statistics

– Exposes some weaknesses in existing HPC storage systems “Scientific discovery in energy research and a wide range of other fields increasingly depends on effectively managing and searching large datasets for new insights.”

  • Dr. Steven Chu, Secretary of Energy

Big Data Research Initiative, March 29, 2012

slide-3
SLIDE 3

3 SOS: The Scalable Object Store

Big Data HPC Use Cases

  • 1. Scientific Application Checkpointing

– Codes often time-step based simulation – Bursty I/O (write for 5 minutes, once an hour) – Almost entirely storage system write limited – Large fraction of the memory space of the entire application streaming to storage

  • 2. Big Data Analysis

– Data mining, parallel data analysis, statistical induction – Ideally, Map-Reduce – Large block storage system reads – Almost continuous storage system load

slide-4
SLIDE 4

4 SOS: The Scalable Object Store

Interference Scenario

Compute Nodes Storage Nodes

slide-5
SLIDE 5

5 SOS: The Scalable Object Store

Interference Scenario

Storage Nodes Science Simulation

slide-6
SLIDE 6

6 SOS: The Scalable Object Store

Interference Scenario

Storage Nodes Science Simulation Inferential Analysis

slide-7
SLIDE 7

7 SOS: The Scalable Object Store

Interference Scenario

Science Simulation Inferential Analysis Read(fd1, buf, len); Read(fd2, buf, len);

slide-8
SLIDE 8

8 SOS: The Scalable Object Store

Interference Scenario

Science Simulation Inferential Analysis Write(fd, buf, len); Read(fd1, buf, len); Read(fd2, buf, len);

slide-9
SLIDE 9

9 SOS: The Scalable Object Store

Interference Scenario

Science Simulation Inferential Analysis Write(fd3, buf, len); Read(fd1, buf, len); Read(fd2, buf, len);

slide-10
SLIDE 10

10 SOS: The Scalable Object Store

Interference Scenario

Simulation Inferential Analysis Write(fd, buf, len); Read(fd1, buf, len); Read(fd2, buf, len);

slide-11
SLIDE 11

11 SOS: The Scalable Object Store

I/O Interference Reality

  • Why can’t scientific applications just overlap

computation with I/O?

– A time-step based simulation doesn’t checkpoint after every time-step, just after some time-steps – Next program state depends on previous program state – Significant memory pressure – Coordination across multiple compute nodes makes these problems worse rather than better

  • Doesn’t CFQ scheduling make this a non-issue?

– The storage servers use the same PID

  • Alternative solutions like ADIOS circumvent FS services
slide-12
SLIDE 12

12 SOS: The Scalable Object Store

Read/Write I/O interference

  • Low-end storage array
  • Simultaneous access falls
  • ff immediately
  • High-end solid state

storage (FusionIO Octal)

  • Performance loss due to

large queue depths

slide-13
SLIDE 13

13 SOS: The Scalable Object Store

A Scalable Object Store

  • HPC Storage + Analytics
  • SOS Goals:

1. 100% Asynchronous access 2. Provide storage quality

  • f service (QoS)

3. Object resilience with multi-tiered storage 4. In transit object data transformation Volume

Velocity Variety

Big Data

slide-14
SLIDE 14

14 SOS: The Scalable Object Store

Quality of Service

  • Two factors make QoS tractable:

– SOS I/O is fundamentally async and server-directed – Object semantic takes away consistency between two clients writing to same store

  • Lot’s of existing work on how to do this for a single

server with multiple clients

– No magic, we just take a slotted approach to multiple access (like ALOHA) – Don’t try to make it perfect, if an IOP runs past the end of the slot, complete it anyway

  • Clients request QoS via reservations

– Similar to network RSVP protocols

slide-15
SLIDE 15

15 SOS: The Scalable Object Store

SOS Prototype Data Organization Model

  • Objects

– Named data buffers

  • Object Collections

– Lists – named collection of

  • bjects

– Maps – named collections of lists

  • Provides naming hierarchy,

that we believe is compatible with use cases

slide-16
SLIDE 16

16 SOS: The Scalable Object Store

SOS Prototype Storage Organization Model

  • Object Lockers

– A dynamically provisioned allocation of storage resources – Provides opaque private namespace – Returned via a reservation request – Lockers provide the QoS scheme

  • Prototype implementation is based on Ceph

– Lockers provided by RADOS pools – Asynchronous Object Placement with CRUSH is weird – Leverage the lessons of POSIX AIO and Linux AIO

slide-17
SLIDE 17

17 SOS: The Scalable Object Store

SOS Prototype

slide-18
SLIDE 18

18 SOS: The Scalable Object Store

Ongoing work

  • Server-directed object placement

– Ceph uses CRUSH which is client directed placement – Exploring organizing locker’s out of Chord Rings

  • Scalable reservation protocols

– Currently dedicating a server interval slot to handling reservations – Simply dropping reservation requests when a reservation for a new locker cannot be satisfied – Reservations acquired in order for existing lockers (to avoid dining philosophers problem) – Fault tolerant runtimes are still rare

slide-19
SLIDE 19

19 SOS: The Scalable Object Store

Future Use Cases

  • Scalable Fault Tolerance Backplane (for runtimes)

– Storage systems are and should continue to be far more reliable than large compute resources

  • Visualization and performance traces

– Tricky because these are user-guided, and performance is dominated by small, unaligned data access

  • Key-value support

– Important to whole classes of Cloud applications – E.g. HTTP Session Data

slide-20
SLIDE 20

20 SOS: The Scalable Object Store

Acknow ledgements

Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U. S. Department of Energy. This work also used resources at the Extreme Scale Systems Center, located at Oak Ridge National Laboratory and supported by DoD.

slide-21
SLIDE 21

21 SOS: The Scalable Object Store

Thanks!

Questions?

slide-22
SLIDE 22

22 SOS: The Scalable Object Store

Asynchronous I/O Interfaces

  • HPC networks moving steadily toward asynchrony

simply to support scale

  • Many existing asynchronous I/O API’s problematic
  • The only exception I am aware of is Linux AIO

– Has proven very useful in scenarios such as FS scan – Effective because it improves disk access performance!

  • SOS attempts to build on lessons of Linux AIO

– HPC networks likely natively async – Improve disk performance via server-directed I/O – Improve client predictability QoS