Exploring Trade-offs in Transactional Parallel Data Movement Ivo - - PowerPoint PPT Presentation

exploring trade offs in transactional parallel data
SMART_READER_LITE
LIVE PREVIEW

Exploring Trade-offs in Transactional Parallel Data Movement Ivo - - PowerPoint PPT Presentation

Exploring Trade-offs in Transactional Parallel Data Movement Ivo Jimenez, Carlos Maltzahn (UCSC) Jay Lofstead (Sandia National Labs) November 18, 2013 The need for Transactional Atomicity 1 The difference with Databases In terms of


slide-1
SLIDE 1

Exploring Trade-offs in Transactional Parallel Data Movement

Ivo Jimenez, Carlos Maltzahn (UCSC) Jay Lofstead (Sandia National Labs) November 18, 2013

slide-2
SLIDE 2

The need for Transactional Atomicity

1

slide-3
SLIDE 3

The difference with Databases

  • In terms of ACID, we want:
  • Atomicity
  • Durability
  • Leave Isolation/Consistency to the clients
  • Single Transaction (vs. thousands)
  • Massive amount of cohorts (vs. hundreds)

2

slide-4
SLIDE 4

The approach

  • Assume that storage servers can do:
  • multi-version concurrency control
  • per-object visibility control
  • Clients handle consensus

3

slide-5
SLIDE 5

Consensus Protocols

4

slide-6
SLIDE 6

NBTA

  • Non-blocking Transactional Atomicity
  • “HAT” formalization (Bailis et al. VLDB 2014)
  • In the context of Highly-available systems
  • Can also be applied in synchronous systems

to achieve very low overhead

5

slide-7
SLIDE 7

Features

Protocol Fault Model Block Async Replication NBTA none Yes No No 2PC fail-stop Yes No No 3PC fail-stop No No No Paxos fail-recover No Yes Yes

6

slide-8
SLIDE 8

Our goal

  • One-size-fits-all solution won’t work
  • Let users pick based on their needs:
  • Length of job
  • MTTF
  • fault modes
  • etc
  • We want to explore trade-offs and

characterize protocols based on the user needs

7

slide-9
SLIDE 9

Preliminary Evaluation

8

slide-10
SLIDE 10

Future Work

  • Incorporate fault-tolerance
  • Cohort failure: can recover individually
  • Coordinator failure: 3PC, Paxos
  • Coordinate asynchronously
  • No need to wait for global consensus

9

slide-11
SLIDE 11

Related Work

  • DOE’s Fast Forward Storage and I/O. The

FastForward approach is similar to the NBTA protocol.

  • Fault-tolerant MPI make use of consensus

protocols to identify faulty processes.

  • Recovery in multi-level checkpoint restart.

10

slide-12
SLIDE 12

Thanks!

11