An Empirical Study of High Availability in Stream Processing Systems - - PowerPoint PPT Presentation

an empirical study of high availability in stream
SMART_READER_LITE
LIVE PREVIEW

An Empirical Study of High Availability in Stream Processing Systems - - PowerPoint PPT Presentation

IBM Research An Empirical Study of High Availability in Stream Processing Systems Yu Gu, Zhe Zhang , Fan Ye, Hao Yang, Minkyong Kim, Hui Lei, Zhen Liu 12/3/2009 IBM Research Stream Processing Model software operators (PEs)


slide-1
SLIDE 1

IBM Research

12/3/2009

An Empirical Study of High Availability in Stream Processing Systems

Yu Gu, Zhe Zhang, Fan Ye, Hao Yang, Minkyong Kim, Hui Lei, Zhen Liu

slide-2
SLIDE 2

IBM Research

2 12/3/2009

Stream Processing Model

∩ ∑ ∆ ∞ Ω

… … … … software operators (PEs) deployment machines

An Empirical Study of High Availability in DSPS

  • Unexpected machine failures

– Loss of data and internal state – Disruption to normal processing

  • Challenge: how to preserve data /

state and minimize disruption?

subjob

slide-3
SLIDE 3

IBM Research

3 12/3/2009

Active Standby

∩ ∑ ∆ ∩ ∑ ∆

  • vs. Passive Standby

∩ ∑ ∆ ∩ ∆ ∑ ∩ ∑ ∆ ∩ ∆ ∑

Active Standby Passive Standby

Existing approaches:

An Empirical Study of High Availability in DSPS

slide-4
SLIDE 4

IBM Research

4 12/3/2009

Basic Tradeoff between AS and PS

  • Active Standby

– Overhead: double processing load; at least double message load – Recovery delay: almost zero

  • Passive Standby

– Overhead: checkpoint messages – Recovery delay: failure detection + deploy new job + recover state

slide-5
SLIDE 5

IBM Research

5 12/3/2009

Motivation

  • Tradeoffs of AS & PS not fully understood

– Only systematic comparison: [Hwang ICDE05]

  • Used a variant of PS with high overhead
  • Evaluated in simulations rather than real systems
  • Our contributions

– A sweeping checkpointing method

  • Reducing checkpoint overhead by one order of magnitude
  • Proof of consistency

– A real prototype distributed stream processing system – Comprehensive and empirical evaluation of AS and PS

An Empirical Study of High Availability in DSPS

slide-6
SLIDE 6

IBM Research

Outline

  • Background and Motivation
  • Design and Implementation

– Sweeping Checkpointing – System Architecture

  • Performance Evaluation
  • Related Work
  • Conclusions

6 An Empirical Study of High Availability in DSPS 12/3/2009

slide-7
SLIDE 7

IBM Research

Overview of Sweeping Checkpointing

  • What to include

– Input queues – Internal states – Output queues

  • When to trim
  • Checkpointing Multiple PEs
  • Proof of consistency

7 An Empirical Study of High Availability in DSPS 12/3/2009

recoverable from upstream output queues dominating ckpt size with high data rates

slide-8
SLIDE 8

IBM Research

8 12/3/2009

When to Trim

1 2 3 4 5

√ ≡

1 1 1

In U’s output queue, only removing those packets that have been processed and checkpointed by D

upstream node U downstream node D

An Empirical Study of High Availability in DSPS

slide-9
SLIDE 9

IBM Research

9 12/3/2009

When to Trim

2 3 4 5

√ ≡

1 1 1

In U’s output queue, only removing those packets that have been processed and checkpointed by D

upstream node U downstream node D

An Empirical Study of High Availability in DSPS

slide-10
SLIDE 10

IBM Research

10 12/3/2009

When to Trim

3 4 5

√ ≡

1 1 2 2 2

In U’s output queue, only removing those packets that have been processed and checkpointed by D

upstream node U downstream node D

An Empirical Study of High Availability in DSPS

slide-11
SLIDE 11

IBM Research

11 12/3/2009

When to Trim

3 4 5

√ ≡

1 1 2 2 2

1 2

checkpoint

In U’s output queue, only removing those packets that have been processed and checkpointed by D

upstream node U downstream node D

An Empirical Study of High Availability in DSPS

slide-12
SLIDE 12

IBM Research

12 12/3/2009

When to Trim

3 4 5

√ ≡

1 1 2 2 2

1 2

checkpoint 1 and 2 have been processed and checkpointed

In U’s output queue, only removing those packets that have been processed and checkpointed by D

upstream node U downstream node D

An Empirical Study of High Availability in DSPS

slide-13
SLIDE 13

IBM Research

13 12/3/2009

∆ ∩ ∑ ∑ ∆ ∩

Site 2

CM

≡ √ ≡ √

Site 1

CM

√ ≡

snapshot of the whole sub job

sub job 1 sub job 2

  • Freeze all PEs, then

checkpoint all state, then resume all PEs

Checkpointing Multiple PEs – Synchronous

An Empirical Study of High Availability in DSPS

checkpoint manager

slide-14
SLIDE 14

IBM Research

14 12/3/2009

Checkpointing Multiple PEs – Individual

∆ ∩ ∑ ∑ ∆ ∩

Site 2

CM

≡ √ ≡ √

Site 1

CM

sub job 1 sub job 2

  • Freeze / checkpoint /

resume each PE individually

An Empirical Study of High Availability in DSPS

checkpoint manager

slide-15
SLIDE 15

IBM Research

15 12/3/2009

Checkpointing Multiple PEs – Sweeping

∆ ∩ ∑ ∑ ∆ ∩

Site 2

CM

≡ √ ≡ √

Site 1

CM

sub job 1 sub job 2

  • Checkpoint a PE immediately

after receipt of acknowledgement and

  • utput queue trimming

An Empirical Study of High Availability in DSPS

checkpoint manager

slide-16
SLIDE 16

IBM Research

Sketch of Proof for Consistency

  • Scenario: single node failure (Ni)

– Actions for recovery

  • Recovering operator state
  • Recovering input queue from output queues of upstream
  • Reprocessing affected elements
  • Scenario: multiple concurrent node failures

– Actions for recovery

  • Finding and recovering most upstream failed node
  • Recovering other nodes recursively

16 An Empirical Study of High Availability in DSPS 12/3/2009

  • nly trimmed to reflect

latest checkpoint of Ni

slide-17
SLIDE 17

IBM Research

17 12/3/2009

System Architecture

REC CM FM JMN

  • Remote Execution Coordinator

– manage HA protection for distributed jobs

  • Job Management

– manage job deployment

  • Checkpoint Manager

– manage checkpoint tasks according to assigned checkpoint mechanism

  • Failover Manager

monitor other nodes and initiate recovery

  • Jobs and Processing Nodes

– take data from upstream, execute processing tasks, and send results to downstream

  • Features:

– A distributed job consists of multiple subjobs, each of which can choose its own specific HA mechanism (AS, PS) – The system coordinates the deployment and protection of subjobs among all machines

∑ ∆ ∩

Node

Job Job

An Empirical Study of High Availability in DSPS

slide-18
SLIDE 18

IBM Research

Outline

  • Background and Motivation
  • Design and Implementation
  • Performance Evaluation

– Experiment Setup – Overhead and Delay Results

  • Related Work
  • Conclusions

18 An Empirical Study of High Availability in DSPS 12/3/2009

slide-19
SLIDE 19

IBM Research

19 12/3/2009

Experiment Setup

  • Testbed: a cluster environment

– Dual Xeon 3.06GHz CPUs, 800MHz, 512KB L2 caches, 4GB memory, 80GB disk – 1Gbps LAN

– A distributed job containing 4 subjobs, each having 2 processing nodes running on one machine

  • Metrics

– Recovery delay – Message overhead

An Empirical Study of High Availability in DSPS

slide-20
SLIDE 20

IBM Research

20 12/3/2009 20 12/3/2009

  • Avg. Checkpoint Queue Size Comparison

Sweeping reduces checkpoint size by about 96%

3000 elements/second

An Empirical Study of High Availability in DSPS

slide-21
SLIDE 21

IBM Research

21 12/3/2009 21 12/3/2009

Checkpoint Time Comparison

Sweeping reduces checkpoint time by about 75%

An Empirical Study of High Availability in DSPS

checkpoint interval = 500 ms

slide-22
SLIDE 22

IBM Research

22 12/3/2009 22 12/3/2009

Message Overhead Comparison

AS-AS incurs almost 4 times message overhead vs. PS

An Empirical Study of High Availability in DSPS

slide-23
SLIDE 23

IBM Research

23 12/3/2009 23 12/3/2009

Recovery Delay Decomposition

An Empirical Study of High Availability in DSPS

Detection delay becomes dominant with large heartbeat interval

slide-24
SLIDE 24

IBM Research

Outline

  • Background and Motivation
  • Design and Implementation
  • Performance Evaluation
  • Related Work
  • Conclusions

24 An Empirical Study of High Availability in DSPS 12/3/2009

slide-25
SLIDE 25

IBM Research

25 12/3/2009 25 12/3/2009

  • Borealis

1. “Fault tolerance in the Borealis distributed stream processing system” (SIGMOD ‘05)

A variant of AS

Achieving flexible trade-off between availability and consistency by introducing tentative data concept

2. “Fast and reliable stream processing over wide area networks” (ICDE ’07)

A variant of AS

Most expensive variant; upstream sending to all downstream replicas

No switch required when failure occurs

3. “A cooperative, self-configuring high-availability solution for stream processing” (ICDE ‘07)

A variant of PS

Novel checkpoint scheduling and backup assignment

Balances recovery load over multiple servers

4. “Borealis-R: a replication-transparent stream processing system for wide-area monitoring applications” (SIGMOD ‘08)

A variant of AS

Same technique as in [2]

Novel mechanism to allow replicas execute without coordination but still produce consistent results

Related Work

An Empirical Study of High Availability in DSPS

slide-26
SLIDE 26

IBM Research

26 12/3/2009 26 12/3/2009

Related Work

  • System S

5. “Towards automatic fault recovery in System-S” (ICAC ‘07)

Checkpoint state

Recovery of JMN, not jobs

6. “Failure recovery in cooperative data streaming analysis” (ARES ’07)

How to select a backup site on demand, not recovery technique

7. “Online failure forecast for fault-tolerant data stream processing” (ICDE ‘08)

Prediction of potential failures, a monitoring technique

Leverages varies system metrics (system productivity, available CPU, etc.) to predict failures before they occur

  • Comparison of AS and PS

8. “High-availability algorithms for distributed stream processing” (ICDE ‘05)

Valuable summaries of basic tradeoffs

PS variant has large overhead

Evaluation mainly based on simulations

An Empirical Study of High Availability in DSPS

slide-27
SLIDE 27

IBM Research

27 12/3/2009

Conclusions

  • Fundamental tradeoffs between AS and PS in stream

processing systems not fully understood

  • Our contributions

– A novel sweeping checkpoint mechanism – Proof of consistency – System implementation and empirical evaluation

  • Performance results providing valuable insights

– Importance of queue trimming in checkpointing – Decomposition of recovery delay

An Empirical Study of High Availability in DSPS

slide-28
SLIDE 28

IBM Research

Questions?

Zhe Zhang

zhezhang@ornl.gov

http://users.nccs.gov/~zzhang3/pubs/empirical-middleware09.pdf

28 An Empirical Study of High Availability in DSPS 12/3/2009