in Big-Data Analytic Systems Rui Li , Peizhen Guo, Bo Hu, Wenjun Hu - - PowerPoint PPT Presentation

in big data analytic systems
SMART_READER_LITE
LIVE PREVIEW

in Big-Data Analytic Systems Rui Li , Peizhen Guo, Bo Hu, Wenjun Hu - - PowerPoint PPT Presentation

Libra and the Art of Task Sizing in Big-Data Analytic Systems Rui Li , Peizhen Guo, Bo Hu, Wenjun Hu Yale University Background Stage 0 Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 0 Stage 4 Stage 1 Stage 2 Stage 3 Stage 4


slide-1
SLIDE 1

Libra and the Art of Task Sizing in Big-Data Analytic Systems

Rui Li, Peizhen Guo, Bo Hu, Wenjun Hu Yale University

slide-2
SLIDE 2

Background

slide-3
SLIDE 3

Stage 0

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6

slide-4
SLIDE 4

Stage 0

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 4

slide-5
SLIDE 5

Stage 0

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 4

stage input data stage output data

slide-6
SLIDE 6

Stage 0

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 4

stage input data stage output data

slide-7
SLIDE 7

Stage 0

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 4

stage input data stage output data How to set task size?

slide-8
SLIDE 8

Stage 0

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 4

stage input data stage output data How to set task size?

  • - User experience
  • - System default value
slide-9
SLIDE 9

The importance of task sizing

slide-10
SLIDE 10

Observation 1: diff jobs have diff optimal task sizes

Normalized stage completion time vs task size

slide-11
SLIDE 11

PageRank stage completion time vs task size

Observation 2: diff stages have diff optimal task sizes

slide-12
SLIDE 12
  • 1. Proper task sizing is important
slide-13
SLIDE 13
  • 1. Proper task sizing is important
  • 2. U-curve pattern
slide-14
SLIDE 14

Analysis of U-curve pattern

slide-15
SLIDE 15

Per-task overhead for PageRank stage 1

Observation 3: tasks have similar scheduling delay and system

  • verhead regardless of task sizes
slide-16
SLIDE 16

# of IO ops for different stages of PageRank

Observation 4: small size => fail to do batch processing large size => memory swapping

slide-17
SLIDE 17

Small task size => high aggregated overhead, no batch processing Large task size => memory swapping

slide-18
SLIDE 18

System design

  • Strawman solution
slide-19
SLIDE 19

Refinement 1: ADAM optimization

slide-20
SLIDE 20

Refinement 2: noise filtering

Task processing rate fluctuation for stage 1 of PageRank

slide-21
SLIDE 21

Refinement 2: noise filtering

Task processing rate fluctuation for stage 1 of PageRank

slide-22
SLIDE 22

Refinement 3: contention avoidance

PageRank over two machines

slide-23
SLIDE 23

Refinement 3: contention avoidance

PageRank over two machines

slide-24
SLIDE 24

Evaluation

  • 8 m4.xlarge VMs from EC2
  • Workloads generated from HiBench
slide-25
SLIDE 25

Initial task size effect

PageRank completion time over diff. initial task size

slide-26
SLIDE 26

Libra performance

PageRank completion time with diff. input data size

slide-27
SLIDE 27

Q&A