Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud - - PowerPoint PPT Presentation

heterogeneity aware
SMART_READER_LITE
LIVE PREVIEW

Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud - - PowerPoint PPT Presentation

Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud Gunho Lee (UC Berkeley) Byung-Gon Chun (Yahoo! Research) Randy H. Katz (UC Berkeley) We have resources and jobs Resource Job/Task Allocate resources (slots) Allocation


slide-1
SLIDE 1

Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud

Gunho Lee (UC Berkeley) Byung-Gon Chun (Yahoo! Research) Randy H. Katz (UC Berkeley)

slide-2
SLIDE 2

We have resources and jobs

Resource Job/Task

slide-3
SLIDE 3

Allocate resources (slots)

Allocation Resource Job/Task

slide-4
SLIDE 4

Then schedule jobs/tasks on them

Allocation Resource Scheduling Job/Task

slide-5
SLIDE 5

Goal 1. Minimize the cluster size while providing good performance

Dynamic Resource Allocation Resource Job/Task

slide-6
SLIDE 6

Goal 2. Provide each job with “fair share” of resources

Resource Job/Task Fair scheduling

slide-7
SLIDE 7

Heterogeneity makes the problem more complex

Resource Job/Task Allocation ??? Scheduling ???

slide-8
SLIDE 8

Our Approach

  • Consider Job Affinity to match more suitable

resources to jobs

  • Redefine a share metric to provide fairness
  • Allocation

– Core Nodes + Accelerator Nodes

  • Scheduling

– Progress Share

slide-9
SLIDE 9

Fair Share Metric

  • The scheduler try to equalize “share” of all

jobs

– SlotShare : Number of slots owned

  • Does not work well in heterogeneous environments

– ProgressShare: Progress being made with owned slots / all slots

  • Contribution of a slot to a job’s progress rate
slide-10
SLIDE 10

Progress Share

Progress 1 Time Progress without sharing (1 job)

slide-11
SLIDE 11

Progress Share

Progress 1 Time Just good progress with sharing (2 jobs) Progress without sharing (1 job)

slide-12
SLIDE 12

Progress Share

Progress 1 Time Progress without sharing (1 job) Just good progress with sharing (2 jobs)

(Under-served) (Even better)

slide-13
SLIDE 13

Progress Share

Progress 1 Time a b Progress Share of Job A = Ratio of progress slope (b/a) Progress without sharing (1 job) Just good progress with sharing (2 jobs)

(Under-served) (Even better)

slide-14
SLIDE 14

Homogeneous case

Progress 1 Time Job A Job B 1 Slot Share 1 Progress Share

slide-15
SLIDE 15

Heterogeneous case

Job A runs faster on gray slots

Progress 1 Time A A A A A A A A A A A A A A A A A A Progress 1 Time Job A Job B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B

slide-16
SLIDE 16

Heterogeneous case 1

Using SlotShare

Progress 1 Time Job A Job B 1 Slot Share 1 Progress Share B B A A B B B B B A A B A B A B A B A B B B B Time Time

slide-17
SLIDE 17

Heterogeneous case 1

Using SlotShare

Progress 1 Time Job A Job B 1 Slot Share 1 Progress Share B B A A B B B B B A A B A B A B A B A B B B B Time Time

slide-18
SLIDE 18

Heterogeneous case 1

Using SlotShare

Progress 1 Time Job A Job B 1 Slot Share 1 Progress Share B B A A B B B B B A A B A B A B A B A B B B B Time Time Job A is making less progress, with the same number of slots

slide-19
SLIDE 19

Heterogeneous case 2

Using ProgressShare

Progress 1 Time Job A Job B 1 Slot Share 1 Progress Share Time Time B B B B B B B B B B B B B B B B A B A A A A A A A A A A

slide-20
SLIDE 20

Heterogeneous case 2

Using ProgressShare

Progress 1 Time Job A Job B 1 Slot Share 1 Progress Share Time Time B B B B B B B B B B B B B B B B A B A A A A A A A A A A

slide-21
SLIDE 21

Heterogeneous case 2

Using ProgressShare

Progress 1 Time Job A Job B 1 Slot Share 1 Progress Share Time Time B B B B B B B B B B B B B B B B A B A A A A A A A A A A Both jobs making progress >= 0.5

slide-22
SLIDE 22

Performance Gain

  • f Using Progress Share
slide-23
SLIDE 23

Summary

  • Heterogeneity should be taken account at both level of two-level

scheduling

– Resource Allocation and Job Scheduling

  • Need to redefine “share” to provide performance and fairness

simultaneously in heterogeneous environments

– Propose “progress share”

  • Future Work

– Combine with sub-linear performance model – Consider inference of co-located jobs