April 2020 1 Harvard University and intern at Google; 2 University of - - PowerPoint PPT Presentation

april 2020
SMART_READER_LITE
LIVE PREVIEW

April 2020 1 Harvard University and intern at Google; 2 University of - - PowerPoint PPT Presentation

3 April 2020 1 Harvard University and intern at Google; 2 University of St Andrews and visiting researcher at Google; 3 CMU and visiting researcher at Google 1 Proprietary + Confjdential Borg Google's internal cluster manager. Cell : a set of


slide-1
SLIDE 1

3

April 2020

1 Harvard University and intern at Google; 2 University of St Andrews and visiting researcher at Google; 3 CMU and visiting researcher at Google

1

slide-2
SLIDE 2

Proprietary + Confjdential

Borg

Google's internal cluster manager. Cell: a set of machines managed by Borg as one unit.

Cell

2

slide-3
SLIDE 3

Proprietary + Confjdential

Borg

Users submit work in the form of jobs

Cell

Job

each of which contains one or more tasks.

Task Task

3

slide-4
SLIDE 4

Proprietary + Confjdential

Borg

A job may run in an alloc set

Cell

Job

making each of its tasks run in an alloc instance

Alloc set Alloc instance

4

Task Task

slide-5
SLIDE 5

Proprietary + Confjdential

Borg

Cell

Job

Jobs have tiers: production, mid, best-efgoru batch, free.

Alloc set Alloc instance

5

Task Task

slide-6
SLIDE 6

Proprietary + Confjdential

Borg

Cell

Job

Alloc set Alloc instance

More info: "Large scale cluster management at Google with Borg" (EuroSys '15)

6

Task Task

slide-7
SLIDE 7

Proprietary + Confjdential

traces

A single Borg trace describes the workload in a Borg cell:

  • {Jobs, tasks}, {alloc sets, alloc instances}

○ arrivals and deparuures: submit, update, fjnish ○ scheduling decisions: place, evict

  • Resource allocations and usage

2011 trace: 1 cell from May, 2011

7

slide-8
SLIDE 8

Proprietary + Confjdential

new

2019 trace: 8 cells for May 2019

  • ~96k machines in 3 continents

github.com/google/cluster-data

8 Job Job Job

  • CPU usage histograms
  • Job-parent information
  • Autopilot (see companion paper in session 5)
slide-9
SLIDE 9

Proprietary + Confjdential

Two metrics:

9 Job

  • used
  • allocated
slide-10
SLIDE 10

Proprietary + Confjdential

used

2011 2019

New “mid” tier

Time (days) Fraction of cell capacity

10

slide-11
SLIDE 11

Proprietary + Confjdential

used

2011 2019

Much more “best efgoru batch”

Time (days) Fraction of cell capacity

11

slide-12
SLIDE 12

Proprietary + Confjdential

used

2011

12

2019

CPU memory

slide-13
SLIDE 13

Proprietary + Confjdential

allocated

2011 2019 Time (days) Fraction of cell capacity

13

slide-14
SLIDE 14

Proprietary + Confjdential

Memory

2011 2019 Time (days) Fraction of cell capacity

14

slide-15
SLIDE 15

Proprietary + Confjdential

allocated

2011

15

2019

CPU memory

slide-16
SLIDE 16

Proprietary + Confjdential

used allocation

16

allocated used 100%

slide-17
SLIDE 17

Proprietary + Confjdential

x - utilization P(utilization > x) 2011

17

slide-18
SLIDE 18

Proprietary + Confjdential

x - utilization P(utilization > x)

Median machine in 2011: ~ 30% utilized Median machine in 2019: 50 - 77% utilized

18

Median utilization is higher in 2019

slide-19
SLIDE 19

Proprietary + Confjdential

x - tasks submitued per hour P(tasks submitued > x)

Scheduler load today: ~ 4 times higher

19

slide-20
SLIDE 20

Proprietary + Confjdential

VERY

C2 = variance / mean2

for CPU-hours and memory-hours

  • CPU-hours of UNIX jobs (1996): C2≈50
  • CPU-hours of supercomputing jobs (2005): C2≈250
  • CPU-hours of Google Borg jobs (2011): C2≈8400

2019 Google Borg trace: 23k

20

slide-21
SLIDE 21

Proprietary + Confjdential

Largest 1% of jobs: hogs Remaining 99%: mice

  • Prior work: 50%
  • Google, 2011: 97.3%
  • Google, 2019: 99.2%

Fraction of resources consumed by

21

slide-22
SLIDE 22

Proprietary + Confjdential

x - {CPU, RAM}-hours Fraction of jobs where: {CPU, RAM}-hours > x

α = 0.77 α = 0.69

Extremely heavy tailed Even more heavy-tailed!

22

slide-23
SLIDE 23

Proprietary + Confjdential

scheduling

Since Google's workload has high C2

23

Cell

Hogs can fjll all the resources!

slide-24
SLIDE 24

Proprietary + Confjdential

  • New Borg workload trace:

○ 8 cells for month of May 2019 ○ 2.4TB data accessed via BigQuery ○ github.com/google/cluster-data

  • Workload and machine utilization have increased
  • Disparity between hogs and mice more extreme than

any other reporued trace ○ largest 1% of jobs consume >99% of resources

24