Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC - - PowerPoint PPT Presentation

quantifying scalability with the usl
SMART_READER_LITE
LIVE PREVIEW

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC - - PowerPoint PPT Presentation

Quantifying Scalability with the USL Baron Schwartz DataEngConf NYC 2018 Introduction Ive been focused on databases for about two decades, rst as a developer, then a consultant, and now a startup founder. Ive written High


slide-1
SLIDE 1

Quantifying Scalability with the USL

Baron Schwartz • DataEngConf NYC 2018

slide-2
SLIDE 2

Introduction

I’ve been focused on databases for about two decades, rst as a developer, then a consultant, and now a startup founder. I’ve written High Performance MySQL and several

  • ther books, and created a lot of open source

software, mostly focused around database monitoring, database operations, and database performance: innotop, Percona Toolkit, etc. I welcome you to get in touch at @xaprb or baron@vividcortex.com.

@xaprb 2

slide-3
SLIDE 3

Agenda

How can you quantify, forecast, and reason about scalability?

  • 1. Queueing theory.

In which we discover load

@xaprb 3

slide-4
SLIDE 4

Agenda

How can you quantify, forecast, and reason about scalability?

  • 1. Queueing theory.

In which we discover load

  • 2. Amdahl’s Law.

In which we dene linearity

@xaprb 3

slide-5
SLIDE 5

Agenda

How can you quantify, forecast, and reason about scalability?

  • 1. Queueing theory.

In which we discover load

  • 2. Amdahl’s Law.

In which we dene linearity

  • 3. The Universal Scalability Law (USL).

In which Frederick Brooks laughs last

@xaprb 3

slide-6
SLIDE 6

Agenda

How can you quantify, forecast, and reason about scalability?

  • 1. Queueing theory.

In which we discover load

  • 2. Amdahl’s Law.

In which we dene linearity

  • 3. The Universal Scalability Law (USL).

In which Frederick Brooks laughs last

  • 4. Application.

In which things are even worse than we thought

@xaprb 3

slide-7
SLIDE 7

Agenda

How can you quantify, forecast, and reason about scalability?

  • 1. Queueing theory.

In which we discover load

  • 2. Amdahl’s Law.

In which we dene linearity

  • 3. The Universal Scalability Law (USL).

In which Frederick Brooks laughs last

  • 4. Application.

In which things are even worse than we thought

  • 5. Prot???

In which we do the impossible

@xaprb 3

slide-8
SLIDE 8

Queueing Theory

In Which We Discover Load

slide-9
SLIDE 9

Queueing Theory

There’s a branch of operations research called queueing theory. It analyzes the waiting that happens when systems get busy.

@xaprb 5

slide-10
SLIDE 10

Queueing happens even at low utilization:

  • 1. Irregular arrival timings
  • 2. Irregular job sizes
  • 3. Lost time is lost forever

@xaprb

What Causes Queueing?

6

slide-11
SLIDE 11

Queueing happens even at low utilization:

  • 1. Irregular arrival timings
  • 2. Irregular job sizes
  • 3. Lost time is lost forever

A queue fundamentally changes how a system works: Increases availability and utilization Increases average residence time Increases cost/overhead

@xaprb

What Causes Queueing?

6

slide-12
SLIDE 12

Arrival Rate and Queue Delay

Eben Freeman has a great visual that explains how arrival rate is related to queueing delay.

@xaprb

λ

7

slide-13
SLIDE 13

Arrival Rate and Queue Delay

Eben Freeman has a great visual that explains how arrival rate is related to queueing delay. A request arrives, and the server processes it until it’s nished The height is the job size, and the width is the service time The upper edge of the triangle is the amount of outstanding work to do

@xaprb

λ S

7

slide-14
SLIDE 14

Another Request Arrives

It has to wait in the queue until the rst is done Then it has service time too Its total residence time

@xaprb

W S R = W + S

8

slide-15
SLIDE 15

Eben uses the area under the graph to relate the height of the top edge to the width of the red wait parallelograms: Solving this for gives an equation for wait time: This creates the familiar hockey stick curve, shown here in terms of utilization .

0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 Utilization Residence Time 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25

@xaprb

An Equation For Queue Wait

W W = 2(1 − λS) λS2 ρ

9

slide-16
SLIDE 16

Some Implications

One of the nice things about this form is that it lets you reason about service time and arrival rate easily: What if you… double the arrival rate halve the service time

@xaprb

W = 2(1 − λS) λS2 λ S

10

slide-17
SLIDE 17

The Hockey Stick Curve

The “hockey stick” queueing curve is hard to use in practice. And the sharpness of the “knee” is nonlinear and very hard for humans to intuit.

0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 Utilization Residence Time 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 @xaprb 11

slide-18
SLIDE 18

Great Truths From Queueing Theory

  • 1. Requests into ~any system have to queue and wait for service.
  • 2. As the system gets busier, queueing escalates suddenly.
  • 3. Queueing is very sensitive to service time and variability.
  • 4. Contention over serialized resources causes nonlinear scaling.

The last point is quite a leap, but I’ll explain.

@xaprb 12

slide-19
SLIDE 19

Amdahl’s Law

In Which We Define Scalability

slide-20
SLIDE 20

What is Scalability?

There’s a mathematical denition of scalability as a function of concurrency.

@xaprb 14

slide-21
SLIDE 21

What is Scalability?

There’s a mathematical denition of scalability as a function of concurrency. I’ll illustrate it in terms of a parallel processing system that uses concurrency to achieve speedup.

@xaprb 14

slide-22
SLIDE 22

Linear Scaling

Suppose a clustered system can complete tasks per second with no parallelism. With parallelism, it completes tasks faster, e.g. higher throughput.

Linear/Serial Parallel!

@xaprb

X

15

slide-23
SLIDE 23

Ideal Linear Scalability

Ideally, throughput increases linearly with parallelism.

2 4 6 8 10 5000 15000 nodes throughput

@xaprb 16

slide-24
SLIDE 24

Ideal Linear Scalability

Ideally, throughput increases linearly with parallelism.

2 4 6 8 10 5000 15000 nodes throughput

For example, triple the parallelism means as much work completes.

@xaprb

3X

16

slide-25
SLIDE 25

2 4 6 8 10 5000 15000 nodes throughput

The Linear Scalability Equation

The equation of ideal linear scaling: where the slope is .

@xaprb

X(N) = 1 γN γ = X(1)

17

slide-26
SLIDE 26

But Our Cluster Isn’t Perfect

Linear scaling comes from subdividing tasks perfectly.

@xaprb 18

slide-27
SLIDE 27

But Our Cluster Isn’t Perfect

Linear scaling comes from subdividing tasks perfectly. What if a portion isn’t subdividable?

Linear/Serial Parallel!

@xaprb 18

slide-28
SLIDE 28

2 4 6 8 10 5000 15000 nodes throughput 2 4 6 8 10 5000 15000 nodes throughput

Amdahl’s Law Describes Serialization

Amdahl’s Law describes throughput when a fraction can’t be parallelized.

@xaprb

X(N) = 1 + σ(N − 1) γN σ

19

slide-29
SLIDE 29

2 4 6 8 10 5000 15000 nodes throughput 2 4 6 8 10 5000 15000 nodes throughput

Amdahl’s Law Describes Serialization

Amdahl’s Law describes throughput when a fraction can’t be parallelized. Serialization is queueing.

@xaprb

X(N) = 1 + σ(N − 1) γN σ

19

slide-30
SLIDE 30

Amdahl’s Law Has An Asymptote

Parallelism delivers speedup, but there’s a limit:

@xaprb

X(N) = 1 + σ(N − 1) γN

X(N) =

N→∞

lim σ 1

20

slide-31
SLIDE 31

Amdahl’s Law Has An Asymptote

Parallelism delivers speedup, but there’s a limit: e.g. a 5% serialized task can’t be sped up more than 20-fold.

@xaprb

X(N) = 1 + σ(N − 1) γN

X(N) =

N→∞

lim σ 1

20

slide-32
SLIDE 32

The Universal Scalability Law (USL)

In Which Frederick Brooks Laughs Last

slide-33
SLIDE 33

What If Workers Coordinate?

Suppose the parallel workers also ask each other for things?

@xaprb 22

slide-34
SLIDE 34

What If Workers Coordinate?

Suppose the parallel workers also ask each other for things? They’re making each other do extra work. As load increases, each task’s job gets harder.

@xaprb 22

slide-35
SLIDE 35

How Bad Is Coordination?

workers = pairs of interactions, which grows fast: in .

@xaprb

N N(N − 1) O(n )

2

N

23

slide-36
SLIDE 36

2 4 6 8 10 5000 15000 nodes throughput 2 4 6 8 10 5000 15000 nodes throughput

The Universal Scalability Law

The USL adds a term for crosstalk, multiplied by the coefcient. Crosstalk is also called coordination

  • r coherence penalty.

Now there’s a point of diminishing returns!

@xaprb

X(N) = 1 + σ(N − 1) + κN(N − 1) γN κ

24

slide-37
SLIDE 37

The USL Describes Behavior Under Load

The USL explains the highly nonlinear behavior we know systems exhibit near their saturation point. desmos.com/calculator/3cycsgdl0b

@xaprb 25

slide-38
SLIDE 38

Application

In Which Things Are Even Worse Than We Thought

slide-39
SLIDE 39

Applying the USL to the Real World

Behold, I give you two metrics of concurrency and throughput. What do they mean?

@xaprb 27

slide-40
SLIDE 40

Let’s Scatterplot Concurrency vs Throughput

This is the USL’s input and output. Is it linear?

@xaprb 28

slide-41
SLIDE 41

It Looks Highly Linear, Doesn’t It?

R² = 0.9781

Don’t celebrate yet.

@xaprb 29

slide-42
SLIDE 42

Fit the USL Equation with Regression

5000 10000 15000 20000 25000 30000 35000 40000 5 10 15 20 25 30 35

Throughput C

  • ncurrency / L
  • ad

Modeled Measured Throughput

Now the picture looks totally different!

@xaprb 30

slide-43
SLIDE 43

How Much Headroom Does This System Have?

5000 10000 15000 20000 25000 30000 35000 40000 5 10 15 20 25 30 35

Throughput C

  • ncurrency / L
  • ad

Modeled Measured Throughput

There's not much headroom.

Just by looking, you can tell this system has maybe 10-15% more to give.

@xaprb 31

slide-44
SLIDE 44

Profit???

In Which We Do The Impossible

slide-45
SLIDE 45

What is the System’s Primary Bottleneck?

The regression gives estimates of the USL parameters. The parameters have physical meaning. is the throughput of single-threadedness. is the fraction that’s serialized/queued. is the fraction that’s crosstalk/coherency.

@xaprb

X(N) = 1 + σ(N − 1) + κN(N − 1) γN γ σ κ

33

slide-46
SLIDE 46

This System Is Sublinear Because Of Queueing

5000 10000 15000 20000 25000 30000 35000 40000 5 10 15 20 25 30 35

Throughput C

  • ncurrency / L
  • ad

Modeled Measured Throughput

= 7.4%, = 0.1%

@xaprb

σ κ

34

slide-47
SLIDE 47

@xaprb 35

slide-48
SLIDE 48

Slides and Contact Information

Slides are at https://www.xaprb.com/talks/ or you can scan the QR code. Contact: baron@vividcortex.com, @xaprb

@xaprb 36

slide-49
SLIDE 49

Neil Gunther, author of the USL. My USL book. My USL Excel workbook. Eben Freeman’s LISA17 talk and slides Kavya Joshi’s QCon talk There are lots of good books on queueing theory and scalability from Neil Gunther, Mor Harchol- Balter, Gross & Harris, etc

@xaprb

Further Reading & References

37