Introductory Queueing Theory Tutorial Mor Harchol-Balter Computer - - PowerPoint PPT Presentation

introductory queueing theory tutorial
SMART_READER_LITE
LIVE PREVIEW

Introductory Queueing Theory Tutorial Mor Harchol-Balter Computer - - PowerPoint PPT Presentation

Introductory Queueing Theory Tutorial Mor Harchol-Balter Computer Science Dept, Carnegie Mellon Univ. 1 Outline I. Basic Vocabulary Avg arrival rate, l Response time, T o o Littles Law Avg service rate, o o Exponential vs.


slide-1
SLIDE 1

1

Mor Harchol-Balter Computer Science Dept, Carnegie Mellon Univ.

Introductory Queueing Theory Tutorial

slide-2
SLIDE 2

2

  • I. Basic Vocabulary

II.Single-server queues III.Multi-server queues

Outline

  • Avg arrival rate, l
  • Avg service rate, µ
  • Avg load, r
  • Avg throughput, X
  • Response time, T
  • Little’s Law
  • Exponential vs. Pareto/Heavy-tailed
  • Poisson Process
  • M/G/1 response time
  • Inspection Paradox
  • Effect of job size variability
  • Effect of load
  • Scheduling: FCFS, PS, SJF, LAS, SRPT
  • Scheduling: Priority Classes
  • Scheduling: SOAP Framework (New)
  • Single shared queue, M/G/k
  • Load balancing across queues
  • Cycle stealing
  • Replication of jobs (New)
  • Multi-task jobs and fork-join (New)
  • Networks of queues
slide-3
SLIDE 3

Vocabulary

3

Avg. service rate

jobs sec

µ

Avg. arrival rate

jobs sec

l

FCFS

l µ <

throughout

! = job size = service requirement 2 ! = 3 4 sec sec

Example:

  • On average, job needs 3x106 cycles
  • Server executes 9x106 cycles/sec

Avg service rate Avg size of job

  • n this server:

sec.

jobs sec

3 µ =

1 3

[ ] E S =

slide-4
SLIDE 4

Vocabulary: Load

jobs sec

µ

jobs sec

l

FCFS

: job size S 1 [ ] E S µ =

Example:

  • arrive
  • Each job requires sec on avg

2 3 r = Load (utilization)

  • Frac. time server busy

[ ] E S l l µ

r =

= = =

jobs sec

2 l =

1 3

[ ] E S =

slide-5
SLIDE 5

Vocabulary: Throughput

Defn: Throughput X is the average rate at which jobs complete (jobs/sec)

QUESTION: Which has higher throughput, C?

jobs sec

µ

jobs sec

l

jobs sec

2µ 2µ

jobs sec

l

l µ <

Assume

5

slide-6
SLIDE 6

jobs sec

µ

jobs sec

l

C:

avg rate at which jobs complete

X l =

(assuming no jobs dropped)

6

Vocabulary: Throughput

slide-7
SLIDE 7

Vocabulary: Response Time

jobs sec

µ

jobs sec

l

: S job size 1 [ ] E S µ =

Q

T

T

[ ] E S l l µ

r =

=

Q

T =

T = response time

queueing time (waiting time)

7

Number jobs in system

! = # $ = #[!] '

Little’s Law:

slide-8
SLIDE 8

Vocabulary: Response Time

jobs sec

µ

jobs sec

l

: S job size 1 [ ] E S µ =

Q

T

T

[ ] E S l l µ

r =

=

Q

T =

T = response time

queueing time (waiting time)

Q: Given that l < < µ µ, what causes wait? A: Variability in the arrival process & service requirements

8

slide-9
SLIDE 9

Vocabulary: Response Time

jobs sec

µ

jobs sec

l

: S job size 1 [ ] E S µ =

Q

T

T

[ ] E S l l µ

r =

=

9

Variability in job size, S Variability in arrival process

slide-10
SLIDE 10

Job Size Distributions

“Most jobs are small; few jobs are large”

1 ½ ¼

0 1 2 3 4 5 6 7 8

1 ½ ¼

1 2 3 4 5 6 7 8 9

Pr{ }

x

S x e µ

  • >

=

1 Pr{ } S x xa > =

~ Exp( ) S µ ~ Pareto( ) S a

x x

heavy tail

10

slide-11
SLIDE 11

Job Size Distributions

1 ½ ¼

0 1 2 3 4 5 6 7 8

1 ½ ¼

1 2 3 4 5 6 7 8 9

Pr{ }

x

S x e µ

  • >

=

1 Pr{ } S x x > =

~ Exp( ) S µ ~ Pareto( 1) S a =

x x

d 2d 3d 4d 5d 6d 7d 8d

time µd µd µd µd µd µd µd µd S is time until coin with

prob µd comes up heads

S

11

slide-12
SLIDE 12
  • “Memoryless”
  • Lower variability
  • Light-tail:

top 1% of jobs comprise 5% load.

Job Size Distributions

1 ½ ¼

0 1 2 3 4 5 6 7 8

1 ½ ¼

1 2 3 4 5 6 7 8 9

Pr{ }

x

S x e µ

  • >

=

1 Pr{ } S x x > =

~ Exp( ) S µ ~ Pareto( 1) S a =

x x

  • Decreasing hazard rate
  • Infinite variance
  • Heavy-tail:

top 1% of jobs comprise 50% load.

12

slide-13
SLIDE 13
  • “Memoryless”
  • Lower variability
  • Light-tail:

top 1% of jobs comprise 5% load.

Job Size Distributions

1 ½ ¼

0 1 2 3 4 5 6 7 8

1 ½ ¼

1 2 3 4 5 6 7 8 9

Pr{ }

x

S x e µ

  • >

=

1 Pr{ } S x x > =

~ Exp( ) S µ ~ Pareto( 1) S a =

x x

Representative of:

  • - UNIX job sizes sizes
  • - Supercomputing job sizes
  • - File sizes
  • - Human wealth
  • - Damage due to forest fires,

earthquakes, etc.

13

slide-14
SLIDE 14

Variability

jobs sec

µ

jobs sec

l

: S job size 1 [ ] E S µ =

Q

T

T

[ ] E S l l µ

r =

=

Variability in job size, S Variability in arrival process

14

slide-15
SLIDE 15

Vocabulary: Poisson Process with rate l

(Poisson process comes up when aggregating many users)

d 2d 3d 4d 5d 6d 7d 8d 9d

time

~ Exp( ) S l ~ Exp( ) S l ~ Exp( ) S l

15

Arrival Arrival Arrival

slide-16
SLIDE 16

16

  • I. Basic Vocabulary

II.Single-server queues III.Multi-server queues

Outline

  • Avg arrival rate, l
  • Avg service rate, µ
  • Avg load, r
  • Avg throughput, X
  • Response time, T
  • Little’s Law
  • Exponential vs. Pareto/Heavy-tailed
  • Poisson Process
  • M/G/1 response time
  • Inspection Paradox
  • Effect of job size variability
  • Effect of load
  • Scheduling: FCFS, PS, SJF, LAS, SRPT
  • Scheduling: Priority Classes
  • Scheduling: SOAP Framework (New)
  • Single shared queue, M/G/k
  • Load balancing across queues
  • Cycle stealing
  • Replication of jobs (New)
  • Multi-task jobs and fork-join (New)
  • Networks of queues
slide-17
SLIDE 17

Single-Server Queue

jobs sec

µ

jobs sec

l

: job size S 1 [ ] E S µ =

Q

T

T

[ ] E S l l µ

r =

=

M/G/1

Exponential inter-arrival times (M = memoryless) General i.i.d. service times 1 server

17

Q: Does low è low

r

[ ]

Q

E T

?

slide-18
SLIDE 18

Single-Server Queue

jobs sec

µ

jobs sec

l

: job size S 1 [ ] E S µ =

Q

T

T

[ ] E S l l µ

r =

=

M/G/1

Exponential inter-arrival times (M = memoryless) General i.i.d. service times 1 server

18

A: low load does NOT ensure low wait

slide-19
SLIDE 19

M/G/1

2

[ ] ] 1 2 [ ] [ Q S E E S E T r r = ×

  • Where is this

coming from?

19

A: low load does NOT ensure low wait

slide-20
SLIDE 20

Waiting for the bus

20

slide-21
SLIDE 21

Waiting for the bus

S: time between buses

1 m n ] i [ E S =

time

S S S

QUESTION: On average, how long do I have to wait for a bus? (a) < 5 min (b) 5 min (c) 10 min (d) >10 min

slide-22
SLIDE 22

Waiting for the bus

S: time between buses

2

[ ] [Wait] [ ] 2 [ ] E S E E S E S = >>

S S S Wait

time

“Inspection Paradox”

22

slide-23
SLIDE 23

M/G/1

2

[ ] ] 1 2 [ ] [ Q S E E S E T r r = ×

  • High load

leads to high wait High job size variability leads to high wait

To drop load, we can increase server speed. Q: What can we do to combat job size variability? A: Smarter scheduling!

23

slide-24
SLIDE 24

Scheduling in M/G/1

jobs sec

µ

jobs sec

l

Well-studied scheduling policies: FCFS (First-Come-First-Served, non-preemptive) PS (Processor-Sharing, preemptive) SJF (Shortest-Job-First, non-preemptive) SRPT (Shortest-Remaining-Processing-Time, preemptive) LAS (Least-Attained-Service First, preemptive)

slide-25
SLIDE 25

Scheduling in M/G/1

FCFS (First-Come-First-Served, non-preemptive) PS (Processor-Sharing, preemptive) SJF (Shortest-Job-First, non-preemptive) SRPT (Shortest-Remaining-Processing-Time, preemptive) LAS (Least-Attained-Service First, preemptive)

FCFS SJF PS LAS SRPT

1 3 5 7 9 0.2 0.4 0.6 0.8 1.0

E[T]

r

Under high job size variability

slide-26
SLIDE 26

Priority Classes

26 jobs sec jobs sec

!" !#

1st 2nd

According to Ruth Williams (genetic networks):

  • Jobs à molecules
  • Server à enzyme
  • Classes à protein species
  • Reneging à dilution
  • Class 1’s load and variability can really affect class 2
slide-27
SLIDE 27

Big Scheduling Breakthrough

[Scully, Harchol-Balter, Scheller-Wolf SIGMETRICS 2018]

The SOAP framework:

Enables first analysis of many previously intractable policies: SERPT: Prioritize jobs by Expected Remaining Size Gittins: Prioritize jobs by their Gittins Index Discretized Policies: Preemptions only at specific ages Mixed Priority Classes: Priority classes, where each class can have its own scheduling policy.

slide-28
SLIDE 28

28

  • I. Basic Vocabulary

II.Single-server queues III.Multi-server queues

Outline

  • Avg arrival rate, l
  • Avg service rate, µ
  • Avg load, r
  • Avg throughput, X
  • Response time, T
  • Waiting time, TQ
  • Exponential vs. Pareto/Heavy-tailed
  • Poisson Process
  • M/G/1 response time
  • Inspection Paradox
  • Effect of job size variability
  • Effect of load
  • Scheduling: FCFS, PS, SJF, LAS, SRPT
  • Scheduling: Priority Classes
  • Scheduling: SOAP Framework (New)
  • Single shared queue, M/G/k
  • Load balancing across queues
  • Cycle stealing
  • Replication of jobs (New)
  • Multi-task jobs and fork-join (New)
  • Networks of queues
slide-29
SLIDE 29

29

M/G/k Model

When server frees up, it grabs next available job k servers Q: How does M/G/k compare with M/G/1 at k-speed? A: Both worse and better!

slide-30
SLIDE 30

30

Load Balancing Model

Probabilistically split into independent queues.

p1 p2 p3

slide-31
SLIDE 31

31

Load Balancing Model

Round-Robin Join-Shortest-Queue Least-Work-Left Size-Interval Assignment

L.B.

Smart Load Balancing è Much reduced mean response time

slide-32
SLIDE 32

32

Cycle Stealing Model (N-model)

L.B.

A’s B’s

B’s have priority, but if idle, then work on A’s. OnlyA’s. 2D-inf Markov Chain

slide-33
SLIDE 33

33

Replication Model

[Gardner, Harchol-Balter, Scheller-Wolf Transactions on Networking 2017] [Gardner, Harchol-Balter, Scheller-Wolf Operations Research 2017]

slide-34
SLIDE 34

34

Replication Model

[Gardner, Harchol-Balter, Scheller-Wolf Transactions on Networking 2017] [Gardner, Harchol-Balter, Scheller-Wolf Operations Research 2017]

Same job goes to multiple queues. Job is “done” as soon as first copy completes.

slide-35
SLIDE 35

35

Replication Model

[Gardner, Harchol-Balter, Scheller-Wolf Transactions on Networking 2017] [Gardner, Harchol-Balter, Scheller-Wolf Operations Research 2017]

Same job goes to multiple queues. Job is “done” as soon as first copy completes.

slide-36
SLIDE 36

36

Replication Model

[Gardner, Harchol-Balter, Scheller-Wolf Transactions on Networking 2017] [Gardner, Harchol-Balter, Scheller-Wolf Operations Research 2017]

Same job goes to multiple queues. Job is “done” as soon as first copy completes.

slide-37
SLIDE 37

37

Replication Model

[Gardner, Harchol-Balter, Scheller-Wolf Transactions on Networking 2017] [Gardner, Harchol-Balter, Scheller-Wolf Operations Research 2017]

Same job goes to multiple queues. Job is “done” as soon as first copy completes.

Replication Tradeoff: + Lower response time because only need first completion. + Higher response time due to extra load.

nD-inf Markov Chain

slide-38
SLIDE 38

Multi-task Job Model

arriving jobs

Map Map Map

1 2 3 4 5 104

1 3 5 “job with 3 tasks”

38

slide-39
SLIDE 39

Multi-task Job Model

arriving jobs

1 2 3 4 5 104

1 3 5

39

slide-40
SLIDE 40

Multi-task Job Model

arriving jobs

1 2 3 4 5 104

1 3 5

arriving jobs

1 2 5 Map Map Map “job with 3 tasks”

40

slide-41
SLIDE 41

Multi-task Job Model

arriving jobs

1 2 3 4 5 104

1 3 5

arriving jobs

1 2 5

41

slide-42
SLIDE 42

Multi-task Job Model

arriving jobs

1 2 3 4 5 104

1 3 5

arriving jobs

1 2 5 1 2 4 “job with 4 tasks” 5

42

slide-43
SLIDE 43

Multi-task Job Model

arriving jobs

1 2 3 4 5 104

1 3 5

arriving jobs

1 2 5 1 2 4 5

43

slide-44
SLIDE 44

Job not done until ALL its tasks are done

Multi-task Job Model

arriving jobs

1 2 3 4 5 104

1 3 5

arriving jobs

1 2 5 1 2 4 5

44

“Limited Fork-Join” See [Wang, Harchol-Balter, Jiang, Scheller-Wolf, Srikant, 2018].

slide-45
SLIDE 45

45

Networks of Queues Model

nD-inf Markov Chain

slide-46
SLIDE 46

46

  • I. Basic Vocabulary

II.Single-server queues III.Multi-server queues

Conclusion

  • Avg arrival rate, l
  • Avg service rate, µ
  • Avg load, r
  • Avg throughput, X
  • Response time, T
  • Little’s Law
  • Exponential vs. Pareto/Heavy-tailed
  • Poisson Process
  • M/G/1 response time
  • Inspection Paradox
  • Effect of job size variability
  • Effect of load
  • Scheduling: FCFS, PS, SJF, LAS, SRPT
  • Scheduling: Priority Classes
  • Scheduling: SOAP Framework (New)
  • Single shared queue, M/G/k
  • Load balancing across queues
  • Cycle stealing
  • Replication of jobs (New)
  • Multi-task jobs and fork-join (New)
  • Network of queues
slide-47
SLIDE 47

47 47

THANK YOU!

www.cs.cmu.edu/~harchol/