On the Resource/Performance Tradeoff in Large Scale Queueing Systems - - PowerPoint PPT Presentation

on the resource performance tradeoff in large scale
SMART_READER_LITE
LIVE PREVIEW

On the Resource/Performance Tradeoff in Large Scale Queueing Systems - - PowerPoint PPT Presentation

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange On the Resource/Performance Tradeoff in Large Scale Queueing Systems David Gamarnik MIT Joint work with Patrick Eschenfeldt, John


slide-1
SLIDE 1

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

On the Resource/Performance Tradeoff in Large Scale Queueing Systems

David Gamarnik MIT

Joint work with

Patrick Eschenfeldt, John Tsitsiklis and Martin Zubeldia (MIT)

slide-2
SLIDE 2

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

High level comments

slide-3
SLIDE 3

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

High level comments Many modern queueing systems are large scale

slide-4
SLIDE 4

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

High level comments Many modern queueing systems are large scale Operating optimally requires large scale resources

slide-5
SLIDE 5

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

High level comments Many modern queueing systems are large scale Operating optimally requires large scale resources It is of interest to understand the best performance under limited resources availability

slide-6
SLIDE 6

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

High level comments Many modern queueing systems are large scale Operating optimally requires large scale resources It is of interest to understand the best performance under limited resources availability In this work we study

Join-the-Shortest-Queue (JSQ) policy in heavy traffic and compare it with M/M/N design Dispatching policies with limited memory and limited information exchange in many server queueing systems

slide-7
SLIDE 7

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Join the Shortest Queue in heavy traffic n . . . 3 2 1 n parallel servers Exp(1) service Pois(nλn) arrival. λn = 1 − β/√n. Choose any shortest queue upon arrival Compare with M/M/N - global buffer, join the smallest workload

slide-8
SLIDE 8

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Join the Shortest Queue in heavy traffic n . . . 3 2 1 n parallel servers Exp(1) service Pois(nλn) arrival. λn = 1 − β/√n. Choose any shortest queue upon arrival Compare with M/M/N - global buffer, join the smallest workload

slide-9
SLIDE 9

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Join the Shortest Queue in heavy traffic n . . . 3 2 1 n parallel servers Exp(1) service Pois(nλn) arrival. λn = 1 − β/√n. Choose any shortest queue upon arrival Compare with M/M/N - global buffer, join the smallest workload

slide-10
SLIDE 10

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Prior work on JSQ: fixed number of servers Winston ’77 : JSQ is the optimal policy if customers are routed to servers immedaitely. Foschini and Salz ’78 : diffusion limit for heavy traffic with fixed number of servers. Mukherjee, Borst, van Leeuwaarden and Whiting ’15: a combination of JSQ with a Supermarket Model.

slide-11
SLIDE 11

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Notation Qn

i (t) is the number of servers with queue length at least i,

including those in service n ≥ Qn

1(t) ≥ Qn 2(t) ≥ · · · ≥ 0

Qi(t) − Qi+1(t) is the number of servers with exactly i customers n − Qn

1(t) is the number of idle servers

X n

1 (t) = (Qn 1(t) − n)/√n, X n i (t) = Qn i (t)/√n

slide-12
SLIDE 12

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively

slide-13
SLIDE 13

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: queues

slide-14
SLIDE 14

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: queues

slide-15
SLIDE 15

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: queues O √n

  • idle servers and O

√n

  • servers with exactly one

customer waiting

slide-16
SLIDE 16

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: queues O √n

  • idle servers and O

√n

  • servers with exactly one

customer waiting Upon rescaling, they form a 2-dimensional reflected Ornstein-Uhlenbeck process

slide-17
SLIDE 17

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: queues O √n

  • idle servers and O

√n

  • servers with exactly one

customer waiting Upon rescaling, they form a 2-dimensional reflected Ornstein-Uhlenbeck process Longer queues disappear in constant time

slide-18
SLIDE 18

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: waiting times

slide-19
SLIDE 19

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: waiting times Two possibilities for arriving customer:

slide-20
SLIDE 20

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: waiting times Two possibilities for arriving customer:

At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)

slide-21
SLIDE 21

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: waiting times Two possibilities for arriving customer:

At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)

Aggregate waiting time for customers arriving in [0, t] is O √n

slide-22
SLIDE 22

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: waiting times Two possibilities for arriving customer:

At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)

Aggregate waiting time for customers arriving in [0, t] is O √n

  • Order n arrivals in [0, t]
slide-23
SLIDE 23

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: waiting times Two possibilities for arriving customer:

At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)

Aggregate waiting time for customers arriving in [0, t] is O √n

  • Order n arrivals in [0, t]

Fraction of customers who wait: O

  • 1/√n
slide-24
SLIDE 24

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main result described qualitatively: waiting times Two possibilities for arriving customer:

At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)

Aggregate waiting time for customers arriving in [0, t] is O √n

  • Order n arrivals in [0, t]

Fraction of customers who wait: O

  • 1/√n
  • Average waiting time O
  • 1/√n
  • same as for M/M/N.
slide-25
SLIDE 25

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

JSQ as a reflected process Fix k ≥ 3, b ∈ Rk, y ∈ Dk. Theorem There exists a unique solution x(t) to the integral equation x(t) = b + y(t) +

  • ∆(x(t))dt + U(t)

∞ ✶{x(t) ∈ ·}dU(t) = 0, This is a variation on a result of Pang, Talreja, and Whitt ’07.

slide-26
SLIDE 26

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

An integral equation For k ≥ 3, B ∈ R+, b ∈ Rk, y ∈ Dk, there is a unique solution (x, u) to x1(t) = b1 + y1(t) + t (−x1(s) + x2(s))ds − u1(t) x2(t) = b2 + y2(t) + t (−x2(s) + x3(s))ds + u1(t) − u2(t), xi(t) = bi + yi(t) + t (−xi(s) + xi+1(s))ds, 3 ≤ i ≤ k − 1, xk(t) = bk + yk(t) + t −xk(s)ds, x1(t) ≤ 0, 0 ≤ x2(t) ≤ B, xi(t) ≥ 0, u1(t), u2(t) ≥ 0, t ≥ 0, ∞ ✶{x1(t) < 0}du1(t) = 0, ∞ ✶{x2(t) < B}du2(t) = 0.

slide-27
SLIDE 27

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

JSQ heavy traffic limit

Theorem (Main Result) Suppose X n(0) ⇒ X(0) with X n

k+1(0) = 0. Then X n ⇒ X where

X1 ≤ 0, Xi ≥ 0, i ≥ 2, and nondecreasing U1 ≥ 0 such that X1(t) = X1(0) + √ 2W(t) − βt + t (−X1(s) + X2(s)) ds − U1(t), X2(t) = X2(0) + U1(t) + t (−X2(s) + X3(s))ds, Xi(t) = Xi(0) + t (−Xi(s) + Xi+1(s))ds, 3 ≤ i ≤ k − 1, Xk(t) = Xk(0) + t −Xk(s)ds, Xi(t) = 0, i ≥ k + 1, 0 = ∞ ✶{X1(t) < 0}dU1(t), where W is a standard Brownian motion.

slide-28
SLIDE 28

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Proof outline Introduce truncated approximation of system. show that the truncated system converges to the Ornstein-Uhlenbeck process. Show the original and truncated systems have same behavior whp.

slide-29
SLIDE 29

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Truncated model Initially no queue longer than k. Reject any arrival when ˆ Qn

2(t) = n.

ˆ Qn

i (t), i ≥ 3 decreases

monotonically in t. 1 2 3 . . . n − 2 n − 1 n

slide-30
SLIDE 30

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Truncated model Initially no queue longer than k. Reject any arrival when ˆ Qn

2(t) = n.

ˆ Qn

i (t), i ≥ 3 decreases

monotonically in t. 1 2 3 . . . n − 2 n − 1 n

slide-31
SLIDE 31

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Connecting truncated and untruncated Since Qn

2(0) < n, truncated system and full system are identical

until the first time ˆ Qn

2(t) = n. The weak convergence of the

truncated system ˆ X n ⇒ X implies P

  • sup

0≤s≤t

ˆ Qn

2(s) ≥ n

  • → 0.

This further implies X n ⇒ X.

slide-32
SLIDE 32

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Open questions Waiting time distribution for a customer arriving at time t Steady state of the limiting system Convergence of steady state in n-th system to steady state

  • f limiting system (interchange of limits)

General service times distribution

slide-33
SLIDE 33

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Dispatching with limited memory and information exchange Resource Constrained Pull Based (RCPB) policy

Dispatcher 15,3,28,6,87 15

n parallel servers. Exp(1) service. Pois(λn) arrival. λ < 1. Dispatcher can store up to C IDs of idle servers. Idle servers send ”reminders” at rate µ. Job is assigned to an idle server, if at least

  • ne idle server ID is
  • available. Otherwise

u.a.r.

slide-34
SLIDE 34

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Some relevant literature Badonnel and Burgess ’08: Pull-based load balancing Stolyar ’15: Pull-based load distribution in heterogeneous systems Literature on Supermarket Model

slide-35
SLIDE 35

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher.

slide-36
SLIDE 36

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem

slide-37
SLIDE 37

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem ODE (Fluid model limit) SN(t) N → s(t).

slide-38
SLIDE 38

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem ODE (Fluid model limit) SN(t) N → s(t). s(t) → s∗.

slide-39
SLIDE 39

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem ODE (Fluid model limit) SN(t) N → s(t). s(t) → s∗. Interchange of limits: SN(t)

t

→ πN

N

→ s∗.

slide-40
SLIDE 40

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Description of the equilibrium Theorem The equilibrium is given by P∗

0 =

 

0≤k≤C

µ(1 − λ) λ k  

−1

, s∗

i = λ (λP∗ 0)i−1 ,

i ≥ 1 E[Delay] = λP∗ 1 − λP∗ .

slide-41
SLIDE 41

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Uniformly bounded delay in λ ↑ 1

slide-42
SLIDE 42

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ.

slide-43
SLIDE 43

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ. Given a ”budget” ν ≥ (1 − λ)µ and memory size C, how does the delay scale as λ ↑ 1?

slide-44
SLIDE 44

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ. Given a ”budget” ν ≥ (1 − λ)µ and memory size C, how does the delay scale as λ ↑ 1? Theorem The delay is uniformly bounded in λ: sup

λ<1

E[Delay] ≤  

1≤k≤C

νk  

−1

.

slide-45
SLIDE 45

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ. Given a ”budget” ν ≥ (1 − λ)µ and memory size C, how does the delay scale as λ ↑ 1? Theorem The delay is uniformly bounded in λ: sup

λ<1

E[Delay] ≤  

1≤k≤C

νk  

−1

. Note: For the supermarket model E[Delay] ∼ 1 log d log

  • 1

1 − λ

  • .
slide-46
SLIDE 46

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Lower bound on delays for general policies

Dispatcher queries messages

Dispatcher memory capacity C log n. Dispatcher queries some servers upon arrivals. Servers send messages to Dispatcher. Memory state is updated at ”events”.

slide-47
SLIDE 47

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Lower bound on delays for general policies Theorem Every ”symmetric” dispatching policy induces a delay bounded away from zero: for every λ < 1 lim inf

π

E[Delayπ] > 0.

slide-48
SLIDE 48

High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange

Thank you.