High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
On the Resource/Performance Tradeoff in Large Scale Queueing Systems - - PowerPoint PPT Presentation
On the Resource/Performance Tradeoff in Large Scale Queueing Systems - - PowerPoint PPT Presentation
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange On the Resource/Performance Tradeoff in Large Scale Queueing Systems David Gamarnik MIT Joint work with Patrick Eschenfeldt, John
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
High level comments
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
High level comments Many modern queueing systems are large scale
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
High level comments Many modern queueing systems are large scale Operating optimally requires large scale resources
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
High level comments Many modern queueing systems are large scale Operating optimally requires large scale resources It is of interest to understand the best performance under limited resources availability
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
High level comments Many modern queueing systems are large scale Operating optimally requires large scale resources It is of interest to understand the best performance under limited resources availability In this work we study
Join-the-Shortest-Queue (JSQ) policy in heavy traffic and compare it with M/M/N design Dispatching policies with limited memory and limited information exchange in many server queueing systems
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Join the Shortest Queue in heavy traffic n . . . 3 2 1 n parallel servers Exp(1) service Pois(nλn) arrival. λn = 1 − β/√n. Choose any shortest queue upon arrival Compare with M/M/N - global buffer, join the smallest workload
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Join the Shortest Queue in heavy traffic n . . . 3 2 1 n parallel servers Exp(1) service Pois(nλn) arrival. λn = 1 − β/√n. Choose any shortest queue upon arrival Compare with M/M/N - global buffer, join the smallest workload
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Join the Shortest Queue in heavy traffic n . . . 3 2 1 n parallel servers Exp(1) service Pois(nλn) arrival. λn = 1 − β/√n. Choose any shortest queue upon arrival Compare with M/M/N - global buffer, join the smallest workload
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Prior work on JSQ: fixed number of servers Winston ’77 : JSQ is the optimal policy if customers are routed to servers immedaitely. Foschini and Salz ’78 : diffusion limit for heavy traffic with fixed number of servers. Mukherjee, Borst, van Leeuwaarden and Whiting ’15: a combination of JSQ with a Supermarket Model.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Notation Qn
i (t) is the number of servers with queue length at least i,
including those in service n ≥ Qn
1(t) ≥ Qn 2(t) ≥ · · · ≥ 0
Qi(t) − Qi+1(t) is the number of servers with exactly i customers n − Qn
1(t) is the number of idle servers
X n
1 (t) = (Qn 1(t) − n)/√n, X n i (t) = Qn i (t)/√n
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: queues
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: queues
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: queues O √n
- idle servers and O
√n
- servers with exactly one
customer waiting
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: queues O √n
- idle servers and O
√n
- servers with exactly one
customer waiting Upon rescaling, they form a 2-dimensional reflected Ornstein-Uhlenbeck process
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: queues O √n
- idle servers and O
√n
- servers with exactly one
customer waiting Upon rescaling, they form a 2-dimensional reflected Ornstein-Uhlenbeck process Longer queues disappear in constant time
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: waiting times
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: waiting times Two possibilities for arriving customer:
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: waiting times Two possibilities for arriving customer:
At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: waiting times Two possibilities for arriving customer:
At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)
Aggregate waiting time for customers arriving in [0, t] is O √n
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: waiting times Two possibilities for arriving customer:
At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)
Aggregate waiting time for customers arriving in [0, t] is O √n
- Order n arrivals in [0, t]
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: waiting times Two possibilities for arriving customer:
At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)
Aggregate waiting time for customers arriving in [0, t] is O √n
- Order n arrivals in [0, t]
Fraction of customers who wait: O
- 1/√n
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main result described qualitatively: waiting times Two possibilities for arriving customer:
At least one idle server, so zero wait No idle servers, join queue behind one customer, so wait Exp(1)
Aggregate waiting time for customers arriving in [0, t] is O √n
- Order n arrivals in [0, t]
Fraction of customers who wait: O
- 1/√n
- Average waiting time O
- 1/√n
- same as for M/M/N.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
JSQ as a reflected process Fix k ≥ 3, b ∈ Rk, y ∈ Dk. Theorem There exists a unique solution x(t) to the integral equation x(t) = b + y(t) +
- ∆(x(t))dt + U(t)
∞ ✶{x(t) ∈ ·}dU(t) = 0, This is a variation on a result of Pang, Talreja, and Whitt ’07.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
An integral equation For k ≥ 3, B ∈ R+, b ∈ Rk, y ∈ Dk, there is a unique solution (x, u) to x1(t) = b1 + y1(t) + t (−x1(s) + x2(s))ds − u1(t) x2(t) = b2 + y2(t) + t (−x2(s) + x3(s))ds + u1(t) − u2(t), xi(t) = bi + yi(t) + t (−xi(s) + xi+1(s))ds, 3 ≤ i ≤ k − 1, xk(t) = bk + yk(t) + t −xk(s)ds, x1(t) ≤ 0, 0 ≤ x2(t) ≤ B, xi(t) ≥ 0, u1(t), u2(t) ≥ 0, t ≥ 0, ∞ ✶{x1(t) < 0}du1(t) = 0, ∞ ✶{x2(t) < B}du2(t) = 0.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
JSQ heavy traffic limit
Theorem (Main Result) Suppose X n(0) ⇒ X(0) with X n
k+1(0) = 0. Then X n ⇒ X where
X1 ≤ 0, Xi ≥ 0, i ≥ 2, and nondecreasing U1 ≥ 0 such that X1(t) = X1(0) + √ 2W(t) − βt + t (−X1(s) + X2(s)) ds − U1(t), X2(t) = X2(0) + U1(t) + t (−X2(s) + X3(s))ds, Xi(t) = Xi(0) + t (−Xi(s) + Xi+1(s))ds, 3 ≤ i ≤ k − 1, Xk(t) = Xk(0) + t −Xk(s)ds, Xi(t) = 0, i ≥ k + 1, 0 = ∞ ✶{X1(t) < 0}dU1(t), where W is a standard Brownian motion.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Proof outline Introduce truncated approximation of system. show that the truncated system converges to the Ornstein-Uhlenbeck process. Show the original and truncated systems have same behavior whp.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Truncated model Initially no queue longer than k. Reject any arrival when ˆ Qn
2(t) = n.
ˆ Qn
i (t), i ≥ 3 decreases
monotonically in t. 1 2 3 . . . n − 2 n − 1 n
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Truncated model Initially no queue longer than k. Reject any arrival when ˆ Qn
2(t) = n.
ˆ Qn
i (t), i ≥ 3 decreases
monotonically in t. 1 2 3 . . . n − 2 n − 1 n
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Connecting truncated and untruncated Since Qn
2(0) < n, truncated system and full system are identical
until the first time ˆ Qn
2(t) = n. The weak convergence of the
truncated system ˆ X n ⇒ X implies P
- sup
0≤s≤t
ˆ Qn
2(s) ≥ n
- → 0.
This further implies X n ⇒ X.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Open questions Waiting time distribution for a customer arriving at time t Steady state of the limiting system Convergence of steady state in n-th system to steady state
- f limiting system (interchange of limits)
General service times distribution
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Dispatching with limited memory and information exchange Resource Constrained Pull Based (RCPB) policy
Dispatcher 15,3,28,6,87 15
n parallel servers. Exp(1) service. Pois(λn) arrival. λ < 1. Dispatcher can store up to C IDs of idle servers. Idle servers send ”reminders” at rate µ. Job is assigned to an idle server, if at least
- ne idle server ID is
- available. Otherwise
u.a.r.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Some relevant literature Badonnel and Burgess ’08: Pull-based load balancing Stolyar ’15: Pull-based load distribution in heterogeneous systems Literature on Supermarket Model
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem ODE (Fluid model limit) SN(t) N → s(t).
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem ODE (Fluid model limit) SN(t) N → s(t). s(t) → s∗.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Main results: positive SN(t) = (Qi(t)/N, i ≥ 1), Qi(t) - number of servers with length ≥ i. 0 ≤ M(t) ≤ C - number of tokens at Dispatcher. Theorem ODE (Fluid model limit) SN(t) N → s(t). s(t) → s∗. Interchange of limits: SN(t)
t
→ πN
N
→ s∗.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Description of the equilibrium Theorem The equilibrium is given by P∗
0 =
0≤k≤C
µ(1 − λ) λ k
−1
, s∗
i = λ (λP∗ 0)i−1 ,
i ≥ 1 E[Delay] = λP∗ 1 − λP∗ .
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Uniformly bounded delay in λ ↑ 1
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ. Given a ”budget” ν ≥ (1 − λ)µ and memory size C, how does the delay scale as λ ↑ 1?
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ. Given a ”budget” ν ≥ (1 − λ)µ and memory size C, how does the delay scale as λ ↑ 1? Theorem The delay is uniformly bounded in λ: sup
λ<1
E[Delay] ≤
1≤k≤C
νk
−1
.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Uniformly bounded delay in λ ↑ 1 Note: as λ ↑ 1, the effective rate of messages decreases (1 − λ)µ. Given a ”budget” ν ≥ (1 − λ)µ and memory size C, how does the delay scale as λ ↑ 1? Theorem The delay is uniformly bounded in λ: sup
λ<1
E[Delay] ≤
1≤k≤C
νk
−1
. Note: For the supermarket model E[Delay] ∼ 1 log d log
- 1
1 − λ
- .
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Lower bound on delays for general policies
Dispatcher queries messages
Dispatcher memory capacity C log n. Dispatcher queries some servers upon arrivals. Servers send messages to Dispatcher. Memory state is updated at ”events”.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange
Lower bound on delays for general policies Theorem Every ”symmetric” dispatching policy induces a delay bounded away from zero: for every λ < 1 lim inf
π
E[Delayπ] > 0.
High Level Comments Join the Shortest Queue Policy Dispatching with Limited Memory and Information Exchange