Dynamic Matching Models Ana Bu si c Inria Paris CS Department of - - PowerPoint PPT Presentation

dynamic matching models
SMART_READER_LITE
LIVE PREVIEW

Dynamic Matching Models Ana Bu si c Inria Paris CS Department of - - PowerPoint PPT Presentation

Approximate Optimality with Bounded Regret in Dynamic Matching Models Ana Bu si c Inria Paris CS Department of Ecole normale sup erieure Joint work with Sean Meyn University of Florida and Ivo Adan, Varun Gupta, Jean Mairesse,


slide-1
SLIDE 1

Approximate Optimality with Bounded Regret in

Dynamic Matching Models

Ana Buˇ si´ c

Inria Paris CS Department of ´ Ecole normale sup´ erieure Joint work with Sean Meyn University of Florida and Ivo Adan, Varun Gupta, Jean Mairesse, and Gideon Weiss Real-Time Decision Making Simons Institute, Berkeley, Jun. 27 - Jul. 1, 2016

1 / 19

slide-2
SLIDE 2

Outline

1

Background

2

Bipartite matching model

3

Optimization Average Cost Criterion Workload Workload Relaxation Asymptotic optimality

4

Final remarks

2 / 19

slide-3
SLIDE 3

Background

Bipartite Matching

(D, S, E) bipartite graph D(s) = {d ∈ D : (d, s) ∈ E} S(d) = {s ∈ S : (d, s) ∈ E}

1 3 2 2’ 1’ 3’ D S

xi number of elements of type i ∈ D ∪ S Perfect matching: m ∈ NE such that: xd =

  • s∈S(d)

mds, ∀d ∈ D, xs =

  • d∈D(s)

mds, ∀s ∈ S Hall’s marriage theorem (1935) ∃ perfect matching if and only if:

  • d∈U xc ≤

s∈S(U) xs,

∀U ⊂ D

  • s∈V xs ≤

d∈D(V ) xd,

∀V ⊂ S

3 / 19

slide-4
SLIDE 4

Background

Matching in Health-care

Kidney paired donation Who can join this program? For recipients:

If you are eligible for a kidney transplant and are receiving care at a transplant center in the United States, you can join ... You must have a living donor who is willing and medically able to donate his or her kidney ...

For donors: You must also be

willing to take part ...

U N I T E D N E T W O R K F O R O R G A N S H A R I N G TA L K I N G A B O U T T R A N S P L A N TAT I O N

4 / 19

slide-5
SLIDE 5

Background

Matching in Health-care

Kidney paired donation Who can join this program? For recipients:

If you are eligible for a kidney transplant and are receiving care at a transplant center in the United States, you can join ... You must have a living donor who is willing and medically able to donate his or her kidney ...

For donors: You must also be

willing to take part ...

U N I T E D N E T W O R K F O R O R G A N S H A R I N G TA L K I N G A B O U T T R A N S P L A N TAT I O N

Need for Dynamic Matching Models

4 / 19

slide-6
SLIDE 6

Background

Dynamic Matching Model: FCFS

Another Example Boston area public housing (some 25 years ago): Model Two independent infinite sequences of items. Demand / supply i.i.d. FCFS matching policy admits product-form invariant distribution Caldentey, Kaplan, Weiss 2009 Adan, Weiss 2012 Adan, B., Mairesse, Weiss 2015

5 / 19

slide-7
SLIDE 7

Bipartite matching model

Dynamic Bipartite Matching Model

Multiclass queueing model – Supply/Demand play symmetric roles Discrete time queueing model with two types of arrival: “supply” and “demand”. Discrete time: at each time step there is

  • ne customer and one server that arrive

into the system, independently of the past. Instantaneous matchings according to a bipartite matching graph. Unmatched supply/demand stored in a buffer.

6 / 19

slide-8
SLIDE 8

Bipartite matching model

Dynamic Bipartite Matching Model

Multiclass queueing model – Supply/Demand play symmetric roles Discrete time queueing model with two types of arrival: “supply” and “demand”. Discrete time: at each time step there is

  • ne customer and one server that arrive

into the system, independently of the past. Instantaneous matchings according to a bipartite matching graph. Unmatched supply/demand stored in a buffer.

6 / 19

slide-9
SLIDE 9

Bipartite matching model

Dynamic Bipartite Matching Model

Multiclass queueing model – Supply/Demand play symmetric roles Discrete time queueing model with two types of arrival: “supply” and “demand”. Discrete time: at each time step there is

  • ne customer and one server that arrive

into the system, independently of the past. Instantaneous matchings according to a bipartite matching graph. Unmatched supply/demand stored in a buffer.

6 / 19

slide-10
SLIDE 10

Bipartite matching model

Dynamic Bipartite Matching Model

Multiclass queueing model – Supply/Demand play symmetric roles Discrete time queueing model with two types of arrival: “supply” and “demand”. Discrete time: at each time step there is

  • ne customer and one server that arrive

into the system, independently of the past. Instantaneous matchings according to a bipartite matching graph. Unmatched supply/demand stored in a buffer. Given by: a matching graph, a joint probability measure µ for arrivals of demand/supply and a matching policy.

6 / 19

slide-11
SLIDE 11

Bipartite matching model

Dynamic Matching Model: Stability

For the dynamic model with i.i.d. arrivals, when is the Markovian model stable? (positive recurrent) Necessary condition: generalization of Hall’s marriage theorem Under this condition, certain policies are stabilizing, such as MaxWeight Under this condition, other policies are not stabilizing B., Gupta, Mairesse 2013.

7 / 19

slide-12
SLIDE 12

Bipartite matching model

Dynamic Matching Model: Approximate Optimality

Subject of this talk: How to define ‘heavy traffic’? This requires a formulation of ‘network load’ What is the structure of an optimal policy for the model in heavy traffic? How do we use this structure for policy design? B., Meyn 2016

8 / 19

slide-13
SLIDE 13

Bipartite matching model

Necessary stability conditions

Assumption: matching graph (D, S, E) is connected. Necessary conditions: If the model is stable then the marginals of µ satisfy NCond :

  • µD(U) < µS(S(U)),

∀U D µS(V ) < µD(D(V )), ∀V S

9 / 19

slide-14
SLIDE 14

Bipartite matching model

Necessary stability conditions

Assumption: matching graph (D, S, E) is connected. Necessary conditions: If the model is stable then the marginals of µ satisfy NCond :

  • µD(U) < µS(S(U)),

∀U D µS(V ) < µD(D(V )), ∀V S Sufficient conditions: If NCond holds, then there exists a policy that is stabilizing

9 / 19

slide-15
SLIDE 15

Bipartite matching model

Necessary stability conditions

Assumption: matching graph (D, S, E) is connected. Necessary conditions: If the model is stable then the marginals of µ satisfy NCond :

  • µD(U) < µS(S(U)),

∀U D µS(V ) < µD(D(V )), ∀V S Sufficient conditions: If NCond holds, then there exists a policy that is stabilizing

  • Prop. Given [(D, S, E), µ], there exists an algorithm of time complexity

O((|D| + |S|)3) to decide if NCond is satisfied.

9 / 19

slide-16
SLIDE 16

Optimization Average Cost Criterion

Optimization

Cost function c on buffer levels. Average-cost: η = lim sup

N→∞

1 N

N−1

  • t=0

E

  • c(Q(t))
  • 10 / 19
slide-17
SLIDE 17

Optimization Average Cost Criterion

Optimization

Cost function c on buffer levels. Average-cost: η = lim sup

N→∞

1 N

N−1

  • t=0

E

  • c(Q(t))
  • Queue dynamics: Q(t + 1) = Q(t) − U(t) + A(t) ,

t ≥ 0

10 / 19

slide-18
SLIDE 18

Optimization Average Cost Criterion

Optimization

Cost function c on buffer levels. Average-cost: η = lim sup

N→∞

1 N

N−1

  • t=0

E

  • c(Q(t))
  • Queue dynamics: Q(t + 1) = Q(t) − U(t) + A(t) ,

t ≥ 0 Input process U represents the sequence of matching activities. Input space: U⋄ =

  • e∈E

neue : ne ∈ Z+

  • with ue = 1i + 1j for e = (i, j) ∈ E.

10 / 19

slide-19
SLIDE 19

Optimization Average Cost Criterion

Optimization

Cost function c on buffer levels. Average-cost: η = lim sup

N→∞

1 N

N−1

  • t=0

E

  • c(Q(t))
  • Queue dynamics: Q(t + 1) = Q(t) − U(t) + A(t) ,

t ≥ 0 Input process U represents the sequence of matching activities. Input space: U⋄ =

  • e∈E

neue : ne ∈ Z+

  • with ue = 1i + 1j for e = (i, j) ∈ E.

X(t) = Q(t) + A(t) the state process of the MDP model, X(t + 1) = X(t) − U(t) + A(t + 1) The state space X⋄ = {x ∈ Zℓ

+ : ξ0 · x = 0} with ξ0 = (1, . . . , 1, −1, . . . , −1).

10 / 19

slide-20
SLIDE 20

Optimization Workload

Workload

For any D ⊂ D, corresponding workload vector ξD defined so that ξD · x =

  • i∈D

x D

i −

  • j∈S(D)

x S

j

11 / 19

slide-21
SLIDE 21

Optimization Workload

Workload

For any D ⊂ D, corresponding workload vector ξD defined so that ξD · x =

  • i∈D

x D

i −

  • j∈S(D)

x S

j

Necessary and sufficient condition for a stabilizing policy:

NCond: δD := −ξD · α > 0 for each D

α = E[A(t)] arrival rate vector.

11 / 19

slide-22
SLIDE 22

Optimization Workload

Workload

For any D ⊂ D, corresponding workload vector ξD defined so that ξD · x =

  • i∈D

x D

i −

  • j∈S(D)

x S

j

Necessary and sufficient condition for a stabilizing policy:

NCond: δD := −ξD · α > 0 for each D

α = E[A(t)] arrival rate vector. Why is this workload? Consistent with routing/scheduling models: Fluid model, d dt x(t) = −u(t) + α The minimal time to reach the origin from x(0) = x: T ∗(x) = maxD

ξD·x δD

11 / 19

slide-23
SLIDE 23

Optimization Workload

Workload

For any D ⊂ D, corresponding workload vector ξD defined so that ξD · x =

  • i∈D

x D

i −

  • j∈S(D)

x S

j

Necessary and sufficient condition for a stabilizing policy:

NCond: δD := −ξD · α > 0 for each D

α = E[A(t)] arrival rate vector. Why is this workload? Consistent with routing/scheduling models: Fluid model, d dt x(t) = −u(t) + α The minimal time to reach the origin from x(0) = x: T ∗(x) = maxD

ξD·x δD

Heavy-traffic: δD ∼ 0 for one or more D

11 / 19

slide-24
SLIDE 24

Optimization Workload

Workload Dynamics

Fix one workload vector ξD; denote (ξ, δ) for (ξD, δD). Workload W (t) = ξ · X(t)

12 / 19

slide-25
SLIDE 25

Optimization Workload

Workload Dynamics

Fix one workload vector ξD; denote (ξ, δ) for (ξD, δD). Workload W (t) = ξ · X(t) can be positive or negative.

12 / 19

slide-26
SLIDE 26

Optimization Workload

Workload Dynamics

Fix one workload vector ξD; denote (ξ, δ) for (ξD, δD). Workload W (t) = ξ · X(t) can be positive or negative. Dynamics as in other queueing models, E[W (t + 1) − W (t) | X(t), U(t)] ≥ −δ

12 / 19

slide-27
SLIDE 27

Optimization Workload

Workload Dynamics

Fix one workload vector ξD; denote (ξ, δ) for (ξD, δD). Workload W (t) = ξ · X(t) can be positive or negative. Dynamics as in other queueing models, E[W (t + 1) − W (t) | X(t), U(t)] ≥ −δ Achieved ⇐ ⇒ S(D) matches with D only.

12 / 19

slide-28
SLIDE 28

Optimization Workload

Workload Dynamics

Fix one workload vector ξD; denote (ξ, δ) for (ξD, δD). Workload W (t) = ξ · X(t) can be positive or negative. Dynamics as in other queueing models, E[W (t + 1) − W (t) | X(t), U(t)] ≥ −δ Achieved ⇐ ⇒ S(D) matches with D only. Workload relaxation: take this as the model for control.

12 / 19

slide-29
SLIDE 29

Optimization Workload Relaxation

Relaxations

A workload relaxation takes this as the model for control: One Dimensional Workload relaxation,

  • W (t + 1) =

W (t) − δ + I(t)

  • Idleness ≥ 0

+ ∆(t + 1)

  • Zero mean

13 / 19

slide-30
SLIDE 30

Optimization Workload Relaxation

Relaxations

A workload relaxation takes this as the model for control: One Dimensional Workload relaxation,

  • W (t + 1) =

W (t) − δ + I(t)

  • Idleness ≥ 0

+ ∆(t + 1)

  • Zero mean

Effective cost ¯ c : R → R+: Given a cost function c for Q, ¯ c(w) = min{c(x) : ξ · x = w}

13 / 19

slide-31
SLIDE 31

Optimization Workload Relaxation

Relaxations

A workload relaxation takes this as the model for control: One Dimensional Workload relaxation,

  • W (t + 1) =

W (t) − δ + I(t)

  • Idleness ≥ 0

+ ∆(t + 1)

  • Zero mean

Effective cost ¯ c : R → R+: Given a cost function c for Q, ¯ c(w) = min{c(x) : ξ · x = w} Piecewise linear if c is linear: c(w) = max(c+w, −c−w).

13 / 19

slide-32
SLIDE 32

Optimization Workload Relaxation

Relaxations

A workload relaxation takes this as the model for control: One Dimensional Workload relaxation,

  • W (t + 1) =

W (t) − δ + I(t)

  • Idleness ≥ 0

+ ∆(t + 1)

  • Zero mean

Effective cost ¯ c : R → R+: Given a cost function c for Q, ¯ c(w) = min{c(x) : ξ · x = w} Piecewise linear if c is linear: c(w) = max(c+w, −c−w). Conclusions Control of the relaxation = inventory model of Clark & Scarf (1960) Hedging policy, with threshold τ ∗: Idling is not permitted unless W (t) < −τ ∗

13 / 19

slide-33
SLIDE 33

Optimization Workload Relaxation

Relaxations

A workload relaxation takes this as the model for control: One Dimensional Workload relaxation,

  • W (t + 1) =

W (t) − δ + I(t)

  • Idleness ≥ 0

+ ∆(t + 1)

  • Zero mean

Effective cost ¯ c : R → R+: Given a cost function c for Q, ¯ c(w) = min{c(x) : ξ · x = w} Piecewise linear if c is linear: c(w) = max(c+w, −c−w). Conclusions Control of the relaxation = inventory model of Clark & Scarf (1960) Hedging policy, with threshold τ ∗: Idling is not permitted unless W (t) < −τ ∗ Heavy-traffic: For average-cost optimal control, τ ∗ ∼ 1

2

σ2

δ log(1 + c+/c−)

13 / 19

slide-34
SLIDE 34

Optimization Asymptotic optimality

Asymptotic optimality

Family of arrival processes {Aδ(t)}, parameterized by δ ∈ [0, ¯ δ•], ¯ δ• ∈ (0, 1). Additional assumptions: (A1) For one set D D we have ξD · αδ = −δ, where αδ denotes the mean of Aδ(t). Moreover, there is a fixed constant δ > 0 such that ξD′ · αδ ≤ −δ for any D′ D, D′ = D, and δ ∈ [0, ¯ δ•]. (A2) The distributions are continuous at δ = 0, with linear rate: For some constant b, E[Aδ(t) − A0(t)] ≤ bδ. (A3) Graph structure for arrivals and for feasible matches independent of δ ≥ 0 = ⇒ The matching graph is connected even for δ = 0. Moreover, there exists i0 ∈ S(D), j0 ∈ Dc, and pI > 0 such that P{Aδ

i0(t) ≥ 1 and Aδ j0(t) ≥ 1} ≥ pI,

0 ≤ δ ≤ ¯ δ•.

14 / 19

slide-35
SLIDE 35

Optimization Asymptotic optimality

Asymptotic optimality

h-MWT (h-MaxWeight with threshold) policy: For a differentiable function h: Rℓ → R+, and a threshold τ ≥ 0, φ(x) = arg max u · ∇h (x) subject to u feasible and I(t) ≤ max(−W (t) − τ, 0), when X(t) = x and U(t) = u.

15 / 19

slide-36
SLIDE 36

Optimization Asymptotic optimality

Asymptotic optimality

h-MWT (h-MaxWeight with threshold) policy: For a differentiable function h: Rℓ → R+, and a threshold τ ≥ 0, φ(x) = arg max u · ∇h (x) subject to u feasible and I(t) ≤ max(−W (t) − τ, 0), when X(t) = x and U(t) = u. Thm (Asymptotic Optimality With Bounded Regret) [B., Meyn ’16] There is an h-MWT policy with finite average cost η, satisfying ˆ η∗ ≤ η∗ ≤ η ≤ ˆ η∗ + O(1) where η∗ is the optimal average cost for the MDP model, ˆ η∗ is the optimal average cost for the workload relaxation, and the term O(1) does not depend upon δ.

15 / 19

slide-37
SLIDE 37

Optimization Asymptotic optimality

Asymptotic optimality

The average cost for the relaxation satisfies the uniform bound, ˆ η∗ = ˆ η∗∗ + O(1) where ˆ η∗∗ is the optimal cost for the diffusion approx. for the relaxation: ˆ η∗∗ = τ ∗ ¯ c− = 1

2

σ2

N

δ ¯ c− log

  • 1 + ¯

c+ ¯ c−

  • 16 / 19
slide-38
SLIDE 38

Optimization Asymptotic optimality

Asymptotic optimality

The average cost for the relaxation satisfies the uniform bound, ˆ η∗ = ˆ η∗∗ + O(1) where ˆ η∗∗ is the optimal cost for the diffusion approx. for the relaxation: ˆ η∗∗ = τ ∗ ¯ c− = 1

2

σ2

N

δ ¯ c− log

  • 1 + ¯

c+ ¯ c−

  • h(x) = ˆ

h(ξ · x) + hc(x)

  • hc is introduced to penalize deviations between c(x) and ¯

c(ξ · x).

  • The first term ˆ

h is a function of workload. For w ≥ −τ ∗, it solves the second-order differential equation, −δˆ h′ (w) + 1

2σ2 ∆ˆ

h′′ (w) = −¯ c(w) + ˆ η∗∗, (1) There is a solution that is convex and increasing on [−τ ∗, ∞), with ˆ h′(−τ ∗) = ˆ h′′(−τ ∗) = 0. Then extended to get a convex C 2 function on R.

16 / 19

slide-39
SLIDE 39

Examples

Example

Example Cost: c(x) = x D

1 + 2x D 2 + 3x D 3 + 3x S 1 + 2x S 2 + x S 3

= ⇒ Effective Cost: ¯ c(w) = 4|w|

e1 e2 e3 e4 e5 xD

1

xD

2

xD

3

xS

1

xS

2

xS

3

W (t) = Q D

3 (t) − Q S 1 (t)

Matching of Supply 1 and Demand 2 allowed only if W (t) < −τ ∗ Workload Relaxation: Q S

1 (t) = Q S 2 (t) = 0

if W (t) > 0 Q D

2 (t) = Q D 3 (t) = 0

if W (t) < 0

17 / 19

slide-40
SLIDE 40

Examples

Example

Example Cost: c(x) = x D

1 + 2x D 2 + 3x D 3 + 3x S 1 + 2x S 2 + x S 3

= ⇒ Effective Cost: ¯ c(w) = 4|w|

e1 e2 e3 e4 e5 xD

1

xD

2

xD

3

xS

1

xS

2

xS

3

5 10 15 20 25 30 r

¯ r∗ = 14.9

62 64 66 68 70 72 74 76

Average Cost Estimated in Simulation:

W (t) = Q D

3 (t) − Q S 1 (t)

Matching of Supply 1 and Demand 2 allowed only if W (t) < −τ ∗ Workload Relaxation: Q S

1 (t) = Q S 2 (t) = 0

if W (t) > 0 Q D

2 (t) = Q D 3 (t) = 0

if W (t) < 0

17 / 19

slide-41
SLIDE 41

Examples

Example

Example Cost: c(x) = x D

1 + 2x D 2 + 3x D 3 + 3x S 1 + 2x S 2 + x S 3

= ⇒ Effective Cost: ¯ c(w) = 4|w|

e1 e2 e3 e4 e5 xD

1

xD

2

xD

3

xS

1

xS

2

xS

3

Average Cost Comparisons:

50 100 150

Priority MaxWeight Threshold (15)

1 2 3 4 5 x 10

6

T

W (t) = Q D

3 (t) − Q S 1 (t)

Matching of Supply 1 and Demand 2 allowed only if W (t) < −τ ∗ Workload Relaxation: Q S

1 (t) = Q S 2 (t) = 0

if W (t) > 0 Q D

2 (t) = Q D 3 (t) = 0

if W (t) < 0 Simulation with τ = 14.9

17 / 19

slide-42
SLIDE 42

Final remarks

Final remarks

Performance bounds? Approximate optimal control for relaxations in higher dimensions? More general arrival assumptions. Admission control? Abandonnements? Optimization for non-bipartite matching? Applications?

18 / 19

slide-43
SLIDE 43

References

References

Dynamic bipartite matching models

Caldentey, Kaplan, Weiss, FCFS infinite bipartite matching of servers and

  • customers. Adv. Appl. Probab. 2009.

Adan & Weiss, Exact FCFS matching rates for two infinite multi-type sequences. Operations Research, 2012. Buˇ si´ c, Gupta, Mairesse, Stability of the bipartite matching model. Adv. Appl.

  • Probab. 2013.

Mairesse, Moyal, Stability of the stochastic matching model. ArXiv. 2014. Adan, Buˇ si´ c, Mairesse, Weiss, Reversibility and further properties of FCFS infinite bipartite matching. ArXiv. 2015. Buˇ si´ c, Meyn, Approximate optimality with bounded regret in dynamic matching

  • models. ArXiv. 2016.

Workload relaxations

Meyn, Control Techniques for Complex Networks. Cambridge Uni. Press, 2007. Meyn, Stability and asymptotic optimality of generalized MaxWeight policies. SIAM J. Control Optim., 2009. Gurvich, Ward, On the dynamic control of matching queues, Stoch. Systems, 2014.

19 / 19