Optimal control of the MFG equilibrium for a pedestrian tourists - - PowerPoint PPT Presentation

optimal control of the mfg equilibrium for a pedestrian
SMART_READER_LITE
LIVE PREVIEW

Optimal control of the MFG equilibrium for a pedestrian tourists - - PowerPoint PPT Presentation

INTRODUCTION MODEL MAIN RESULT CONCLUSION Optimal control of the MFG equilibrium for a pedestrian tourists flow model R. Maggistro F. Bagagiolo S. Faggian R. Pesenti Department of Management 1 / 26 INTRODUCTION MODEL MAIN RESULT


slide-1
SLIDE 1

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Optimal control of the MFG equilibrium for a pedestrian tourists’ flow model

  • R. Maggistro
  • F. Bagagiolo
  • S. Faggian
  • R. Pesenti

Department of Management 1 / 26

slide-2
SLIDE 2

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Introduction Main goal: To define mean field game model with both continuous and switching decisional variables. Motivations ւ ց to model and analytically study the flow of tourists (or, more precisely,

  • f daily pedestrian excursionists)

along the narrow alleys of the historic center of a heritage city. to define an optimization problem for an external controller who aims to induce a suitable mean field equilibrium.

2 / 26

slide-3
SLIDE 3

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Some features of the model All excursionists have only two main attractions they want to visit: P1 and P2 The excursionists arrive at the train station during a fixed interval of time They may decide to first visit attraction P1 and then attraction P2 or vice-versa. This choice may, for example, depends on the crowdedness and expected waiting time. They have to return back the station at fixed time T (Memory) Excursionists may occupy at the same instant the same place in the path but they may have different purposes: someone has already visited P1, someone else P2 only, someone both, someone else nothing. Hence they have “different past histories” During the day they split into several “populations” with different purposes and possibly they eventually recover into the same population.

3 / 26

slide-4
SLIDE 4

INTRODUCTION MODEL MAIN RESULT CONCLUSION

The model We describe the path of excursionists inside the city as a circular network containing three nodes S: the train station P1: the attraction 1 P2: the attraction 2 The position of an excursionist is given by the parameter θ ∈ [0, 2π] whose evolution is given by

  • θ′(s) = u(s),

s ∈]t, T] θ(t) = θ to which we associate a time-varying label (w1, w2) ∈ {0, 1} × {0, 1}. For i ∈ {1, 2}, wi(t) = 1 means that, at the time t, the attraction Pi has not been visited yet, and wi(t) = 0 that the attraction has been already visited.

4 / 26

slide-5
SLIDE 5

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Need of Memory The time-varying label (w1, w2) encode the memory of the excursionists, that is: the information about which attractions they have already visited. In the presence of more than one target, the Dynamic Programming Principle does not hold anymore and hence, we do not in general recover a Hamilton-Jacobi-Bellman equation.

 

5 / 26

slide-6
SLIDE 6

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Need of Memory The time-varying label (w1, w2) encode the memory of the excursionists, that is: the information about which attractions they have already visited. In the presence of more than one target, the Dynamic Programming Principle does not hold anymore and hence, we do not in general recover a Hamilton-Jacobi-Bellman equation. Problem: visit three sites minimizing time, with evolution subject to y ′(t) = f (y(t), u(t)), y(0) = x Optimal trajectory for x, y(t), not for y(τ)

3

T

2

T

1

T

x y(t) y() 

5 / 26

slide-7
SLIDE 7

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Need of Memory The time-varying label (w1, w2) encode the memory of the excursionists, that is: the information about which attractions they have already visited. In the presence of more than one target, the Dynamic Programming Principle does not hold anymore and hence, we do not in general recover a Hamilton-Jacobi-Bellman equation. Problem: visit three sites minimizing time, with evolution subject to y ′(t) = f (y(t), u(t)), y(0) = x

3

T

2

T

1

T

y()

?

5 / 26

slide-8
SLIDE 8

INTRODUCTION MODEL MAIN RESULT CONCLUSION

State Space The state of an agent is then (θ, w1, w2) and we denote by B = [0, 2π] × {0, 1} × {0, 1} the state space of variables (θ, w1, w2). The evolution inside the network

6 / 26

slide-9
SLIDE 9

INTRODUCTION MODEL MAIN RESULT CONCLUSION

State Space We call (circle)branch any Bw1,w2 ⊂ B which includes the states (θ, w1, w2), with (w1, w2) fixed and θ varying in [0, 2π]. Such branches correspond to edges

  • f the switching networks

where g : [0, T] → [0, +∞[ is the exogenous arrival flow at the station representing, roughly speaking, the density of arriving tourists per unit of time.

7 / 26

slide-10
SLIDE 10

INTRODUCTION MODEL MAIN RESULT CONCLUSION

The mean field game model The cost to be minimized by every agent J(u; t, θ, w1, w2) = T

t

  • u(s)2

2

+ F w1(s),w2(s)(M(s))

  • ds+

+c1w1(T) + c2w2(T) + c3Q(T) being M(s) the actual distribution of the agents. It is defined as M = (m1,1, m0,1, m1,0,m0,0) : B × [0, T] → [0, +∞[ (θ, w1, w2, t) → mw1,w2(θ, t), and by conservation of mass principle satisfies

  • B

dM(t) =

t

g(s)ds, t ∈ [0, T].

8 / 26

slide-11
SLIDE 11

INTRODUCTION MODEL MAIN RESULT CONCLUSION

The mean field game model Hypotheses (H1) g : [0, T] → [0, +∞[ is a Lipschitz continuous function; (H2) mw1,w2 are continuous functions of time into the set of Borel measures on the corresponding branch Bw1,w2 and M(0) = 0; (H3) t → F w1,w2(M(t)) continuous and bounded for all (w1, w2) ∈ {0, 1}2 (H4) F w1,w2 does not depend explicitly on state variable θ. Consequence: (H1) (H3), (H4) imply that the control choice made by agents at states (θS, 1, 1), (θ1, 0, 1), (θ2, 1, 0), (θ1, 0, 0) and (θ2, 0, 0) (significant states) does not change as long as the agent remains in the same branch, and is constant in time.

9 / 26

slide-12
SLIDE 12

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Exit time interpretation Given M, on every one of the first three branches we may interpret the

  • ptimal control problem as a finite horizon/exit time optimal control

problem The exit cost is given by the value function on the point where we switch

  • n.

On the fourth branch B0,0 the problem is just a finite horizon problem with all given data.

10 / 26

slide-13
SLIDE 13

INTRODUCTION MODEL MAIN RESULT CONCLUSION

HJB problem

V(. ,0, 0, .) V(. , 1, 0, .) V(. , 0, 1, .)

11 / 26

slide-14
SLIDE 14

INTRODUCTION MODEL MAIN RESULT CONCLUSION

The transport equation If it optimally behaves, then every excursionist moves with the optimal feedback u∗(θ, t, w1, w2) = −Vθ(θ, t, w1, w2). Due to our simple model (the simple controlled dynamics, the non-dependence of F w1,w2 on θ, the one-dimensionality,. . . ) the feedback

  • ptimal control has some good properties :

No excursionist will return back on its path when inside the same branch (that is not an optimal behavior). To stop is not an optimal behavior (apart the case that we are at the station and that we stop there until T.) When arrived on a switching point, the best choice is to immediately switch. N.B. These facts simplify a little bit the transport equation.

12 / 26

slide-15
SLIDE 15

INTRODUCTION MODEL MAIN RESULT CONCLUSION

The transport equation

m1,1 m1,0 m0,1 m0,0

Equilibrium Mean Field M → V → u∗ = −Vθ → M

13 / 26

slide-16
SLIDE 16

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Different characterization We now suppose to consider the following cost to minimize J(u; t, θ, w1, w2) = T

t

  • u(s)2

2

+ F w1(s),w2(s)(M(s))

  • ds+

+c1w1(T) + c2w2(T) + c3ξθ=θS (T) with, c1, c2, c3 > 0 are fixed, and ξθ=θS (s) ∈ {0, 1} and it is equal to 0 if and

  • nly if θ(s) = θS.

14 / 26

slide-17
SLIDE 17

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Different characterization We now suppose to consider the following cost to minimize J(u; t, θ, w1, w2) = T

t

  • u(s)2

2

+ F w1(s),w2(s)(M(s))

  • ds+

+c1w1(T) + c2w2(T) + c3ξθ=θS (T) with, c1, c2, c3 > 0 are fixed, and ξθ=θS (s) ∈ {0, 1} and it is equal to 0 if and

  • nly if θ(s) = θS.

An agent standing at (θi, 0, 0) at time t ∈ [0, T], with i ∈ {1, 2}, has two possible choices: either staying at θi indefinitely or moving to reach θS exactly at time T.The controls among which the agent choses are then, respectively u0,0

0 (t) ≡ 0,

u0,0

1 (t) = ±θS − θ1

T − t , u0,0

2 (t) = ±θS − θ2

T − t . Hence, given the cost functional, we derive V (θi, t, 0, 0) = min

  • c3, 1

2 (θS − θi)2 T − t

  • +

T

t

F0,0 ds

14 / 26

slide-18
SLIDE 18

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Different characterization We now suppose to consider the following cost to minimize J(u; t, θ, w1, w2) = T

t

  • u(s)2

2

+ F w1(s),w2(s)(M(s))

  • ds+

+c1w1(T) + c2w2(T) + c3ξθ=θS (T) with, c1, c2, c3 > 0 are fixed, and ξθ=θS (s) ∈ {0, 1} and it is equal to 0 if and

  • nly if θ(s) = θS.

At (θ1, 0, 1) at time t the possible choices for a control are, respectively u0,1

0 (t) ≡ 0,

u0,1

1 (t) = ±θS − θ1

T − t , u0,1

2 (t) = ±θ2 − θ1

τ − t . V (θ1, t, 0, 1) = min

  • c2 + c3 +

T

t

F 0,1 ds, c2 + 1 2 (θS − θ1)2 T − t +

T

t

F 0,1 ds, inf

τ∈]t,T]

  • 1

2 (θ2 − θ1)2 τ − t +

τ

t

F0,1 ds + V (θ2, τ, 0, 0)

  • 14 / 26
slide-19
SLIDE 19

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Different characterization We now suppose to consider the following cost to minimize J(u; t, θ, w1, w2) = T

t

  • u(s)2

2

+ F w1(s),w2(s)(M(s))

  • ds+

+c1w1(T) + c2w2(T) + c3ξθ=θS (T) with, c1, c2, c3 > 0 are fixed, and ξθ=θS (s) ∈ {0, 1} and it is equal to 0 if and

  • nly if θ(s) = θS.

Similarly at (θ2, 1, 0) the control is chosen among u1,0

0 (t) ≡ 0,

u1,0

1 (t) = ±θS − θ2

T − t , u1,0

2 (t) = ±θ2 − θ1

τ − t . V (θ2, t, 1, 0) = min

  • c1 + c3 +

T

t

F 0,1 ds, c1 + 1 2 (θS − θ2)2 T − t +

T

t

F 0,1 ds, inf

τ∈]t,T]

  • 1

2 (θ2 − θ1)2 τ − t +

τ

t

F0,1 ds + V (θ1, τ, 0, 0)

  • .

14 / 26

slide-20
SLIDE 20

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Masses and Flows of agents Consider the arrival flows at significant states: the given external arrival flow g, and the four flows g0,1 at (θ1, 0, 1), g1,0 at (θ2, 1, 0); g1,2 at (θ2, 0, 0), and g2,1 at (θ1, 0, 0).

15 / 26

slide-21
SLIDE 21

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Masses and Flows of agents Consider the arrival flows at significant states: the given external arrival flow g, and the four flows g0,1 at (θ1, 0, 1), g1,0 at (θ2, 1, 0); g1,2 at (θ2, 0, 0), and g2,1 at (θ1, 0, 0). Such flow functions are time densities entering the switching states. When the optimal controls are chosen, they generate spatial densities along the branches Bw1,w2, and such spatial densities transform once again into time densities at the subsequent switching point. Such procedure requires a fine decomposition of transport of measure in the time and in the space-components. For this approach, see Camilli-De Maio-Tosin: Transport of measure on networks, 2017

15 / 26

slide-22
SLIDE 22

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Masses and Flows of agents Consider the arrival flows at significant states: the given external arrival flow g, and the four flows g0,1 at (θ1, 0, 1), g1,0 at (θ2, 1, 0); g1,2 at (θ2, 0, 0), and g2,1 at (θ1, 0, 0). Denoting by ρw1,w2(t) the actual total mass of agents on the branch Bw1,w2 ρ1,1(t) =

t

g(τ)dτ −

t

g0,1(τ)dτ −

t

g1,0(τ)dτ, ρ0,1(t) =

t

g0,1(τ)dτ −

t

g1,2(τ)dτ, ρ1,0(t) =

t

g1,0(τ)dτ −

t

g2,1(τ)dτ, ρ0,0(t) =

t

g1,2(τ)dτ +

t

g2,1(τ)dτ.

15 / 26

slide-23
SLIDE 23

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Masses and Flows of agents Consider the arrival flows at significant states: the given external arrival flow g, and the four flows g0,1 at (θ1, 0, 1), g1,0 at (θ2, 1, 0); g1,2 at (θ2, 0, 0), and g2,1 at (θ1, 0, 0). Denoting by ρw1,w2(t) the actual total mass of agents on the branch Bw1,w2 ρ1,1(t) =

t

g(τ)dτ −

t

g0,1(τ)dτ −

t

g1,0(τ)dτ, ρ0,1(t) =

t

g0,1(τ)dτ −

t

g1,2(τ)dτ, ρ1,0(t) =

t

g1,0(τ)dτ −

t

g2,1(τ)dτ, ρ0,0(t) =

t

g1,2(τ)dτ +

t

g2,1(τ)dτ. We denote by ρ = (ρ1,1, ρ0,1, ρ1,0, ρ0,0) the vector of masses in different branches.

15 / 26

slide-24
SLIDE 24

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Masses and Flows of agents Consider the arrival flows at significant states: the given external arrival flow g, and the four flows g0,1 at (θ1, 0, 1), g1,0 at (θ2, 1, 0); g1,2 at (θ2, 0, 0), and g2,1 at (θ1, 0, 0). Denoting by ρw1,w2(t) the actual total mass of agents on the branch Bw1,w2 ρ1,1(t) =

t

g(τ)dτ −

t

g0,1(τ)dτ −

t

g1,0(τ)dτ, ρ0,1(t) =

t

g0,1(τ)dτ −

t

g1,2(τ)dτ, ρ1,0(t) =

t

g1,0(τ)dτ −

t

g2,1(τ)dτ, ρ0,0(t) =

t

g1,2(τ)dτ +

t

g2,1(τ)dτ. (H3′) F w1,w2 : [0, K] → [0, +∞[ are Lipschitz continuous, and F w1,w2(M(t)) = F w1,w2(ρw1,w2(t)), ∀t ∈ [0, T], ∀(w1, w2) ∈ {0, 1}2.

15 / 26

slide-25
SLIDE 25

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Existence equilibrium-fixed point Let ρ = (ρ1,1, ρ0,1, ρ1,0, ρ0,0) ∈ X where X is a suitable space of bounded and continuous function of time. ρ → u → gij → ¯ ρ, ρ = ¯ ρ ?

16 / 26

slide-26
SLIDE 26

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Existence equilibrium-fixed point Let ρ = (ρ1,1, ρ0,1, ρ1,0, ρ0,0) ∈ X where X is a suitable space of bounded and continuous function of time. ρ → u → gij → ¯ ρ, ρ = ¯ ρ ? ψ : X → X such that ψ(ρ) = ¯ ρ.

16 / 26

slide-27
SLIDE 27

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Existence equilibrium-fixed point Let ρ = (ρ1,1, ρ0,1, ρ1,0, ρ0,0) ∈ X where X is a suitable space of bounded and continuous function of time. ρ → u → gij → ¯ ρ, ρ = ¯ ρ ? ψ : X → X such that ψ(ρ) = ¯ ρ. Actually ψ is a multifunction since the optimal control may not be unique. ψ(ρ) ⊂ X.

16 / 26

slide-28
SLIDE 28

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Existence equilibrium-fixed point Let ρ = (ρ1,1, ρ0,1, ρ1,0, ρ0,0) ∈ X where X is a suitable space of bounded and continuous function of time. ρ → u → gij → ¯ ρ, ρ = ¯ ρ ? ψ : X → X such that ψ(ρ) = ¯ ρ. Actually ψ is a multifunction since the optimal control may not be unique. ψ(ρ) ⊂ X. Hence we need to apply the Kakutani fixed point theorem If ψ has closed graph, compact and convex images, then it has a fixed point, i.e. ¯ ρ ∈ ψ(¯ ρ).

16 / 26

slide-29
SLIDE 29

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Existence equilibrium-fixed point We obtain ψ and its fixed point ¯ ρ through a limiting procedure on {ψε}ε>0 of functions approximating ψ, and the corresponding fixed points ρε. The single function ψε is obtained through the MFG paradigm with the difference that now one chooses ε−optimal controls and, along time, an ε−optimal stream. ε-optimal stream Assume the branch Bw1,w2 is entered at the state ( θ, w1, w2), with

  • θ ∈ {θS, θ1, θ2}, and let uw1,w2

i

, i ∈ {1, 2, 3} be the controls defined through the characterization of the value functions. Consider also a partition τ w1,w2 = {tn}n

  • f the interval [0, T], and fix ε > 0. Then uw1,w2

ε

is an ε−optimal stream for Bw1,w2 associated to the partition τ w1,w2 if uw1,w2

ε

(s) = uw1,w2

in

(s), s ∈ [tn, tn+1[ where uw1,w2

in

is optimal at tn and ε-optimal at all s ∈]tn, tn+1[, that is, it realizes the minimum cost up to an error not greater than ε.

17 / 26

slide-30
SLIDE 30

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Existence equilibrium-fixed point For a fixed ε > 0, consider τε and uε = (u1,1

ε , u0,1 ε , u1,0 ε , u1;0,0 ε

, u2;0,0

ε

) whose components are ε-optimal streams associated to τε To consider the possibility for agents to split into fractions among different vectors uε we consider the ε-split function. An ε-split function is a vector λε ∈ L∞(0, T)13, with coordinates

  • λ(θS,1,1)

1

, λ(θS,1,1)

2

, λ(θS,1,1)

3

, λ(θ1,0,1)

1

, λ(θ1,0,1)

2

, λ(θ1,0,1)

3

, λ(θ2,1,0)

1

, λ(θ2,1,0)

2

, λ(θ2,1,0)

3

, λ(θ1,0,0)

1

, λ(θ1,0,0)

2

, λ(θ2,0,0)

1

, λ(θ2,0,0)

2

  • .

constant on subintervals induced by the partition τε and such that the split fractions λ(ˆ

θ,w1,w2) i

satisfy: (i) λ(ˆ

θ,w1,w2) i

(s) ≥ 0, for all s ∈ [0, T], and λ(ˆ

θ,w1,w2) i

(t) = 0 if uw1,w2

i

is not

  • ptimal at (ˆ

θ, w1, w2, t) for t ∈ τε; (ii) 3

i=1 λ(ˆ θ,w1,w2) i

(s) = 1 for all s ∈ [0, T].

18 / 26

slide-31
SLIDE 31

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Existence equilibrium-fixed point Now we have all the ingredients to build each ψε and the corresponding fixed point ρε. Hence we define an ε-mean field equilibrium as a total mass ρε ∈ X : ρε ∈ ψε(ρε). Passing to the limit for ε → 0 we get that ψε(ρε) → ψ(ρ). Theorem (Existence) Assume (H1),(H3’),(H4). Then there exists a mean field equilibrium, i.e., a total mass ρ ∈ X such that ρ ∈ ψ(ρ).

19 / 26

slide-32
SLIDE 32

INTRODUCTION MODEL MAIN RESULT CONCLUSION

The optimal control of the mean field equilibrium F w1,w2(ρ) = αw1,w2ρw1,w2(s) + βw1,w2 with ρ = (ρ1,1, ρ0,1, ρ1,0, ρ0,0) ∈ X. The coefficients (αw1,w2, βw1,w2) are chosen by the controller, aiming to force the equilibrium to be as close as possible (in uniform topology) to a reference string ρ ∈ X, i.e. to minimize: max

w1,w2∈{0,1}

  • max

t∈[0,T] |ρw1,w2(t) − ρw1,w2(t)|

  • = ρ − ρX.

Let us denote by χα,β the set of mean field game equilibria corresponding to the choice of parameters α, β ∈ K ⊂ R4 × R4. Then the optimization problem is inf

(α,β)∈K

inf

ρ∈χα,β

ρ − ρX. (1)

20 / 26

slide-33
SLIDE 33

INTRODUCTION MODEL MAIN RESULT CONCLUSION

The optimal control of the mean field equilibrium Theorem If (H1),(H3’), (H4) hold then there exists an optimal pair (α, β) ∈ K that solves problem (1). A variation of problem (1) is inf

(α,β)∈K

sup

ρ∈χα,β

ρ − ρX where the controller tries to manage the worst case scenario. in this case the existence of an optimal pair is not evident. we are not able to prove that (α, β) is optimal, due to the possible presence of multiple mean field equilibria. to bypass the problem assuming stronger hypotheses on the cost F

21 / 26

slide-34
SLIDE 34

INTRODUCTION MODEL MAIN RESULT CONCLUSION

Some references

  • L. Ambrosio, N. Gigli , G. Savar´

e Gradient flows in metric spaces and in the space of probability measures, 2008

  • F. Bagagiolo, R. Pesenti

Non-memoryless Pedestrian Flow in a Crowded Environment with Target Sets, 2017

  • F. Bagagiolo, D. Bauso, R. M., M. Zoppello

Game theoretic decentralized feedback controls in Markov jump processes, 2017

  • F. Bagagiolo, S. Faggian, R. M., R Pesenti

Optimal control of the mean field equilibrium for a pedestrian tourists’ flow model. Preprint

  • F. Camilli, R. De Maio, A. Tosin

Transport of measures on networks, 2017

  • P. Cardaliaguet

Notes on mean field games. Unpublished notes.

22 / 26

slide-35
SLIDE 35

INTRODUCTION MODEL MAIN RESULT CONCLUSION

THANK YOU

rosario.maggistro@unive.it

23 / 26