optimal control of the mfg equilibrium for a pedestrian
play

Optimal control of the MFG equilibrium for a pedestrian tourists - PowerPoint PPT Presentation

INTRODUCTION MODEL MAIN RESULT CONCLUSION Optimal control of the MFG equilibrium for a pedestrian tourists flow model R. Maggistro F. Bagagiolo S. Faggian R. Pesenti Department of Management 1 / 26 INTRODUCTION MODEL MAIN RESULT


  1. INTRODUCTION MODEL MAIN RESULT CONCLUSION Optimal control of the MFG equilibrium for a pedestrian tourists’ flow model R. Maggistro F. Bagagiolo S. Faggian R. Pesenti Department of Management 1 / 26

  2. INTRODUCTION MODEL MAIN RESULT CONCLUSION Introduction Main goal: To define mean field game model with both continuous and switching decisional variables. Motivations ւ ց to model and analytically study the to define an optimization flow of tourists (or, more precisely, problem for an external controller of daily pedestrian excursionists) who aims to induce a suitable along the narrow alleys of the mean field equilibrium. historic center of a heritage city. 2 / 26

  3. INTRODUCTION MODEL MAIN RESULT CONCLUSION Some features of the model All excursionists have only two main attractions they want to visit: P1 and P2 The excursionists arrive at the train station during a fixed interval of time They may decide to first visit attraction P1 and then attraction P2 or vice-versa. This choice may, for example, depends on the crowdedness and expected waiting time. They have to return back the station at fixed time T (Memory) Excursionists may occupy at the same instant the same place in the path but they may have different purposes: someone has already visited P1, someone else P2 only, someone both, someone else nothing. Hence they have “different past histories” During the day they split into several “populations” with different purposes and possibly they eventually recover into the same population. 3 / 26

  4. INTRODUCTION MODEL MAIN RESULT CONCLUSION The model We describe the path of excursionists inside the city as a circular network containing three nodes S : the train station P 1: the attraction 1 P 2: the attraction 2 The position of an excursionist is given by the parameter θ ∈ [0 , 2 π ] whose evolution is given by � θ ′ ( s ) = u ( s ) , s ∈ ] t , T ] θ ( t ) = θ to which we associate a time-varying label ( w 1 , w 2 ) ∈ { 0 , 1 } × { 0 , 1 } . For i ∈ { 1 , 2 } , w i ( t ) = 1 means that, at the time t , the attraction P i has not been visited yet, and w i ( t ) = 0 that the attraction has been already visited. 4 / 26

  5.   INTRODUCTION MODEL MAIN RESULT CONCLUSION Need of Memory The time-varying label ( w 1 , w 2 ) encode the memory of the excursionists, that is: the information about which attractions they have already visited. In the presence of more than one target, the Dynamic Programming Principle does not hold anymore and hence, we do not in general recover a Hamilton-Jacobi-Bellman equation. 5 / 26

  6.  INTRODUCTION MODEL MAIN RESULT CONCLUSION Need of Memory The time-varying label ( w 1 , w 2 ) encode the memory of the excursionists, that is: the information about which attractions they have already visited. In the presence of more than one target, the Dynamic Programming Principle does not hold anymore and hence, we do not in general recover a Hamilton-Jacobi-Bellman equation. Problem: visit three sites Optimal trajectory for x , y ( t ), not for y ( τ ) minimizing time, with x evolution subject to y ′ ( t ) = f ( y ( t ) , u ( t )), y(t) y (0) = x T 1 T y(  ) 3 T 2 5 / 26

  7.  INTRODUCTION MODEL MAIN RESULT CONCLUSION Need of Memory The time-varying label ( w 1 , w 2 ) encode the memory of the excursionists, that is: the information about which attractions they have already visited. In the presence of more than one target, the Dynamic Programming Principle does not hold anymore and hence, we do not in general recover a Hamilton-Jacobi-Bellman equation. Problem: visit three sites minimizing time, with T evolution subject to T 3 ? y ′ ( t ) = f ( y ( t ) , u ( t )), 1 y(  ) y (0) = x T 2 5 / 26

  8. INTRODUCTION MODEL MAIN RESULT CONCLUSION State Space The state of an agent is then ( θ, w 1 , w 2 ) and we denote by B = [0 , 2 π ] × { 0 , 1 } × { 0 , 1 } the state space of variables ( θ, w 1 , w 2 ). The evolution inside the network 6 / 26

  9. INTRODUCTION MODEL MAIN RESULT CONCLUSION State Space We call (circle)branch any B w 1 , w 2 ⊂ B which includes the states ( θ, w 1 , w 2 ), with ( w 1 , w 2 ) fixed and θ varying in [0 , 2 π ] . Such branches correspond to edges of the switching networks where g : [0 , T ] → [0 , + ∞ [ is the exogenous arrival flow at the station representing, roughly speaking, the density of arriving tourists per unit of time. 7 / 26

  10. INTRODUCTION MODEL MAIN RESULT CONCLUSION The mean field game model The cost to be minimized by every agent � � J ( u ; t , θ, w 1 , w 2 ) = � T u ( s ) 2 + F w 1 ( s ) , w 2 ( s ) ( M ( s )) ds + t 2 + c 1 w 1 ( T ) + c 2 w 2 ( T ) + c 3 Q ( T ) being M ( s ) the actual distribution of the agents. It is defined as M = ( m 1 , 1 , m 0 , 1 , m 1 , 0 , m 0 , 0 ) : B × [0 , T ] → [0 , + ∞ [ ( θ, w 1 , w 2 , t ) → m w 1 , w 2 ( θ, t ) , and by conservation of mass principle satisfies � � t d M ( t ) = g ( s ) ds , t ∈ [0 , T ] . B 0 8 / 26

  11. INTRODUCTION MODEL MAIN RESULT CONCLUSION The mean field game model Hypotheses (H1) g : [0 , T ] → [0 , + ∞ [ is a Lipschitz continuous function; (H2) m w 1 , w 2 are continuous functions of time into the set of Borel measures on the corresponding branch B w 1 , w 2 and M (0) = 0; (H3) t �→ F w 1 , w 2 ( M ( t )) continuous and bounded for all ( w 1 , w 2 ) ∈ { 0 , 1 } 2 (H4) F w 1 , w 2 does not depend explicitly on state variable θ . Consequence: (H1) (H3), (H4) imply that the control choice made by agents at states ( θ S , 1 , 1), ( θ 1 , 0 , 1), ( θ 2 , 1 , 0), ( θ 1 , 0 , 0) and ( θ 2 , 0 , 0) ( significant states ) does not change as long as the agent remains in the same branch, and is constant in time. 9 / 26

  12. INTRODUCTION MODEL MAIN RESULT CONCLUSION Exit time interpretation Given M , on every one of the first three branches we may interpret the optimal control problem as a finite horizon/exit time optimal control problem The exit cost is given by the value function on the point where we switch on. On the fourth branch B 0 , 0 the problem is just a finite horizon problem with all given data. 10 / 26

  13. INTRODUCTION MODEL MAIN RESULT CONCLUSION HJB problem V(. , 0, 1, .) V(. , 1, 0, .) V(. ,0, 0, .) 11 / 26

  14. INTRODUCTION MODEL MAIN RESULT CONCLUSION The transport equation If it optimally behaves, then every excursionist moves with the optimal feedback u ∗ ( θ, t , w 1 , w 2 ) = − V θ ( θ, t , w 1 , w 2 ) . Due to our simple model (the simple controlled dynamics, the non-dependence of F w 1 , w 2 on θ , the one-dimensionality,. . . ) the feedback optimal control has some good properties : No excursionist will return back on its path when inside the same branch (that is not an optimal behavior). To stop is not an optimal behavior (apart the case that we are at the station and that we stop there until T .) When arrived on a switching point, the best choice is to immediately switch. N.B. These facts simplify a little bit the transport equation. 12 / 26

  15. INTRODUCTION MODEL MAIN RESULT CONCLUSION The transport equation m1,1 m1,0 m0,1 m0,0 Equilibrium Mean Field M → V → u ∗ = − V θ → M 13 / 26

  16. INTRODUCTION MODEL MAIN RESULT CONCLUSION Different characterization We now suppose to consider the following cost to minimize � � J ( u ; t , θ, w 1 , w 2 ) = � T u ( s ) 2 + F w 1 ( s ) , w 2 ( s ) ( M ( s )) ds + 2 t + c 1 w 1 ( T ) + c 2 w 2 ( T ) + c 3 ξ θ = θ S ( T ) with, c 1 , c 2 , c 3 > 0 are fixed, and ξ θ = θ S ( s ) ∈ { 0 , 1 } and it is equal to 0 if and only if θ ( s ) = θ S . 14 / 26

  17. INTRODUCTION MODEL MAIN RESULT CONCLUSION Different characterization We now suppose to consider the following cost to minimize � � J ( u ; t , θ, w 1 , w 2 ) = � T u ( s ) 2 + F w 1 ( s ) , w 2 ( s ) ( M ( s )) ds + 2 t + c 1 w 1 ( T ) + c 2 w 2 ( T ) + c 3 ξ θ = θ S ( T ) with, c 1 , c 2 , c 3 > 0 are fixed, and ξ θ = θ S ( s ) ∈ { 0 , 1 } and it is equal to 0 if and only if θ ( s ) = θ S . An agent standing at ( θ i , 0 , 0) at time t ∈ [0 , T ], with i ∈ { 1 , 2 } , has two possible choices: either staying at θ i indefinitely or moving to reach θ S exactly at time T .The controls among which the agent choses are then, respectively 1 ( t ) = ± θ S − θ 1 2 ( t ) = ± θ S − θ 2 u 0 , 0 u 0 , 0 u 0 , 0 0 ( t ) ≡ 0 , T − t , T − t . Hence, given the cost functional, we derive � � � T ( θ S − θ i ) 2 c 3 , 1 F 0 , 0 ds V ( θ i , t , 0 , 0) = min + 2 T − t t 14 / 26

  18. INTRODUCTION MODEL MAIN RESULT CONCLUSION Different characterization We now suppose to consider the following cost to minimize � � J ( u ; t , θ, w 1 , w 2 ) = � T u ( s ) 2 + F w 1 ( s ) , w 2 ( s ) ( M ( s )) ds + 2 t + c 1 w 1 ( T ) + c 2 w 2 ( T ) + c 3 ξ θ = θ S ( T ) with, c 1 , c 2 , c 3 > 0 are fixed, and ξ θ = θ S ( s ) ∈ { 0 , 1 } and it is equal to 0 if and only if θ ( s ) = θ S . At ( θ 1 , 0 , 1) at time t the possible choices for a control are, respectively 1 ( t ) = ± θ S − θ 1 2 ( t ) = ± θ 2 − θ 1 u 0 , 1 u 0 , 1 u 0 , 1 0 ( t ) ≡ 0 , T − t , τ − t . � � T � T ( θ S − θ 1 ) 2 F 0 , 1 ds , c 2 + 1 F 0 , 1 ds , V ( θ 1 , t , 0 , 1) = min c 2 + c 3 + + 2 T − t t t � � τ �� ( θ 2 − θ 1 ) 2 1 F 0 , 1 ds + V ( θ 2 , τ, 0 , 0) inf + 2 τ − t τ ∈ ] t , T ] t 14 / 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend