deterministic mean field games
play

DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza - PowerPoint PPT Presentation

DETERMINISTIC MEAN FIELD GAMES DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza Universit` a di Roma and GNAMPA - Istituto di Alta Matematica DETERMINISTIC MEAN FIELD GAMES A classical optimization problem Given a time interval


  1. DETERMINISTIC MEAN FIELD GAMES DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza Universit` a di Roma and GNAMPA - Istituto di Alta Matematica

  2. DETERMINISTIC MEAN FIELD GAMES A classical optimization problem Given a time interval [0 , T ] consider the classical Mayer type problem � T � 1 � X s | 2 + L ( X s ) 2 | ˙ inf ds + G ( X T ) (1) t where X := X t , x is any curve in the Sobolev space W 1 , 2 ([ t , T ]; R d ) such that X T = x ∈ R d for t ∈ [0 , T ]. Well-known that if L : R d × [0 , T ] → R , g : R d → R are continuous and bounded, then the value function of problem (1) above, i.e. � � T � � 1 � X s | 2 + L ( X s ) 2 | ˙ ds + G ( X T ) ; X ∈ W 1 , 2 ([0 , T ]; R d ) u ( t , x ) = inf t is the unique bounded continuous viscosity solution of

  3. DETERMINISTIC MEAN FIELD GAMES the backward Cauchy problem HJ 2 |∇ x u ( t , x ) | 2 = L ( x )  − ∂ t u ( t , x ) + 1 in (0 , T ) × R d ,  (2) in R d u ( T , x ) = G ( x )  of Hamilton-Jacobi type. The proof that u solves (2) in viscosity sense is a simple consequence of the following identity, the Dynamic Programming Principle : � t � � u ( s , X t , x ( s )) + X ∈ W 1 , 2 ([0 , T ]; R d ) u ( t , x ) = inf L ( X s ) ds ; s valid for any given ( t , x ) ∈ (0 , T ) × R d and any s ∈ [ t , T ]. Uniqueness of solution is a non trivial, fundamental result in viscosity solutions theory (Lions 1982).

  4. DETERMINISTIC MEAN FIELD GAMES t , x is optimal for the initial As for optimal curves , easy to check that X setting ( t , x ) if and only if � T t , x ( s )) + t , x ( τ )) d τ for all s ∈ [ t , T ] u ( t , x ) = u ( s , X L ( X s Moreover, if u is smooth enough, the velocity field of the optimal paths is the spatial gradient of the solution of the HJ equation. More precisely,

  5. DETERMINISTIC MEAN FIELD GAMES A Verification Lemma Lemma Let X ∗ ( t ) be such that X ∗ ( s ) = −∇ x u ( s , X ∗ ( s )) for s ∈ [ t , T ] , X ∗ ( t ) = x ˙ Then, � T � 1 � X ∗ ( s ) | 2 + L ( X ∗ ( s )) 2 | ˙ ds + G ( X ∗ ( T )) = t � T � 1 � X s | 2 + L ( X s ) 2 | ˙ = inf ds + G ( X T ) t

  6. DETERMINISTIC MEAN FIELD GAMES Verification result above requires u to be C 1 with respect to x . This turns out to be true in the present model problem under a C 2 smoothness assumptions on L , G . The proof of C 1 regularity of u is in 3 steps: step 1: u is globally Lipschitz w.r.t ( t , x ) step 2 : u is semiconcave w.r.t. x , i.e. x → u ( t , x ) − 1 2 C t | x | 2 concave for some positive constant C t step 3: the upper semidifferential � � u ( t , y ) − u ( t , x ) − p · ( y − x ) p ∈ R d : lim sup D + x u ( t , x ) = ≤ 0 | y − x | y → x is a singleton at each ( t , x ) Alternative way to optimal feebacks for general control problems when no smoothness available is via semi-discretization (comments on this issue later on)

  7. DETERMINISTIC MEAN FIELD GAMES Proof of Verification Lemma: � T � � ∂ s u ( s , X s ) + ˙ u ( T , X T ) = u ( t , X T ) + X s · ∇ u ( s , X s ) ds = t [by HJ] � T � 1 � 2 |∇ x u ( s , X s ) | 2 + ˙ = u ( t , X T ) + X s · ∇ x u ( s , X s ) − L ( X s ) ds ≥ t [by convexity of p → 1 2 | p | 2 ] � T � − 1 � X s | 2 − L ( X s ) 2 | ˙ ≥ u ( t , X T ) + ds t

  8. DETERMINISTIC MEAN FIELD GAMES Since u ( T , X T ) = G ( X T ), u ( t , X T ) = u ( t , x ), above yields � T � 1 � X s | 2 + L ( X s ) 2 | ˙ G ( X T ) + ds ≥ u ( t , x ) t Same computation with generic curve X replaced by X ∗ given by X ∗ ( s ) = −∇ x u ( s , X ∗ ( s )) for s ∈ [ t , T ] , X ∗ ( t ) = x ˙ gives = in the last step, so that � T � 1 � X s | 2 + L ( X s ) 2 | ˙ u ( t , x ) = inf ds + G ( X T ) t

  9. DETERMINISTIC MEAN FIELD GAMES A deterministic mean field game problem An interesting new class of optimal control has become recently object of interest after the 2006/07 papers by Lasry and Lions (see also P.-L. Lions, Cours au Coll` ege de France www.college-de-france.fr. for more recent developments) Related ideas have been developed independently in the engineering literature, and at about the same time, by Huang, Caines and Malham´ e. Assume that the running cost L ( X s ) depends also on an exhogenous variable m ( s , X s ) modeling the density of population of the other agents at state X s at time s .

  10. DETERMINISTIC MEAN FIELD GAMES The new cost criterion is then � T � 1 � X s | 2 + L ( X s , m ( s , X s )) 2 | ˙ inf ds + G ( X T , m ( T , X T )) (3) t Here, m is a non-negative function valued in [0 , 1] such that � R d m ( s , x ) dx = 1 for all s . The time evolution of m starting from an initial configuration m (0 , x ) is governed by the continuity equation in (0 , T ) × R d ∂ t m ( t , x ) − div ( m ( t , x ) D x u ( t , x )) = 0 Note that in the cost criterion the evolution of the measure m enters as a parameter. The value function of the agent is then given by � T � 1 � X s | 2 + L ( X s , m ( s , X s )) 2 | ˙ inf ds + G ( X T , m ( T , X T )) (4) t

  11. DETERMINISTIC MEAN FIELD GAMES His optimal control is, at least heuristically, given in feedback form by α ∗ ( t , x ) = −∇ x u ( t , x ). Now, if all agents argue in this way, their repartition will move with a velocity which is due to the drift term ∇ x u ( t , x ). This leads eventually to the continuity equation.

  12. DETERMINISTIC MEAN FIELD GAMES We are therefore led to consider the following system of nonlinear evolution pde’s for the unknown functions u = u ( t , x ) , m = m ( t , x ): ∂ t + 1 − ∂ u 2 |∇ u | 2 = L ( x , m ) in (0 , T ) × R d (5) ∂ m in (0 , T ) × R d ∂ t − div ( m ∇ u ) = 0 (6) with the initial and terminal conditions in R d m (0 , x ) = m 0 ( x ) , u ( T , x ) = G ( x , m ( T , x )) (7)

  13. DETERMINISTIC MEAN FIELD GAMES Three crucial structural features: first equation backward , second one forward in time the operator in the continuity equation is the adjoint of the linearization at u of the operator in the HJ operator in the first equation nonlinearity in the HJB equation is convex with respect to |∇ u |

  14. DETERMINISTIC MEAN FIELD GAMES The planning problem An interesting variant of the MFG system proposed by Lions for modeling the presence of a regulator prescribing a target density to be reached at final time : ∂ u ∂ t + 1 2 |∇ u | 2 = L ( x , m ) in (0 , T ) × R d ∂ m in (0 , T ) × R d ∂ t − div ( m ∇ u ) = 0 with the initial and terminal conditions in R d m (0 , x ) = m 0 ( x ) ≥ 0 , m ( T , x ) = m T ( x ) , No side conditions on u . For L ≡ 0, the above is the equivalent formulation of Monge-Kantorovich optimal mass transport problem considered by Benamou-Brenier (2000), see also Achdou-Camilli-CD SIAM J. Control Optim. (2011).

  15. DETERMINISTIC MEAN FIELD GAMES Stochastic mean field game models Consider the following system (MFG ) of evolution pde’s: − ∂ u ∂ t − ν ∆ u + 1 2 |∇ u | 2 = L ( x , m ) in (0 , T ) × R d (8) ∂ m in (0 , T ) × R d ∂ t − ν ∆ m − div ( m ∇ u ) = 0 (9) with the initial and terminal conditions in R d m (0 , x ) = m 0 ( x ) , u ( T , x ) = G ( x , m ( T , x )) (10) ν is a positive number. First equation is a backward HJB , the second one a forward FP

  16. DETERMINISTIC MEAN FIELD GAMES The heuristic interpretation of this system is as follows. Fix a solution of MFG : classical dynamic programming approach to optimal control suggest that the solution u of (HJB) is the value function of an agent controlling the stochastic ODE √ dX t = α t dt + 2 ν dB t , X 0 = x where B t is a standard Brownian motion, i.e. � t √ X t = x + α s ds + 2 ν B t 0 The agent aims at minimizing the integral cost � � T � 1 � 2 | α s | 2 + L ( X s , m ( s ) � J ( x , α ) := E x ds + G ( X T , m ( T )) 0 considering the density m ( s ) of ”the other agents” as given.

  17. DETERMINISTIC MEAN FIELD GAMES Formal dynamic programming arguments indicate that the candidate optimal control for the agent should be constructed through the feedback strategy α ∗ ( t , x ) := −∇ u ( t , x ) where u is the unique solution of HJB for fixed m . Indeed, we have the simple verification result: Lemma Let X ∗ t be the solution of √ dX t = α ∗ ( t , X t ) dt + 2 ν dB t , X 0 = x and set α ∗ t := α ∗ ( t , X t ) . Then, � α J ( x , α ) = J ( x , α ∗ inf t ) = R d u (0 , X 0 ) dm 0 ( x ) Therefore, optimal control problem ”completely” solved by solving backward HJB , determining ∇ u ( t , x ) for all t and initial value u (0 , x )

  18. DETERMINISTIC MEAN FIELD GAMES Proof: Take ν = 1 for simplicity and let α t be any admissible control. Then, � � � � G ( X T , m ( T )) = E u ( X T , m ( T )) = E x [by Ito’s formula] � T � ∂ u ( s , X s ) � � � = E x u (0 , X 0 ) + + α s · ∇ u ( s , X s ) + ∆ u ( s , X s ) ds = ∂ t 0 [by HJB ] � T � 1 � � � 2 |∇ u ( s , X s ) | 2 + α s · ∇ u ( s , X s ) − F ( X s , m ( s )) = E x u (0 , X 0 )+ ≥ 0 [by convexity] � T ( − 1 � 2 | α s | 2 − L ( X s , m ( s ))) ds � ≥ E x u (0 , X 0 ) + 0

  19. DETERMINISTIC MEAN FIELD GAMES Hence, by very definition of J , � � E x u (0 , x ) ≤ J ( α, x ) for any admissible control α . The same computation with α s replaced by α ∗ s gives an equality in the last step, proving that α J ( x , α ) = J ( x , α ∗ ) inf

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend