a numerical method for mean field type problems
play

A numerical method for mean-field type problems Laurent Pfeiffer - PowerPoint PPT Presentation

Mean-field type problems Algorithm and application A numerical method for mean-field type problems Laurent Pfeiffer Institute for Mathematics and Scientific Computing, University of Graz Numerical methods for HJB equations in optimal control


  1. Mean-field type problems Algorithm and application A numerical method for mean-field type problems Laurent Pfeiffer Institute for Mathematics and Scientific Computing, University of Graz Numerical methods for HJB equations in optimal control and related fields Ricam, Linz, November 22nd, 2016

  2. Mean-field type problems Algorithm and application Introduction Goal: analysing and solving stochastic optimal control problems Specificity: the cost function is a function of the probability distribution of the state variable at the final time. Method: kind of gradient method. Application: risk-averse optimization.

  3. Mean-field type problems Algorithm and application 1 Mean-field type problems Fokker-Planck equation Problem formulation Optimality conditions 2 Algorithm and numerical example Algorithm Results

  4. Mean-field type problems Algorithm and application 1 Mean-field type problems Fokker-Planck equation Problem formulation Optimality conditions 2 Algorithm and numerical example Algorithm Results

  5. Mean-field type problems Algorithm and application Fokker-Planck equation Consider the stochastic differential equation (SDE) : d X t = f ( X t ) d t + σ ( X t ) d W t , X 0 = x 0 . with f : R n → R , σ : R n → R , ( W t ) t ≥ 0 a Brownian motion, and x 0 a random variable in R n with probability distribution m 0 . Let m ( t , · ) ∈ P ( R ) be the probability distribution of X t : � � � P X t ∈ Ω = 1 d m ( t , x ) , ∀ Ω ⊂ R . Ω Under assumptions: weak solution to the Fokker-Planck equation (FP) : n n ∂ x i ( mf i ) + 1 � � ∂ t m = − ∂ x i x j ( m σ i σ j ) . 2 i =1 i , j =1

  6. Mean-field type problems Algorithm and application Problem formulation Let U be the set of adapted control processes taking values in a given compact U . For all t ∈ [0 , T ], x ∈ R n , u ∈ U , let ( X t , x , u ) s ∈ [ t , T ] be solution to: s d X s = f ( X s , u s ) d s + σ ( X s , u s ) d W s , X t = x , where f : R n × U → R n and σ : R n × U → R n are given. Assumptions: ∃ L > 0, ∀ x , y ∈ R n , ∀ u , v ∈ U , | f ( x , u ) | + | σ ( x , u ) | ≤ L (1 + | x | + | u | ) , | f ( x , u ) − f ( y , v ) | + | σ ( x , u ) − σ ( y , v ) | ≤ L ( | y − x | + | v − u | ) . | X 0 | 2 � � Let X 0 be a random variable such that E < + ∞ .

  7. Mean-field type problems Algorithm and application Problem formulation For all u ∈ U , we denote by m u the probability distribution of X 0 , X 0 , u , for a fixed initial state X 0 . We aim at solving: T � m u � min u ∈U χ ( P ) where the cost χ : P ( R n ) → R is given. Remark: attempt of a PDE-constrained problem formulation: u :[0 , T ] × R n → U χ ( m ( T , · )) , min subject to:  ∂ t m ( t , · ) = − � n � � m ( t , · ) f i ( · , u ( t , · ))  i =1   � n + 1 � � m ( t , · ) σ i σ j ( · , u ( t , · )) i , j =1 ∂ x i x j 2   m (0 , · ) = L ( X 0 ) .  But well-posedness of the Fokker-Planck equation is not ensured.

  8. Mean-field type problems Algorithm and application Problem formulation Possible application: risk-averse optimization ( n = 1). Penalization of the variance: � � � � 2 � χ ( m ) = x d m ( x ) + ε x − y d m ( y ) d m ( x ) . R R R Conditional Value at Risk: 1 � CVaR β = x 1 x ≥ VaR β d m ( x ) 1 − β R � � � where: VaR β = sup z ∈ R | 1 x ≤ z d m ( x ) ≤ β . R

  9. Mean-field type problems Algorithm and application Optimality conditions Specific case: standard problems. Assume ∃ φ : R n → R s.t.: � φ ( X 0 , x 0 , u χ ( m u ) = R n φ ( x )d m u ( x ) = E � � ) . T The corresponding problem is solved by dynamic programming. φ ( X 0 , x 0 , u � � min u ∈U E ) . ( P ( φ )) T Theorem φ ( X t , x , u � � The value function: V ( t , x ) = min u ∈U E ) is the T solution to the Hamilton-Jacobi-Bellman (HJB) equation: ∇ V ( t , x ) ⊤ f ( x , u ) + 1 ∇ 2 V ( t , x ) σσ ⊤ ( x , u ) � � �� − ∂ t V ( t , x ) =min 2 tr u ∈ U V ( T , x ) = φ ( x ) . → Provides a characterization of the optimal control.

  10. Mean-field type problems Algorithm and application Optimality conditions General case. Theorem Assume the following: 1 χ is continuous for the Wasserstein d 1 -distance 2 χ is diff.: ∀ m 1 , m 2 ∈ P ( R n ) , ∃ D χ ( m 1 , · ) ∈ C ( R n , R ) s.t.: � � χ (1 − θ ) m 1 + θ m 2 − χ ( m 1 ) � � � − → D χ ( m 1 , x ) d m 2 ( x ) − m 1 ( x ) . θ θ → 0 We also assume: ∃ K > 0 , ∀ x ∈ R n , D χ ( m 1 , x ) ≤ K (1 + | x | 2 ) . u is a solution to P ( D χ ( m ¯ u )) . If ¯ u ∈ U is a solution to ( P ) , then ¯ Remark: The associated value function V ( t , x ) may be seen as a Lagrange multiplier for the Fokker-Planck equation.

  11. Mean-field type problems Algorithm and application Optimality conditions Let R be the set of reachable prob. distributions: { m u | u ∈ U} . Lemma The closure of R (for the d 1 -distance), cl ( R ) is convex. m = m ¯ u . By continuity of χ , Proof of the theorem. Let ¯ χ ( ¯ m ) = m ∈ cl( R ) χ ( m ) . inf By convexity of cl( R ), for all u ∈ U , for all θ ∈ [0 , 1], 0 ≤ χ ( θ m u + (1 − θ ) ¯ m ) − χ ( ¯ m ) � − → R n D χ ( ¯ m , x )d( m ( x ) − ¯ m ( x )) . θ θ → 0 m , X 0 , X 0 , ¯ m , X 0 , X 0 , u u � � � � Thus: E D χ ( ¯ ) ≤ E D χ ( ¯ ) . T T

  12. Mean-field type problems Algorithm and application 1 Mean-field type problems Fokker-Planck equation Problem formulation Optimality conditions 2 Algorithm and numerical example Algorithm Results

  13. Mean-field type problems Algorithm and application Algorithm Set k = 0, choose m 0 ∈ R , fix δ > 0. While ε ( m k ) > δ , do: 1 Backward phase (HJB): solve P ( D χ ( m k )), optimal sol.: u k . 2 Forward phase (FP): compute m = m u k . 3 Solve: min θ ∈ [0 , 1] χ ( θ m k + (1 − θ ) m ), solution: θ k . Set: m k +1 = θ k m k + (1 − θ k ) m . 4 Set k = k + 1. The criterion ε ( m ) is defined by: � R n D χ ( m , x )d( m ′ ( x ) − m ( x )) ≥ 0 . ε ( m ) = − inf m ′ ∈ cl( R ) u satisfies the optimality conditions iff ε ( m u ) = 0. Note that ¯ Remark: does not provide a feedback optimal solution.

  14. Mean-field type problems Algorithm and application Algorithm Theorem Assume that: ∃ K > 0 , ∀ m 1 , m 2 , m 3 , m 4 ∈ cl ( R ) , � � � d ( m 2 ( x ) − m 1 ( x )) ≤ Kd 1 ( m 1 , m 2 ) 2 D χ ( m 2 , x ) − D χ ( m 1 , x ) R n � � � D χ ( m 2 , x ) − D χ ( m 1 , x ) d ( m 4 ( x ) − m 3 ( x )) ≤ Kd 1 ( m 1 , m 2 ) . R n Then, the sequence ( m k ) k ∈ N generated by the method (without stopping criterion) possesses a limit point ¯ m such that ε ( ¯ m ) = 0 . Moreover, χ ( m k ) → χ ( ¯ m ) . Idea of proof. Inspired from gradient descent methods. There exist A > 0 and B > 0 such that: χ ( m k +1 ) − χ ( m k ) ≤ − min A ε ( m k ) , B ε ( m k ) 2 � � .

  15. Mean-field type problems Algorithm and application Algorithm Given φ 1 ,..., φ N : R n → R , and Ψ : R N → R , define: � � � χ ( m u ) =Ψ R n φ 1 ( x )d m u ( x ) , ..., � R n φ N ( x )d m u ( x ) � �� φ 1 ( X 0 , X 0 , u φ N ( X 0 , X 0 , u � � � = Ψ ) , ..., E ) . E T T Lemma Assume that Ψ is differentiable with a Lipschitz-derivative, assume that for some p ≥ 2 : | φ i ( x ) | (1 + | x | ) − p − | X 0 | p � � | x |→∞ 0 , → < + ∞ . E Then, the assumptions of the previous theorem are satisfied, with: � � � D χ ( m , x ) = � N R n φ 1 ( x ) dm u ( x ) , ... i =1 ∂ y i Ψ φ i ( x ) .

  16. Mean-field type problems Algorithm and application Algorithm Backward phase: Discretization of the SDE (Semi-Lagrangian scheme) with a controlled Markov chain Resolution of the HJB equation (discrete dynamic programming principle) Forward phase: Resolution of the FP equation (adjoint equation to the Markov chain → Chapman-Kolmogorov equation.) Remarks: Curse of dimensionality Computational effort in the backward phase.

  17. Mean-field type problems Algorithm and application Results Example considered: SDE: d X s = u s d s + d W s , X 0 = 0, with final time 1. Controls: u s ∈ U = [ − 1 , 1] Cost: χ ( m ) = d 2 ( m , m ref ), with: m ref = 1 3 ( δ − 2 + δ 0 + δ 2 ) . Discretization: Semi-Lagrangian scheme 100 × 5000 points in [0 , 1] × [ − 5 , 5], 20 points for the control Convergence: Iterations 0 10 20 30 40 50 χ ( m k ) 0 . 874 0 . 551 0 . 536 0 . 531 0 . 528 0 . 526 ε ( m k ) 0 . 43 0 . 043 0 . 030 0 . 020 0 . 030 0 . 025

  18. Mean-field type problems Algorithm and application Results 0.2 0.1 0 0.2 0.4 0.6 4 0.8 2 0 −2 Time 1 −4 Space Figure: Distribution along time

  19. Mean-field type problems Algorithm and application Results 1 0 −1 0 0.2 0.4 0.6 5 0.8 0 Time 1 −5 Space Figure: Control

  20. Mean-field type problems Algorithm and application Results 4 2 0 −2 0 0.5 4 2 0 −2 1 Time −4 Space Figure: Value function

  21. Mean-field type problems Algorithm and application Bibliography References: A. Bensoussan, J. Frehse, and P. Yam. Mean-field games and mean-field type control theory. Springer, 2013. L. Pfeiffer. Optimality conditions for mean-field type control problems. Preprint. L. Pfeiffer. Numerical methods for mean-field type optimal control problems. Pure and Applied Functional Analysis , 1(4):629-655, 2016. Thank you for you attention.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend