Mean-field type problems Algorithm and application
A numerical method for mean-field type problems Laurent Pfeiffer - - PowerPoint PPT Presentation
A numerical method for mean-field type problems Laurent Pfeiffer - - PowerPoint PPT Presentation
Mean-field type problems Algorithm and application A numerical method for mean-field type problems Laurent Pfeiffer Institute for Mathematics and Scientific Computing, University of Graz Numerical methods for HJB equations in optimal control
Mean-field type problems Algorithm and application
Introduction
Goal: analysing and solving stochastic optimal control problems Specificity: the cost function is a function of the probability distribution of the state variable at the final time. Method: kind of gradient method. Application: risk-averse optimization.
Mean-field type problems Algorithm and application
1 Mean-field type problems
Fokker-Planck equation Problem formulation Optimality conditions
2 Algorithm and numerical example
Algorithm Results
Mean-field type problems Algorithm and application
1 Mean-field type problems
Fokker-Planck equation Problem formulation Optimality conditions
2 Algorithm and numerical example
Algorithm Results
Mean-field type problems Algorithm and application
Fokker-Planck equation
Consider the stochastic differential equation (SDE): dXt = f (Xt) dt + σ(Xt) dWt, X0 = x0. with f : Rn → R, σ : Rn → R, (Wt)t≥0 a Brownian motion, and x0 a random variable in Rn with probability distribution m0. Let m(t, ·) ∈ P(R) be the probability distribution of Xt: P
- Xt ∈ Ω
- =
- Ω
1 dm(t, x), ∀Ω ⊂ R. Under assumptions: weak solution to the Fokker-Planck equation (FP): ∂tm = −
n
- i=1
∂xi(mfi) + 1 2
n
- i,j=1
∂xixj(mσiσj).
Mean-field type problems Algorithm and application
Problem formulation
Let U be the set of adapted control processes taking values in a given compact U. For all t ∈ [0, T], x ∈ Rn, u ∈ U, let (X t,x,u
s
)s∈[t,T] be solution to: dXs = f (Xs, us) ds + σ(Xs, us) dWs, Xt = x, where f : Rn × U → Rn and σ : Rn × U → Rn are given. Assumptions: ∃L > 0, ∀x, y ∈ Rn, ∀u, v ∈ U, |f (x, u)| + |σ(x, u)| ≤ L(1 + |x| + |u|), |f (x, u) − f (y, v)| + |σ(x, u) − σ(y, v)| ≤ L(|y − x| + |v − u|). Let X0 be a random variable such that E
- |X0|2
< +∞.
Mean-field type problems Algorithm and application
Problem formulation
For all u ∈ U, we denote by mu the probability distribution of X 0,X0,u
T
, for a fixed initial state X0. We aim at solving: minu∈U χ
- mu
(P) where the cost χ : P(Rn) → R is given. Remark: attempt of a PDE-constrained problem formulation: min
u:[0,T]×Rn→U χ(m(T, ·)),
subject to:
∂tm(t, ·) = − n
i=1
- m(t, ·)fi(·, u(t, ·))
- + 1
2
n
i,j=1 ∂xixj
- m(t, ·)σiσj(·, u(t, ·))
- m(0, ·) = L(X0).
But well-posedness of the Fokker-Planck equation is not ensured.
Mean-field type problems Algorithm and application
Problem formulation
Possible application: risk-averse optimization (n = 1). Penalization of the variance: χ(m) =
- R
x dm(x) + ε
- R
- x −
- R
y dm(y) 2 dm(x). Conditional Value at Risk: CVaRβ = 1 1 − β
- R
x1x≥VaRβ dm(x) where: VaRβ = sup
- z ∈ R |
- R
1x≤z dm(x) ≤ β
- .
Mean-field type problems Algorithm and application
Optimality conditions
Specific case: standard problems. Assume ∃φ : Rn → R s.t.: χ(mu) =
- Rn φ(x)dmu(x) = E
- φ(X 0,x0,u
T
)
- .
The corresponding problem is solved by dynamic programming. min
u∈U E
- φ(X 0,x0,u
T
)
- .
(P(φ)) Theorem The value function: V (t, x) = minu∈U E
- φ(X t,x,u
T
)
- is the
solution to the Hamilton-Jacobi-Bellman (HJB) equation: −∂tV (t, x) =min
u∈U
- ∇V (t, x)⊤f (x, u) + 1
2tr
- ∇2V (t, x)σσ⊤(x, u)
- V (T, x) = φ(x).
→ Provides a characterization of the optimal control.
Mean-field type problems Algorithm and application
Optimality conditions
General case. Theorem Assume the following:
1 χ is continuous for the Wasserstein d1-distance 2 χ is diff.: ∀m1, m2 ∈ P(Rn), ∃Dχ(m1, ·) ∈ C(Rn, R) s.t.:
χ
- (1 − θ)m1 + θm2
- −χ(m1)
θ − →
θ→0
- Dχ(m1, x)d
- m2(x)−m1(x)
- .
We also assume: ∃K > 0, ∀x ∈ Rn, Dχ(m1, x) ≤ K(1 + |x|2). If ¯ u ∈ U is a solution to (P), then ¯ u is a solution to P(Dχ(m¯
u)).
Remark: The associated value function V (t, x) may be seen as a Lagrange multiplier for the Fokker-Planck equation.
Mean-field type problems Algorithm and application
Optimality conditions
Let R be the set of reachable prob. distributions: {mu | u ∈ U}. Lemma The closure of R (for the d1-distance), cl(R) is convex. Proof of the theorem. Let ¯ m = m¯
- u. By continuity of χ,
χ( ¯ m) = inf
m∈cl(R) χ(m).
By convexity of cl(R), for all u ∈ U, for all θ ∈ [0, 1], 0 ≤ χ(θmu + (1 − θ) ¯ m) − χ( ¯ m) θ − →
θ→0
- Rn Dχ( ¯
m, x)d(m(x) − ¯ m(x)). Thus: E
- Dχ( ¯
m, X 0,X0,¯
u T
)
- ≤ E
- Dχ( ¯
m, X 0,X0,u
T
)
- .
Mean-field type problems Algorithm and application
1 Mean-field type problems
Fokker-Planck equation Problem formulation Optimality conditions
2 Algorithm and numerical example
Algorithm Results
Mean-field type problems Algorithm and application
Algorithm
Set k = 0, choose m0 ∈ R, fix δ > 0. While ε(mk) > δ, do:
1 Backward phase (HJB): solve P(Dχ(mk)), optimal sol.: uk. 2 Forward phase (FP): compute m = muk . 3 Solve: minθ∈[0,1] χ(θmk + (1 − θ)m), solution: θk.
Set: mk+1 = θkmk + (1 − θk)m.
4 Set k = k + 1.
The criterion ε(m) is defined by: ε(m) = − inf
m′∈cl(R)
- Rn Dχ(m, x)d(m′(x) − m(x)) ≥ 0.
Note that ¯ u satisfies the optimality conditions iff ε(mu) = 0. Remark: does not provide a feedback optimal solution.
Mean-field type problems Algorithm and application
Algorithm
Theorem Assume that: ∃K > 0, ∀m1, m2, m3, m4 ∈ cl(R),
- Rn
- Dχ(m2, x) − Dχ(m1, x)
- d(m2(x) − m1(x)) ≤ Kd1(m1, m2)2
- Rn
- Dχ(m2, x) − Dχ(m1, x)
- d(m4(x) − m3(x)) ≤ Kd1(m1, m2).
Then, the sequence (mk)k∈N generated by the method (without stopping criterion) possesses a limit point ¯ m such that ε( ¯ m) = 0. Moreover, χ(mk) → χ( ¯ m). Idea of proof. Inspired from gradient descent methods. There exist A > 0 and B > 0 such that: χ(mk+1) − χ(mk) ≤ − min
- Aε(mk), Bε(mk)2
.
Mean-field type problems Algorithm and application
Algorithm
Given φ1,...,φN : Rn → R, and Ψ : RN → R, define: χ(mu) =Ψ
Rn φ1(x)dmu(x), ...,
- Rn φN(x)dmu(x)
- = Ψ
- E
- φ1(X 0,X0,u
T
)
- , ..., E
- φN(X 0,X0,u
T
)
- .
Lemma Assume that Ψ is differentiable with a Lipschitz-derivative, assume that for some p ≥ 2: |φi(x)|(1 + |x|)−p − →
|x|→∞ 0,
E
- |X0|p
< +∞. Then, the assumptions of the previous theorem are satisfied, with: Dχ(m, x) = N
i=1 ∂yiΨ Rn φ1(x)dmu(x), ...
- φi(x).
Mean-field type problems Algorithm and application
Algorithm
Backward phase: Discretization of the SDE (Semi-Lagrangian scheme) with a controlled Markov chain Resolution of the HJB equation (discrete dynamic programming principle) Forward phase: Resolution of the FP equation (adjoint equation to the Markov chain → Chapman-Kolmogorov equation.) Remarks: Curse of dimensionality Computational effort in the backward phase.
Mean-field type problems Algorithm and application
Results
Example considered: SDE: dXs = usds + dWs, X0 = 0, with final time 1. Controls: us ∈ U = [−1, 1] Cost: χ(m) = d2(m, mref), with: mref = 1
3(δ−2 + δ0 + δ2).
Discretization: Semi-Lagrangian scheme 100×5000 points in [0, 1] × [−5, 5], 20 points for the control Convergence: Iterations 10 20 30 40 50 χ(mk) 0.874 0.551 0.536 0.531 0.528 0.526 ε(mk) 0.43 0.043 0.030 0.020 0.030 0.025
Mean-field type problems Algorithm and application
Results
0.2 0.4 0.6 0.8 1 −4 −2 2 4 0.1 0.2 Space Time
Figure: Distribution along time
Mean-field type problems Algorithm and application
Results
0.2 0.4 0.6 0.8 1 −5 5 −1 1 Space Time
Figure: Control
Mean-field type problems Algorithm and application
Results
0.5 1 −4 −2 2 4 −2 2 4 Space Time
Figure: Value function
Mean-field type problems Algorithm and application
Bibliography
References:
- A. Bensoussan, J. Frehse, and P. Yam. Mean-field games and mean-field
type control theory. Springer, 2013.
- L. Pfeiffer. Optimality conditions for mean-field type control problems.
Preprint.
- L. Pfeiffer. Numerical methods for mean-field type optimal control
- problems. Pure and Applied Functional Analysis, 1(4):629-655, 2016.