Numerical strategies for efficient control of largescale particle - - PowerPoint PPT Presentation

numerical strategies for efficient control of large scale
SMART_READER_LITE
LIVE PREVIEW

Numerical strategies for efficient control of largescale particle - - PowerPoint PPT Presentation

Numerical strategies for efficient control of largescale particle systems Michael Herty IGPM, RWTH Aachen University. joint work with G. Albi, L. Pareschi, C. Ringhofer, S. Steffensen, M. Zanella. CIRM, Crowds: models and control, 2019 1 /


slide-1
SLIDE 1

Numerical strategies for efficient control of large–scale particle systems

Michael Herty IGPM, RWTH Aachen University. joint work with

  • G. Albi, L. Pareschi, C. Ringhofer, S. Steffensen, M. Zanella.

CIRM, Crowds: models and control, 2019

1 / 30

slide-2
SLIDE 2

Control of interacting particle systems i = 1, . . .

dxi(t) =

  • j

P(xi, xj)(xj − xi)dt + ui + σdWi

◮ Coupled system of ODEs /

SDEs

◮ xi, ui are state/control of ith

particle

◮ P is interaction kernel, e.g.

P(x, y) = χ(x − y ≤ ∆)

◮ P = cst used as opinion

formation model

Bounded confidence model, ui ≡ 0, Hegselmann/Krause

2 / 30

slide-3
SLIDE 3

Realistic example: Financial market model by Levy-Levy-Solomon

◮ Particles (investors) i have two portfolios (xi stocks/yi bonds).

S is the stock price, D > 0 dividend, and r > 0 interest rate. d dt xi = S′ + D S xi + ui, d dt yi = ryi − ui, S = 1 N

  • j

xj.

◮ Particles have a choice ui to invest in stocks ◮ Objective: Maximize total profit

3 / 30

slide-4
SLIDE 4

Problem setup: Optimal Control

Consider i = 1, . . . , N particles with additive control and quadratic regularization u = argmin ˜

u∈R

T ν 2 ˜ u(s)2 + h(X(s)))ds d dt xi = f (xi, X−i) + u,

◮ Notation: X = (xi)N i=1 and X−i = (xj)N j=1,j=i ◮ Simplifying setup: Single control u = u(t) for all particles but

similar results for u = ui possible

◮ Interest in open/closed loop control but no game theoretic

setting (see other talks in this conference)

◮ Interest in mean–field limit N → ∞

4 / 30

slide-5
SLIDE 5

Results in the Linear–Quadratic Case

Opinion formation model with constant P and quadratic cost f (xi, X−i) =

N

  • j=1

P(xj − xi), h(X) = 1 2N

N

  • i=1

x2

i

and it is a linear quadratic problem

u = argmin ˜

u∈R

T ν 2 ˜ u(s)2 + X(s)TMX(s)ds d dt X(t) = AX(t) + Bu(t)

Solution given by Riccati equation with K(t) ∈ RN×N − d dt K = KA + ATK − 1 ν K 1 K T + M, K(T) = 0 Bu(t) = −1 ν BBTK(t)X(t).

5 / 30

slide-6
SLIDE 6

Structure of K and mean–field limit

Matrices have symmetric structure Ai,i = a0, Ai,j = ad, Bi,j = 1, Mi,j = 1

N δi,j

Structure extends to Riccati equation − d dt K = KA + ATK − 1 ν K 1 K T + M, K(T) = 0 d dt X(t) = AX(t) − 1 ν BBTK(t)X(t)

  • Lemma. The solution to the matrix Riccati equation

K(t) ∈ RN×N fulfills for K ∈ R

  • BBTK(t)
  • i,j = 1

N K(t) − d dt K(t) = 1 − 1 ν K(t), K(T) = 0 Corresponding mean–field for probability measure µ(t) ∈ P(R) ∂tµ(t, x) + ∂x P(y − x) − 1 ν K(t)y

  • µ(t, y) dy µ(t, x)
  • = 0

6 / 30

slide-7
SLIDE 7

Long-term behavior of solutions

0 = ∂tµ(t, x) + ∂x P(y − x) − 1 ν K(t)y

  • µ(t, y) dy µ(t, x)
  • − d

dt K(t) = 1 − 1 ν K(t), K(T) = 0

◮ Moments m(t) =

  • xµ(t, x)dx and E(t) =
  • x2µ(t, x)dx

have the following asymptotic behavior m(t) → 0 at rate 1/√ν, E(t) → 0 at rate 2P

◮ Results generalize to problems with fixed desired state xd

7 / 30

slide-8
SLIDE 8

Nonlinear Case: Model–Predictive Control

d ds xi = f (xi, X−i) + u, u = argmin ˜

u

T ν 2 ˜ u2 + h(X))

  • ds

8 / 30

slide-9
SLIDE 9

Evolution of State under Control Action for Time Control Horizon N

9 / 30

slide-10
SLIDE 10

Model–Predictive Control on Single Time Horizon N = 2

d ds xi = f (xi, X−i) + u, u = argmin ˜

u

T ν 2 ˜ u2 + h(X))

  • ds

◮ Piecewise constant control u on time interval (t, t + ∆t) ◮ Discretized dynmics

xi(t + ∆t) = xi(t) + ∆t (f (xi, X−i) + ui) , u = argmin ˜

u∆t

ν 2 ˜ u2 + h(X(t + ∆t))

  • ◮ Optimization problem is solved explicitly

u = −∆t ν ∂xih(X(t) + O(∆t)) ≈ −∆t ν ∂xih(X(t)) + O(∆t)2

◮ Scaling of ν as ν ∆t yields closed loop system

d ds xi = f (xi, X−i) − 1 ν ∂xih(X)

10 / 30

slide-11
SLIDE 11

Mean–field limit of closed loop system

d ds xi = f (xi, X−i) − 1 ν ∂xih(X) 0 = ∂tµ + ∂x

  • f(x, µ) − 1

ν ∂xh(µ)

  • µ
  • Comparison of uncontrolled and controlled model and

h(x) =

1 2N

  • i x2

i

11 / 30

slide-12
SLIDE 12

Efficient computation of controlled particle systems

d ds xi = f (xi, X−i) + u, u = argmin ˜

u

T ν 2 ˜ u2 + h(X)

  • ds

◮ MPC approach at time t and horizon ∆t yields

u = − 1

ν ∂xih(X) ◮ Binary discretized interaction model with f in N = 2

xn+1

i

= xn

i + ∆t f bin(xn j , xn i ) − ∆t

ν ∂xihbin(xn

i , xn j ),

xn+1

j

= xn

j + ∆t f bin(xn i , xn j ) − ∆t

ν ∂xjhbin(xn

i , xn j ),

◮ Binary interaction model has same mean–field limit

12 / 30

slide-13
SLIDE 13

Sznadj’s model P = (1 − x2

i ) and h(X) = 1 2(xi − wd)2

−1 −0.5 0.5 1 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 w wd =0 T =1 exact k = ∞ k =1 k =0.5 −1 −0.5 0.5 1 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 w wd =-0.25 wd =0.5 T =1 exact k = ∞ k =1 k =0.5 −1 −0.5 0.5 1 2 4 6 8 10 12 14 16 18 20 w wd =0 T =2 exact k = ∞ k =1 k =0.5 −1 −0.5 0.5 1 5 10 15 20 25 w wd =-0.25 wd =0.5 T =2 exact k = ∞ k =1 k =0.5

Figure: Solution profiles at time T = 1 , first row, and T = 2, second row, for uncontrolled, mildly controlled case, strong controlled case. On the left: desired state is set to wd = 0, on the right wd = 0.5 for the strongly controlled case, and wd = −0.25 for the mildly controlled case.

13 / 30

slide-14
SLIDE 14

Model predictive control vs Riccati control results

Figure: Evolution of the mean

  • xf (t, x)dx for in the Riccati control case

(left) and the MPC case (right). Plots are in log–scale and for different penalization of the control ν. Left plot scales to 10−8, right to 10−0.55.

14 / 30

slide-15
SLIDE 15

Performance result for MPC measured by value function

V ∗(τ, Y ) = minu T

τ

h(X) + ν 2u2ds, x′

i (t) = f (xi(t), X−i(t)) + u ◮ V ∗ value function for optimal control u and initial data

X(τ) = Y

◮ MPC controlled dynamics (xMPC i

)′(t) = f (X MPC(t)) + uMPC and corresponding value function V MPC(τ, y) = T

τ

h(X MPC) + ν 2(uMPC)2ds

◮ Gr¨

une [2009]: There exists 0 < α < 1 such that V MPC(τ, y) ≤ 1 αV ∗(τ, y)

◮ α depends in particular on MPC horizon M and growth

conditions.

15 / 30

slide-16
SLIDE 16

Performance result independent of number of particles

◮ Result extends to the

mean–field under same assumptions as in finite dimensions

◮ Growth conditions are

fulfilled for example for the

  • pinion model

◮ α independent of number of

agents Quality of the estimate V MPC(τ, y) ≤ 1

αV ∗(τ, y).

16 / 30

slide-17
SLIDE 17

Computation of mean–field optimality conditions

MPC with horizon larger than one requires to solve d ds xi = f (xi, X−i) + u, u = argmin ˜

u

t+M∆t

t

ν 2 ˜ u2 + h(xi, X−i)

  • ds
  • r on mean–field level

u = argmin ˜

u

t+M∆t

t

  • h([µ])µdx + ν

2 ˜ u2dt 0 = ∂tµ + ∂x ((f (x, [µ])µ + u) µ)

◮ Derivation of consistent optimality systems on particle and

mean–field level

◮ Leading to suitable numerical discretizations of both systems

Particle system → Pontryagin’s maximum principle → Mean–field → Decomposition with conditional probabilities → Numerical scheme

17 / 30

slide-18
SLIDE 18

Particle system→ Pontryagin’s maximum principle

Link optimality systems for N-particle system for pairwise interactions (see E. Caines) with xi ∈ RK, u ∈ RM d dt xi = 1 N

  • j

p(xi, xj, u), u = argmin ˜

u

T 1 N

  • j

φ(xj, ˜ u)dt Pontryagin’s maximum principle and adjoint variable νi ∈ RK with zero terminal conditions − d dt νi + 1 N

  • j

∂1p(xi, xj, u)Tνi + ∂2p(xi, xj, u)Tνj + ∇xφ(xi, u) = 0 1 N2

  • i,j

νT

i ∂up(xi, xj, u) + 1

N

  • i

∇uφ(xi, u) = 0 Under suitable assumptions on ∇2

uuφ and ∇2 uup the control can be

expressed explicitly in terms of (x, ν).

18 / 30

slide-19
SLIDE 19

Pontryagin’s maximum principle → Mean–field

Under IID assumption we obtain by BBGKY hierarchy the mean–field of the PMP system as g = g(t, x, z)

∂tg(x, z, t) + divx

  • g(t, x, z)
  • g(y, w, t)p(x, y, u)dyw
  • −divz
  • g(x, z, t)
  • g(y, w, t)∂1p(x, y, u)Twdyw+

g(x, z, t)

  • g(y, w, t)∂2p(y, x, u)Tw
  • −divzf (x, z, t)∇xφ(x, u) = 0

◮ Kinetic density g depends on z corresponding to Lagrange multiplier

variable

◮ Goal is to derive equations for the optimality system to mean–field

control problem

◮ Multiplier depends on state, hence decompose

g(x, z, t) := µ(t, x)µc(z, x, t)

19 / 30

slide-20
SLIDE 20

Mean–field Equation and Conditional Probability

∂tg(x, z, t) + divx

  • g(t, x, z)
  • g(y, w, t)p(x, y, u)dyw
  • − divz
  • g(x, z, t)
  • g(y, w, t)∂1p(x, y, u)T wdyw+

g(x, z, t)

  • g(y, w, t)∂2p(y, x, u)T w
  • − divz f (x, z, t)∇x φ(x, u) = 0

◮ g(x, z, t) := µ(t, x)gc(z, x, t) and gc conditional probability

with

  • gc(z, x, t)dz = 1 ∀x.

◮ Proposition. Probability density µ = µ(x, t) and conditional

expectation µc(z, t) =

  • zgc(z, x, t)dz fulfill

0 = ∂tµ + divx (µGµ) 0 = ∂t(µcµ) + divx

  • µGµµT

c

  • + µ∂xG T

µ µc + µbµ + µ∇xφ

Gµ(x, t) =

  • µ(y, t)p(x, y, u)dy ∈ RK

bµ(x, t) =

  • µ(y, t)∂2p(y, x, u)Tµc(y, t)dy ∈ RK

◮ Equation for µ is mean–field limit of discrete dynamics

20 / 30

slide-21
SLIDE 21

Comparison of Decomposition and Control Problem

0 = ∂tµ + divx (µGµ) 0 = ∂t(µcµ) + divx

  • µGµµT

c

  • + µ∂xG T

µ µc + µbµ + µ∇xφ

Control problem in mean–field formulation

u = argmin˜

u

T

  • φ(x, u)µ(x, t)dx s.t. 0 = ∂tµ + divx (µGµ)

◮ Multiplier λ(t, x) to PDE constraint for µ is scalar and cost is

linear in φ (as L2−derivative)

◮ Link

λ(t, x) = −∇xµc(t, x)

◮ Rigorous: Differentiability of mean–field formulation in

measure space by variations as push–forward possible ψ

◮ Differential is then gradient of the vectorfield ∇xψ

21 / 30

slide-22
SLIDE 22

Summary of Mean–field Control Relations

Particle and mean–field control problem

(P) u = argmin ˜

u

T 1 N

  • j

φ(xi, u)dt s.t. d dt xi = 1 N

  • j

p(xi, xj, u) (MF) u = argmin˜

u

T

  • φ(x, u)µ(x, t)dx s.t. 0 = ∂tµ + divx (µGµ)

Mean–field g of PMP of (P) as conditional probability gc and L2−optimal control result are equivalent in the following sense:

g(t, x, z) = µ(t, x)gc(t, x, z), µc(t, x) =

  • zgc(t, x, z)dz

Then, ∇xλ(t, x) = µc(t, x) fulfills the L2−optimality conditions of (MF)

0 = ∂tµ + divx (µGµ) , 0 = −∂tλ − ∇xλTGµ −

  • µ(y, t)∇xλ(y, t)Tp(y, x, u)dy + φ

Lagrange multiplier λ in L2 is gradient of expected conditional probability of the mean–field density of PMP system.

22 / 30

slide-23
SLIDE 23

Numerical scheme

Mean–field formulation allows for consistent discretization

∂tg(x, z, t) + divx

  • g(t, x, z)
  • g(y, w, t)p(x, y, u)dyw
  • −divz
  • g(x, z, t)
  • g(y, w, t)∂1p(x, y, u)Twdyw+

g(x, z, t)

  • g(y, w, t)∂2p(y, x, u)Tw
  • −divzf (x, z, t)∇xφ(x, u) = 0

through weighted particles

g(x, z, t) = 1 N

  • i

δ(x − xi)gi(z, t), ρi(t) =

  • gi(z, t)dz, νi(t) =
  • zgi(z, t)dt

Then, the weighted particles fullfill

d dt xi(t) = 1 N

  • j

p(xi, xj, u)ρj(t), d dt ρj(t) = 0, − d dt νi(t) = + 1 N

  • j

∂1p(xi, xj, u)Tνi + ∂2p(xi, xj, u)Tνj + ∇xφ(xi, u)

23 / 30

slide-24
SLIDE 24

Properties of the scheme

◮ The decomposition g(t, x, z) = µ(t, x)gc(t, z, x) is a closure

relation

◮ In the derivation we required

  • zdivzBdz =
  • Bdz for some

function B. If the discretization is exact for the discretized system, the scheme is a consistent discretization of the closed mean–field system

◮ For

  • gi(z, 0)dz = 1 and
  • zgi(z, T)dz = 0 we recover PMP

by the discretization.

◮ Arbirtrary discretization schemes of the mean–field equation

  • nly differ by the discretization of initial and terminal

conditions

24 / 30

slide-25
SLIDE 25

Comutational results for Opinion Formation Model

◮ Pairwise interaction at rate α with other agents, interaction

rate β for control.

◮ Post interaction states xi∗ and relation

x∗

i = β

 αxi + (1 − α) 1 N

  • j

xj   + (1 − β)u

◮ Time–continuous ODE system with ω∆t probability for

interactions p(x, y, u) = ω(αβ − 1)xi + ω(1 − α)βxj + ω(1 − β)u d dt xi = 1 N

  • j

p(xi, xj, u),

◮ Objective functional given Φ(X) = 1 N

  • j

T

1 2(xi(t) − c)2dt

25 / 30

slide-26
SLIDE 26

Trajectories for Desired Opinion c = 0.2

Figure: Optimal trjectories for 100 particles. Control converges to fixed control u∗

26 / 30

slide-27
SLIDE 27

Possible extensions and further discussion

◮ MPC for mean–field games also possible for i = 1 : N

d dt xi = f (xi, X−i) + ui, ui = argmin ˜

u

T ν 2 ˜ u(s)2 + h(X(s))ds leads to control action ui = − 1

ν ∂xih(X) ◮ MPC with single time horizon able to stabilize stochastic

perturbed dynamics

27 / 30

slide-28
SLIDE 28

Summary of relations in game and control setting

28 / 30

slide-29
SLIDE 29

Thank you for your attention. Contact details. Michael Herty, herty@igpm.rwth-aachen.de

29 / 30

slide-30
SLIDE 30

Some references

◮ Modeling particle dynamics: Active particles. Vol. 1. Advances in

theory, models, and applications, 4998, Model. Simul. Sci. Eng. Technol., Birkhuser/Springer, Cham, 2017 Degond; Liu; Ringhofer: Evolution of wealth in a non-conservative economy driven by local Nash equilibria. Philos. Trans. R. Soc.

  • Lond. Ser. A Math. Phys. Eng. Sci. 372 (2014)

◮ Performance bounds MPC: Herty; Zanella: Performance bounds for

the mean-field limit of constrained dynamics. Discrete Contin. Dyn.

  • Syst. 37 (2017), no. 4.

Grne: Analysis and design of unconstrained nonlinear MPC schemes for finite and infinite dimensional systems, SIAM Journal on Control and Optimization, 48 (2009).

◮ MPC for kinetic equations: Albi; Herty; Pareschi: Kinetic

description of optimal control problems and applications to opinion

  • consensus. Commun. Math. Sci. 13 (2015)

◮ Sparse control: Kalise; Kunisch; Rao: Infinite horizon sparse optimal

  • control. J. Optim. Theory Appl. 172 (2017), no. 2.

30 / 30