Turnpike property in finite-dimensional nonlinear optimal control - - PowerPoint PPT Presentation

turnpike property in finite dimensional nonlinear optimal
SMART_READER_LITE
LIVE PREVIEW

Turnpike property in finite-dimensional nonlinear optimal control - - PowerPoint PPT Presentation

Turnpike property in finite-dimensional nonlinear optimal control elat 1 Emmanuel Tr 1 Univ. Paris 6 (Labo. J.-L. Lions) et Institut Universitaire de France IHP , nov. 2014 E. Tr elat Turnpike in optimal control E. Tr elat Turnpike


slide-1
SLIDE 1

Turnpike property in finite-dimensional nonlinear optimal control

Emmanuel Tr´ elat1

  • 1Univ. Paris 6 (Labo. J.-L. Lions) et Institut Universitaire de France

IHP , nov. 2014

  • E. Tr´

elat Turnpike in optimal control

slide-2
SLIDE 2
  • E. Tr´

elat Turnpike in optimal control

slide-3
SLIDE 3

Turnpike property

The solution of an optimal control problem in large time should spend most of its time near a steady-state. In infinite horizon the solution should converge to that steady-state. Historically: discovered in econometry (Von Neumann points). The first turnpike result was discovered in 1958 by Dorfman, Samuelson and Solow, in view of deriving efficient programs

  • f capital accumulation, in the context of a Von Neumann

model in which labor is treated as an intermediate product. Paul Samuelson (1915–2009) Nobel Prize in Economic Science, 1970

  • E. Tr´

elat Turnpike in optimal control

slide-4
SLIDE 4

Turnpike property

The solution of an optimal control problem in large time should spend most of its time near a steady-state. In infinite horizon the solution should converge to that steady-state. Dorfman - Samuelson - Solow, 1958 Thus in this unexpected way, we have found a real normative significance for steady growth – not steady growth in general, but maximal von Neumann growth. It is, in a sense, the single most effective way for the system to grow, so that if we are planning long-run growth, no matter where we start, and where we desire to end up, it will pay in the intermediate stages to get into a growth phase of this kind. It is exactly like a turnpike paralleled by a network of minor roads. There is a fastest route between any two points; and if the origin and destination are close together and far from the turnpike, the best route may not touch the turnpike. But if origin and destination are far enough apart, it will always pay to get on to the turnpike and cover distance at the best rate of travel, even if this means adding a little mileage at either end. The best intermediate capital configuration is one which will grow most rapidly, even if it is not the desired one, it is temporarily optimal.

  • E. Tr´

elat Turnpike in optimal control

slide-5
SLIDE 5

Turnpike property

The solution of an optimal control problem in large time should spend most of its time near a steady-state. In infinite horizon the solution should converge to that steady-state. Turnpike theorems have been derived in the 60’s for discrete-time optimal control prob- lems arising in econometry (Mac Kenzie, 1963). Continous versions by Haurie for particular dynamics (economic growth models). See also Carlson Haurie Leizarowitz 1991, Zaslavski 2000. More recently, in biology: Rapaport 2005, Coron Gabriel Shang 2014; human locomo- tion: Chitour Jean Mason 2012. Linear heat and wave equations: Porretta Zuazua 2013. Rockafellar 1973, Samuelson 1972: saddle point feature of the extremal equations of

  • ptimal control.

Different point of view by Anderson Kokotovic (1987), Wilde Kokotovic (1972): exponential dichotomy property → hyperbolicity phenomenon.

  • E. Tr´

elat Turnpike in optimal control

slide-6
SLIDE 6

General nonlinear optimal control problem

f : I Rn × I Rm → I Rn

dynamics

R : I Rn × I Rn → I Rk, R = (R1, . . . , Rk)

terminal conditions

f 0 : I Rn × I Rm → I R

instantaneous cost

  • f class C2.

Optimal control problem (OCP)T For T > 0 fixed, find uT (·) ∈ L∞(0, T; I Rm) such that ˙ x(t) = f(x(t), u(t)) R(x(0), x(T)) = 0 min Z T f 0(x(t), u(t)) dt

Examples of terminal conditions R: point-to-point, point-to-free, periodic, ...

Optimal (assumed) solution: (xT (·), uT (·)).

  • E. Tr´

elat Turnpike in optimal control

slide-7
SLIDE 7

General nonlinear optimal control problem

Pontryagin maximum principle ⇒ ∃(λT (·), λ0

T ) = (0, 0) such that

˙ xT (t) = ∂H ∂λ (xT (t), λT (t), λ0

T , uT (t))

˙ λT (t) = − ∂H ∂x (xT (t), λT (t), λ0

T , uT (t))

∂H ∂u (xT (t), λT (t), λ0

T , uT (t)) = 0

where H(x, λ, λ0, u) = λ, f(x, u) + λ0f 0(x, u) Moreover we have transversality conditions „−λT (0) λT (T) « =

k

X

i=1

γi∇Ri(xT (0), xT (T)) (generic...) assumption made throughout: no abnormal ⇒ λ0

T = −1

  • E. Tr´

elat Turnpike in optimal control

slide-8
SLIDE 8

Static optimal control problem

Static optimal control problem min

(x,u)∈I Rn×I Rm

f(x,u)=0

f 0(x, u) Optimal (assumed) solution: (¯ x, ¯ u). Lagrange multipliers ⇒ (¯ λ, ¯ λ0) = (0, 0) such that f(¯ x, ¯ u) = 0 ¯ λ0 ∂f 0 ∂x (¯ x, ¯ λ, ¯ u) + D ¯ λ, ∂f ∂x (¯ x, ¯ λ, ¯ u) E = 0 ¯ λ0 ∂f 0 ∂u (¯ x, ¯ λ, ¯ u) + D ¯ λ, ∂f ∂u (¯ x, ¯ λ, ¯ u) E = 0 i.e. ∂H ∂λ (¯ x, ¯ λ, ¯ λ0, ¯ u) = 0 − ∂H ∂x (¯ x, ¯ λ, ¯ λ0, ¯ u) = 0 ∂H ∂u (¯ x, ¯ λ, ¯ λ0, ¯ u) = 0

H(x, λ, λ0, u) = λ, f(x, u) + λ0f 0(x, u)

(generic...) assumption made throughout: no abnormal ⇒ ¯ λ0

T = −1 (Mangasarian-Fromowitz)

  • E. Tr´

elat Turnpike in optimal control

slide-9
SLIDE 9

(OCP)T ˙ x(t) = f(x(t), u(t)) R(x(0), x(T)) = 0 min Z T f 0(x(t), u(t)) dt Static optimal control problem min

(x,u)∈I Rn×I Rm

f(x,u)=0

f 0(x, u) ˙ xT (t) = ∂H ∂λ (xT (t), λT (t), −1, uT (t)) ˙ λT (t) = − ∂H ∂x (xT (t), λT (t), −1, uT (t)) ∂H ∂u (xT (t), λT (t), −1, uT (t)) = 0 ∂H ∂λ (¯ x, ¯ λ, −1, ¯ u) = 0 − ∂H ∂x (¯ x, ¯ λ, −1, ¯ u) = 0 ∂H ∂u (¯ x, ¯ λ, −1, ¯ u) = 0 H(x, λ, λ0, u) = λ, f(x, u) + λ0f 0(x, u) (¯ x, ¯ λ, ¯ u): equilibrium point of the extremal equations

  • E. Tr´

elat Turnpike in optimal control

slide-10
SLIDE 10

It is expected that, in large time T, the optimal extremal solution (xT (·), λT (·), uT (·)) of (OCP)T approximately consists of 3 pieces:

1

short-time: (xT (0), λT (0), uT (0)) → (¯ x, ¯ λ, ¯ u) (transient arc)

2

long-time, stationary: (¯ x, ¯ λ, ¯ u)

3

short-time: (¯ x, ¯ λ, ¯ u) → (xT (T), λT (T), uT (T)) (transient arc)

  • E. Tr´

elat Turnpike in optimal control

slide-11
SLIDE 11

The main result

H∗# = ∂2H ∂ ∗ ∂# (¯ x, ¯ λ, −1, ¯ u) A = Hxλ − HuλH−1

uu Hxu,

B = Huλ, W = −Hxx + HuxH−1

uu Hxu.

Theorem (Tr´ elat Zuazua, JDE 2014) Huu < 0, W > 0 rank(B, AB, . . . , An−1B) = n (Kalman condition) (¯ x, ¯ λ) ”almost satisfies” the terminal + transversality conditions Then for T > 0 large enough: xT (t) − ¯ x + λT (t) − ¯ λ + uT (t) − ¯ u ≤ C1(e−C2t + e−C2(T−t)) ∀t ∈ [0, T] Moreover: E−A + A∗E− − E−BH−1

uu B∗E− − W = 0

minimal solution of Riccati E+A + A∗E+ − E+BH−1

uu B∗E+ − W = 0

maximal solution of Riccati C2 = − max{Re(µ) | µ ∈ Spec(A − BH−1

uu B∗E−)} > 0.

  • E. Tr´

elat Turnpike in optimal control

slide-12
SLIDE 12

Particular case: linear quadratic

(OCP)T ˙ x(t) = Ax(t) + Bu(t), x(0) = x0, x(T) = x1 min 1 2 Z T “ (x(t) − xd)∗Q(x(t) − xd) + (u(t) − ud)∗U(u(t) − ud) ” dt Static optimal control problem min

(x,u)∈I Rn×I Rm

Ax+Bu=0

1 2 “ (x − xd)∗Q(x − xd) + (u − ud)∗U(u − ud) ” ˙ xT (t) = AxT (t) + BU−1B∗λT (t) + Bud ˙ λT (t) = QxT (t) − A∗λT (t) − Qxd A¯ x + BU−1B∗¯ λ + Bud = 0 Q¯ x − A∗¯ λ − Qxd = 0

  • E. Tr´

elat Turnpike in optimal control

slide-13
SLIDE 13

Particular case: linear quadratic

(OCP)T ˙ x(t) = Ax(t) + Bu(t), x(0) = x0, x(T) = x1 min 1 2 Z T “ (x(t) − xd)∗Q(x(t) − xd) + (u(t) − ud)∗U(u(t) − ud) ” dt Static optimal control problem min

(x,u)∈I Rn×I Rm

Ax+Bu=0

1 2 “ (x − xd)∗Q(x − xd) + (u − ud)∗U(u − ud) ” ˙ xT (t) = AxT (t) + BU−1B∗λT (t) + Bud ˙ λT (t) = QxT (t) − A∗λT (t) − Qxd A¯ x + BU−1B∗¯ λ + Bud = 0 Q¯ x − A∗¯ λ − Qxd = 0 Theorem U > 0, Q > 0 rank(B, AB, . . . , An−1B) = n (Kalman condition) ⇒ xT (t) − ¯ x + λT (t) − ¯ λ + uT (t) − ¯ u ≤ C1(e−C2t + e−C2(T−t)) ∀t ∈ [0, T]

  • E. Tr´

elat Turnpike in optimal control

slide-14
SLIDE 14

Example in LQ case

Example ˙ x1(t) = x2(t), x1(0) = 0 ˙ x2(t) = −x1(t) + u(t), x2(0) = 0 (x(T) free ⇒ λ(T) = 0) min 1 2 Z T “ (x1(t) − 2)2 + (x2(t) − 7)2 + u(t)2” dt Optimal solution of the static problem: ¯ x2 = 0, ¯ x1 = ¯ u min

x2=0

x1=u

“ (x1 − 2)2 + (x2 − 7)2 + u2” whence ¯ x = (1, 0), ¯ u = 1, ¯ λ = (−7, 1)

  • E. Tr´

elat Turnpike in optimal control

slide-15
SLIDE 15

Example in LQ case

Example ˙ x1(t) = x2(t), x1(0) = 0 ˙ x2(t) = −x1(t) + u(t), x2(0) = 0 (x(T) free ⇒ λ(T) = 0) min 1 2 Z T “ (x1(t) − 2)2 + (x2(t) − 7)2 + u(t)2” dt

Oscillation of (x1(·), x2(·)) around the steady-state (1, 0)

  • E. Tr´

elat Turnpike in optimal control

slide-16
SLIDE 16

Example in control-affine case

Example ˙ x1(t) = x2(t), x1(0) = 1 ˙ x2(t) = 1 − x1(t) + x2(t)3 + u(t), x2(0) = 1 min 1 2 Z T “ (x1(t) − 1)2 + (x2(t) − 1)2 + (u(t) − 2)2” dt Optimal solution of the static problem: ¯ x2 = 0, 1 − ¯ x1 + ¯ x3

2 + ¯

u = 0 min

x2=0

1−x1+x3

2 +u=0

“ (x1 − 1)2 + (x2 − 1)2 + (u − 2)2” whence ¯ x = (2, 0) , ¯ u = 1, ¯ λ = (−1, −1)

  • E. Tr´

elat Turnpike in optimal control

slide-17
SLIDE 17

Example in control-affine case

Example ˙ x1(t) = x2(t), x1(0) = 1 ˙ x2(t) = 1 − x1(t) + x2(t)3 + u(t), x2(0) = 1 min 1 2 Z T “ (x1(t) − 1)2 + (x2(t) − 1)2 + (u(t) − 2)2” dt

Oscillation of (x1(·), x2(·)) around the steady-state (2, 0)

  • E. Tr´

elat Turnpike in optimal control

slide-18
SLIDE 18

Proof in the LQ case

A¯ x + BU−1B∗¯ λ + Bud = 0 Q¯ x − A∗¯ λ − Qxd = 0 i.e. „ A BU−1B∗ Q −A∗ « | {z } „¯ x ¯ λ « = „−Bud Qxd « M ˙ xT (t) = AxT (t) + BU−1B∗λT (t) + Bud ˙ λT (t) = QxT (t) − A∗λT (t) − Qxd δx(t) = xT (t) − ¯ x, δλ(t) = λT (t) − ¯ λ ( δ ˙ x(t) = A δx(t) + BU−1B∗ δλ(t) δ ˙ λ(t) = Q δx(t) − A∗ δλ(t) i.e. ˙ Z(t) = MZ(t) with Z(t) = „δx(t) δλ(t) «

  • E. Tr´

elat Turnpike in optimal control

slide-19
SLIDE 19

Proof in the LQ case

Shooting problem ˙ Z(t) = M Z(t) with Z(t) = „δx(t) δλ(t) « δx(0) = x0 − ¯ x, δx(T) = x1 − ¯ x ր δλ(0) unknown Key lemma M is Hamiltonian, i.e. M ∈ sp(n, I R) (Lie algebra of Sp(n, I R) symplectic matrices). Under Kalman (A, B): Spec(M) ⊂ I R \ {0} µ ∈ Spec(M) ⇒ −µ ∈ Spec(M)

  • E. Tr´

elat Turnpike in optimal control

slide-20
SLIDE 20

Proof in the LQ case

Proof (borrowed from Wilde-Kokotovic, 1972) E−A + A∗E− − E−BH−1

uu B∗E− − W = 0

minimal solution of Riccati (1) E+A + A∗E+ − E+BH−1

uu B∗E+ − W = 0

maximal solution of Riccati (2) P = „ In In E− E+ « ⇒ P−1MP = „A + BU−1B∗E− A + BU−1B∗E+ « Moreover: (2) − (1) ⇒ (E+ − E−)(A + BU−1B∗E+) + (A + BU−1B∗E−)∗(E+ − E−) = 0 E+ − E− invertible ⇒ Spec(A + BU−1B∗E+) = −Spec(A + BU−1B∗E−) Re ` Spec(A + BU−1B∗E−) ´ < 0 by the algebraic Riccati theory ⇒ conclusion

  • E. Tr´

elat Turnpike in optimal control

slide-21
SLIDE 21

Proof in the LQ case

Setting Z(t) = „ In In E− E+ « Z1(t), we get ˙ Z1(t) = „A + BU−1B∗E− A + BU−1B∗E+ « Z1(t) purely hyperbolic Z1(t) = „v(t) w(t) « ⇒ v′(t) = (A + BU−1B∗E−)v(t) → Re(eigenvalues) < 0 w′(t) = (A + BU−1B∗E+)w(t) → Re(eigenvalues) > 0 whence v(t) ≤ v(0)e−C2t w(t) ≤ w(T)e−C2(T−t) where C2 = − max{Re(µ) | µ ∈ Spec(A + BU−1B∗E−)} > 0.

(click on the figure to see time evolution)

  • E. Tr´

elat Turnpike in optimal control

slide-22
SLIDE 22

Consequences for the numerical computations

Direct methods (full discretization): initialization with the solution of the static problem ⇒ successful convergence Indirect method (shooting): solve ˙ z(t) = F(z(t)), G(z(0), z(T)) = 0 Usual implementation: z(0) unknown, tuned such that G(z(0), z(T)) = 0. Here we propose the following variant: z(0) ← − z(T/2) unknown − → z(T) backward forward integration integration

tuned s.t. G(z(0), z(T)) = 0

  • E. Tr´

elat Turnpike in optimal control

slide-23
SLIDE 23

Example in control-affine case

Example ˙ x1(t) = x2(t), x1(0) = 1 ˙ x2(t) = 1 − x1(t) + x2(t)3 + u(t), x2(0) = 1 min 1 2 Z T “ (x1(t) − 1)2 + (x2(t) − 1)2 + (u(t) − 2)2” dt Impossible to make converge the usual shooting method if T > 3 (explosive term + too high sensitivity) Easy convergence with the variant, ∀T

  • E. Tr´

elat Turnpike in optimal control

slide-24
SLIDE 24

Further comments

PDE framework: A. Porretta, E. Zuazua (2013) → linear heat and wave equations (quite different approach) Nonlinear PDE’s: to be done. Interaction with discretizations. Turnpikes in optimal design. Adiabatic theory. ˙ x(t) = Aσ(t)x(t) + b min

σ∈Σ

Z T “ x(t) − xd2 + Aσ(t)2” dt min

σ∈Σ

Aσx+b=0

“ x − xd2 + Aσ2” σ: shape (for instance)

  • E. Tr´

elat Turnpike in optimal control