Turnpike property in finite-dimensional nonlinear optimal control
Emmanuel Tr´ elat1
- 1Univ. Paris 6 (Labo. J.-L. Lions) et Institut Universitaire de France
IHP , nov. 2014
- E. Tr´
elat Turnpike in optimal control
Turnpike property in finite-dimensional nonlinear optimal control - - PowerPoint PPT Presentation
Turnpike property in finite-dimensional nonlinear optimal control elat 1 Emmanuel Tr 1 Univ. Paris 6 (Labo. J.-L. Lions) et Institut Universitaire de France IHP , nov. 2014 E. Tr elat Turnpike in optimal control E. Tr elat Turnpike
Emmanuel Tr´ elat1
IHP , nov. 2014
elat Turnpike in optimal control
elat Turnpike in optimal control
The solution of an optimal control problem in large time should spend most of its time near a steady-state. In infinite horizon the solution should converge to that steady-state. Historically: discovered in econometry (Von Neumann points). The first turnpike result was discovered in 1958 by Dorfman, Samuelson and Solow, in view of deriving efficient programs
model in which labor is treated as an intermediate product. Paul Samuelson (1915–2009) Nobel Prize in Economic Science, 1970
elat Turnpike in optimal control
The solution of an optimal control problem in large time should spend most of its time near a steady-state. In infinite horizon the solution should converge to that steady-state. Dorfman - Samuelson - Solow, 1958 Thus in this unexpected way, we have found a real normative significance for steady growth – not steady growth in general, but maximal von Neumann growth. It is, in a sense, the single most effective way for the system to grow, so that if we are planning long-run growth, no matter where we start, and where we desire to end up, it will pay in the intermediate stages to get into a growth phase of this kind. It is exactly like a turnpike paralleled by a network of minor roads. There is a fastest route between any two points; and if the origin and destination are close together and far from the turnpike, the best route may not touch the turnpike. But if origin and destination are far enough apart, it will always pay to get on to the turnpike and cover distance at the best rate of travel, even if this means adding a little mileage at either end. The best intermediate capital configuration is one which will grow most rapidly, even if it is not the desired one, it is temporarily optimal.
elat Turnpike in optimal control
The solution of an optimal control problem in large time should spend most of its time near a steady-state. In infinite horizon the solution should converge to that steady-state. Turnpike theorems have been derived in the 60’s for discrete-time optimal control prob- lems arising in econometry (Mac Kenzie, 1963). Continous versions by Haurie for particular dynamics (economic growth models). See also Carlson Haurie Leizarowitz 1991, Zaslavski 2000. More recently, in biology: Rapaport 2005, Coron Gabriel Shang 2014; human locomo- tion: Chitour Jean Mason 2012. Linear heat and wave equations: Porretta Zuazua 2013. Rockafellar 1973, Samuelson 1972: saddle point feature of the extremal equations of
Different point of view by Anderson Kokotovic (1987), Wilde Kokotovic (1972): exponential dichotomy property → hyperbolicity phenomenon.
elat Turnpike in optimal control
f : I Rn × I Rm → I Rn
dynamics
R : I Rn × I Rn → I Rk, R = (R1, . . . , Rk)
terminal conditions
f 0 : I Rn × I Rm → I R
instantaneous cost
Optimal control problem (OCP)T For T > 0 fixed, find uT (·) ∈ L∞(0, T; I Rm) such that ˙ x(t) = f(x(t), u(t)) R(x(0), x(T)) = 0 min Z T f 0(x(t), u(t)) dt
Examples of terminal conditions R: point-to-point, point-to-free, periodic, ...
Optimal (assumed) solution: (xT (·), uT (·)).
elat Turnpike in optimal control
Pontryagin maximum principle ⇒ ∃(λT (·), λ0
T ) = (0, 0) such that
˙ xT (t) = ∂H ∂λ (xT (t), λT (t), λ0
T , uT (t))
˙ λT (t) = − ∂H ∂x (xT (t), λT (t), λ0
T , uT (t))
∂H ∂u (xT (t), λT (t), λ0
T , uT (t)) = 0
where H(x, λ, λ0, u) = λ, f(x, u) + λ0f 0(x, u) Moreover we have transversality conditions „−λT (0) λT (T) « =
k
X
i=1
γi∇Ri(xT (0), xT (T)) (generic...) assumption made throughout: no abnormal ⇒ λ0
T = −1
elat Turnpike in optimal control
Static optimal control problem min
(x,u)∈I Rn×I Rm
f(x,u)=0
f 0(x, u) Optimal (assumed) solution: (¯ x, ¯ u). Lagrange multipliers ⇒ (¯ λ, ¯ λ0) = (0, 0) such that f(¯ x, ¯ u) = 0 ¯ λ0 ∂f 0 ∂x (¯ x, ¯ λ, ¯ u) + D ¯ λ, ∂f ∂x (¯ x, ¯ λ, ¯ u) E = 0 ¯ λ0 ∂f 0 ∂u (¯ x, ¯ λ, ¯ u) + D ¯ λ, ∂f ∂u (¯ x, ¯ λ, ¯ u) E = 0 i.e. ∂H ∂λ (¯ x, ¯ λ, ¯ λ0, ¯ u) = 0 − ∂H ∂x (¯ x, ¯ λ, ¯ λ0, ¯ u) = 0 ∂H ∂u (¯ x, ¯ λ, ¯ λ0, ¯ u) = 0
H(x, λ, λ0, u) = λ, f(x, u) + λ0f 0(x, u)
(generic...) assumption made throughout: no abnormal ⇒ ¯ λ0
T = −1 (Mangasarian-Fromowitz)
elat Turnpike in optimal control
(OCP)T ˙ x(t) = f(x(t), u(t)) R(x(0), x(T)) = 0 min Z T f 0(x(t), u(t)) dt Static optimal control problem min
(x,u)∈I Rn×I Rm
f(x,u)=0
f 0(x, u) ˙ xT (t) = ∂H ∂λ (xT (t), λT (t), −1, uT (t)) ˙ λT (t) = − ∂H ∂x (xT (t), λT (t), −1, uT (t)) ∂H ∂u (xT (t), λT (t), −1, uT (t)) = 0 ∂H ∂λ (¯ x, ¯ λ, −1, ¯ u) = 0 − ∂H ∂x (¯ x, ¯ λ, −1, ¯ u) = 0 ∂H ∂u (¯ x, ¯ λ, −1, ¯ u) = 0 H(x, λ, λ0, u) = λ, f(x, u) + λ0f 0(x, u) (¯ x, ¯ λ, ¯ u): equilibrium point of the extremal equations
elat Turnpike in optimal control
It is expected that, in large time T, the optimal extremal solution (xT (·), λT (·), uT (·)) of (OCP)T approximately consists of 3 pieces:
1
short-time: (xT (0), λT (0), uT (0)) → (¯ x, ¯ λ, ¯ u) (transient arc)
2
long-time, stationary: (¯ x, ¯ λ, ¯ u)
3
short-time: (¯ x, ¯ λ, ¯ u) → (xT (T), λT (T), uT (T)) (transient arc)
elat Turnpike in optimal control
H∗# = ∂2H ∂ ∗ ∂# (¯ x, ¯ λ, −1, ¯ u) A = Hxλ − HuλH−1
uu Hxu,
B = Huλ, W = −Hxx + HuxH−1
uu Hxu.
Theorem (Tr´ elat Zuazua, JDE 2014) Huu < 0, W > 0 rank(B, AB, . . . , An−1B) = n (Kalman condition) (¯ x, ¯ λ) ”almost satisfies” the terminal + transversality conditions Then for T > 0 large enough: xT (t) − ¯ x + λT (t) − ¯ λ + uT (t) − ¯ u ≤ C1(e−C2t + e−C2(T−t)) ∀t ∈ [0, T] Moreover: E−A + A∗E− − E−BH−1
uu B∗E− − W = 0
minimal solution of Riccati E+A + A∗E+ − E+BH−1
uu B∗E+ − W = 0
maximal solution of Riccati C2 = − max{Re(µ) | µ ∈ Spec(A − BH−1
uu B∗E−)} > 0.
elat Turnpike in optimal control
(OCP)T ˙ x(t) = Ax(t) + Bu(t), x(0) = x0, x(T) = x1 min 1 2 Z T “ (x(t) − xd)∗Q(x(t) − xd) + (u(t) − ud)∗U(u(t) − ud) ” dt Static optimal control problem min
(x,u)∈I Rn×I Rm
Ax+Bu=0
1 2 “ (x − xd)∗Q(x − xd) + (u − ud)∗U(u − ud) ” ˙ xT (t) = AxT (t) + BU−1B∗λT (t) + Bud ˙ λT (t) = QxT (t) − A∗λT (t) − Qxd A¯ x + BU−1B∗¯ λ + Bud = 0 Q¯ x − A∗¯ λ − Qxd = 0
elat Turnpike in optimal control
(OCP)T ˙ x(t) = Ax(t) + Bu(t), x(0) = x0, x(T) = x1 min 1 2 Z T “ (x(t) − xd)∗Q(x(t) − xd) + (u(t) − ud)∗U(u(t) − ud) ” dt Static optimal control problem min
(x,u)∈I Rn×I Rm
Ax+Bu=0
1 2 “ (x − xd)∗Q(x − xd) + (u − ud)∗U(u − ud) ” ˙ xT (t) = AxT (t) + BU−1B∗λT (t) + Bud ˙ λT (t) = QxT (t) − A∗λT (t) − Qxd A¯ x + BU−1B∗¯ λ + Bud = 0 Q¯ x − A∗¯ λ − Qxd = 0 Theorem U > 0, Q > 0 rank(B, AB, . . . , An−1B) = n (Kalman condition) ⇒ xT (t) − ¯ x + λT (t) − ¯ λ + uT (t) − ¯ u ≤ C1(e−C2t + e−C2(T−t)) ∀t ∈ [0, T]
elat Turnpike in optimal control
Example ˙ x1(t) = x2(t), x1(0) = 0 ˙ x2(t) = −x1(t) + u(t), x2(0) = 0 (x(T) free ⇒ λ(T) = 0) min 1 2 Z T “ (x1(t) − 2)2 + (x2(t) − 7)2 + u(t)2” dt Optimal solution of the static problem: ¯ x2 = 0, ¯ x1 = ¯ u min
x2=0
x1=u
“ (x1 − 2)2 + (x2 − 7)2 + u2” whence ¯ x = (1, 0), ¯ u = 1, ¯ λ = (−7, 1)
elat Turnpike in optimal control
Example ˙ x1(t) = x2(t), x1(0) = 0 ˙ x2(t) = −x1(t) + u(t), x2(0) = 0 (x(T) free ⇒ λ(T) = 0) min 1 2 Z T “ (x1(t) − 2)2 + (x2(t) − 7)2 + u(t)2” dt
Oscillation of (x1(·), x2(·)) around the steady-state (1, 0)
elat Turnpike in optimal control
Example ˙ x1(t) = x2(t), x1(0) = 1 ˙ x2(t) = 1 − x1(t) + x2(t)3 + u(t), x2(0) = 1 min 1 2 Z T “ (x1(t) − 1)2 + (x2(t) − 1)2 + (u(t) − 2)2” dt Optimal solution of the static problem: ¯ x2 = 0, 1 − ¯ x1 + ¯ x3
2 + ¯
u = 0 min
x2=0
1−x1+x3
2 +u=0
“ (x1 − 1)2 + (x2 − 1)2 + (u − 2)2” whence ¯ x = (2, 0) , ¯ u = 1, ¯ λ = (−1, −1)
elat Turnpike in optimal control
Example ˙ x1(t) = x2(t), x1(0) = 1 ˙ x2(t) = 1 − x1(t) + x2(t)3 + u(t), x2(0) = 1 min 1 2 Z T “ (x1(t) − 1)2 + (x2(t) − 1)2 + (u(t) − 2)2” dt
Oscillation of (x1(·), x2(·)) around the steady-state (2, 0)
elat Turnpike in optimal control
A¯ x + BU−1B∗¯ λ + Bud = 0 Q¯ x − A∗¯ λ − Qxd = 0 i.e. „ A BU−1B∗ Q −A∗ « | {z } „¯ x ¯ λ « = „−Bud Qxd « M ˙ xT (t) = AxT (t) + BU−1B∗λT (t) + Bud ˙ λT (t) = QxT (t) − A∗λT (t) − Qxd δx(t) = xT (t) − ¯ x, δλ(t) = λT (t) − ¯ λ ( δ ˙ x(t) = A δx(t) + BU−1B∗ δλ(t) δ ˙ λ(t) = Q δx(t) − A∗ δλ(t) i.e. ˙ Z(t) = MZ(t) with Z(t) = „δx(t) δλ(t) «
elat Turnpike in optimal control
Shooting problem ˙ Z(t) = M Z(t) with Z(t) = „δx(t) δλ(t) « δx(0) = x0 − ¯ x, δx(T) = x1 − ¯ x ր δλ(0) unknown Key lemma M is Hamiltonian, i.e. M ∈ sp(n, I R) (Lie algebra of Sp(n, I R) symplectic matrices). Under Kalman (A, B): Spec(M) ⊂ I R \ {0} µ ∈ Spec(M) ⇒ −µ ∈ Spec(M)
elat Turnpike in optimal control
Proof (borrowed from Wilde-Kokotovic, 1972) E−A + A∗E− − E−BH−1
uu B∗E− − W = 0
minimal solution of Riccati (1) E+A + A∗E+ − E+BH−1
uu B∗E+ − W = 0
maximal solution of Riccati (2) P = „ In In E− E+ « ⇒ P−1MP = „A + BU−1B∗E− A + BU−1B∗E+ « Moreover: (2) − (1) ⇒ (E+ − E−)(A + BU−1B∗E+) + (A + BU−1B∗E−)∗(E+ − E−) = 0 E+ − E− invertible ⇒ Spec(A + BU−1B∗E+) = −Spec(A + BU−1B∗E−) Re ` Spec(A + BU−1B∗E−) ´ < 0 by the algebraic Riccati theory ⇒ conclusion
elat Turnpike in optimal control
Setting Z(t) = „ In In E− E+ « Z1(t), we get ˙ Z1(t) = „A + BU−1B∗E− A + BU−1B∗E+ « Z1(t) purely hyperbolic Z1(t) = „v(t) w(t) « ⇒ v′(t) = (A + BU−1B∗E−)v(t) → Re(eigenvalues) < 0 w′(t) = (A + BU−1B∗E+)w(t) → Re(eigenvalues) > 0 whence v(t) ≤ v(0)e−C2t w(t) ≤ w(T)e−C2(T−t) where C2 = − max{Re(µ) | µ ∈ Spec(A + BU−1B∗E−)} > 0.
(click on the figure to see time evolution)
elat Turnpike in optimal control
Direct methods (full discretization): initialization with the solution of the static problem ⇒ successful convergence Indirect method (shooting): solve ˙ z(t) = F(z(t)), G(z(0), z(T)) = 0 Usual implementation: z(0) unknown, tuned such that G(z(0), z(T)) = 0. Here we propose the following variant: z(0) ← − z(T/2) unknown − → z(T) backward forward integration integration
tuned s.t. G(z(0), z(T)) = 0
elat Turnpike in optimal control
Example ˙ x1(t) = x2(t), x1(0) = 1 ˙ x2(t) = 1 − x1(t) + x2(t)3 + u(t), x2(0) = 1 min 1 2 Z T “ (x1(t) − 1)2 + (x2(t) − 1)2 + (u(t) − 2)2” dt Impossible to make converge the usual shooting method if T > 3 (explosive term + too high sensitivity) Easy convergence with the variant, ∀T
elat Turnpike in optimal control
PDE framework: A. Porretta, E. Zuazua (2013) → linear heat and wave equations (quite different approach) Nonlinear PDE’s: to be done. Interaction with discretizations. Turnpikes in optimal design. Adiabatic theory. ˙ x(t) = Aσ(t)x(t) + b min
σ∈Σ
Z T “ x(t) − xd2 + Aσ(t)2” dt min
σ∈Σ
Aσx+b=0
“ x − xd2 + Aσ2” σ: shape (for instance)
elat Turnpike in optimal control