Numerical approximation for optimal control problems via MPC and HJB - - PowerPoint PPT Presentation

numerical approximation for optimal control problems via
SMART_READER_LITE
LIVE PREVIEW

Numerical approximation for optimal control problems via MPC and HJB - - PowerPoint PPT Presentation

Numerical approximation for optimal control problems via MPC and HJB Giulia Fabrini Konstanz Women In Mathematics 15 May, 2018 G. Fabrini (University of Konstanz) Numerical approximation for OCP 1 / 33 Outline Outline Introduction and


slide-1
SLIDE 1

Numerical approximation for optimal control problems via MPC and HJB Giulia Fabrini

Konstanz Women In Mathematics

15 May, 2018

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 1 / 33

slide-2
SLIDE 2

Outline

Outline

1

Introduction and motivations

2

Hamilton-Jacobi-Bellman

3

Model Predictive Control

4

Numerical Tests

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 2 / 33

slide-3
SLIDE 3

Introduction and motivations

Outline

1

Introduction and motivations

2

Hamilton-Jacobi-Bellman

3

Model Predictive Control

4

Numerical Tests

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 3 / 33

slide-4
SLIDE 4

Introduction and motivations

Introduction and Motivations

Some history The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s. The theory of control analyzes the properties of controlled systems, i.e. dynamical systems on which we can act through a control. Aim Bring the system from an initial state to a certain final state satisfying some criteria. Several applications: Robotics, aeronautics, electrical and aerospace engineering, biological and medical field.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 4 / 33

slide-5
SLIDE 5

Introduction and motivations

Controlled dynamical system

  • ˙

y(t) = F(y(t), u(t)) (t > 0) y(0) = x (x ∈ Rn) Assumptions on the data u(·) ∈ U: control, where U := {u(·) : [0, +∞[→ U measurable} F : Rn × U → Rn is the dynamics, which satisfies:

F is continuous respect to (y, u) F is local bounded F is Lipschitz respect to y

⇒ ∃!yx(t, u), solution of the problem (Caratheodory Theorem).

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 5 / 33

slide-6
SLIDE 6

Introduction and motivations

The infinite horizon problem

Cost functional J(x, u) := ∞ L(yx(t, u), u(t))e−λtdt. where λ > 0 is the interest rate and L is the running cost. Goal We are interested in minimizing this cost functional. We want to find an

  • ptimal pair (y∗, u∗) which minimizes the cost functional.
  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 6 / 33

slide-7
SLIDE 7

Introduction and motivations

Two possible approaches

Open-loop control Control expressed as functions of time t (necessary condition - Pontryagin Minimum Principle or direct methods, e.g. gradient method). Remark: it cannot take into account errors in the real state of the system due to model errors or external disturbances. Feedback control Control expressed as functions of the state system (Dynamic Programming, Model Predictive Control) Remark: robust to external perturbations.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 7 / 33

slide-8
SLIDE 8

Hamilton-Jacobi-Bellman

Outline

1

Introduction and motivations

2

Hamilton-Jacobi-Bellman

3

Model Predictive Control

4

Numerical Tests

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 8 / 33

slide-9
SLIDE 9

Hamilton-Jacobi-Bellman

Value Function v(y0) := inf

u(·)∈U J(y0, u(·))

The value function is the unique viscosity solution of the Bellman equation associated to the control problem via Dynamic Programming. Dynamic Programming Principle v(x) = min

u∈U

t

t0

L(yx(s), u(s))e−λs ds + v(yx(t))e−λt

  • Hamilton-Jacobi-Bellman

λv(x) + max

u∈U {−f(x, u)∇v(x) − L(x, u)} = 0, x ∈ Ω, u ∈ U

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 9 / 33

slide-10
SLIDE 10

Hamilton-Jacobi-Bellman

Feedback Control

Given v(x) for any x ∈ Rn, we define u(y∗

x (t))) = arg min u∈U[F(x, u)∇v(x) + L(x, u)]

where y∗(t) is the solution of ˙ y∗(t) = F(y∗(t), u∗(t)), t ∈ (t0, ∞] y(t0) = x, Technical difficulties The bottleneck is the approximation of the value function v, however this remains the main goal since v allows to get back to feedback controls in a rather simple way.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 10 / 33

slide-11
SLIDE 11

Hamilton-Jacobi-Bellman

Numerical computation of the value function

The bottleneck of the DP approach is the computation of the value function, since this requires to solve a non linear PDE in high-dimension. This is a challenging problem due to the huge number of nodes involved and to the singularities of the solution. Another important issue is the choice of the domain. Several numerical schemes: finite difference, finite volume, Semi-Lagrangian (obtained by a discrete version of the Dynamic Programming Principle ).

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 11 / 33

slide-12
SLIDE 12

Hamilton-Jacobi-Bellman

Semi-Lagrangian discretization of HJB

These schemes are based on the direct discretization of the directional derivative f(x, u)∇v(x). Continuous Version λv(x) = − max

u∈U {−F(x, u) · Dv(x) − L(x, u)}

Semi-Discrete Approximation (Value Iteration) Making a discretization in time of the continuous control problem (e.g. using the Euler method): V k+1 = min

u∈U{e−λ∆tV k (x + ∆t F (x, u)) + ∆t L (xi, u)}

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 12 / 33

slide-13
SLIDE 13

Hamilton-Jacobi-Bellman

Semi-Lagrangian discretization of HJB

Fully discrete SL Value Iteration (VI) scheme V k+1 = T(V k) , for i = 1, . . . , N , with

  • T(V k)
  • i

≡ min

u∈U{βI1[V k](xi + ∆t F(xi, u)) + L(xi, u)}.

Fixed grid in Ω ⊂ Rn bounded, Steps ∆x. Nodes: {x1, . . . , xN} Stability for large time steps ∆t. Error estimation: (Falcone/Ferretti ’97) Slow convergence, since β = e−λ∆t → 1 when ∆t → 0 PROBLEM: Find a reasonable computational domain.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 13 / 33

slide-14
SLIDE 14

Hamilton-Jacobi-Bellman

Value Iteration for infinite horizon optimal control (VI)

Require: Mesh G, ∆t, initial guess V 0, tolerance ǫ.

1: while V k+1 − V k ≥ ǫ do 2:

for xi ∈ G do

3:

V k+1

i

= min

u∈U{e−λ∆tI

  • V k

(xi + ∆t F (xi, u)) + ∆t L (xi, u)}

4:

k = k + 1

5:

end for

6: end while

Remarks VI algorithm converges (slowly) for any initial guess V 0. We can provide an error estimate,

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 14 / 33

slide-15
SLIDE 15

Model Predictive Control

Outline

1

Introduction and motivations

2

Hamilton-Jacobi-Bellman

3

Model Predictive Control

4

Numerical Tests

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 15 / 33

slide-16
SLIDE 16

Model Predictive Control

Model Predictive Control

Dynamics:

  • ˙

y(t) = F(y(t), u(t)) y(0) = y0 t > 0 Infinite horizon cost functional: J∞(u(·)) = ∞ L(y(t; u, y0))e−λtdt Finite horizon cost functional: JN(u(·)) = tN

t0

L(y(t; u, y0))e−λtdt, tN

0 = t0 + N∆t, N ∈ N

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 16 / 33

slide-17
SLIDE 17

Model Predictive Control

MPC trajectories (in L. Grüne, J. Pannek, NMPC )

black=prediction (obtained with an open loop optimization) red= MPC closed loop, y(tn) = yµN(tn)

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 17 / 33

slide-18
SLIDE 18

Model Predictive Control

MPC METHOD solves infinite time horizon problem by means of iterative solution of finite horizon (N≥ 2) optimal control problems. min JN(u) s.t. u ∈ UN FEEDBACK CONTROL: µN (y(t)) = u∗(t) We obtain a closed loop representation by applying the map µN ˙ y = F

  • y(t), µN(y(t))
  • OPTIMAL...

TRAJECTORIES: y∗(t0), . . . , y∗(tN

0 )

CONTROLS: u∗(t0), . . . , u∗(tN

0 )

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 18 / 33

slide-19
SLIDE 19

Model Predictive Control

Advantages and disadvantages

HJB PRO

  • 1. Valid for all problems in any dimension, a-priori error estimates in

L∞.

  • 2. SL schemes can work on structured and unstructured grids.
  • 3. The computation of feedbacks is almost built in.

HJB CONS

  • 1. “Curse of dimensionality”
  • 2. Computational cost and huge memory allocations.

MPC PRO

  • 1. Easy to implement,short computational time.
  • 2. It can be applied to high dimensional problems.
  • 3. Feedback controls.

MPC CONS

  • 1. Approximate feedback just along one trajectory.
  • 3. With a short horizon → sub-optimal trajectory.
  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 19 / 33

slide-20
SLIDE 20

Model Predictive Control

IDEA Try to combine the advantages of the two methods in order to obtain an efficient algorithm. The approximation of the HJB equation needs to restrict the computational domain Ω to a subset of Rn. The choice of the domain is totally arbitrary. GOAL: To find a reasonable way to compute Ω QUESTION: How to compute the computational domain? SOLUTION: An inexact MPC solver may provide a reference trajectory for our

  • ptimal control problem.
  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 20 / 33

slide-21
SLIDE 21

Model Predictive Control

Localized DP Algorithm

Algorithm Start: Inizialization Step 1: Solve MPC and compute yMPC

y◦

for a given initial condition y0 Step 2: Compute the distance from yMPC

y0

via the Eikonal equation Step 3: Select the tube Ωρ with distance ρ with respect to yMPC

y0

Step 4: Compute the constrained value function vtube in Ωρ via HJB Step 5: Compute the optimal feedbacks and trajectory using vtube. End A posteriori error-estimate for the choice of ρ y∗(t) − yMPC(t) ≤ C eλT

σ ζ

yMPC(t) ∀t where ζ is a perturbation which is independent of y∗, C is a computable constant.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 21 / 33

slide-22
SLIDE 22

Numerical Tests

Outline

1

Introduction and motivations

2

Hamilton-Jacobi-Bellman

3

Model Predictive Control

4

Numerical Tests

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 22 / 33

slide-23
SLIDE 23

Numerical Tests

Heat equation

   yt(t, x) − ν∆y(t, x) = 0, a.e. in Q, ν ∂y

∂n(t, s, ν) = m i=1 ui(t)bi(s),

a.e. on Σ y(0, x, ν) = y◦(x), a.e. in Ω (1) where Q = (0, ∞) × Ω, Σ = (0, ∞) × Γ, Γ = ∂Ω, bi : Γ → R, given shape functions, ν ∈ R fixed parameter. Cost Functional min

u∈U J(u) = min u∈U

  • 1

2 ∞ e−λt y(t) − ¯ y2

L2(Ω) dt + 1

2

m

  • i=1

σi ui2

L2(0,∞)

  • where U := {u ∈ L2(0, +∞; Rm)|ua ≤ u(t) ≤ ub ∀t)}

Goal: find a control which steers the trajectories to a desired state.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 23 / 33

slide-24
SLIDE 24

Numerical Tests

Finite Element Formulation

Considering the weak formulation and using a FE scheme of piecewise linear basis functions ϕi, i = 1, . . . , N, we lead to: My′(t) = νAy(t) + Bu(t) for t ∈ (0, T], y(0) = y◦. (2) where A =

  • a(ϕj, ϕi)
  • 1≤i,j≤N ,

B =

  • bj, ϕiL2(Γ)
  • 1≤i≤N,1≤j≤M,

y(t) =

  • yN

i (t)

  • 1≤i≤N ,

y◦ =

  • yN
  • i
  • 1≤i≤N .

Proper Ortoghonal Decomposition (Kunisch, Hinze, Volkwein,...) We have to solve an optimal control problem for a large system of ODEs → POD decomposition allows to reduce the number of variables to approximate partial differential equations.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 24 / 33

slide-25
SLIDE 25

Numerical Tests

Proper Orthogonal Decomposition and SVD

Given snapshots Y = [y1, . . . , yn] ∈ RN×n with yi = y(t; ·). We look for an orthonormal basis {ψi}ℓ

i=1 in Rm with ℓ ≤ min{n, m} s.t.

J(ψ1, . . . , ψℓ) =

m

  • j=1

αj

  • yj −

  • i=1

yj, ψiψi

  • 2

L2(Ω)

=

d

  • i=ℓ+1

σ2

i

reaches a minimum where {αj}n

j=1 ∈ R+.

min J(ψ1, . . . , ψℓ) s.t.ψi, ψj = δij Singular Value Decomposition: Y = ΨΣV T. For ℓ ∈ {1, . . . , d = rank(Y)}, {ψi}ℓ

i=1 are called POD basis of rank ℓ.

Ansatz: y(x, t) ≈

  • i=1

yℓ

i (t)ψi(x).

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 25 / 33

slide-26
SLIDE 26

Numerical Tests

Reduced dynamics The system can be expressed as: ˙ y(t) = F

  • y(t), u(t)
  • for t > 0,

y(0) = y0, with F(y) = M−1(Ay + Bu + f) ∈ Rℓ, for y ∈ Rℓ, u ∈ Rm. Reduced Cost Functional Jℓ(u) = ∞ e−λt L

  • y(t), u(t)
  • dt

with: L(y, u) = 1 2

  • (y − ¯

y)⊤M(y − ¯ y) + σu⊤u

  • for (y, u) ∈ Rℓ × Rm.
  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 26 / 33

slide-27
SLIDE 27

Numerical Tests

Test 1

Dynamics and Cost Functional    yt(t, x) − ν∆y(t, x) = 0, a.e. in Q, ν ∂y

∂n(t, s, ν) = m i=1 ui(t)bi(s),

a.e. on Σ y(0, x, ν) = y◦(x), a.e. in Ω J(u) = 1 2 ∞ e−λt y(t) − ¯ y2

L2(Ω) dt + 1

2

m

  • i=1

σi ui2

L2(0,∞)

Parameters: Snapshots: Q = [0, 1] × Ω, Ω = [0, 1]2,ν = 1, µ = 0, m = 1, y◦(x) = 0χ[0,0.5] + 0.5χ[0.5,1] ∆x = 0.5 and ∆t = 0.02, u(t) ≡ 1, ℓ = 2 (number of POD basis). MPC: ∆tMPC = 0.02, N = 5, λ = 1, u(t) ∈ [−4, 4], ¯ y ≡ 1. HJB: ∆tHJB = 0.05, ∆ttraj = 0.02, # contr value= 21, # contr traj= 81

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 27 / 33

slide-28
SLIDE 28

Numerical Tests

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 28 / 33

slide-29
SLIDE 29

Numerical Tests

Figure: optimal trajectories for different intial profiles.

λ = 1 MPC N=5 HJB in Ωρ HJB in Ω CPU time [s] 37 72 198 J(u) 0.05 0.05 0.05 y(T) − ¯ yL2 0.1 0.02 0.01

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 29 / 33

slide-30
SLIDE 30

Numerical Tests

Test 2

Parameters: same as Test 1, different initial profile: y◦(x) = sin(πx).

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 30 / 33

slide-31
SLIDE 31

Numerical Tests

Figure: optimal trajectories for different intial profiles.

λ = 1 MPC N=5 HJB in Ωρ HJB in Ω CPU time [s] 72 88 205 J(u) 0.013 0.012 0.012 y(T) − ¯ yL2 0.02 0.003 0.003

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 31 / 33

slide-32
SLIDE 32

Numerical Tests

CONCLUSIONS Local version of dynamic programming approach for infinite horizon optimal control problems The coupling between MPC and DP methods can produce rather accurate results Computational speed-up FUTURE DIRECTIONS Advection term (using more POD basis) A posteriori-error estimate including the advection term in the equation.

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 32 / 33

slide-33
SLIDE 33

Numerical Tests

References

  • A. Alla, G. Fabrini, M. Falcone, Coupling MPC and DP methods for an efficient

solution of optimal control problems, Conference Proceedings of IFIP 2015.

  • M. Bardi, I. Capuzzo Dolcetta,Optimal control and viscosity solutions of

Hamilton-Jacobi-Bellman equations, Birkhäuser, 1997.

  • G. Fabrini, M. Falcone, S. Volkwein, Coupling MPC and HJB for the computation
  • f POD-basis feedback laws, to appear in ENUMATH proceedings, 2017.
  • M. Falcone, R. Ferretti, Semi-Lagrangian Approximation schemes for linear and

Hamilton-Jacobi equations, SIAM 2013.

  • L. Grüne, J. Pannek, Nonlinear Model Predictive Control, Springer London, 2011.
  • V. Gaitsgory, L. Grüne, N. Thatcher, Stabilization with discounted optimal control,

2014.

  • S. Volkwein, F. Trölzsch POD a-posteriori error estimates for linear-quadratic
  • ptimal control problems, Computational Optimization and Applications,

44:83-115, 2009.

  • S. Volkwein. Model Reduction using Proper Orthogonal Decomposition, Lecture

Notes, University of Konstanz, 2013.

THANK YOU FOR YOUR ATTENTION

  • G. Fabrini (University of Konstanz)

Numerical approximation for OCP 33 / 33