Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation

optimal control and dynamic programming
SMART_READER_LITE
LIVE PREVIEW

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Introduction We have seen in the two previous lectures (5, 6) that for stage decision problems with quadratic costs and linear models we can compute analytically


slide-1
SLIDE 1

4SC000 Q2 2017-2018

Optimal Control and Dynamic Programming

Duarte Antunes

slide-2
SLIDE 2

Introduction

  • We have seen in the two previous lectures (5, 6) that for stage decision

problems with quadratic costs and linear models we can compute analytically the costs-to-go of the DP algorithm and the optimal policy.

  • However, linear models and quadratic costs is one of the few cases

where this happens. In fact, we typically cannot run DP analytically. Example of slide 11, lecture 5:

  • In the next three lectures we will discuss three alternative methods and

their pros and cons:

  • Discretization (lecture 7)
  • Static optimization (lecture 7 and 8)
  • Approximate dynamic programming (lecture 9)

1

g2(x2) = ex2 xk+1 = xk + uk

1

X

k=0

x2

k + u2 k + g2(x2)

Cost: Model:

slide-3
SLIDE 3

Outline

  • Discretization
  • Introduction to static optimization
slide-4
SLIDE 4

Discretization

2

Stage decision problems Discrete optimization problems state and input discretization

  • Stage decision problems can be approximated by a process called discretization (also denoted by

sampling or quantization).

slide-5
SLIDE 5

Example

3

k ∈ {0, 1} xk+1 = xk + uk

1

X

k=0

x2

k + u2 k + g2(x2)

g2(x2) = x2

2

J0(x0) = 8 5x2 u0 = −3 5x0 u1 = −1 2x1 J1(x1) = 3 2x2

1

When we obtained (see lecture 5) the optimal policy and the optimal costs-to-go We will now recover these functions using an alternative method (discretization) and then apply it to the problem with a non-quadratic terminal cost.

g2(x2) = ex2

slide-6
SLIDE 6

Example: discretization

4

k ∈ {0, 1} xk+1 = xk + uk

1

X

k=0

x2

k + u2 k + g2(x2)

Stage-decision problem Discrete

  • ptimization

problem Discretization uk = δ¯ uk ¯ uk ∈ {−M, −M + 1 . . . , M} ¯ xk ∈ {−N, −N + 1 . . . , N} xk = δ¯ xk ¯ xk+1 = ¯ xk + ¯ uk

k ∈ {0, 1}

δ2(P1

k=0 ¯

x2

k + ¯

u2

k) + g2(δ¯

x2)

slide-7
SLIDE 7
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5
0.5 1 1.5 2 2.5 1(x1)
  • 5
5

x1

Discretization Optimal solution
  • 1.5
  • 1
  • 0.5
0.5 1 1.5 0(x0)
  • 2
  • 1
1 2

x0

Discretization Optimal solution 5 10 15 20 25 30 35 40

J1(x1)

  • 5
5

x1

Discretization Optimal solution 5 10 15 20 25 30 35 40

J1(x1)

  • 5
5

x1

Discretization Optimal solution 2 4 6 8 10 12

J0(x0)

  • 2
  • 1
1 2

x0

Discretization Optimal solution

Results

5

g2(x2) = x2

2

When we indeed recover (approximately) the optimal policy

J0(x0) = 8 5x2 u0 = −3 5x0 J1(x1) = 3 2x2

1

J2(x2) = x2

2

u1 = −1 2x1

N = 100 M = 50 δ = 0.05

slide-8
SLIDE 8

Results

6

Which encourages us to think that the optimal policy when can be (approximately) obtained with the same method

g2(x2) = ex2

This statement can be formalised (see LaValle’s book) but we do not pursue this here

  • 2
  • 1.5
  • 1
  • 0.5
0.5 1 1.5 0(x0)
  • 2
  • 1
1 2 x0 Discretization
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5
1(x1)
  • 4
  • 3
  • 2
  • 1
1 2 3 4 x1 Discretization 2 4 6 8 10 12 J0(x0)
  • 2
  • 1
1 2 x0 Discretization 5 10 15 20 25 30 35 40 45 J1(x1)
  • 5
5 x1 Discretization 500 1000 1500 2000 J2(x2)
  • 6
  • 4
  • 2
2 4 6 x2 Discretization

N = 100 M = 50 δ = 0.05

slide-9
SLIDE 9

Results

7

Note that the cost if the initial state is

0.5 1 1.5 2 2.5 3 3.5 J0(x0) 0.8 0.9 1 1.1 1.2 1.3 1.4 x0 Discretization
  • 1.4
  • 1.2
  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2
0.2 0(x0) 0.4 0.6 0.8 1 1.2 1.4 1.6 x0 Discretization
  • 0.5
  • 0.45
  • 0.4
  • 0.35
  • 0.3
  • 0.25
  • 0.2
  • 0.15
  • 0.1
  • 0.05
1(x1) 0.1 0.2 0.3 0.4 0.5 0.6 x1 Discretization

x0 = 1 2.64 and corresponds to an initial control leading to state u0 = −0.7 x1 = 0.3 We will (approximately) obtain these values later with a different method! and in turn to the control u1 = −0.45

slide-10
SLIDE 10

Discussion

  • We discuss next how to extend the discretisation method to the general case where the

state belongs to with the help of the following example.

  • Consider the following toy problem: compute the force to move a unitary mass 1 meter

along a flat surface from rest to rest in 1 second with minimum energy.

  • Later in the course we will learn the tools to find an optimal solution to this problem

which is , resulting in

  • To convert from continuous to discrete time we also need temporal discretization.

8

min R 1

0 u(t)2dt

u(t) = 6 − 12t v(t) = 6t − 6t2 x(t) = 3t2 − 2t3 ¨ z(t) = u(t) z(0) = 0 z(1) = 1 ˙ z(0) = ˙ z(1) = 0

Rn

slide-11
SLIDE 11

Discretization

9

Stage decision problems Continuous-time optimal control problems Discrete optimization problems state and input discretization temporal discretization

slide-12
SLIDE 12

Outline

  • Discretization
  • Digital control and temporal discretization
  • State and input discretization
  • Application: minimum energy control of a

vehicle

  • Introduction to static optimization
slide-13
SLIDE 13

10

Digital control

Control law / Digital algorithm Physical system

D/A A/D

Sensors Sampled State Control decisions Actuators State Actuation x(t) uk Discretization: system ”seen by the controller” xk+1 = fk(xk, uk) u(t) = uk, Sampling period τ

˙ x(t) = fc(t, x(t), u(t)) uk = µk(xk)

tk := kτ

t ∈ [tk, tk+1)

xk = x(tk)

slide-14
SLIDE 14

11

Temporal discretization

  • Solve the differential equation in

for an initial condition and a constant control input an evaluate at ˙ x(t) = fc(t, x(t), u(t)) t ∈ [tk, tk+1) x(tk) = xk u(t) = uk

  • For example if , use the variation of constants formula

˙ x(t) = Ax(t) + Bu(t)

x(t) = eA(t−tk)x(tk) + R t

tk eA(t−s)Bu(s)ds to conclude that

xk+1 = Adxk + Bduk

Ad = eAτ, Bd = R τ

0 eAsdsB

t = tk+1

t1 xk = x(tk) xk+1 = x(tk+1) = f(xk, uk) tk tk+1 t0 th

slide-15
SLIDE 15

12

Double integrator

Model Cost

 ˙ z(t) ˙ v(t)

  • =

 1 z(t) v(t)

  • +

0 1

  • u(t)

u(t) = uk, t ∈ [tk, tk+1) zk = z(tk) vk = v(tk) R 1

0 u(t)2dt = PK−1 k=0

R tk+1

tk

u2

kdt

tk+1 − tk = τ Kτ = 1 zk+1 vk+1

  • =

1 τ 1 zk vk

  • +

 τ 2

2

τ

  • uk

τ PK−1

k=0 u2 k

Initial and terminal conditions

z(0) = 0 z(1) = 1 ˙ z(0) = ˙ z(1) = 0 z0 = v0 = vK = 0 zK = 1

Optimal solution (can be obtained by LQR)

M11 M12 M21 M22

  • =

A −BR−1B|(A|)−1 (A|)−1 K λ0 = M −1

12

 1

  • uk = −R−1B|((A|)−1)k+1λ0
slide-16
SLIDE 16

13

Solution

τ = 0.2 τ = 0.1

time

0.2 0.4 0.6 0.8 1

  • 6
  • 4
  • 2

2 4 6

uk u(t)

time

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

xk x(t)

time

0.2 0.4 0.6 0.8 1 0.5 1 1.5

vk v(t)

time

0.2 0.4 0.6 0.8 1

  • 6
  • 4
  • 2

2 4 6

uk u(t)

time

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

xk x(t)

time

0.2 0.4 0.6 0.8 1 0.5 1 1.5 2

vk v(t)

slide-17
SLIDE 17

14

Solution

τ = 0.05 τ = 0.025

time

0.2 0.4 0.6 0.8 1

  • 6
  • 4
  • 2

2 4 6

uk u(t)

time

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

xk x(t)

time

0.2 0.4 0.6 0.8 1 0.5 1 1.5 2

vk v(t)

time

0.2 0.4 0.6 0.8 1

  • 6
  • 4
  • 2

2 4 6

uk u(t)

time

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

xk x(t)

time

0.2 0.4 0.6 0.8 1 0.5 1 1.5 2

vk v(t)

slide-18
SLIDE 18

15

Discussion on temporal discretization

R T

0 gc(t, x(t), u(t))dt = Ph−1 k=0

Z tk+1

tk

gc(t, x(t), u(t))dt | {z }

gk(xk,uk)

  • In the problem formulation, the state and input variables may already be penalized at

sampling times.

  • Cost may result from discretizing a continuous-time cost function

h−1

X

k=0

gk(x(tk), u(tk)) + gh(x(T)) =

h−1

X

k=0

gk(xk, uk) + gh(xh)

  • For non-linear systems discretization may be hard. A numerical method (e.g. Euler) is

typically used xk+1 = xk + τf(tk, xk, uk)

Cost function Dynamic model

  • We can also use the approximation

R T

0 gc(t, x(t), u(t))dt = Ph−1 k=0 τgc(kτ, xk, uk)

Temporal discretization

  • It is also useful in other contexts and we will exploit this later in the course.
slide-19
SLIDE 19

16

Discussion on discretization

  • For linear systems ˙

x(t) = Ax(t) + Bu(t) we need to compute matrix exponentials

xk+1 = Adxk + Bduk Ad = eAτ, Bd = R τ

0 eAsdsB

  • Recall how to compute a matrix exponential of a matrix *
  • 1. Find the spectral decomposition of , i.e.,

where is a matrix with (generalized) eigenvectors and is a block diagonal matrix with eigenvalues in its diagonal. If has eigenvalues then is diagonal, i.e., and

  • 2. Then where is easy to compute, e.g., if is diagonal:

A A A = UJU −1 U = [u1 u2 . . . un] ui J A n

J =      λ1 . . . λ2 . . . . . . ... ... ... . . . . . . λn     

J Aui = λiui eAt = UeJtU −1 eJt

eJt =      eλ1t . . . eλ2t . . . . . . ... ... ... . . . . . . eλnt     

*if you do not know, you can easily find tutorials online, or check the slides of the system theory course (TU/e)

J

slide-20
SLIDE 20

Outline

  • Discretization
  • Digital control and temporal discretization
  • State and input discretization
  • Application: minimum energy control of a

vehicle

  • Introduction to static optimization
slide-21
SLIDE 21

State space and input space

17

Rn Rm

X0 X1 g0(x0, u0) g1(x1, u1) gh−

1(xh− 1, uh− 1)

Stage 0 Stage 1 Stage h − 1 Stage h Xh−1 Xh gh(xh) x1 = f0(x0, u0) x2 = f1(x1, u1) xh =fh

− 1(xh − 1,uh − 1)

For a general stage decision problem

slide-22
SLIDE 22

18

Example: discretization

Input discretization State discretization zk vk uk ¯ uk ∈ {1, 2, . . . nu} ¯ vk ∈ {0, 1, 2, . . . nv} ¯ zk ∈ {0, 1, 2, . . . nz} ¯ xk ∈ {1, 2, . . . nx} nx = (nz + 1)(nv + 1) zk ∈ δz¯ zk ¯ xk = 1 + ¯ vk + (nv + 1)¯ zk vk ∈ δv¯ vk ˆ uk = ¯ uk − 1 − (nu − 1)/2 uk = δuˆ uk

slide-23
SLIDE 23

19

zk+1 vk+1

  • =

1 τ 1 zk vk

  • +

 τ 2

2

τ

  • uk

Example: model

zk vk zk vk Example δv = δuτ ¯ vk+1 = ¯ vk + ˆ uk ¯ zk+1 = ¯ zk + 2¯ vk + ˆ uk if (¯ zk, ¯ vk) = (3, 1) (¯ zk+1, ¯ vk+1) = (5, 1) ¯ xk = 1 + ¯ vk + (nv + 1)¯ zk ¯ xk = 2 + (nv + 1)3 ¯ xk+1 = 2 + (nv + 1)5 with this procedure we can obtain ¯ xk+1 = f(¯ xk, ¯ uk) ˆ uk = 0 ˆ uk = 0

δz = δuτ 2/2

slide-24
SLIDE 24

20

Example: cost and terminal conditions

Cost The terminal conditions can be imposed by making

τ PK−1

k=0 u2 k

ˆ uk = ¯ uk − 1 − (nu − 1)/2 τδ2

u

PK−1

k=0 ˆ

u2

k

Initial condition

Jh(¯ xh) = ( ∞ otherwise 0 if ¯ xh = 1 + 0 + (nv + 1)(nz) ¯ x0 = 1 + 0 + (nv + 1)(0) = 1

slide-25
SLIDE 25

21

Solution

δu = 0.5 τ = 0.2 τ = 0.2 δu = 1

slide-26
SLIDE 26

22

Solution

δu = 0.1 τ = 0.1 τ = 0.2 δu = 0.1

slide-27
SLIDE 27

23

Solution

δu = 0.1 τ = 0.05 τ = 0.05 δu = 0.2

slide-28
SLIDE 28

24

Matlab code

% parameters ------------------------------------------- dt = 0.05; K = round(1/dt); du = 0.2; dv = dt*du; dd = dt^2/2*du; nu = 2*(7/du)+1; nv = round(2/dv); nd = round(1/dd); nx = (nd+1)*(nv+1); % ------------------------------------------------------ % formulation ------------------------------------------ D = floor( ((1:1:nx)'-1)/(nv+1))*ones(1,nu); V = mod((1:1:nx)'-1,nv+1)*ones(1,nu); U = ones(nx,1)*(-floor((nu-1)/2):1:floor((nu-1)/2)); V_1 = min(max(V + U,0),nv); D_1 = min(max(D + 2*V + U,0),nd); X_1 = V_1 + (nv+1)*D_1 +1; M = X_1; C = ones(nx,1)*(-floor((nu-1)/2):1:floor((nu-1)/2)).^2; Jh = inf*ones(nx,1); Jh(nd*(nv+1)+1) = 1; %-------------------------------------------------------

Problem formulation

% DP --------------------------------------------------- [mu,J] = dp( M, C, Jh, K); % obtain u and x --------------------------------------- d = zeros(1,K+1); v = zeros(1,K+1); xbar = zeros(1,K+1); u = zeros(1,K); ubar = zeros(1,K); xbar(1) = 1; d(1) = 0; v(1) = 0; for k=1:K ubar(:,k) = mu(xbar(:,k),k); xbar(:,k+1) = M(xbar(:,k),ubar(:,k)); u(:,k) = (ubar(:,k)-(nu-1)/2-1)*du; d(:,k+1) = floor( (xbar(:,k+1)-1)/(nv+1) )*dd; v(:,k+1) = mod( (xbar(:,k+1)-1),nv+1 )*dv; end % ------------------------------------------------------ % cost-------------------------------------------------- dt*J(1,1)*du^2 % ------------------------------------------------------

Get policy with DP and run system

slide-29
SLIDE 29

25

Matlab code

DP function

hsol = 0.0001; tsol = 0:hsol:1; zsol = 3*tsol.^2-2*tsol.^3; vsol = 6*tsol - 6*tsol.^2; usol = 6-12*tsol; figure(1) kvec = [0:K]/K; subplot(1,3,1) stem(kvec(1:end-1),u) hold on plot(tsol,usol,'g') get(gca) xlabel('time') legend({'u_k','u(t)'}) set(gca,'FontSize',18), grid on subplot(1,3,2) stem(kvec,d) hold on plot(tsol,zsol,'g') xlabel('time') legend({'x_k','x(t)'}) set(gca,'FontSize',18), grid on subplot(1,3,3) stem(kvec,v) hold on plot(tsol,vsol,'g') xlabel('time') legend({'v_k','v(t)'}) set(gca,'FontSize',18), grid on set(1,'Position',[9 507 1426 293])

Plots

function [u_,J_] = dp( M, C, Jh, h) [n,~] = size(M); J = Jh; J_ = zeros(n,h); u_ = zeros(n,h); ind = 1:size(M,2); for k=h:-1:1 for i=1:n [J_(i,k),~] = min( C(i,:)' + J(M(i,:)) ); c = ind(J_(i,k) == (C(i,:)' + J(M(i,:)))); u_(i,k) = c(end); end J = J_(:,k); end

slide-30
SLIDE 30

Outline

  • Discretization
  • Digital control and temporal discretization
  • State and input discretization
  • Application: minimum energy control of a

vehicle

  • Introduction to static optimization
slide-31
SLIDE 31

26

Problem formulation

Consider that we wish to minimize the energy of a vehicle traveling from rest at point A to rest at point B a distance D away in a given time H by choosing the engine torque A simple power model considered e.g. in [1, page 5] is the following B A

[1] Optimal control of Hybrid Vehicles, Bram de Jager, Thijs van Keulen, John Kessels, Springer 20113

Pw = mv(t)˙ v(t) + mgv(t) sin(α(d(t))) + c0mgv(t) cos(α(d(t))) + c2v(t)3

inertial power gravitational power rolling losses aerodynamic losses power breaks power motor

Pw = Pm + Pk v -velocity

  • distance

from A

d α -road angle

  • mass
  • gravity
  • constants

m g c0, c2

slide-32
SLIDE 32

27

Problem formulation

The energy is obtained by integrating the power which is a function of the state (velocity is proportional to rotational velocity ) and input (motor torque ) Dynamic model (derived from the power model)

T(t) 1v>0 = ( 1 if v > 0 0 if v = 0 ˙ v(t) = −g sin(α(d(t))) − c0g cos(α(d(t)))1v>0 − c2v(t)2 + c1T(t) c1 - constant

  • motor torque

˙ d(t) = v(t) x(t) =  d(t) v(t)

  • Example ([1] Electric

power as a function

  • f the electric

machine torque and rotational velocity

P

ω T J = R H

0 P(T(t), v(t))dt

slide-33
SLIDE 33

28

Approach

The approach is similar to the previous example:

  • 1. Discretize the plant using Euler approximation
  • 2. Discretize the state and input spaces as before
  • 3. Compute the dynamic model in the discretized space

 dk+1 vk+1

  • =

 dk + τvk vk + τ(−g sin(α(dk)) − c0g cos(α(dk))1vk>0 − c2v2

k + c1uk)

  • f(xk, uk)

xk+1 |{z}

|{z}

xk = x(tk) T(t) = uk, t ∈ [tk, tk+1) tk = kτ

vk vk

(Differently from the previous case an approximation is now needed) dk dk

slide-34
SLIDE 34

29

Approach

  • 4. Discretize the cost (assuming for simplicity )
  • 5. Set the terminal cost to be such only states closed to the desired final state (distance L

and zero velocity) can be achieved (note that due to approximations it might not be possible to achieve the final state exactly). Results of this approach in the context of optimal driving for an electric bus can be found in the MSc thesis ‘To identify the potential gains of an optimal driving strategy in an electric bus’

  • A. R. Shenoy

TU/e, 2016 J = R H

0 P(T(t), v(t))dt

H = Kτ ≈ τ PK−1

k=0 P(uk, vk)

  • vk

dk L JH(xH) = 8 > < > : 0 if kxH D

  • k2 < ✏

1, otherwise small constant ✏− A very good reference highlighting applications in automotive is: Vehicle propulsion systems, Lino Guzzella, Antonio Sciarretta (Appendix III), 2013

slide-35
SLIDE 35

Discussion

30

  • One can always use discretization for a general stage decision problem although

might be hard to circumvent some issues (as it cannot be possible to obtain the final desired state).

  • Is then DP so powerful that we can solve any stage decision problem? (note that

most digital control problem are stage decision problems).

  • Bellman called this the curse of dimensionality.
  • The answer is no: the discretization of state and inputs spaces is only

computationally feasible when these spaces have a small dimension.

  • Suppose that and and we discretize each component of the state

with values and each component of the input with values. Then we need around states per stage and around decisions (arrows) per state.

  • For example for the control of a quadcopter and . Picking

we would need states and decisions per state per stage! M N

xk ∈ Rn

uk ∈ Rm

N n M m

n = 12 m = 4 N = M = 100 N n = 1024 M m = 108

slide-36
SLIDE 36

Outline

  • Discretization
  • Introduction to static optimization
slide-37
SLIDE 37

31

Motivation

We can write the problem we started with

x1 = x0 + u0 x2 = x1 + u1 = x0 + u0 + u1 x2

0 + u2 0 + x2 1 + u2 1 + ex2 = x2 0 + u2 0 + (x0 + u0)2 + u2 1 + e(x0+u0+u1)

min

y∈Rn

c(y)

k ∈ {0, 1} xk+1 = xk + uk

1

X

k=0

x2

k + u2 k + g2(x2)

g2(x2) = ex2

x0 = 1

as an optimisation problem In fact, writing We can reformulate the cost which takes the desired form by making

y = (u0, u1)

slide-38
SLIDE 38

Outline

  • Discretization
  • Introduction to static optimization
  • Unconstrained optimization
  • Constrained optimization
  • Numerical methods
slide-39
SLIDE 39

Unconstrained optimization

32

min

y∈Rn

c(y)

is a:

  • local minimum if there exists such that for all such that we have
  • strict minimum if holds with strict inequality.
  • global minimum if holds for every .

Local, strict, and global maximum are defined similarly. y∗ ∈ Rn ✏ > 0 (1) (1) y ∈ Rn y ∈ Rn (1) Global minimum Local minimum (not strict) local maximum y c(y) ky y∗k < ✏ c(y∗) ≤ c(y)

slide-40
SLIDE 40

First order conditions

33

If , a necessary condition for to be a minimum is that the gradient of at is zero

rc(y∗)

Notation

  • with some abuse of notation shall indicate
  • when convenient we will use or
  • Note that by convention the gradient is a row vector.

rc(y)|y=y∗

c ∈ C1 y∗ rc(y∗) = 0 c y∗ that is, the variables must satisfy the equations n (y∗

1, . . . , y∗ n)

n

cy(y∗) ⌘ rc(y∗)

∂c ∂y(y∗) ⌘ rc(y∗)

rc(y∗) = h

∂ ∂y1 c(y∗ 1, . . . , y∗ n)

. . .

∂ ∂yn c(y∗ 1, . . . , y∗ n)

i = ⇥0 . . . 0⇤

slide-41
SLIDE 41

Example

34

2 1.5 1 0.5
  • 0.5
  • 1
  • 1.5
  • 2
2 1.5 1 0.5
  • 0.5
  • 1
  • 1.5
  • 2
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

c(y) = y1e−y2

1−y2 2

y1

y2

rc(y) = e−y2

1−y2 2 ⇥

2y2

1 + 1

2y1y2 ⇤

rc(y) = 0 (y1, y2) = ( 1

√ 2, 0)

(y1, y2) = (− 1

√ 2, 0) minimum

maximum when

slide-42
SLIDE 42

Second order conditions

35

r2c(y) =   

d2c dy2

1 (y)

. . .

d2c dy1dyn (y)

. . . . . . . . .

d2c dyndy1 (y)

. . .

d2c dy2

n (y)

   If , a necessary condition for to be a local minimum is that the Hessian is positive semi-definite y∗ c ∈ C2 r2c(y∗) > 0 r2c(y∗) 0 If a sufficient condition for to be a strict local minimum is y∗ y y y y c(y) = y2 c(y) = y3 c(y) = −y2 c(y) = −y4 r2c(y) > 0 r2c(y) < 0 r3c(y) = 0 r4c(y) = 0

Examples

c ∈ C2

Sufficient conditions Necessary conditions

slide-43
SLIDE 43

Convex functions

36

y A function is convex if for every , , , we have c : Rn → R c(αy1 + (1 − α)y2) ≤ αc(y1) + (1 − α)c(y2) y1 ∈ Rn y2 ∈ Rn α ∈ [0, 1] y1 y2 c(y) αc(y1) + (1 − α)c(y2)

  • If then convexity is equivalent to

c ∈ C2

  • For convex functions , is (necessary and) sufficient for to

be a global minimum. c ∈ C1 rc(y∗) = 0 y∗ r2c(y) 0, 8y 2 Rn.

slide-44
SLIDE 44

Outline

  • Discretization
  • Introduction to static optimization
  • Unconstrained optimization
  • Constrained optimization
  • Numerical methods
slide-45
SLIDE 45

Constrained optimization

37

min

y∈Rn c(y)

s.t. h(y) = 0 y = (x, u)

  • The restriction constraints in general of the degrees of freedom.

h(y) = (h1(y), h2(y), . . . , hp(y)) p n

  • Then we can write where , , are free variables

and are determined by (at least implicitly if )

  • This leads to an unconstrained problem

u ∈ Rm x ∈ Rp u x = β(u)

∂h(x,u) ∂u

6= 0 h(x, u) = 0 min

u∈Rm c(β(u), u)

m = n − p

  • We assume that h ∈ C1
slide-46
SLIDE 46

Example

38

u x u c(x, u) = 0.5 + x2 + u2 x = β(u) = 1 + 0.5u

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2 1 2 3 4 5 6 7 8 9

  • 2
  • 1

1 2 2 1

  • 1

8 6 4 2

  • 2

c(β(u), u) = 0.5 + (1 + 0.5u)2 + u2 u∗ = − 2

5

minimum h(x, u) = 1 + 0.5u − x = 0 x∗ = 4

5

slide-47
SLIDE 47

Discussion

39

  • Restrictions are typically easy to describe but after the

parameterization the unconstrained problem may be hard to solve.

  • For example for the simple optimization problem
  • We describe next the method of Lagrange multipliers which does not

require parameterizations. we would need to parameterize the sphere (several charts would be needed) and solve “difficult problems” such as

y = (cos(φ) cos(θ), cos(φ) sin(θ), sin(φ)) min

θ∈[π,π)),φ∈[ π

2 , π 2 ) c|y

1
  • 1
  • 1
1 1 0.5
  • 0.5
  • 1

y1 y2 y3

min

y∈R3 a|y

s.t. kyk2 = 1

slide-48
SLIDE 48

Preliminaries

40

  • The restriction defines a manifold .
  • The derivative of a curve , at

belongs to the tangent space at , denoted by

  • The rows of the Jacobian matrix of at ,

belong to the normal space at In fact, which implies that

  • Actually the rows of span the normal space at .

h(y) = 0 M ⊂ Rn γ(t) ∈ M γ(0) = y0 t = 0 y0 y0 h(γ(t)) = 0 Ty0(M) y0

Ty0(M)

y0

γ(t)

d dt |t=0h(γ(t)) = rh(p0)γ0(t) = 0

γ0(t)

γ0(t)

rhi(y0) ? Ty0(M) rh(y0) rh(y0) =    rh1(y0) . . . rhm(y0)    y0 rh(y0)

slide-49
SLIDE 49

Example

41

γ(t) = (cos(t), sin(t)) γ0(t) = (− sin(t), cos(t)) γ( π

4 ) = ( √ 2 2 , √ 2 2 )

y2

1 + y2 2 = 1

h(y) = y2

1 + y2 2 − 1

rh(y) = 2(y1, y2) rh(

√ 2 2 , √ 2 2 ) = (

p 2, p 2) γ0( π

4 ) = (− p 2 2 , p 2 2 )

slide-50
SLIDE 50

Geometric optimality conditions

42

  • If is a local minimum then

for any curve , , i.e.,

  • Since the rows of are a basis of the normal space at , we

must have that is a linear combination of these rows. γ(t) ∈ M

M

y∗ ∈ M d dtc(γ(t))|t=0 = rc(y⇤)γ0(t) = 0 γ(0) = y∗ rc(y∗) ? Ty∗(M)

Ty∗(M) rc(y∗) y∗

y∗ rc(y∗) rc(y∗) +

m

X

i=1

λirhi(y∗) = 0 rh(y0)

slide-51
SLIDE 51

Method of the Lagrange multipliers

43

Then, if we define the Lagrangian

  • r simply

L(y, λ) = c(y) + λ|h(y) = c(y) +

m

X

i=1

λihi(y) This is a necessary condition for optimality which can be used to find candidates for minimum. we can write the optimality conditions as: if is a local minimum then there must exist multipliers s.t. y∗ ∈ M λi Lλ(y∗, λ) = h(y∗) = 0 Ly(y∗, λ) = rc(y∗) +

m

X

i=1

λirhi(y∗) = 0 rL(y∗, λ) = 0

slide-52
SLIDE 52

Example

44

1
  • 1
  • 1
1 1 0.5
  • 0.5
  • 1

min

y∈R3 a|y

c(y) = a|y s.t. kyk2 = 1 h(y) = kyk2 1 rh(y) = 2y| rc(y) = a| Ly(y∗, λ) = 0 Lλ(y∗, λ) = 0 maximum minimum ky∗k2 = 1 rc(y∗) + λrh(y∗) = a| + λ2(y∗)| = 0 (y∗, λ) = ( a kak, kak 2 ) (y∗, λ) = ( a kak, kak 2 )

slide-53
SLIDE 53

Other approach to previous example

45

x u c(x, u) = 0.5 + x2 + u2

  • 2
  • 1

1 2 2 1

  • 1

8 6 4 2

  • 2

rc(x, u) = 2 ⇥x u⇤ is satisfied for rc(x∗, u∗) + λrh(x∗, u∗) = 0 h(x, u) = 1 + 0.5u − x = 0 rh(x, u) = ⇥1 0.5⇤ (x∗, u∗, λ) = ( 4

5, − 2 5, 8 5)

slide-54
SLIDE 54

Outline

  • Discretization
  • Introduction to static optimization
  • Unconstrained optimization
  • Constrained optimization
  • Numerical methods
slide-55
SLIDE 55

Gradient method for unconstrained problems

46

Given an initial estimate iterate the method x0 until a certain stopping criterion is met (e.g. or ) krc(y)|y=xkk < δ k > kmax xk+1 = xk ✏krc(y)|

|y=xk

Choice of the step ✏k

  • Ideally is taken as the solution to but this requires a

line search which might be computationally demanding

  • The simplest choice would be but in general we need , otherwise the

method will (in the best case!) oscillate around the minimum. Small many iterations needed to get close to the minimum, large , method might diverge.

  • In practice a heuristic may be used guaranteeing that (e.g. Armijo’s rule)

✏k = ¯ ✏

✏k → 0 ¯ ✏ ¯ ✏ c(xk+1) < c(xk) minα∈R≥0 c(xk αrc(y)|

|y=xk)

✏k Newton’s method (much faster)

  • Choose the following descend direction instead of

[r2c(y)]−1rc(y)| rc(y)|

slide-56
SLIDE 56

˜ xk+1

Projection method for constrained problems

47

M

Ty∗(M)

Iteratively:

  • For a given estimate compute the gradient
  • Project the gradient on the tangent space of the manifold defined by the constraints

min

y∈Rn c(y)

s.t. h(y) = 0 rc(xk) xk

h(y) = 0

xk+1

rc(xk)

xk

rc(xk)

  • Project on the manifold defined by the constraints obtaining .
  • Perform a step of the gradient method

˜ xk+1

xk+1 ˜ xk+1 = xk ✏k(rc(y)|y=xk )|

slide-57
SLIDE 57

Discussion

48

  • These methods work very well for convex problems (cost and constraint functions are

convex). For non-convex problem it is very hard to compute the minimum and the numerical methods have many issues (e.g. get stuck in local minima).

  • There are many other numerical methods. For example for unconstrained problems,

Newton and Quasi-Newton methods, conjugate gradient methods, etc, For constrained problems, Penalty, Barrier, and Augmented Lagrangian methods.

  • A class of problems that can be solved very efficiently is quadratic program

and this allows to model linear quadratic control problem with input and state constraints.

  • Two very good references if you want to know more about numerical optimization:

Numerical optimization. Jorge Nocedal, Stephen Wright, Springer, 1999 Convex optimization Stephen Boyd, Lieven Vandenberghe, Cambridge University press, 2004

slide-58
SLIDE 58

Example

49

Dynamic model

k ∈ {0, 1}

Terminal cost Cost

Non-quadratic I

Consider again the following simple example

xk+1 = xk + uk

1

X

k=0

x2

k + u2 k + g2(x2)

g2(x2) = ex2

Initial condition

x0 = 1

slide-59
SLIDE 59

Non-quadratic terminal cost I

50

  • 1. Write the state variables as a function of the control inputs
  • 2. Write the cost function as a function of the control inputs

x1 = x0 + u0 x2 = x1 + u1 = x0 + u0 + u1

  • 3. Find the optimal values of the control inputs

x2

0 + u2 0 + x2 1 + u2 1 + ex2 = x2 0 + u2 0 + (x0 + u0)2 + u2 1 + e(x0+u0+u1)

  • differentiating and setting the derivative to zero would lead to a system of non-linear

equations - one could apply a numerical method to solve it, but in general this is difficult.

  • One can apply the gradient method to find the optimal control input.
  • Since there are only two scalar control inputs to be determined, one can simply plot the

function.

slide-60
SLIDE 60

51

Non-quadratic terminal cost I

  • 3
  • 2
  • 1
1 2
  • 3
  • 2
  • 1
1 20 15 10 5 25 30 35 40

Plot 1 + u2

0 + (1 + u0)2 + u2 1 + e1+u0+u1

(convex function!) u0 u1 Gradient method with line search αk = min{β|J(u(k) − βa), a = ∂J ∂u (u(k))} u(k) = [u(k) u(k)

1 ]

u(k+1) = u(k) − αk ∂J ∂u (u(k))

7 8

Numerical values using Matlab for two initial guesses

0.1000 1.5000

  • 1.1469 0.2060
  • 0.6441 -0.2790
  • 0.7466 -0.3853
  • 0.7104 -0.4201
  • 0.7181 -0.4281
  • 0.7154 -0.4307
  • 0.7159 -0.4313
  • 0.7157 -0.4315

1.0000 1.0000

  • 0.7086 -0.4466
  • 0.7169 -0.4325
  • 0.7156 -0.4318
  • 0.7158 -0.4315

k u(k) u(k)

1

1 2 3 4 5 6 1 2 3 4 k u(k) u(k)

1

h(u0, u1) =

  • 4. Replace the values of in the expression for the cost:

u0, u1 h(−0.7158, −0.4315) = 2.64

slide-61
SLIDE 61

Concluding remarks

  • State and input discretization: allows to transform any stage-decision problem

into a discrete optimization problem.

  • It is not feasible for problems with large input and state dimensions - curse of

dimensionality.

  • For linear system we can exactly and easily obtain temporal discretizations.

52

  • Formulate stage decision problems in the framework of discrete optimization

problems (to be solved by DP).

  • Obtain a temporal discretization of a linear plant.
  • Have basic understanding of static optimization

To sum up After this lecture, you should be able to