Analytical Solution of Constrained LQ Optimal Control for Horizon 2 - - PowerPoint PPT Presentation

analytical solution of constrained lq optimal control for
SMART_READER_LITE
LIVE PREVIEW

Analytical Solution of Constrained LQ Optimal Control for Horizon 2 - - PowerPoint PPT Presentation

Analytical Solution of Constrained LQ Optimal Control for Horizon 2 Jos e De Don a September 2004 Centre of Complex Dynamic Systems and Control Outline A Motivating Example 1 Cautious Design Serendipitous Design Tactical Design


slide-1
SLIDE 1

Analytical Solution of Constrained LQ Optimal Control for Horizon 2

Jos´ e De Don´ a September 2004

Centre of Complex Dynamic Systems and Control

slide-2
SLIDE 2

Outline

1

A Motivating Example Cautious Design Serendipitous Design Tactical Design

2

Explicit Solutions Explicit vs Implicit

3

A Simple Problem with N = 2 Dynamic Programming

4

Motivating Example Revisited

5

Conclusions

Centre of Complex Dynamic Systems and Control

slide-3
SLIDE 3

A Motivating Example

Consider the double integrator plant: d2y(t) dt2

= u(t).

The zero-order hold discretisation with sampling period 1 is: xk+1 = Axk + Buk, yk = Cxk, with A =

  • 1

1 1

  • ,

B =

  • 0.5

1

  • ,

C =

  • 1
  • .

Assume the actuator has a maximum and minimum allowed value (saturation) of ±1. Thus, the controller has to satisfy the constraint:

|uk| ≤ 1 for all k.

Centre of Complex Dynamic Systems and Control

slide-4
SLIDE 4

A Motivating Example

The schematic of the feedback control loop is shown in the Figure. uk xk controller linear system sat

Figure: Feedback control loop.

“sat” represents the actuator modelled by the saturation function

sat(u) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

1 if u > 1, u if |u| ≤ 1,

−1

if u < −1.

Centre of Complex Dynamic Systems and Control

slide-5
SLIDE 5

(i) Cautious Design

Suppose the initial condition is x0 =

  • −6

.

We start with a “low gain” linear state feedback controller (LQR), computed so as to minimise the objective function: V∞({xk}, {uk}) = 1 2

  • k=0
  • x

kQxk + u kRuk

  • ,

with weighting matrices Q = CC =

  • 1
  • and R = 20.

The solution is then computed by solving the algebraic Riccati equation P = APA + Q − K (R + BPB)K, where K = (R + BPB)−1BPA. We obtain the linear state feedback law: uk = −Kxk = −

  • 0.1603

0.5662

  • xk.

Centre of Complex Dynamic Systems and Control

slide-6
SLIDE 6

(i) Cautious Design

“Slow gain” control: Q = CC =

  • 1
  • , R = 20, uk = −Kxk.

The resulting input and output sequences are shown in the Figure.

5 10 15 20 25 −0.4 −0.2 0.2 0.4 0.6 0.8 1 5 10 15 20 25 −6 −5 −4 −3 −2 −1 1

k k uk yk

We can see from the figure that the input uk satisfies the given constraint |uk| ≤ 1, for all k; for this initial condition. The response is rather slow. The “settling time” is of the order of 8 samples.

Centre of Complex Dynamic Systems and Control

slide-7
SLIDE 7

(ii) Serendipitous Design

“High gain” control: Q = CC =

  • 1
  • , R = 2, uk = sat(−Kxk).

The resulting input and output sequences are shown in the Figure.

5 10 15 20 25 −1 1 2 3 5 10 15 20 25 −6 −5 −4 −3 −2 −1 1

k k uk yk

We can see from the figure that the input uk satisfies the given constraint |uk| ≤ 1, for all k. The controller makes better use of the available control authority. The amount of overshoot is essentially the same. The “settling time” is of the order of 5 samples.

Centre of Complex Dynamic Systems and Control

slide-8
SLIDE 8

(ii) Serendipitous Design

“High gain” control: Q = CC =

  • 1
  • , R = 0.1, uk = sat(−Kxk).

The resulting input and output sequences are shown in the Figure.

5 10 15 20 25 −6 −4 −2 2 4 6 5 10 15 20 25 −6 −4 −2 2 4

k k uk yk

We can see from the figure that the input uk satisfies the given constraint |uk| ≤ 1, for all k. The control sequence stays saturated much longer. Significant overshoot. The “settling time” blows out to 12 samples.

Centre of Complex Dynamic Systems and Control

slide-9
SLIDE 9

Recapitulation

Going from R = 20 → 2: same overshoot, faster response. Going from R = 2 → 0.1: large overshoot, long settling time. Let us examine the state space trajectory corresponding to the serendipitous strategy R = 0.1, uk = sat(−Kxk):

−6 −4 −2 2 4 6 −4 −3 −2 −1 1 2 3 4

x1

k

x2

k

R0 R1 R2

The control law u = −sat(Kx) partitions the state space into three regions. Hence, the controller can be characterised as a switched control strategy: u = K(x) =

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ −Kx

if x ∈ R0, 1 if x ∈ R1,

−1

if x ∈ R2.

Centre of Complex Dynamic Systems and Control

slide-10
SLIDE 10

Recapitulation

−6 −4 −2 2 4 6 −4 −3 −2 −1 1 2 3 4

x1

k

x2

k

R0 R1 R2

u = K(x) =

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ −Kx

if x ∈ R0, 1 if x ∈ R1,

−1

if x ∈ R2. Heuristically, we can think,in this example, of x2 as “velocity” and x1 as “position.” Now, in our attempt to change the position rapidly (from −6 to 0), the velocity has been allowed to grow to a relatively high level (+3). This would be fine if the braking action were unconstrained. However,

  • ur input (including braking) is

limited to the range [−1, 1]. Hence, the available braking is inadequate to “pull the system up,” and overshoot occurs.

Centre of Complex Dynamic Systems and Control

slide-11
SLIDE 11

(iii) Tactical Design

We will now start “afresh” with a formulation that incorporates constraints from the beginning in the design process. A sensible idea would seem to be to try to “look ahead” and take account of future input constraints (that is, the limited braking authority available). We now consider a Model Predictive Control law with prediction horizon N = 2. At each sampling instant i and for the current state xi, the two-step objective function: V2({xk}, {uk}) = 1 2x

i+2Pxi+2 + 1

2

i+1

  • k=i
  • x

kQxk + u kRuk

  • ,

(1) is minimised subject to the equality and inequality constraints: xk+1 = Axk + Buk, k = i, i + 1,

|uk| ≤ 1,

k = i, i + 1. (2)

Centre of Complex Dynamic Systems and Control

slide-12
SLIDE 12

(iii) Tactical Design

In the objective function (1), we set, as before, Q = CC, R = 0.1. The terminal state weighting matrix P is taken to be the solution of the algebraic Riccati equation P = APA + Q − K (R + BPB)K, where K = (R + BPB)−1BPA. As a result of minimising (1) subject the constraints (2), we obtain an optimal fixed-horizon control sequence

{u

i

, u

i+1}

We then apply the resulting value of u

i

to the system in a receding horizon control form. We can see that this strategy has the ability to “look ahead” by considering the constraints not only at the current time i, but also

  • ne step ahead i + 1.

Centre of Complex Dynamic Systems and Control

slide-13
SLIDE 13

(iii) Tactical Design

MPC: Q = CC =

  • 1
  • , R = 0.1, Horizon N = 2

The resulting input and output sequences are shown in the Figure.

5 10 15 20 25 −6 −4 −2 2 4 6 5 10 15 20 25 −6 −4 −2 2 4

k k uk yk

Dashed line: control uk = −Kxk. Solid line: MPC We can see from the figure that the output trajectory with constrained input now has minimal overshoot. Thus, the idea of “looking ahead” has paid dividends.

Centre of Complex Dynamic Systems and Control

slide-14
SLIDE 14

Recapitulation

As we will see in this part of the course, the receding horizon control strategy (MPC) we have used can be described as a partition of the state space into different regions in which affine control laws hold. Serendipitous strategy R = 0.1, uk = sat(−Kxk).

−6 −4 −2 2 4 6 −4 −3 −2 −1 1 2 3 4

x1

k

x2

k

R0 R1 R2

Receding horizon tactical design R = 0.1, N = 2.

−6 −4 −2 2 4 6 −4 −3 −2 −1 1 2 3 4

x1

k

x2

k

R0 R1 R2 R3 R4

Centre of Complex Dynamic Systems and Control

slide-15
SLIDE 15

Explicit vs Implicit

Before we start studying how to find an explicit characterisation of the MPC solution, let us first define what procedures we would call explicit and which ones we would call implicit. Explicit solution (p = parameter, a, b = constants). fp(z) = z2 + 2apz + bp

z

∂fp ∂z = 0 ⇒ z(p) = −ap

Implicit (numerical) solution (p = parameter, a, b = constants). fp(z) = z2 + 2apz + bp

z

z0

p

zk

p

zk+1

p

. . .

Centre of Complex Dynamic Systems and Control

slide-16
SLIDE 16

Problem Setup

We consider the discrete time system xk+1 = Axk + Buk, where xk ∈ Rn and uk ∈ R. The pair (A, B) is assumed to be

  • stabilisable. We consider the following fixed horizon optimal control

problem:

PN(x) :

V

N (x) = min VN({xk}, {uk}),

(3) subject to: x0 = x, xk+1 = Axk + Buk for k = 0, 1, . . . , N − 1, uk ∈ U = [−∆, ∆] for k = 0, 1, . . . , N − 1, where ∆ > 0 is the input constraint level. The objective function is: VN({xk}, {uk}) = 1 2x

NPxN + 1

2

N−1

  • k=0
  • x

kQxk + u kRuk

  • .

(4)

Centre of Complex Dynamic Systems and Control

slide-17
SLIDE 17

Problem Setup

Let the control sequence that achieves the minimum in (3) be:

{u

0 , u 1 , . . . , u N−1}

The Receding horizon control law, which depends on the current state x = x0, is

KN(x) = u

0 .

Can we obtain an explicit expression for KN(·) defined above, as a function of the parameter x?

Centre of Complex Dynamic Systems and Control

slide-18
SLIDE 18

A Simple Problem with N = 2

Consider now the following fixed horizon optimal control problem with prediction horizon N = 2:

P2(x) :

V

2 (x) = min V2({xk}, {uk}),

(5) subject to: x0 = x xk+1 = Axk + Buk for k = 0, 1, uk ∈ U = [−∆, ∆] for k = 0, 1. The objective function in (5) is V2({xk}, {uk}) = 1 2x

2Px2 + 1

2

1

  • k=0
  • x

kQxk + u kRuk

  • .

Centre of Complex Dynamic Systems and Control

slide-19
SLIDE 19

A Simple Problem with N = 2

Objective function: V2({xk}, {uk}) = 1 2x

2Px2 + 1

2

1

  • k=0
  • x

kQxk + u kRuk

  • .

(6) The matrices Q and R are assumed positive semi-definite and definite respectively. P is taken as the solution to the algebraic Riccati equation P = APA + Q − K  ¯ RK, where K = ¯ R−1BPA, ¯ R = R + BPB. Let the control sequence that minimises (6) be

{u

0 , u 1 }.

Then the RHC law is given by the first element of the optimal sequence (which depends on the current state x0 = x), that is,

K2(x) = u

0 .

Centre of Complex Dynamic Systems and Control

slide-20
SLIDE 20

Dynamic Programming for N = 2

We will use Dynamic Programming to find the analytical solution for this problem. We define the partial value functions as: x0 = x x1 x2 u1 u0 V

0 (x2) = 1

2x

2Px2

V

1 (x1) =

min

u1∈U x2=Ax1+Bu1

1 2x

2Px2 + 1

2x

1Qx1 + 1

2u

1Ru1

V

2 (x) =

min

uk ∈U xk+1=Axk +Buk

1 2x

2Px2 + 1

2

1

  • k=0
  • x

kQxk + u kRuk

  • Centre of Complex Dynamic

Systems and Control

slide-21
SLIDE 21

Dynamic Programming for N = 2

The dynamic programming algorithm makes use of the Principle

  • f Optimality, which states that any portion of the optimal

trajectory is itself an optimal trajectory: V

k (x) = min u∈U

1 2xQx + 1 2uRu + V

k−1(Ax + Bu),

where u and x denote, u = uk and x = xk, respectively. In the sequel we will use the saturation function:

sat∆(u) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∆

if u > ∆, u if |u| ≤ ∆,

−∆

if u < −∆.

Centre of Complex Dynamic Systems and Control

slide-22
SLIDE 22

Dynamic Programming for N = 2

Theorem (RHC Characterisation for N = 2) The RHC law has the form

K2(x) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ −sat∆(Gx + h)

if x ∈ Z−,

−sat∆(Kx)

if x ∈ Z,

−sat∆(Gx − h)

if x ∈ Z+, (7) where the gains K, G ∈ R1×n and the constant h ∈ R are K = ¯ R−1BPA, G = K + KBKA 1 + (KB)2 , h = KB 1 + (KB)2 ∆, and the sets (Z−, Z, Z+) are defined by

Z− = {x : K(A − BK)x < −∆} , Z = {x : |K(A − BK)x| ≤ ∆} , Z+ = {x : K(A − BK)x > ∆} .

Centre of Complex Dynamic Systems and Control

slide-23
SLIDE 23

Dynamic Programming for N = 2

Outline of the Proof: (i) The partial value function V

0 :

Here x = x2. By definition, the partial value function at time k = N = 2 is V

0 (x) = 1

2xPx for all x ∈ Rn. (ii) The partial value function V

1 :

Here x = x1 and u = u1. By the principle of optimality, for all x ∈ Rn, V

1 (x) = min u∈U

1

2xQx + 1 2uRu + V

0 (Ax + Bu)

  • = min

u∈U

1

2xQx + 1 2uRu + 1 2(Ax + Bu)P(Ax + Bu)

  • = min

u∈U

1

2xPx + 1 2

¯

R(u + Kx)2

  • .

Centre of Complex Dynamic Systems and Control

slide-24
SLIDE 24

Dynamic Programming for N = 2

V

1 (x) = min u∈U

1

2xPx + 1 2

¯

R(u + Kx)2

  • .

1 2xPx + 1 2 ¯

R(u + Kx)2

u

u

1

= −Kx

1 2xPx + 1 2 ¯

R(u + Kx)2

−Kx

u

u

1

From the convexity of the function ¯ R(u + Kx)2 it then follows that the constrained (u ∈ U) optimal control law is given by u

1

= sat∆(−Kx) = −sat∆(Kx)

for all x ∈ Rn.

Centre of Complex Dynamic Systems and Control

slide-25
SLIDE 25

Dynamic Programming for N = 2

Substituting the optimal control u

1

= −sat∆(Kx)

for all x ∈ Rn. into the minimisation problem V

1 (x) = min u∈U

1

2xPx + 1 2

¯

R(u + Kx)2

  • .

we have that the partial value function at time k = N − 1 = 1 is V

1 (x) = 1

2xPx + 1 2

¯

R [Kx − sat∆(Kx)]2 for all x ∈ Rn.

Centre of Complex Dynamic Systems and Control

slide-26
SLIDE 26

Dynamic Programming for N = 2

(iii) The partial value function V

2 :

Here x = x0 and u = u0. By the principle of optimality, we have that, for all x ∈ Rn, V

2 (x) = min u∈U

1

2xQx + 1 2uRu + V

1 (Ax + Bu)

  • = min

u∈U

1

2xQx + 1 2uRu + 1 2(Ax + Bu)P(Ax + Bu)

+1

2

¯

R [K(Ax + Bu) − sat∆(K(Ax + Bu))]2

  • = 1

2 min

u∈U

  • xPx + ¯

R(u + Kx)2

+ ¯

R [KAx + KBu − sat∆(KAx + KBu)]2

.

Centre of Complex Dynamic Systems and Control

slide-27
SLIDE 27

Dynamic Programming for N = 2

u

= arg min

u∈U

  • f1(u)
  • ¯

R(u + Kx)2 +

f2(u)

  • ¯

R [KAx + KBu − sat∆(KAx + KBu)]2

.

Case (a): x ∈ Z− x ∈ Z−

KAx + KB(−Kx) < −∆ f1(u) f2(u) f1(u) + f2(u)

−Kx

u

Centre of Complex Dynamic Systems and Control

slide-28
SLIDE 28

Dynamic Programming for N = 2

We conclude that u

= arg min

u∈U

¯

R(u + Kx)2 + ¯ R [KAx + KBu − sat∆(KAx + KBu)]2

= arg min

u∈U

¯

R(u + Kx)2 + ¯ R [KAx + KBu + ∆]2 Hence u

= −sat∆(Gx + h)

for all x ∈ Z−, where G = K + KBKA 1 + (KB)2 , h = KB 1 + (KB)2 ∆.

Centre of Complex Dynamic Systems and Control

slide-29
SLIDE 29

Dynamic Programming for N = 2

u

= arg min

u∈U

  • f1(u)
  • ¯

R(u + Kx)2 +

f2(u)

  • ¯

R [KAx + KBu − sat∆(KAx + KBu)]2

.

Case (b): x ∈ Z x ∈ Z

⇔ |KAx + KB(−Kx)| ≤ ∆

f1(u) f2(u) f1(u) + f2(u)

−Kx

u

Centre of Complex Dynamic Systems and Control

slide-30
SLIDE 30

Dynamic Programming for N = 2

We conclude that u

= arg min

u∈U

¯

R(u + Kx)2 + ¯ R [KAx + KBu − sat∆(KAx + KBu)]2

= arg min

u∈U

¯

R(u + Kx)2 Hence u

= −sat∆(Kx)

for all x ∈ Z.

Centre of Complex Dynamic Systems and Control

slide-31
SLIDE 31

Dynamic Programming for N = 2

u

= arg min

u∈U

  • f1(u)
  • ¯

R(u + Kx)2 +

f2(u)

  • ¯

R [KAx + KBu − sat∆(KAx + KBu)]2

.

Case (c): x ∈ Z+ x ∈ Z+

KAx + KB(−Kx) > ∆ f1(u) f2(u) f1(u) + f2(u)

−Kx

u

Centre of Complex Dynamic Systems and Control

slide-32
SLIDE 32

Dynamic Programming for N = 2

We conclude that u

= arg min

u∈U

¯

R(u + Kx)2 + ¯ R [KAx + KBu − sat∆(KAx + KBu)]2

= arg min

u∈U

¯

R(u + Kx)2 + ¯ R [KAx + KBu − ∆]2 Hence u

= −sat∆(Gx − h)

for all x ∈ Z+, where G = K + KBKA 1 + (KB)2 , h = KB 1 + (KB)2 ∆.

  • Centre of Complex Dynamic

Systems and Control

slide-33
SLIDE 33

Motivating Example Revisited

We have, from the previous result, that the control law is:

K2(x) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ −sat∆(Gx + h)

if x ∈ Z−,

−sat∆(Kx)

if x ∈ Z,

−sat∆(Gx − h)

if x ∈ Z+,

  • r, equivalently:

K2(x) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ −∆

if x ∈ R1,

−Gx − h

if x ∈ R2,

−Kx

if x ∈ R3,

−Gx + h

if x ∈ R4,

if x ∈ R5.

−6 −4 −2 2 4 6 −4 −3 −2 −1 1 2 3 4

x1

k

x2

k

Z− Z Z+

−6 −4 −2 2 4 6 −4 −3 −2 −1 1 2 3 4

x1

k

x2

k

R1 R2 R3 R4 R5

Centre of Complex Dynamic Systems and Control

slide-34
SLIDE 34

Conclusions

Note that we have obtained, for the case N = 2, the following expression for the receding horizon control law

K2(x) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ −∆

if x ∈ R1,

−Gx − h

if x ∈ R2,

−Kx

if x ∈ R3,

−Gx + h

if x ∈ R4,

if x ∈ R5. That is, the control law can be characterised as a piecewise affine function of the state x. The calculation using Dynamic Programming can be extended to longer prediction horizons, N > 2. However, we will instead reconsider this same problem, and further extensions, using geometric arguments later in the course.

Centre of Complex Dynamic Systems and Control