Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation

optimal control and dynamic programming
SMART_READER_LITE
LIVE PREVIEW

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Outline Loop transfer recovery Recall The LQR loop for the system , x ( t ) = Ax ( t ) + Bu ( t ) u ( t ) R u ( s ) 0 K (


slide-1
SLIDE 1

4SC000 Q2 2017-2018

Optimal Control and Dynamic Programming

Duarte Antunes

slide-2
SLIDE 2

Outline

  • Loop transfer recovery
slide-3
SLIDE 3

1

Recall

The LQR loop for the system , + K(sI − A)−1B u(s)

  • Guaranteed gain margins and phase margins for any matrices and .
  • Guarantees on the sensitivity (<1) and complementary sensitivity (<2) for any ,
  • Root square locus: place the poles of the loop by tuning , .

R

Q

R Q R Q [ 1

2, ∞)

(−60, 60) where is such that is the optimal policy for the problem K minu(t) R ∞ x(t)|Qx(t) + u(t)|Ru(t)dt u(t) = Kx(t) has the following properties ˙ x(t) = Ax(t) + Bu(t) u(t) ∈ R

slide-4
SLIDE 4

2

Recall

By duality, the Kalman loop for the system

  • Guaranteed gain margins and phase margins for any matrices and .
  • Guarantees on the sensitivity (<1) and complementary sensitivity (<2) for any , .
  • Root square locus: place the poles of the loop by tuning , .

[ 1

2, ∞)

(−60, 60) where is such that is the optimal estimator (in the sense of slide 11, lecture 11)

y(t) = Cx(t) + n(t) E[n(t)n(t + τ)|] = V δ(τ) ˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − Cˆ x(t)) E[w(t)w(t + τ)|] = Wδ(τ)

˙ x = Ax(t) + w(t)

L C(sI − A)−1L + − ˆ y(s) y(s) has the following properties: W V W V W V y(t) ∈ R

slide-5
SLIDE 5

3

Motivation for today’s lecture

Since there are guaranteed frequency domain properties for the LQR and Kalman loops (discovered by Kalman in the 60’s, it would be reasonable to search for such properties for LQG loops). However, in 1978, John Doyle wrote the following paper finding an example for some matrices , , , W V R Q

slide-6
SLIDE 6

4

John Doyle’s Example

  • 1.2
  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2
  • 0.4
  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4

Nyquist Diagram Real Axis Imaginary Axis

A = [1 1; 0 1]; B = [0;1]; C = [1 0]; n = size(A,1); Q = 1*[1 1]'*[1 1]; R = 0.001; W = [1 1]'*[1 1]; V = 1; K = lqr(A,B,Q,R); K = -K; [~,L,Theta,~,~,~] = kalman(ss(A,[B eye(n)],C, [0 zeros(1,n)]),W,V); [numLQG,denLQG] = ss2tf( A+B*K-L*C ,L,-K,0); [numplant,denplant] = ss2tf( A , B , C ,0); tflqg = tf(numLQG,denLQG); tfplant = tf(numplant,denplant); nyquist(tfplant*tflqg)

K(sI − (A + BK − LC))−1L C(sI − A)−1B + u(s) y(s) A =  1 1 1

  • B =

 1

  • C =

⇥1 0⇤

Special case of John Doyle’s example

Q = W =  1 1 ⇥1 1⇤ R = 0.01

slide-7
SLIDE 7

5

Discussion

  • While, as John Doyle’s example shows, there are examples of LQG loops where the

gain and phase margins can be arbitrary small, it is possible via loop transfer recovery to avoid these examples.

  • Loop transfer recovery states that, for minimum-phase systems, the LQG loop

transfer function converges to the open loop transfer function of the Kalman filter loop when the LQR control penalty converges to zero (cheap control).

  • Since the Kalman filter loop has good frequency domain properties LQG control

tuned in this way can also have good frequency domain properties (e.g. large gain and phase margins).

  • Minimum phase single-input single-output (SISO) systems are those whose zeros (if

any) lie on the left half complex plane (stable zeros).

  • Non-minimum phase system have limitations in the achievable closed-loop

performance, and although there are some results also for these systems, we will not address them here.

  • We need a first result of LQR and Kalman loops, then we state the general loop

transfer result for model-based systems and then specialised it to LQG loops.

  • We will consider for simplicity SISO systems although the results extend to MIMO.
slide-8
SLIDE 8

6

Preliminary result

minu(t) R ∞ x(t)|Qx(t) + u(t)|Ru(t)dt ˙ x(t) = Ax(t) + Bu(t) Consider the family of gains obtained from the optimal policy for the problem Kρ u(t) = Kρx(t) with the gains Q = C|C and . Then lim

ρ→0(√ρKρ) = −C

Note that, by duality, if and then the Kalman gains satisfy R = ρ lim

ρ→0(√ρKρ) = C

  • r

W = HH| V = θ → 0 lim

θ→0(

√ θLθ) = −H

  • r

lim

θ→0(

√ θLθ) = H u(t) ∈ R

slide-9
SLIDE 9

7

Justification

Start with the continuous-time algebraic Riccati equation and note that if then since R = ρ → 0 P → 0 ρA|P + ρPA − PBB|P + ρC|C = 0 A|P + PA − PBρ−1B|P + C|C = 0

  • r equivalently

A|P + PA − K|ρK + C|C = 0 and then in the limit from which the conclusion follows. ρK|K → C|C

slide-10
SLIDE 10

8

Loop transfer recovery

Consider a linear system with no unstable zeros and a model based controller, parameterized by gains and L Kα ˙ x(t) = Ax(t) + Bu(t) ˙ ¯ x(t) = A¯ x(t) + Bu(t) + L(y(t) − C¯ x(t)) u(t) = Kα¯ x(t) (i) are fixed gains such that is Hurwitz*. L where: (ii) are a family of gains such that is Hurwitz and

*a matrix is Hurwitz if all the eigenvalues have negative real part.

(A − LC) Kα (A + BKα) y(t) = Cx(t)

  • r

Then, for each , s ∈ C lim

α→0 C(sI − A)−1BKα(sI − (A + BKα − LC))−1L = C(sI − A)−1L

y(t) ∈ R u(t) ∈ R lim

α→0(√αKα) = −C

lim

α→0(√αKα) = C

slide-11
SLIDE 11

9

Loop transfer recovery for LQG

Consider a linear system with no unstable zeros and an LQG controller ˙ x(t) = Ax(t) + Bu(t) where

Note that this is a direct consequence of the results of slides 6 and 8

y(t) = Cx(t) Then, for each , s ∈ C y(t) ∈ R u(t) ∈ R AΦ + ΦA| + W − ΦC|V −1CΦ = 0

L = ΦC|V −1

˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − Cˆ x(t)) u(t) = Kρˆ x(t) A|Pρ + PρA − PρBρ−1B|Pα + C|C = 0 Kρ = −ρ−1B|Pρ limρ→0 C(sI − A)−1BKρ(sI − (A + BKρ − LC))−1L = C(sI − A)−1L

slide-12
SLIDE 12

10

Interpretation of LTR for LQG

  • Loop transfer recovery states that, for minimum phase systems, the LQG loop transfer

function converges to the open loop transfer function of the Kalman filter loop when the LQR control penalty converges to zero (cheap control) and . K(sI − (A + BK − LC))−1L C(sI − A)−1B C(sI − A)−1L R = ρ → 0 K(sI − (A + BK − LC))−1L C(sI − A)−1B + u(s) y(s) C(sI − A)−1L + − ˆ y(s) y(s) (as ) R = ρ → 0 Q = C|C

slide-13
SLIDE 13

11

LTR/LQG design

C(sI − A)−1L + − ˆ y(s) y(s)

  • 1. Design Kalman loop such that desired frequency domain properties are obtained,
  • btaining gains (see lecture 11)

Two steps

  • 2. Make and a very small and obtain LQR gain

Q = C|C R = ρ K(sI − (A + BK − LC))−1L C(sI − A)−1B + u(s) y(s) Then the LQG controller ˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − Cˆ x(t)) will yield a closed loop with similar frequency domain properties to the designed Kalman loop Interesting: design first the observer and then design fast controller! (contrarily to the traditional paradigm of designing first the controller and then make fast observer). K L u(t) = Kˆ x(t)

slide-14
SLIDE 14

12

Example

Let us use the LTR/LQG procedure to design a controller with good gain and phase margins for the following system Transfer function

1 (s+1)(s−2)(s+3)

+ u(s) y(s)

1 (s+1)(s−2)(s+3)

? State-space A =   −2 5 6 1 1   B =   1   C = ⇥0 1⇤

slide-15
SLIDE 15

13

Example

  • 1. Design Kalman loop such that desired frequency domain properties are obtained (gain,phase

margins, sensitivity and complementary sensitivity) W =  1 1

  • V = 0.1
  • 30
  • 20
  • 10

10 20

Magnitude (dB)

10-1 100 101 102

  • 180
  • 135
  • 90

Phase (deg) Bode Diagram Frequency (rad/s)

  • 25
  • 20
  • 15
  • 10
  • 5

5

Magnitude (dB)

10-1 100 101 102

  • 90
  • 45

Phase (deg) Bode Diagram Frequency (rad/s)

  • 10
  • 8
  • 6
  • 4
  • 2

Magnitude (dB)

10-1 100 101 102 45 90 135 180

Phase (deg) Bode Diagram Frequency (rad/s)

. After tuning these are found acceptable

  • 4.5
  • 4
  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5
0.5
  • 2
  • 1.5
  • 1
  • 0.5
0.5 1 1.5 2 Nyquist Diagram Real Axis Imaginary Axis

Open loop Sensitivity Complementary

slide-16
SLIDE 16

14

Example

  • 2. Make and a very small and obtain LQR gain

Q = C|C R = ρ K

  • 4.5
  • 4
  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

Nyquist Diagram Real Axis Imaginary Axis

ρ = 1 × 10−4 Blue - Kalman loop Red - LQG loop

slide-17
SLIDE 17

15

Example

  • 2. Make and a very small and obtain LQR gain

Q = C|C R = ρ K

  • 4.5
  • 4
  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

Nyquist Diagram Real Axis Imaginary Axis

ρ = 1 × 10−7 Blue - Kalman loop Red - LQG loop

slide-18
SLIDE 18

16

Example

  • 2. Make and a very small and obtain LQR gain

Q = C|C R = ρ K

  • 4.5
  • 4
  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

Nyquist Diagram Real Axis Imaginary Axis

ρ = 1 × 10−10 Blue - Kalman loop Red - LQG loop

slide-19
SLIDE 19

17

Example

  • 2. Make and a very small and obtain LQR gain

Q = C|C R = ρ K

  • 4.5
  • 4
  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

Nyquist Diagram Real Axis Imaginary Axis

ρ = 1 × 10−14 Blue - Kalman loop Red - LQG loop

slide-20
SLIDE 20

18

John Doyle’s example

Going back to John Doyle’s example A =  1 1 1

  • B =

 1

  • C =

⇥1 0⇤ Q = W =  1 1 ⇥1 1⇤ Q = 1 R = 0.01 suppose that we consider instead the parameters dictated by LTR Q = C|C = 1 ⇥1 0⇤ ρ → 0 R = ρ → 0 then we obtain the Nyquist diagram (nice gain and phase margins!)

  • 2
  • 1.8
  • 1.6
  • 1.4
  • 1.2
  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8

Nyquist Diagram Real Axis Imaginary Axis

slide-21
SLIDE 21

19

Concluding remarks

After this lecture you should be able to

  • Synthesize an LQG controller with reasonable phase and gain

margins, using the loop transfer recovery method. Summary

  • LQG controllers can have arbitrarily poor gain and phase

margins

  • However, by selecting properly the gains of the controller (LQR)
  • n can recover the good properties of the Kalman loop