4SC000 Q2 2017-2018
Optimal Control and Dynamic Programming
Duarte Antunes
Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation
Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Outline Loop transfer recovery Recall The LQR loop for the system , x ( t ) = Ax ( t ) + Bu ( t ) u ( t ) R u ( s ) 0 K (
4SC000 Q2 2017-2018
Duarte Antunes
1
The LQR loop for the system , + K(sI − A)−1B u(s)
R
Q
R Q R Q [ 1
2, ∞)
(−60, 60) where is such that is the optimal policy for the problem K minu(t) R ∞ x(t)|Qx(t) + u(t)|Ru(t)dt u(t) = Kx(t) has the following properties ˙ x(t) = Ax(t) + Bu(t) u(t) ∈ R
2
By duality, the Kalman loop for the system
[ 1
2, ∞)
(−60, 60) where is such that is the optimal estimator (in the sense of slide 11, lecture 11)
y(t) = Cx(t) + n(t) E[n(t)n(t + τ)|] = V δ(τ) ˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − Cˆ x(t)) E[w(t)w(t + τ)|] = Wδ(τ)
˙ x = Ax(t) + w(t)
L C(sI − A)−1L + − ˆ y(s) y(s) has the following properties: W V W V W V y(t) ∈ R
3
Since there are guaranteed frequency domain properties for the LQR and Kalman loops (discovered by Kalman in the 60’s, it would be reasonable to search for such properties for LQG loops). However, in 1978, John Doyle wrote the following paper finding an example for some matrices , , , W V R Q
4
0.1 0.2 0.3 0.4
Nyquist Diagram Real Axis Imaginary Axis
A = [1 1; 0 1]; B = [0;1]; C = [1 0]; n = size(A,1); Q = 1*[1 1]'*[1 1]; R = 0.001; W = [1 1]'*[1 1]; V = 1; K = lqr(A,B,Q,R); K = -K; [~,L,Theta,~,~,~] = kalman(ss(A,[B eye(n)],C, [0 zeros(1,n)]),W,V); [numLQG,denLQG] = ss2tf( A+B*K-L*C ,L,-K,0); [numplant,denplant] = ss2tf( A , B , C ,0); tflqg = tf(numLQG,denLQG); tfplant = tf(numplant,denplant); nyquist(tfplant*tflqg)
K(sI − (A + BK − LC))−1L C(sI − A)−1B + u(s) y(s) A = 1 1 1
1
⇥1 0⇤
Special case of John Doyle’s example
Q = W = 1 1 ⇥1 1⇤ R = 0.01
5
gain and phase margins can be arbitrary small, it is possible via loop transfer recovery to avoid these examples.
transfer function converges to the open loop transfer function of the Kalman filter loop when the LQR control penalty converges to zero (cheap control).
tuned in this way can also have good frequency domain properties (e.g. large gain and phase margins).
any) lie on the left half complex plane (stable zeros).
performance, and although there are some results also for these systems, we will not address them here.
transfer result for model-based systems and then specialised it to LQG loops.
6
minu(t) R ∞ x(t)|Qx(t) + u(t)|Ru(t)dt ˙ x(t) = Ax(t) + Bu(t) Consider the family of gains obtained from the optimal policy for the problem Kρ u(t) = Kρx(t) with the gains Q = C|C and . Then lim
ρ→0(√ρKρ) = −C
Note that, by duality, if and then the Kalman gains satisfy R = ρ lim
ρ→0(√ρKρ) = C
W = HH| V = θ → 0 lim
θ→0(
√ θLθ) = −H
lim
θ→0(
√ θLθ) = H u(t) ∈ R
7
Start with the continuous-time algebraic Riccati equation and note that if then since R = ρ → 0 P → 0 ρA|P + ρPA − PBB|P + ρC|C = 0 A|P + PA − PBρ−1B|P + C|C = 0
A|P + PA − K|ρK + C|C = 0 and then in the limit from which the conclusion follows. ρK|K → C|C
8
Consider a linear system with no unstable zeros and a model based controller, parameterized by gains and L Kα ˙ x(t) = Ax(t) + Bu(t) ˙ ¯ x(t) = A¯ x(t) + Bu(t) + L(y(t) − C¯ x(t)) u(t) = Kα¯ x(t) (i) are fixed gains such that is Hurwitz*. L where: (ii) are a family of gains such that is Hurwitz and
*a matrix is Hurwitz if all the eigenvalues have negative real part.
(A − LC) Kα (A + BKα) y(t) = Cx(t)
Then, for each , s ∈ C lim
α→0 C(sI − A)−1BKα(sI − (A + BKα − LC))−1L = C(sI − A)−1L
y(t) ∈ R u(t) ∈ R lim
α→0(√αKα) = −C
lim
α→0(√αKα) = C
9
Consider a linear system with no unstable zeros and an LQG controller ˙ x(t) = Ax(t) + Bu(t) where
Note that this is a direct consequence of the results of slides 6 and 8
y(t) = Cx(t) Then, for each , s ∈ C y(t) ∈ R u(t) ∈ R AΦ + ΦA| + W − ΦC|V −1CΦ = 0
L = ΦC|V −1
˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − Cˆ x(t)) u(t) = Kρˆ x(t) A|Pρ + PρA − PρBρ−1B|Pα + C|C = 0 Kρ = −ρ−1B|Pρ limρ→0 C(sI − A)−1BKρ(sI − (A + BKρ − LC))−1L = C(sI − A)−1L
10
function converges to the open loop transfer function of the Kalman filter loop when the LQR control penalty converges to zero (cheap control) and . K(sI − (A + BK − LC))−1L C(sI − A)−1B C(sI − A)−1L R = ρ → 0 K(sI − (A + BK − LC))−1L C(sI − A)−1B + u(s) y(s) C(sI − A)−1L + − ˆ y(s) y(s) (as ) R = ρ → 0 Q = C|C
11
C(sI − A)−1L + − ˆ y(s) y(s)
Two steps
Q = C|C R = ρ K(sI − (A + BK − LC))−1L C(sI − A)−1B + u(s) y(s) Then the LQG controller ˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − Cˆ x(t)) will yield a closed loop with similar frequency domain properties to the designed Kalman loop Interesting: design first the observer and then design fast controller! (contrarily to the traditional paradigm of designing first the controller and then make fast observer). K L u(t) = Kˆ x(t)
12
Let us use the LTR/LQG procedure to design a controller with good gain and phase margins for the following system Transfer function
1 (s+1)(s−2)(s+3)
+ u(s) y(s)
1 (s+1)(s−2)(s+3)
? State-space A = −2 5 6 1 1 B = 1 C = ⇥0 1⇤
13
margins, sensitivity and complementary sensitivity) W = 1 1
10 20
Magnitude (dB)
10-1 100 101 102
Phase (deg) Bode Diagram Frequency (rad/s)
5
Magnitude (dB)
10-1 100 101 102
Phase (deg) Bode Diagram Frequency (rad/s)
Magnitude (dB)
10-1 100 101 102 45 90 135 180
Phase (deg) Bode Diagram Frequency (rad/s)
. After tuning these are found acceptable
Open loop Sensitivity Complementary
14
Q = C|C R = ρ K
0.5
0.5 1 1.5 2
Nyquist Diagram Real Axis Imaginary Axis
ρ = 1 × 10−4 Blue - Kalman loop Red - LQG loop
15
Q = C|C R = ρ K
0.5
0.5 1 1.5 2
Nyquist Diagram Real Axis Imaginary Axis
ρ = 1 × 10−7 Blue - Kalman loop Red - LQG loop
16
Q = C|C R = ρ K
0.5
0.5 1 1.5 2
Nyquist Diagram Real Axis Imaginary Axis
ρ = 1 × 10−10 Blue - Kalman loop Red - LQG loop
17
Q = C|C R = ρ K
0.5
0.5 1 1.5 2
Nyquist Diagram Real Axis Imaginary Axis
ρ = 1 × 10−14 Blue - Kalman loop Red - LQG loop
18
Going back to John Doyle’s example A = 1 1 1
1
⇥1 0⇤ Q = W = 1 1 ⇥1 1⇤ Q = 1 R = 0.01 suppose that we consider instead the parameters dictated by LTR Q = C|C = 1 ⇥1 0⇤ ρ → 0 R = ρ → 0 then we obtain the Nyquist diagram (nice gain and phase margins!)
0.2 0.4 0.6 0.8
Nyquist Diagram Real Axis Imaginary Axis
19
After this lecture you should be able to
margins, using the loop transfer recovery method. Summary
margins