4SC000 Q2 2017-2018
Optimal Control and Dynamic Programming
Duarte Antunes
Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation
Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Recall Discrete Stage decision Continuous-time optimization problems control problems problems Discrete-time system & Differential equations &
4SC000 Q2 2017-2018
Duarte Antunes
1
Discrete
problems Stage decision problems Continuous-time control problems Formulation Transition diagram Discrete-time system & additive cost function Differential equations & additive cost function DP algorithm Graphical DP algorithm & DP equation DP equation Hamilton Jacobi Bellman equation Partial information Bayesian inference & decisions based on
Kalman filter and separation principle Continuous-time Kalman filter and separation principle Alternative algorithms Dijkstra's algorithm Static optimization Pontryagin’s maximum principle (PMP) Today: continuous-time Kalman filter and separation principle And a new topic - frequency domain properties of LQR
However, how to define disturbances for continuous-time systems? The analogous problem to linear quadratic control for continuous-time systems would be It it quite challenging! White noise disturbances are one of the few ways to define disturbances without ‘‘memory’ for continuous-time systems. ˙ x(t) = Ax(t) + Bu(t)+ w(t)
2
min
u(t)=µ(t,x(t)) E[
Z T x(t)|Qx(t) + u(t)|Ru(t)dt + x(T)|QT x(T)]
3
Very interesting (or strange!) properties:
is the autocorrelation is zero
and the scalar process white noise is characterized by the amplitude
ω Let us start with a scalar white noise process ω(t) ∈ R time
ω(t) ω(t + δ)
δ
δ = 0
E[ω(t)ω(t)] = E[ω(t)2] = ∞ R(τ) = E[ω(t)ω(t + τ)] = 0
R(τ) = aδ(t)
4
˙ x(t) = w(t) The integral of white noise is called random walk or the Wiener process
that is also Gaussian for fixed time. and it is more intuitive and it is easier to handle mathematically 1 s w(t) x(t) x(t)
x(t) w(t)
x(t) and are correlated
5
˙ x(t) = Ax(t) + Bu(t)+
differential equation is more intuitive than white noise.
x(t) ∈ Rn w(t) ∈ Rn w(t) = N ¯ w(t) ¯ w(t) = ⇥ ¯ w1(t) ¯ w2(t) . . . ¯ wp(t)⇤| where are Gaussian white noise scalar variables and uncorrelated Thus, , ¯ wi(t) w(t) E[ ¯ w(t) ¯ w(t + τ)|] = Iδ(τ) E[w(t)w(t + τ)|] = NN |δ(τ) := Wδ(τ)
6
state feedback control law as for the deterministic version of the problem, the next results come with no surprise. xk+1 = Adxk + Bduk + wk xk := x(tk) tk = kτ u(t) = uk, t ∈ [tk, tk+1) where as before , and are zero-mean Gaussian random independent variables with covariance wk Ad = eAτ
Bd = Z τ eAsBds
E[wkw|
k] =
Z τ eAsWeA|sds
is also a quadratic function.
7
The optimal control law for the problem ˙ x(t) = Ax(t) + Bu(t)+ is , where u(t) = K(t)x(t) K(t) = −R−1B|P(t)x(t) ˙ P(t) = −(A|P(t) + P(t)A − P(t)BR−1B|P(t) + Q) P(T) = QT t ∈ [0, T) w(t) Q > 0 R > 0 min
u(t)=µ(t,x(t)) E[
Z T x(t)|Qx(t) + u(t)|Ru(t)dt + x(T)|QT x(T)] where is zero-mean Gaussian white noise with , w(t) E[w(t)w(t + τ)|] = Wδ(τ)
8
The optimal control law for the problem ˙ x(t) = Ax(t) + Bu(t)+ is , where is the unique positive definite solution to the (continuous-time) algebraic Riccati equation w(t) min
u(t)=µ(x(t)) lim T →∞
1 T E[ Z T x(t)|Qx(t) + u(t)|Ru(t)dt] Q > 0 R > 0 (A, B) controllable u(t) = Kx(t) K = −R−1B|P P where is zero-mean Gaussian white noise with , w(t) E[w(t)w(t + τ)|] = Wδ(τ) A|P + PA − PBR−1B|P + Q = 0
9
Problem formulation ˙ x(t) = Ax(t) + Bu(t)+ w(t) y(t) = Cx(t) + n(t)
w(t)
n(t)
x(0) E[w(t)w(t + τ)|] = Wδ(τ) E[n(t)n(t + τ)|] = V δ(τ) ¯ x0 ¯ Φ0 I(t) = {y(s), u(s)|s ∈ [0, t)} min
u(t)=µ(t,I(t)) E[
Z T x(t)|Qx(t) + u(t)|Ru(t)dt + x(T)|QT x(T)]
10
consists of an optimal estimator (Kalman filter) + optimal controller (LQR)
Kalman-Bucy filter) and of this result is mathematically quite involved.
11
where , are zero-mean Gaussian white noise and the initial state is zero-mean Gaussian random variable. w(t) ˙ x(t) = Ax(t) + Bu(t)+ w(t) The optimal estimator in the sense that minimizes Consider the problem of finding an estimator for the state of ˆ x for any constant vector is the Kalman-Bucy filter c as a function of the information set which includes the measurements y(t) = Cx(t) + n(t) n(t) ˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(t)(y(t) − Cˆ x(t)) L(t) = Φ(t)C|V −1 t ≥ 0 ˙ Φ(t) = AΦ(t) + Φ(t)A| + W − Φ(t)C|V −1CΦ(t) Φ(0) = E[(x(0) − ¯ x0)(x(0) − ¯ x0)|] = ¯ Φ0 E[w(t)w(t + τ)|] = Wδ(τ) E[n(t)n(t + τ)|] = V δ(τ) ˆ x(0) = ¯ x0 c|E[(ˆ x(t) − x(t))(ˆ x(t) − x(t))||I(t)]c
12
˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(t)(y(t) − Cˆ x(t)) L(t) = Φ(t)C|V −1 ˙ P(t) = −(A|P(t) + P(t)A − P(t)BR−1B|P(t) + Q) P(T) = QT t ∈ [0, T) t ∈ [0, T) The optimal control input for the output feedback linear quadratic optimal control problem is u(t) = K(t)ˆ x(t) where K(t) = −R−1B|P(t) ˙ Φ(t) = AΦ(t) + Φ(t)A| + W − Φ(t)C|V −1CΦ(t) Φ(0) = E[(x(0) − ¯ x0)(x(0) − ¯ x0)|] = ¯ Φ0
13
The optimal control input for the output feedback linear quadratic optimal control problem with cost (1) is where u(t) = Kˆ x(t) ˙ ˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − Cˆ x(t)) K = −R−1B|P A|P + PA − PBR−1B|P + Q = 0 AΦ + ΦA| + W − ΦC|V −1CΦ = 0 L = ΦC|V −1 If instead of the finite-horizon cost, we consider (1) min
u(t)=µ(t,I(t)) lim T →0
1 T E[ Z T x(t)|Qx(t) + u(t)|Ru(t)dt]
14
For the model provided in Lecture II_1, slide 32, (state-feedback, for simplicity) let us compare discrete-time and continuous-time gains
clear all, close all, clc % definition of the continuous-time model m = 0.2; M = 1; b = 0.05; I = 0.01; g = 9.8; l = 0.5; p = (I+m*l^2)*(M+m)-m^2*l^2; Ac = [0 1 0 0; 0 -(I+m*l^2)*b/p (m^2*g*l^2)/p 0; 0 0 0 1; 0 -(m*l*b)/p m*g*l*(M+m)/p 0]; Bc = [ 0; (I+m*l^2)/p; 0; m*l/p]; Q = diag([1 1 1 1]); S = zeros(4,1); R = 1; % discretization n = 4; tau = 0.01; sysd = c2d(ss(Ac,Bc,zeros(1,n),0),tau); A = sysd.a; B = sysd.b; % LQR control discrete time K = dlqr(A,B,Q,R,S); K = -K; % continuous-time Kc = lqr(Ac,Bc,Q,R,S); Kc =-Kc;
15
τ = 0.1 τ = 0.01 τ = 0.001 K = ⇥0.5955 1.4650 −25.3322 −5.9529⇤ Kc = ⇥1.0000 2.3674 −33.1623 −7.8509⇤ K = ⇥0.9495 2.2551 −32.1930 −7.6156⇤ K = ⇥0.9948 2.3559 −33.0632 −7.8269⇤ Continuous-time gains ( policy ) Discrete-time gains ( policy ) u(t) = Kcx(t) uk = Kxk (converging to continuous-time gains as expected)
16
continuous-time ones.
(output feedback) converges to the continuous-time one.
discrete-time version.
much more used and known.
sampling/discretization period and allows to gain insight about the system.
it is much easier to work with bode plots or root plots in continuous-time than in discrete-time.
17
has very interesting properties in the frequency domain (impressive theory!)
loop transfer recovery (LTR).
in the Laplace domain.
systems.
R Q
R Q R Q
18
+ K(sI − A)−1B ˙ x(t) = Ax(t) + Bu(t) u(t) = Kx(t)
Time-domain Laplace-domain
u(s) = Z ∞ u(t)e−stdt x(s) = Z ∞ x(t)e−stdt u(s) = Kx(s) x(s) = (sI − A)−1Bu(s) u(s)
19
C(sI − A)−1L + − y(s) = Z ∞ y(t)e−stdt ˆ y(s) y(s) ˙ ˆ x(t) = Aˆ x(t) + L(y(t) − ˆ y(t)) ˆ y(t) = Cˆ x(t)
Time-domain Laplace-domain
ˆ y(s) = Cx(s) ˆ x(s) = (sI − A)−1L(y(s) − ˆ y(s)) ˆ x(s) = Z ∞ ˆ x(t)e−stdt ˆ y(s) = Cˆ x(s)
20
Process LQG controller u(t) = Kˆ x(t) ˙ ˆ x(t) = (A + BK − LC)ˆ x(t) + Ly(t)
K(sI − (A + BK − LC))−1L C(sI − A)−1B
time domain Laplace domain
+
u(s) y(s) = K(sI − (A + BK − LC))−1L
u(s) y(s)
21
converges to the open loop transfer function of the Kalman filter loop when the LQR control penalty converges to zero (cheap control) , and . K(sI − (A + BK − LC))−1L C(sI − A)−1B C(sI − A)−1L R = ρ → 0 K(sI − (A + BK − LC))−1L C(sI − A)−1B + u(s) y(s) C(sI − A)−1L + − ˆ y(s) y(s) (as ) R = ρ → 0
the LQR loop, the dual properties of the Kalman loop and next lecture we address LTR. Q = C|C
22
+ − K(s) P(s) Process Controller y r u e + + n d disturbances
noise reference error control Two of the most basic analysis tools are: Nyquist diagram and root-locus analysis - both allow to infer stability of the closed loop from the open loop t.f K(s) = nK(s)
dK(s)
K(s)P(s) = nK(s)
dK(s) nP (s) dP (s) := n(s) d(s)
P(s) = nP (s)
dP (s)
C(s) =
K(s)P (s) 1+K(s)P (s)
(open loop t.f.) (closed loop t.f.) S(s) =
1 1+K(s)P (s)
(sensitivity t.f.)
23
Plot the open loop transfer function (bode plot) and the Nyquist plot
Nyquist stability criterion dB
L1 L2 L1
if there are no unstable open loop poles, stability Nyquist curve does not encircle -1 ≡
N = Z − P
(clockwise) encirclements of -1 by the nyquist plot zeros of in the right half plane 1 + K(s)P(s) (unstable closed loop poles) poles of in the right half plane (unstable open loop poles) 1 + K(s)P(s) −1
L2
Re{K(s)P(s)}
Im{K(s)P(s)}
−180
|K(jω)P(jω)| arg(K(jω)P(jω))
ω ω
24
Let be the interval of positive gains that can one can multiply the open-loop t.f. without destabilizing the closed loop. Then, the downward margin is and the upward gain margin is .
dB
L1 L2
Can be computed from the bode plot! I G GM −180 GM − = min{G|G ∈ I} GM + = max{G|G ∈ I}
|K(jω)P(jω)| arg(K(jω)P(jω))
log(ω) dB log(ω)
25
Let be the interval of phase such that multiplying the open-loop freq. response by does not destabilize the closed-loop. The negative phase margin is
dB
L2
Can be computed from the bode plot! D −180 d K(jω)P(jω)
|K(jω)P(jω)| arg(K(jω)P(jω))
ejd PM − = min{d|d ∈ D} PM + = max{d|d ∈ D} the positive phase margin is . log(ω) log(ω) dB
26
Consider the LQR gains which result from the optimal policy for the problem
and consider the LQR closed-loop The gain margins and phase margins for this closed-loop system are at least as follows ˙ x = Ax + Bu u = Kx
+ − + −
A R B −K
+
n(s) d(s)
min R ∞ x|Qx + u|Rudt D = (−60, 60) I = (1/2, ∞) PM − = −60 PM + = 60 GM − = 1/2 GM + = ∞ n(s) d(s) = −K(sI − A)−1B These margins hold of any system. and any (!) and from frequency domain equality
(A, B) Q, R
27
K Consider the LQR gains resulting from a standard linear quadratic control problem K = −R−1B|P Then, defining the following holds Φ(s) = (sI − A)−1
m = 1 Q = M |M 0 = A|P + PA + M |M − PBR−1B|P [I − KΦ(−s)B)]|R[I − KΦ(s)B)] = R + [MΦ(−s)B]|[MΦ(s)B]
(1 − KΦ(−s)B))(1 − KΦ(s)B)) = 1 + 1 R(MΦ(−s)B)(MΦ(s)B)
0 = −A|P − PA − M |M + PBR−1B|P | {z }
K|RK
Q = M |M K = −R−1B|P Start with the continuous-time algebraic Riccati equation and add and subtract sP 0 = (−sI − A)|P + P(sI − A) − M |M + K|RK Then premultiply by and postmultiply by Φ(s)B B|Φ(−s)| Φ(s) = (sI − A)−1 Rearrange and add to both sides of the equation to arrive at FDE R 0 = B|P |{z}
=−RK
Φ(s)B+B|Φ(−s)|PB |{z}
=−K|R
−B|Φ(−s)|M |MΦ(s)B+B|Φ(−s)|K|RKΦ(s)B
28
R+B|Φ(−s)|M |MΦ(s)B = R − RKΦ(s)B − B|Φ(−s)|K|R + B|Φ(−s)|K|RKΦ(s)B | {z }
=[I−KΦ(−s)B]|R[I−KΦ(s)B]
29
Making we have s = jω
from which we conclude that The Nyquist plot (curve ) is always outside the region −1 Geometrically we can then infer the phase and gain margins given before −2 {φ ∈ C| |1 + φ| < 1} |1 − KΦ(jω)B)|2 = 1 + 1 R|MΦ(jω)B|2 |1 − KΦ(jω)B)| ≥ 1 φ(ω) = −KΦ(jω)B Im − KΦ(jω)B Re − KΦ(jω)B (1 − KΦ(−jω)B))(1 − KΦ(jω)B)) = 1 + 1 R(MΦ(−jω)B)(MΦ(jω)B)
30
From the FDE K(sI − A)−1B u(s) |1 − KΦ(jω)B)|2 = 1 + 1 R|MΦ(jω)B|2 |1 − KΦ(jω)B)| ≥ 1 we conclude that the the sensitivity function of the LQR loop satisfies = ⇒ |
1 1−KΦ(jω)B | ≤ 1
and from the identity we conclude that the complimentary sensitivity (closed-loop transfer function) satisfies − − +
1 1−KΦ(jω)B + −KΦ(jω)B 1−KΦ(jω)B = 1
| −KΦ(jω)B
1−KΦ(jω)B | ≤ 2
log(ω) dB LQR sensitivity bode plot LQR complementary sensitivity bode plot log(ω) dB 6 Φ(s) = (sI − A)−1
31
we conclude that if is very small (cheap control), then (since is also very large) Conclusion: cheap control LQR loops have -20 db/dec frequency roll-off From the FDE |1 − KΦ(jω)B)|2 = 1 + 1 R|MΦ(jω)B|2 Φ(s) = (sI − A)−1 R = ρ For LQR with cheap control, we then conclude and this implies that Moreover, note that at high frequencies K |KΦ(jω)B| ≈
1 √ρ|MΦ(jω)B|
Φ(jω) ≈
1 jωI MΦ(jω)B ≈ 1 jωMB
log(ω) dB 6
|KΦ(jω)B| ≈
1 jω|MB|
| −KΦ(jω)B
1−KΦ(jω)B | ≈ 1 jω|MB|
ω → ∞ ω → ∞
32
Suppose that the process is the state-space representation of P(s) =
1 s+1 1 s+2 1 s+3
Q = I
tf1 = tf([1],[1 1])*tf([1],[1 2])*tf([1],[1 3]); [A,B,C,D] = tf2ss(tf1.num{1},tf1.den{1}); Q = C'*C; R = 0.0001; W = [0:0.01:1e3]; K = lqr(A,B,Q,R); K = -K; [num,den] = ss2tf(A,B,-K,0); tflqr = tf(num,den); figure(1), bode(tflqr), grid on figure(2), nyquist(tflqr), grid on figure(3), bode(tflqr/(1+tflqr),W), grid on figure(4), bode(1/(1+tflqr)), grid on
10 20 30
Magnitude (dB)
10-2 10-1 100 101 102
Phase (deg) Bode Diagram Frequency (rad/s)
loop
R = 0.0001
33
5 10 15 20
2 4 6 8 10
0 dB
6 dB 4 dB 2 dB
Nyquist Diagram Real Axis Imaginary Axis
0.5 1 1.5
0.5 1 1.5 2
0 dB
20 dB 10 dB 6 dB 4 dB 2 dB
Nyquist Diagram Real Axis Imaginary Axis
(zoom in)
Nyquist plot does not enter unitary circle around (-1,0)
34
10
Magnitude (dB)
10-2 10-1 100 101 102 103
Phase (deg) Bode Diagram Frequency (rad/s)
Magnitude (dB)
10-2 10-1 100 101 102 45 90
Phase (deg) Bode Diagram Frequency (rad/s)
Complementary sensitivity Sensitivity
Absolute value below 2 (6 db)
Absolute value below 1 (0 db)
35
Another interesting fact that we can conclude from the FDE is the geometric place of the closed-loop poles! Let us start by reviewing the root-locus
+ −
n(s) d(s) G Root-locus: procedure to determine where the closed-loop poles (solution to are as a function of the gain G
is less than the number poles, the remaining poles go to infinity in a Butterworth pattern 3.... (about 7 rules, see textbooks such as Feedback control of dynamical systems, G. Franklin et al, we will only need these first two) G = 0 G → ∞ ) d(s) + Gn(s) = 0
36
2 4
0.2 0.4 0.6 0.8 1
Root Locus Real Axis (seconds-1) Imaginary Axis (seconds-1)
tf1 = tf([1 -2],[1 1 1])*tf([1 4],[1 5]) rlocus(tf1)
n(s) d(s) = s−2 s2+s+1 s+4 s+5
37
From the FDE we conclude that we can obtain the poles as a function of by a procedure identical to the root locus: we just have to mirror (with respect to the imaginary axis) the zeros and poles of and obtain the stable closed-loop poles from the diagram (the root locus will always be symmetric with respect the imaginary axis) R MΦ(s)B Example of a root square locus p(s)
2 4 6 8
2 4 6 8
Root Locus Real Axis (seconds-1) Imaginary Axis (seconds-1)
poles of p(s) p(−s) poles of zeros of p(s) zeros of p(−s) (1 − KΦ(−s)B))(1 − KΦ(s)B)) = 1 + 1 R(MΦ(−s)B)(MΦ(s)B)
38
5 10 15
0.5 1 1.5
Root Locus Real Axis (seconds-1) Imaginary Axis (seconds-1)
K = lqr(A,B,M'*M,1/10); K = -K; eig(A+B*K)
A = 1 1 10 13 2 B = 1 M = ⇥10 −6 1⇤ 3 ± 1i 5, −2, −1 Open-loop zeros Open-loop poles Analyzing the root square locus we can check the closed-loop poles for instance when the “open- lop gain” equals 1 ρ = 10 and confirm the location
matlab instructions
39
From the root-square locus we conclude that (with cheap control) we can assign the location of closed-loop poles and the remaining will be stable and very fast Given
˙ x(t) = Ax(t) + Bu(t) m < n
pi i ∈ {1, 2, . . . , m} Procedure det(sI − A) = sn + αn−1sn−1 + · · · + α1s + α0 (s − p1)(s − p2) . . . (s − pm) = sm + γm−1sm−1 + · · · + γ1s + γ0 and write the model in companion form α = ⇥α0 α1 . . . αn−1 ⇤
¯ B = . . . b ¯ A = 1 . . . 1 . . . . . . . . . . . . . . . . . . . . . 1 −α0 −α1 −α2 . . . −αn−1
˙ ¯ x(t) = ¯ A¯ x(t) + ¯ B¯ u(t)
˙ ¯ x(t) = ¯ A¯ x(t) + ¯ B¯ u(t) for obtaining law R ∞ ¯ x(t)| ¯ Q¯ x(t) + ¯ u(t)| ¯ R¯ u(t)dt ¯ Q = γ|γ ¯ R → 0
(A + BK) m u = Kx ¯ u = ¯ K¯ x K = ¯ KT ¯ x = Tx
T = ⇥ ¯ B ¯ A ¯ B . . . ¯ An−1 ¯ B⇤ ⇥B AB . . . An−1B⇤−1
γ = ⇥γ0 γ1 . . . γm−1 1 . . . 0⇤ ∈ Rn
40
Designing the Kalman filter loop (obtain the gain L ) C(sI − A)−1L + − ˆ y(s) y(s) is the same problem as designing the LQR loop (obtain the gain K) if we make the transformation
AΦ + ΦA| + W − ΦC|V −1CΦ = 0
L = ΦC|V −1
¯ K = −L| ¯ A = A| ¯ B = C| ¯ R = V ¯ Q = W ¯ K(sI − ¯ A)−1 ¯ B ≡ ¯ B|(sI − ¯ A|)−1 ¯ K| ¯ K| = − ¯ R−1 ¯ B| ¯ P ¯ A| ¯ P + ¯ P ¯ A − ¯ P ¯ B ¯ R−1 ¯ B| ¯ P + ¯ Q = 0 ¯ P = Φ
41
FDE when From this FDE it follows:
for the Kalman filter loop.
Φ(s) = (sI − A)−1 (1 + CΦ(−s)L)(1 + CΦ(s)L) = 1 + 1
V (CΦ(−s)F)(CΦ(s)F)
W = FF | where m = 1
42
After this lecture, you should be able to: To summarize:
an arbitrarily close discrete-time approximation
results of the Kalman filter, LQR (finite and infinite horizon), and LQG control
from the frequency domain equatlity.
closed-loop.
A1
Ω Function of and time Probability space ω ω ∈ Ω t x(ω, t)
ω Stochastic process ≡ x(ω, t) t ω x(ω, t) t t
A2
Let us assume that and . Then the auto- correlation function of the process is defined by t E[x(t)] = 0 x(t) ∈ R R(r, s) := E[x(r)x(s)] s r For signals of interest (stationary) it is only a function of R(τ) := E[x(r)x(r + τ)] τ = s − r The power spectral density is defined as the Fourier transform of R(τ) S(ω) = Z ∞
−∞
R(τ)e−ωτdτ R(τ) = 1 2π Z ∞
−∞
S(ω)eωτdω
A3
t s r
Auto-correlation Power spectral density Sample paths t t R(τ) = a
2e−a|τ|
S(ω) =
a2 w2+a2
s r s r
a 2
τ ω a −a a = 0.01 a = 1 a = 100 1/a −1/a 1
A4
For signals of interest (ergodic) the power is equal to limT →∞
1 2T
R T
−T x(t)2dt
For a given sample path R(0) = E[x(t)2]
R(0) = 1 2π Z ∞
−∞
S(ω)dω
A5
Power spectral density is one for every frequency ( in previous example) Very interesting (or strange!) properties:
a → ∞ τ ω 1 R(τ) = δ(τ) S(ω) = 1 R(0) = E[x(t)2]
A6
˙ x(t) = w(t) The integral of white noise is called random walk or the Wiener process
that is also Gaussian for fixed time. and it is more intuitive and it is easier to handle mathematically 1 s w(t) x(t) x(t)
x(t) w(t)
x(t) and are correlated
For a differential equation (for now assume ) Two options for the auto-correlation ˙ x(t) = Ax(t) + Bu(t)+
A7
we would like (uncorrelated noise). Otherwise the state would not summarize all the information to make decisions* x(t) ∈ R E[w(r)w(s)] = 0
*e.g. if then if is positive and large than it is likely that is positive and large
w(r + τ)
E[w(r)w(r + τ)] > 0 w(r)
τ R(τ) = δ(τ) τ R(0) = constant
Power spectral density is zero, no power, zero signal - Not interesting! white noise!
w(t)
A8
˙ x(t) = Ax(t) + Bu(t)+
differential equation is more intuitive than white noise (e.g. has finite power).
x(t) ∈ Rn w(t) ∈ Rn w(t) = N ¯ w(t) ¯ w(t) = ⇥ ¯ w1(t) ¯ w2(t) . . . ¯ wp(t)⇤| where are Gaussian white noise scalar variables and uncorrelated Thus, , ¯ wi(t) w(t) E[ ¯ w(t) ¯ w(t + τ)|] = Iδ(τ) E[w(t)w(t + τ)|] = NN |δ(τ) := Wδ(τ)