Structure of the Hessian Graham C. Goodwin September 2004 Centre - - PowerPoint PPT Presentation

structure of the hessian
SMART_READER_LITE
LIVE PREVIEW

Structure of the Hessian Graham C. Goodwin September 2004 Centre - - PowerPoint PPT Presentation

Structure of the Hessian Graham C. Goodwin September 2004 Centre for Complex Dynamic Systems and Control 1. Introduction We saw in earlier lectures that a core ingredient in quadratic constrained optimisation problems is the Hessian matrix H .


slide-1
SLIDE 1

Structure of the Hessian

Graham C. Goodwin September 2004

Centre for Complex Dynamic Systems and Control

slide-2
SLIDE 2
  • 1. Introduction

We saw in earlier lectures that a core ingredient in quadratic constrained optimisation problems is the Hessian matrix H. So far we have simply given an “in principle” approach to the evaluation

  • f this matrix. That is, for a system with state equations

xk+1 = Axk + Buk, we can compute the Hessian as H = ΓQΓ + R, where Γ, Q, R.

Centre for Complex Dynamic Systems and Control

slide-3
SLIDE 3

This will be satisfactory for simple problems. However, for more complex problems (for example, high order systems or problems having mixed stable and unstable modes) this “brute force” approach may fail. A hint as to the source of the difficulties is that the direct way of computing the Hessian depends on powers of the system matrix A. Clearly, if the system has unstable modes, then some entries of Γ will diverge as N increases.

Centre for Complex Dynamic Systems and Control

slide-4
SLIDE 4

We will show in this lecture that this problem can be resolved by focusing attention on the stable and unstable parts of the system separately.

Centre for Complex Dynamic Systems and Control

slide-5
SLIDE 5
  • 2. Separating Stable and Unstable Modes

Our eventual goal in this lecture is to gain a better understanding

  • f the structure of the Hessian matrix particularly for large
  • ptimisation horizons. However, as mentioned above, the

straightforward approach to evaluating the Hessian will often meet difficulties for open loop unstable plants due to exponential divergence of the system impulse response.

Centre for Complex Dynamic Systems and Control

slide-6
SLIDE 6

One way of addressing this problem is to recognise that there is an intimate connection between “stability” and “causality.” In particular, a system having all modes unstable becomes stable if viewed in reverse time, that is, as an anti-causal system. This line

  • f reasoning leads to an alternative viewpoint in which unstable

modes are treated differently. We show below that this leads to an equivalent problem formulation with a different Hessian having different properties.

Centre for Complex Dynamic Systems and Control

slide-7
SLIDE 7

Consider a discrete time linear system and suppose that it has no eigenvalues on the unit circle. We can then partition the state vector xk ∈ Rn as xk =

  • xs

k

xu

k

  • ,

where the states xs

k and xu k are associated with the stable and

unstable modes, respectively.

Centre for Complex Dynamic Systems and Control

slide-8
SLIDE 8

We can then factor the state equations into stable and unstable parts as follows:

  • xs

k+1

xu

k+1

  • =
  • As

Au xs

k

xu

k

  • +
  • Bs

Bu

  • uk,

(1) yk = C xk =

  • Cs

Cu xs

k

xu

k

  • ,

where uk ∈ Rm and yk ∈ Rp (p ≥ m). The eigenvalues of As have moduli less than one, and the eigenvalues of Au have moduli greater than one.

Centre for Complex Dynamic Systems and Control

slide-9
SLIDE 9

We can then express the solution of the system equations as xs

k = Ak s xs 0 + k−1

  • j=0

Ak−1−j

s

Bsuj for k = 1, . . . , N, (2) xu

k = A−(N−k) u

µ −

N−1

  • j=k

Ak−1−j

u

Buuj for k = 0, . . . , N − 1. (3) xu

N µ,

Centre for Complex Dynamic Systems and Control

slide-10
SLIDE 10

We then need an equality constraint that both µ and the sequence

  • f control signals {u0, . . . , uN−1} need to satisfy in order to bring the

unstable states back to their correct initial values, that is, A−N

u µ − N−1

  • j=0

A−j−1

u

Buuj = xu

0.

(4)

Centre for Complex Dynamic Systems and Control

slide-11
SLIDE 11

We are thus led to the following equivalent statement of the

  • ptimisation problem.

PN(x) :

V

N (x) min VN({xk}, {uk}, µ),

(5) subject to: xk =

  • xs

k

xu

k

  • for k = 0, . . . , N,

xs

k = Ak s xs 0 + k−1

  • j=0

Ak−1−j

s

Bsuj for k = 1, . . . , N, xu

k = A−(N−k) u

µ −

N−1

  • j=k

Ak−1−j

u

Buuj for k = 0, . . . , N − 1, (6)

Centre for Complex Dynamic Systems and Control

slide-12
SLIDE 12

x0 =

  • xs

xu

  • = x,

uk ∈ U for k = 0, . . . , N − 1, xk ∈ X for k = 0, . . . , N − 1, xN =

  • xs

N

µ

  • ∈ Xf ⊂ X,

where VN({xk}, {uk}, µ) 1 2

  • xs

N

µ 

P

  • xs

N

µ

  • + 1

2

N−1

  • k=0

(x

kQxk + u kRuk).

(7)

Centre for Complex Dynamic Systems and Control

slide-13
SLIDE 13

The above formulation of the problem, at the expense of the introduction of the additional optimisation variable µ, avoids exponentially diverging terms in the computation of the Hessian

  • matrix. Thus, at least intuitively, it would seem to be more apposite

for studying the structure of the problem for large horizons.

Centre for Complex Dynamic Systems and Control

slide-14
SLIDE 14
  • 3. Numerical Issues in the Computation of the Hessian

We represent the time evolution of the system output using the usual vector notation. Thus, let y =

  • y

1

y

2

. . .

y

N

 ,

u =

  • u

u

1

. . .

u

N−1

 .

(8)

Centre for Complex Dynamic Systems and Control

slide-15
SLIDE 15

y = (Γs + Γu)

  • ¯

Γ

u + Ωsxs

0 + Ωuµ,

(9) where

Ωs =                 

CsAs CsA2

s

. . .

CsAN

s

                 , Γs =                 

CsBs

· · ·

CsAsBs CsBs

· · · . . . . . . ... . . .

CsAN−1

s

Bs CsAN−2

s

Bs

· · ·

CsBs

                 ,

Centre for Complex Dynamic Systems and Control

slide-16
SLIDE 16

Ωu =                        

CuA−(N−1)

u

CuA−(N−2)

u

. . .

CuA−1

u

Cu

                        , Γu = −                        

CuA−1

u Bu

CuA−2

u Bu

. . .

CuA−(N−1)

u

Bu CuA−1

u Bu

. . .

CuA−(N−2)

u

Bu

. . . . . . ... ... . . . · · ·

CuA−1

u Bu

· · ·                        

Centre for Complex Dynamic Systems and Control

slide-17
SLIDE 17

The matrix ¯

Γ Γs + Γu has the form ¯ Γ Γs + Γu =                                  ¯

h0

¯

h−1

¯

h−2

. . . . . . ¯

h−(N−1)

¯

h1

¯

h0

¯

h−1

. . . . . . ¯

h−(N−2)

. . . . . . ... ... . . . . . . . . . ... ... . . . . . . . . . ¯

h0

¯

h−1

¯

hN−1

¯

hN−2

. . . . . . ¯

h1

¯

h0

                                 ,

(10)

Centre for Complex Dynamic Systems and Control

slide-18
SLIDE 18

where {¯ hk : k = −(N − 1), . . . , N − 1}, is a finite subsequence of the infinite sequence {¯ hk : k = −∞, . . . , ∞} defined by

hk : k = 0, . . . , ∞} {CsBs, CsAsBs, CsA2

s Bs, . . . },

(11)

hk : k = −1, . . . , −∞} {−CuA−1

u Bu, −CuA−2 u Bu, −CuA−3 u Bu, . . . }.

(12)

Centre for Complex Dynamic Systems and Control

slide-19
SLIDE 19

Consider, P = Q and Q = CC and R = ρIm > 0. VN as follows1: VN(x, u, y, µ) = 1 2(xQx + yy + ρuu).

1We keep the function VN but change its arguments as appropriate. Centre for Complex Dynamic Systems and Control

slide-20
SLIDE 20

VN(x, u, µ) = 1 2

  • xQx +

¯ Γu + Ωsxs

0 + Ωuµ

 ¯ Γu + Ωsxs

0 + Ωuµ

  • + ρuu
  • = 1

2u[¯

Γ ¯ Γ + ρI]u + u ¯ Γ Ωsxs

0 + Ωuµ

  • + 1

2

  • Ωsxs

0 + Ωuµ

 Ωsxs

0 + Ωuµ

  • .

(13)

Centre for Complex Dynamic Systems and Control

slide-21
SLIDE 21

With respect to the new optimisation variables (u, µ), the modified Hessian of the quadratic objective function is H′

¯ Γ ¯ Γ + ρI ¯ ΓΩu Ω

u ¯

Γ Ω

uΩu

  • .

(14)

Centre for Complex Dynamic Systems and Control

slide-22
SLIDE 22

For future use, we extract the left-upper submatrix, which we call the “regularised sub-Hessian”:

¯

HN ¯

Γ ¯ Γ + ρI.

(15)

Centre for Complex Dynamic Systems and Control

slide-23
SLIDE 23

The above problem formulation has the ability to ameliorate the numerical difficulties encountered when dealing with unstable

  • plants. Indeed, we see that all terms depend only on exponentially

decaying quantities.

Centre for Complex Dynamic Systems and Control

slide-24
SLIDE 24

Example

Consider a single input-single output system with stable and unstable modes defined via the following matrices: As =

  • 1.442

−0.64

1

  • , Bs =
  • 1
  • , Cs =
  • 0.721

−0.64

  • ,

and Au = 2, Bu = 1, Cu = −1.

Centre for Complex Dynamic Systems and Control

slide-25
SLIDE 25

we take Q =

  • Cs

Cu

Cs Cu

  • , R = 0.1, P = Q.

We consider input constraints of the form |uk| ≤ 1 and no state constraints.

Centre for Complex Dynamic Systems and Control

slide-26
SLIDE 26

5 10 15 20 25 30 35 40 45 50 20 40 60 80 100 120 140 160 180 200 Standard Hessian Modified Hessian

Horizon N Condition number

Figure: Condition number of the modified Hessian (14) (circle-dashed line) and that of the standard Hessian (plus-solid line).

Centre for Complex Dynamic Systems and Control

slide-27
SLIDE 27

5 10 15 20 25 30 35 40 45 50 10 20 30 40 50 60 70 80 90 Modified Hessian Standard Hessian

Horizon N Objective function value

Figure: Comparison of the objective function value for different prediction horizons: values achieved using the modified Hessian (circle-dashed line) and using the standard Hessian (plus-solid line).

Centre for Complex Dynamic Systems and Control

slide-28
SLIDE 28
  • 4. Revision of System Frequency Response

Consider again the system split into stable and unstable parts. The system transfer function is clearly G(z) = Gs(z) + Gu(z), (16) where Gs(z) = Cs(zI − As)−1Bs, (17) Gu(z) = Cu(zI − Au)−1Bu, (18)

Centre for Complex Dynamic Systems and Control

slide-29
SLIDE 29

We will study properties of the frequency response of, G(ejω), via its singular value decomposition [SVD], which has the form G(ejω) = U(ejω)Σ(ω)VH(ejω). (19) where, U and V are matrices containing the left and right singular vectors of G, satisfying U ∈ Cp×p, UHU = UUH = Ip, V ∈ Cm×m, VHV = VVH = Im, and

Σ(ω) = diag{σ1(ω), . . . , σm(ω)},

(20)

Centre for Complex Dynamic Systems and Control

slide-30
SLIDE 30

Lemma

Let ¯ G(z) be the two-sided Z-transform of the sequence

hk : k = −∞, . . . , ∞} embedded in ¯

Γ. Then ¯

G(z) is given by

¯

G(z) = zG(z), (21) where G(z) is the system transfer function. Moreover, the region of convergence of ¯ G(z) is given by max{|λi(As)|} < |z| < min{|λi(Au)|}, where {λi(·)} is the set of eigenvalues of the corresponding matrix. Proof: By direct calculation.

Centre for Complex Dynamic Systems and Control

slide-31
SLIDE 31
  • 5. Connecting the Singular Values of the Regularised

Sub-Hessian to the System Frequency Response

We show that there is a direct connection between the singular values of ¯ HN, for large N, and the system frequency response.

Centre for Complex Dynamic Systems and Control

slide-32
SLIDE 32

The result of the previous Lemma ensures that given any ε > 0 there exists k0 > 0 such that

  • ¯

G(ejω)

  • 2

2 −

  • k0
  • k=−k0

¯

hke−jωk

  • 2

2

  • < ε

for all w ∈ [−π, π], (22) since the sequence

¯

hk

  • contains only exponentially decaying

terms.

Centre for Complex Dynamic Systems and Control

slide-33
SLIDE 33

Recall the structure of ¯

Γ and using the definition of Ψℓ given above,

we see that the regularised sub-Hessian can be written as

¯

HN =

                                        

X1

| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ψ−2k0 . . . Ψ0 . . . Ψ2k0 . . . ... ... ... ... . . . . . . ... ... ... ... . . . Ψ−2k0 . . . Ψ0 . . . Ψ2k0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . |

X2

                                         + ρI,

(23) provided N ≥ (4k0 + 1)m. Also, X1 and X2 are appropriate submatrices.

Centre for Complex Dynamic Systems and Control

slide-34
SLIDE 34

Consider the Nm × m matrix EN,ω

  • e1

N,ω

· · ·

em

N,ω

  • 1

N

¯

EN,ωV(ejω), (24) where

¯

EN,ω

                

Im e−jωIm

. . .

e−j(N−1)ωIm

                 , ω = 2π

N ℓ for ℓ ∈ {0, . . . , N − 1}.

Centre for Complex Dynamic Systems and Control

slide-35
SLIDE 35

Consider the discrete time linear system, and the SVD of its frequency response. Let k0 > 0 be such that the tails of the impulse response are negligible. Consider ¯ HN for N ≥ 4k0 + 1 and

ρ = 0, which is the regularised sub-Hessian of the quadratic

  • bjective function VN with ρ = 0. Then, for every given frequency

ω0 = 2π

N0 ℓ0, ℓ0 ∈ {0, . . . , N0 − 1}, we have that

lim

N N0 →∞

  • ¯

HN ei

N,ω0

  • 2 = σ2

i (ω0)

for i = 1, . . . , m, where ei

N,ω0 is the ith column of EN,ω0 defined in (24), and

N/N0 ∈ {1, 2, . . . }.

Centre for Complex Dynamic Systems and Control

slide-36
SLIDE 36

Proof

The key idea is to note that the inner sub-matrix of ¯ HN has a simple shift structure which allows one to easily apply Fourier

  • methods. We then simply have to bound the end effects due to the
  • ther terms.

Centre for Complex Dynamic Systems and Control

slide-37
SLIDE 37

Corollary

Consider the same conditions of the above Theorem and choose

ρ > 0. Then

lim

N N0 →∞

  • ¯

HN ei

N,ω0

  • 2 = σ2

i (ω0) + ρ

for i = 1, . . . , m, (25) where N/N0 ∈ {1, 2, . . . }.

Centre for Complex Dynamic Systems and Control

slide-38
SLIDE 38

Example

Consider a 2 input-2 output system with stable and unstable modes defined via the following matrices. As =

  • 1.442

−0.64

1

  • , Bs =
  • 1
  • , Cs =
  • 0.721

−0.64 −0.36

0.32

  • ,

and Au = 2, Bu =

  • 1
  • ,

Cu =

−0.1 −0.1

  • .

Centre for Complex Dynamic Systems and Control

slide-39
SLIDE 39

Consider the finite horizon optimal control problem (5)–(7) and select Q =

  • Cs

Cu

Cs Cu

  • , R = 0, P = Q.

Centre for Complex Dynamic Systems and Control

slide-40
SLIDE 40

10

−2

10

−1

10 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

ω [rad/sec]

(a) σ1(w)2

10

−2

10

−1

10 −0.05 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

ω [rad/sec]

(b) σ2(w)2

Figure: Singular values of the regularised sub-Hessian ¯ HN (circles) with N = 61. The continuous lines represent the square of the two principal gains of the system.

Centre for Complex Dynamic Systems and Control

slide-41
SLIDE 41

10

−2

10

−1

10 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

ω [rad/sec]

(a) σ1(w)2

10

−2

10

−1

10 −0.05 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

ω [rad/sec]

(b) σ2(w)2

Figure: Singular values of the regularised sub-Hessian ¯ HN (circles) with N = 401. The continuous lines represent the square of the two principal gains of the system.

Centre for Complex Dynamic Systems and Control

slide-42
SLIDE 42
  • 6. Discussion

In this lecture, we have seen: (i) Numerical conditioning of the Hessian can be improved by solving the stable part in forward time and the unstable part in reverse time. This exploits the connection between “stability” and “causality”. (ii) That the singular values of ¯ HN converge to the Frequency Domain Principle gains of the system.

Centre for Complex Dynamic Systems and Control