9. Equality constraints and tradeoffs More least squares Example: - - PowerPoint PPT Presentation

9 equality constraints and tradeoffs
SMART_READER_LITE
LIVE PREVIEW

9. Equality constraints and tradeoffs More least squares Example: - - PowerPoint PPT Presentation

CS/ECE/ISyE 524 Introduction to Optimization Spring 201718 9. Equality constraints and tradeoffs More least squares Example: moving average model Minimum-norm least squares Equality-constrained least squares Optimal


slide-1
SLIDE 1

CS/ECE/ISyE 524 Introduction to Optimization Spring 2017–18

  • 9. Equality constraints and tradeoffs

❼ More least squares ❼ Example: moving average model ❼ Minimum-norm least squares ❼ Equality-constrained least squares ❼ Optimal tradeoffs ❼ Example: hovercraft

Laurent Lessard (www.laurentlessard.com)

slide-2
SLIDE 2

More least squares

Solving the least squares optimization problem: minimize

x

Ax − b2 Is equivalent to solving the normal equations: ATA ˆ x = ATb

❼ If ATA is invertible (A has linearly independent columns)

ˆ x = (ATA)−1ATb

❼ A† := (ATA)−1AT is called the pseudoinverse of A.

9-2

slide-3
SLIDE 3

Example: moving average model

❼ We are given a time series of input data u1, u2, . . . , uT

and output data y1, y2, . . . , yT. Example:

❼ A “moving average” model with window size k assumes

each output is a weighted combination of k previous inputs: yt ≈ w1ut + w2ut−1 + · · · + wkut−k+1 for all t

❼ find weights w1, . . . , wk that best agree with the data.

9-3

slide-4
SLIDE 4

Example: moving average model

❼ Moving average model:

yt ≈ w1ut + w2ut−1 + w3ut−2 for all t

❼ Writing all the equations (e.g. k = 3):

       y1 y2 y3 . . . yT        ≈        u1 u2 u1 u3 u2 u1 . . . . . . . . . uT uT−1 uT−2          w1 w2 w3  

❼ Solve least squares problem! Moving Average.ipynb

9-4

slide-5
SLIDE 5

Minimum-norm least squares

Underdetermined case: A ∈ Rm×n is a wide matrix (m ≤ n), so Ax = b generally has infinitely many solutions.

❼ The set of solutions of Ax = b forms an affine subspace.

Recall: if Ay = b and Az = b then A(αy + (1 − α)z) = b.

❼ One possible choice: pick the x with smallest norm. x

  • 1

1 2 3 4

  • 0.5

0.5 1.0 1.5 2.0 2.5

Rn

❼ Insight: The optimal ˆ

x must satisfy Aˆ x = b and ˆ xT(ˆ x − w) = 0 for all w satisfying Aw = b.

9-5

slide-6
SLIDE 6

Minimum-norm least squares

❼ We want: ˆ

xT(ˆ x − w) = 0 for all w such that Aw = b.

❼ We also know that Aˆ

x = b. Therefore: A(ˆ x − w) = 0. In other words: ˆ x ⊥ (ˆ x − w) and (ˆ x − w) ⊥ (all rows of A) Therefore, ˆ x is a linear combination of the rows of A. Stated another way, ˆ x = ATz for some z.

❼ Therefore, we must find z and ˆ

x such that: Aˆ x = b and ATz = ˆ x (this also follows from R(A)⊥ = N(AT))

9-6

slide-7
SLIDE 7

Minimum-norm least squares

Theorem: If there exists ˆ x and z that satisfy Aˆ x = b and ATz = ˆ x, then ˆ x is a solution to the minimum-norm problem minimize

x

x2 subject to: Ax = b Proof: Suppose Aˆ x = b and ATz = ˆ

  • x. For any x that

satisfies Ax = b, we have: x2 = x − ˆ x + ˆ x2 = x − ˆ x2 + ˆ x2 + 2ˆ xT(x − ˆ x) = x − ˆ x2 + ˆ x2 + 2zTA(x − ˆ x) = x − ˆ x2 + ˆ x2 ≥ ˆ x2

9-7

slide-8
SLIDE 8

Minimum-norm least squares

Solving the minimum-norm least squares problem: minimize

x

x2 subject to: Ax = b Is equivalent to solving the linear equations: Aˆ x = b and ATz = ˆ x = ⇒ AATz = b

❼ If AAT is invertible (A has linearly independent rows)

ˆ x = AT(AAT)−1b

❼ A† := AT(AAT)−1 is also called the pseudoinverse of A.

9-8

slide-9
SLIDE 9

Equality-constrained least squares

A more general optimization problem: minimize

x

Ax − b2 subject to: Cx = d (Equality-constrained least squares)

❼ If C = 0, d = 0, we recover ordinary least squares ❼ If A = I, b = 0, we recover minimum-norm least squares

9-9

slide-10
SLIDE 10

Equality-constrained least squares

Solving the equality-constrained least squares problem: minimize

x

Ax − b2 subject to: Cx = d Is equivalent to solving the linear equations: ATAˆ x + C Tz = ATb and C ˆ x = d

9-10

slide-11
SLIDE 11

Equality-constrained least squares

Proof: Suppose ˆ x and z satisfy ATAˆ x + C Tz = ATb and C ˆ x = d. Let x be any other point satisfying Cx = d. Then, Ax − b2 = A(x − ˆ x) + (Aˆ x − b)2 = A(x − ˆ x)2 + Aˆ x − b2 + 2(x − ˆ x)TAT(Aˆ x − b) = A(x − ˆ x)2 + Aˆ x − b2 − 2(x − ˆ x)TC Tz = A(x − ˆ x)2 + Aˆ x − b2 − 2(Cx − C ˆ x)Tz = A(x − ˆ x)2 + Aˆ x − b2 ≥ Aˆ x − b2 Therefore ˆ x is an optimal choice.

9-11

slide-12
SLIDE 12

Recap so far

Several different variants of least squares problems are easy to solve in the sense that they are equivalent to solving systems of linear equations. Least squares min

x

Ax − b2 Minimum-norm min

x

x2 s.t. Ax = b Equality constrained min

x

Ax − b2 s.t. Cx = d

9-12

slide-13
SLIDE 13

Optimal tradeoffs

We often want to optimize several different objectives simultaneously, but these objectives are conflicting.

❼ risk vs expected return (finance) ❼ power vs fuel economy (automobiles) ❼ quality vs memory (audio compression) ❼ space vs time (computer programs) ❼ mittens vs gloves (winter)

9-13

slide-14
SLIDE 14

Optimal tradeoffs

❼ Suppose J1 = Ax − b2 and J2 = Cx − d2. ❼ We would like to make both J1 and J2 small. ❼ A sensible approach: solve the optimization problem:

minimize

x

J1 + λJ2 where λ > 0 is a (fixed) tradeoff parameter.

❼ Then tune λ to explore possible results.

◮ When λ → 0, we place more weight on J1 ◮ When λ → ∞, we place more weight on J2

9-14

slide-15
SLIDE 15

Optimal tradeoffs

This problem is also equivalent to solving linear equations! J1 + λJ2 = Ax − b2 + λCx − d2 =

  • Ax − b

√ λ(Cx − d)

  • 2

=

  • A

√ λC

  • x −

b √ λd

  • 2

❼ An ordinary least squares problem! ❼ Equivalent to solving

(ATA + λC TC) ˆ x = (ATb + λC Td)

9-15

slide-16
SLIDE 16

Tradeoff analysis

  • 1. Choose values for λ (usually log-spaced). A useful

command: lambda = logspace(p,q,n) produces n points logarithmically spaced between 10p and 10q.

  • 2. For each λ value, find ˆ

xλ that minimizes J1 + λJ2.

  • 3. For each ˆ

xλ, also compute the corresponding Jλ

1 and Jλ 2 .

  • 4. Plot (Jλ

1 , Jλ 2 ) for each λ and connect the dots.

J1 J2

λ → 0 λ → ∞ 9-16

slide-17
SLIDE 17

Pareto curve

J1 J2

λ → 0 λ → ∞

candidate point

better J1 better J2 worse J1 worse J2 worse J1 better J2 better J1 worse J2

9-17

slide-18
SLIDE 18

Pareto curve

J1 J2

λ → 0 λ → ∞

feasible, but strictly suboptimal infeasible P a r e t

  • p

t i m a l p

  • i

n t s

9-18

slide-19
SLIDE 19

Example: hovercraft

We are in command of a hovercraft. We are given a set of k waypoint locations and times. The objective is to hit the waypoints at the prescribed times while minimizing fuel use. Goal is to choose appropriate thruster inputs at each instant.

9-19

slide-20
SLIDE 20

Example: hovercraft

We are in command of a hovercraft. We are given a set of k waypoint locations and times. The objective is to hit the waypoints at the prescribed times while minimizing fuel use.

❼ Discretize time: t = 0, 1, 2, . . . , T. ❼ Important variables: position xt, velocity vt, thrust ut. ❼ Simplified model of the dynamics:

xt+1 = xt + vt vt+1 = vt + ut for t = 0, 1, . . . , T − 1

❼ We must choose u0, u1, . . . , uT. ❼ Initial position and velocity: x0 = 0 and v0 = 0. ❼ Waypoint constraints: xti = wi for i = 1, . . . , k. ❼ Minimize fuel use: u02 + u12 + · · · + uT2

9-20

slide-21
SLIDE 21

Example: hovercraft

First model: hit the waypoints exactly minimize

xt,vt,ut T

  • t=0

ut2 subject to: xt+1 = xt + vt for t = 0, 1, . . . , T − 1 vt+1 = vt + ut for t = 0, 1, . . . , T − 1 x0 = v0 = 0 xti = wi for i = 1, . . . , k Julia model: Hovercraft.ipynb

9-21

slide-22
SLIDE 22

Example: hovercraft

Second model: allow waypoint misses minimize

xt,vt,ut T

  • t=0

ut2 + λ

k

  • i=1

xti − wi2 subject to: xt+1 = xt + vt for t = 0, 1, . . . , T − 1 vt+1 = vt + ut for t = 0, 1, . . . , T − 1 x0 = v0 = 0

❼ λ controls the tradeoff between making u small and hitting

all the waypoints.

9-22