Continuous Optimization Sanzheng Qiao Department of Computing and - - PowerPoint PPT Presentation

continuous optimization
SMART_READER_LITE
LIVE PREVIEW

Continuous Optimization Sanzheng Qiao Department of Computing and - - PowerPoint PPT Presentation

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software Continuous Optimization Sanzheng Qiao Department of Computing and Software McMaster University March, 2009 Intro Golden Section Multivariate


slide-1
SLIDE 1

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Continuous Optimization

Sanzheng Qiao

Department of Computing and Software McMaster University

March, 2009

slide-2
SLIDE 2

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Outline

1

Introduction

2

Golden Section Search

3

Multivariate Functions Steepest Descent Method

4

Linear Least Squares Problem

5

Nonlinear Least Squares Newton’s Method Gauss-Newton Method

6

Software Packages

slide-3
SLIDE 3

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Outline

1

Introduction

2

Golden Section Search

3

Multivariate Functions Steepest Descent Method

4

Linear Least Squares Problem

5

Nonlinear Least Squares Newton’s Method Gauss-Newton Method

6

Software Packages

slide-4
SLIDE 4

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Problem setting

Single variable functions. Minimization: min

x∈S f(x)

f(x): objective function, single variable and real-valued S: support

slide-5
SLIDE 5

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Outline

1

Introduction

2

Golden Section Search

3

Multivariate Functions Steepest Descent Method

4

Linear Least Squares Problem

5

Nonlinear Least Squares Newton’s Method Gauss-Newton Method

6

Software Packages

slide-6
SLIDE 6

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Golden section search

Assumption: f(x) has a unique global minimum in [a, b].

slide-7
SLIDE 7

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Golden section search

Assumption: f(x) has a unique global minimum in [a, b]. If x∗ is the minimizer, then f(x) monotonically decreases in [a, x∗] and monotonically increases in [x∗, b].

slide-8
SLIDE 8

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Golden section search

Assumption: f(x) has a unique global minimum in [a, b]. If x∗ is the minimizer, then f(x) monotonically decreases in [a, x∗] and monotonically increases in [x∗, b]. Algorithm Choose interior points c, d: c = a + r(b − a) d = a + (1 − r)(b − a), 0 < r < 0.5 if f(c) ≤ f(d) b = d else a = c end Each step, the length of the interval is reduced by a factor of (1 − r).

slide-9
SLIDE 9

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Golden section search (cont.)

The choice of r: When f(c) ≤ f(d), d+ = c (the next d is c) When f(c) > f(d), c+ = d (the next c is d) Why? Reduce the number of function evaluations

slide-10
SLIDE 10

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Choice of r

When f(c) ≤ f(d), b+ = d, d+ = a + (1 − r)(b+ − a) = a + (1 − r)(d − a) then d+ = c means a + (1 − r)(d − a) = a + r(b − a) which implies (1 − r)2 = r. When f(c) > f(d), a+ = c, then c+ = d means c+ = c + r(b − c) = a + (1 − r)(b − a) which also implies (1 − r)2 = r. Thus we have r = 3 − √ 5 2

slide-11
SLIDE 11

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Algorithm

c = a + r*(b - a); fc = f(c); d = a + (1-r)*(b - a); fd = f(d); if fc <= fd b = d; fb = fd; d = c; fd = fc; c = a + r*(b-a); fc = f(c); else a = c; fa = fc; c = d; fc = fd; d = a + (1-r)*(b-a); fd = f(d); end

slide-12
SLIDE 12

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Convergence and termination

Convergence rate: Each step reduces the length of the interval by a factor of 1 − r = 1 − 3 − √ 5 2 ≈ 0.618

slide-13
SLIDE 13

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Convergence and termination

Convergence rate: Each step reduces the length of the interval by a factor of 1 − r = 1 − 3 − √ 5 2 ≈ 0.618 Termination criteria: (d − c) ≤ u max(|c|, |d|) or a tolerance.

slide-14
SLIDE 14

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Outline

1

Introduction

2

Golden Section Search

3

Multivariate Functions Steepest Descent Method

4

Linear Least Squares Problem

5

Nonlinear Least Squares Newton’s Method Gauss-Newton Method

6

Software Packages

slide-15
SLIDE 15

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Problem setting

min f(x) where x is a vector (of variables x1, x2, ..., xn).

slide-16
SLIDE 16

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Problem setting

min f(x) where x is a vector (of variables x1, x2, ..., xn). Gradient ∇f(xc) =    

∂f(xc) ∂x1

. . .

∂f(xc) ∂xn

   

slide-17
SLIDE 17

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Problem setting

min f(x) where x is a vector (of variables x1, x2, ..., xn). Gradient ∇f(xc) =    

∂f(xc) ∂x1

. . .

∂f(xc) ∂xn

    −∇f(xc): the direction of greatest decrease from xc

slide-18
SLIDE 18

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Steepest descent method

Idea: Steepest descent direction: sc = −∇f(xc); Find λc such that f(xc + λcsc) ≤ f(xc + λsc), for all λ ∈ R (single variable minimization problem); x+ = xc + λcsc.

slide-19
SLIDE 19

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Steepest descent method

Idea: Steepest descent direction: sc = −∇f(xc); Find λc such that f(xc + λcsc) ≤ f(xc + λsc), for all λ ∈ R (single variable minimization problem); x+ = xc + λcsc.

  • Remark. Conjugate gradient method:

Use conjugate gradient to replace gradient.

slide-20
SLIDE 20

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Outline

1

Introduction

2

Golden Section Search

3

Multivariate Functions Steepest Descent Method

4

Linear Least Squares Problem

5

Nonlinear Least Squares Newton’s Method Gauss-Newton Method

6

Software Packages

slide-21
SLIDE 21

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Problem setting

Given a matrix A (m-by-n, m ≥ n) and b (m-by-1), find x (n-by-1) minimizing Ax − b2

2.

  • Example. Square root problem revisited. Find a1 and a2 in

y(x) = a1x + a2, such that (y(0.25) − √ 0.25)2 + (y(0.5) − √ 0.5)2 + (y(1.0) − √ 1.0)2 is minimized. In matrix-vector form: A =   0.25 1 0.5 1 1.0 1   , x = a1 a2

  • , b =

  √ 0.25 √ 0.5 √ 1.0   .

slide-22
SLIDE 22

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Method

Transform A into a triangular matrix: PA = R

  • where R is upper triangular. Then the problem becomes

Ax − b2

2 = P−1(

Rx − Pb)2

2

where

  • R =

R

  • .
slide-23
SLIDE 23

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Method (cont.)

Desirable properties of P: P−1 is easy to compute; P−1z2

2 = z2 2 for any z.

Partitioning Pb = b1 b2

  • ,

then the LS solution is the solution of the triangular system Rx = b1.

slide-24
SLIDE 24

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Choice of P

Orthogonal matrix (transformation) Q: Q−1 = QT.

  • Example. Givens rotation

G =

  • cos θ

sin θ − sin θ cos θ

  • Introducing a zero into a 2-vector:

G x1 x2

  • =

×

  • i.e., rotate x onto x1-axis.
slide-25
SLIDE 25

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Givens rotation

cos θ = x1

  • x2

1 + x2 2

sin θ = x2

  • x2

1 + x2 2

Algorithm. if x(2) = 0 c =1.0; s = 0.0; elseif abs(x(2)) >= abs(x(1)) ct = x(1)/x(2); s = 1/sqrt(1 + ct*ct); c = s*ct; else t = x(2)/x(1); c = 1/sqrt(1 + t*t); s = c*t; end

slide-26
SLIDE 26

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Givens rotation (cont.)

In general, G13 =     c s 1 −s c 1     G13     x1 x2 x3 x4     =     × x2 x4     Select a pair (xi, xj), find a rotation Gij to eliminate xj.

slide-27
SLIDE 27

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

QR factorization

    × × × ⊗ × × × × × × × ×     − →     × × × × × ⊗ × × × × ×     − →     × × × × × × × ⊗ × ×     − →     × × × × × ⊗ × × ×     − →     × × × × × × ⊗ ×     − →     × × × × × × ⊗     G34G24G23G14G13G12A = R

  • Q = GT

12GT 13GT 14GT 23GT 24GT 34

A = QR

slide-28
SLIDE 28

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Householder transformation

Basically, in the QR decomposition, we introduce zeros below the main diagonal of A using orthogonal transformations. Another example. Householder transformation H = I − 2uuT with uTu = 1 H is symmetric and orthogonal (H2 = I). Goal: Ha = αe1. Choose u = a ± a2 e1 A geometric interpretation:

(b)

1

e u a (a) b a u

slide-29
SLIDE 29

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Householder transformation (cont.)

Normalize u using u2

2 = 2

  • a2

2 ± a1a2

  • for efficiency.
  • Algorithm. Given an n-vector x, this algorithm returns σ, α, and

u such that (I − σ−1uuT)x = −αe1. m = max(abs(x)); u = x/m; alpha = sign(u(1))*norm(u); u(1)= u(1) + alpha; sigma = alpha*u(1); alpha = m*alpha;

slide-30
SLIDE 30

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Framework

A framework of the QR decomposition method for solving the linear least squares problem min Ax − b2 Using orthogonal transformations to triangularize A, applying the transformations to b simultaneously; Solving the resulting triangular system.

slide-31
SLIDE 31

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Outline

1

Introduction

2

Golden Section Search

3

Multivariate Functions Steepest Descent Method

4

Linear Least Squares Problem

5

Nonlinear Least Squares Newton’s Method Gauss-Newton Method

6

Software Packages

slide-32
SLIDE 32

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Problem setting

Multivariate vector-valued function f(x) =    f1(x) . . . fm(x)    ∈ Rm, x ∈ Rn find the solution of ρ(x) = min

x

1 2

m

  • i=1

fi(x)2

slide-33
SLIDE 33

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Problem setting

Multivariate vector-valued function f(x) =    f1(x) . . . fm(x)    ∈ Rm, x ∈ Rn find the solution of ρ(x) = min

x

1 2

m

  • i=1

fi(x)2 Application: Model fitting problem.

slide-34
SLIDE 34

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’s Method

Idea: Solve ∇ρ(x) = 0. (Root finding problem).

slide-35
SLIDE 35

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’s Method

Idea: Solve ∇ρ(x) = 0. (Root finding problem). At each step, find the correction sc (x+ = xc + sc) satisfying ∇2ρ(xc)sc = −∇ρ(xc)

slide-36
SLIDE 36

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’s Method

Idea: Solve ∇ρ(x) = 0. (Root finding problem). At each step, find the correction sc (x+ = xc + sc) satisfying ∇2ρ(xc)sc = −∇ρ(xc)

  • Note. This is Newton’s method for solving nonlinear systems.
slide-37
SLIDE 37

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’t method (cont.)

What is the gradient ∇ρ(xc)?

slide-38
SLIDE 38

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’t method (cont.)

What is the gradient ∇ρ(xc)? ∇ρ(xc) = J(xc)Tf(xc) where the Jacobian J(xc) = ∂fi(xc) ∂xj

slide-39
SLIDE 39

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’t method (cont.)

What is the gradient ∇ρ(xc)? ∇ρ(xc) = J(xc)Tf(xc) where the Jacobian J(xc) = ∂fi(xc) ∂xj

  • How to get ∇2ρ(xc)?
slide-40
SLIDE 40

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’t method (cont.)

What is the gradient ∇ρ(xc)? ∇ρ(xc) = J(xc)Tf(xc) where the Jacobian J(xc) = ∂fi(xc) ∂xj

  • How to get ∇2ρ(xc)?

∇2ρ(xc) = J(xc)TJ(xc) +

m

  • i=1

fi(xc)∇2fi(xc)

slide-41
SLIDE 41

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Newton’t method (cont.)

What is the gradient ∇ρ(xc)? ∇ρ(xc) = J(xc)Tf(xc) where the Jacobian J(xc) = ∂fi(xc) ∂xj

  • How to get ∇2ρ(xc)?

∇2ρ(xc) = J(xc)TJ(xc) +

m

  • i=1

fi(xc)∇2fi(xc) If x∗ fits the model well (fi(x∗) ≈ 0) and xc is close to x∗, then fi(xc) ≈ 0. Then ∇2ρ(xc) ≈ J(xc)TJ(xc).

slide-42
SLIDE 42

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Gauss-Newton Method

Evaluate fc = f(xc) and compute the Jacobian Jc = J(xc); Solve (JT

c Jc)sc = −JT c fc for sc;

Update x+ = xc + sc;

slide-43
SLIDE 43

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Gauss-Newton Method

Evaluate fc = f(xc) and compute the Jacobian Jc = J(xc); Solve (JT

c Jc)sc = −JT c fc for sc;

Update x+ = xc + sc;

  • Note. sc is the solution to the normal equations for the linear

least squares problem: min

s (Jcs + fc2)

Reliable methods such as the QR decomposition method can be used to solve for sc.

slide-44
SLIDE 44

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Gauss-Newton Method

Evaluate fc = f(xc) and compute the Jacobian Jc = J(xc); Solve (JT

c Jc)sc = −JT c fc for sc;

Update x+ = xc + sc;

  • Note. sc is the solution to the normal equations for the linear

least squares problem: min

s (Jcs + fc2)

Reliable methods such as the QR decomposition method can be used to solve for sc.

  • Remark. Gauss-Newton method works well on small residual

(fi(x∗) ≈ 0) problems.

slide-45
SLIDE 45

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Outline

1

Introduction

2

Golden Section Search

3

Multivariate Functions Steepest Descent Method

4

Linear Least Squares Problem

5

Nonlinear Least Squares Newton’s Method Gauss-Newton Method

6

Software Packages

slide-46
SLIDE 46

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Software packages

IMSL uvmif, uminf, umiah, unlsf, flprs, nconf, ncong MATLAB fmin, fmins, leastsq, lp, constr NAG e04abf, e04jaf, e04laf, e04fdf, e04mbf, e04vdf MINPACK lmdif1 NETLIB varpro, dqed Octave sqp, ols, gls

slide-47
SLIDE 47

Intro Golden Section Multivariate Functions Least Squares Nonlinear Least Squares Software

Summary

Problem setting: Real valued objective function Golden section search: Convergence rate Direction of descent: Steepest descent Linear least squares: Data fitting, QR decomposition or triangularization of a matrix using orthogonal transformations (rotation, Householder transformation) Nonlinear least squares: Newton’s method (relation with solving nonlinear systems), Gauss-Newton method (relation with solving linear least squares)