Non linear Least Squares Lectures for PHD course on Numerical - - PowerPoint PPT Presentation

non linear least squares
SMART_READER_LITE
LIVE PREVIEW

Non linear Least Squares Lectures for PHD course on Numerical - - PowerPoint PPT Presentation

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universit a di Trento November 21 December 14, 2011 Non linear Least Squares 1 / 18 The Nonlinear Least Squares Problem Outline


slide-1
SLIDE 1

Non linear Least Squares

Lectures for PHD course on Numerical optimization Enrico Bertolazzi

DIMS – Universit´ a di Trento

November 21 – December 14, 2011

Non linear Least Squares 1 / 18

slide-2
SLIDE 2

The Nonlinear Least Squares Problem

Outline

1

The Nonlinear Least Squares Problem

2

The Levemberg–Marquardt step

3

The Dog-Leg step

Non linear Least Squares 2 / 18

slide-3
SLIDE 3

The Nonlinear Least Squares Problem Introduction

An important class on minimization problem when f : ❘n → ❘ is the nonlinear least squares and takes the form: f(x) = 1 2

m

  • i=1

Fi(x)2, m ≥ n When n = m finding the minimum coincide to finding the solution of the non linear system F(x) = 0 where: F(x) =

  • F1(x), F2(x), . . . , Fn(x)

T Thus, special methods developed for the solution of nonlinear least squares can be used for the solution of nonlinear systems, but not the converse if m > n.

Non linear Least Squares 3 / 18

slide-4
SLIDE 4

The Nonlinear Least Squares Problem Introduction

Example

Consider the the following fitting model M(x, t) = x3 exp(x1t) + x4 exp(x3t) which can be used to fit some data. The model depend on the parameters x = (x1, x2, x3, x4)T . If we have a number of points (tk, yk)T , k = 1, 2, . . . , m we want to find the parameters x such that

1 2

m

k=1(M(x, tk) − yk)2 is minimum. Defining

Fk(x) = M(x, tk) − yk, k = 1, 2, . . . , m then can be viewed as a non linear least squares problem.

Non linear Least Squares 4 / 18

slide-5
SLIDE 5

The Nonlinear Least Squares Problem Introduction

To solve nonlinear least squares problem, we can use any of the previously discussed method. For example BFGS or Newton method with globalization techniques. If for example we use Newton method we need to compute ∇2f(x) = ∇2 1 2

m

  • i=1

Fi(x)2 = 1 2

m

  • i=1

∇2Fi(x)2 = 1 2

m

  • i=1

∇(2Fi(x)∇Fi(x))T =

m

  • i=1

∇Fi(x)T ∇Fi(x) +

m

  • i=1

Fi(x)∇2Fi(x)

Non linear Least Squares 5 / 18

slide-6
SLIDE 6

The Nonlinear Least Squares Problem Introduction

If we define J(x) =      ∇F1(x) ∇F2(x) . . . ∇Fm(x)      then we can write ∇2f(x) = J(x)T J(x) +

m

  • i=1

Fi(x)∇2Fi(x) However, in practical problem normally J(x) is known, while ∇2Fi(x) is not known or impractical to compute.

Non linear Least Squares 6 / 18

slide-7
SLIDE 7

The Nonlinear Least Squares Problem Introduction

A common approximation is given by neglecting the terms ∇2Fi(x) obtaining, ∇2f(x) ≈ J(x)T J(x) This choice can be appropriate near the solution if n = m in solving nonlinear system. In fact near the solution we have Fi(x) ≈ 0 so that the contribution of the neglected term is small. This choice is not good when near the minimum we have large residual (i.e. F(x) is large) because the contribution of ∇2Fi(x) cant be neglected.

Non linear Least Squares 7 / 18

slide-8
SLIDE 8

The Nonlinear Least Squares Problem Introduction

From previous consideration applying Newton method to ∇f(x)T = 0, we have xk+1 = xk − ∇2f(xk)−1∇f(xk)T and when f(x) = 1 2 F(x)2: ∇f(x)T = J(x)F(x) ∇2f(x) = J(x)T J(x) +

m

  • i=1

Fi(x)∇2Fi(x) ≈ J(x)T J(x) And using the last approximation we obtain the Gauss-Newton algorithm.

Non linear Least Squares 8 / 18

slide-9
SLIDE 9

The Nonlinear Least Squares Problem Introduction

Notice that the approximate Newton direction d = −

  • J(x)T J(x)

−1 J(x)F(x) ≈ −∇2f(x)−1∇f(x)T is a descent direction, in fact ∇f(x)d = −∇f(x)

  • J(x)T J(x)

−1 ∇f(x)T < 0 when J(x) is full rank.

Non linear Least Squares 9 / 18

slide-10
SLIDE 10

The Nonlinear Least Squares Problem Introduction

Algorithm (Gauss-Newton algorithm)

x assigned; f ← F(x); J ← ∇F(x) while

  • JT f
  • > ǫ do

— compute search direction d ← −(JT J)−1JT f; Approximate arg minα>0 f(x + αd) by linsearch; — perform step x ← x + αd; end while

Non linear Least Squares 10 / 18

slide-11
SLIDE 11

The Levemberg–Marquardt step

Outline

1

The Nonlinear Least Squares Problem

2

The Levemberg–Marquardt step

3

The Dog-Leg step

Non linear Least Squares 11 / 18

slide-12
SLIDE 12

The Levemberg–Marquardt step

The Levenberg–Marquardt Method

Levenberg (1944) and later Marquardt (1963) suggested to use a damped Gauss-Newton method: d = −

  • J(x)T J(x) + µI

−1 ∇f(x)T , ∇f(x)T = J(x)F(x)

1 for all µ ≥ 0 is a descent direction, in fact

∇f(x)d = −∇f(x)

  • J(x)T J(x) + µI

−1 ∇f(x)T < 0

2 for large µ we have d ≈ − 1

µ∇f(x)T the gradient direction.

3 for small µ we have d ≈ −(J(x)T J(x))−1∇f(x)T the

Gauss-Newton direction

Non linear Least Squares 12 / 18

slide-13
SLIDE 13

The Levemberg–Marquardt step 1 The choice of parameter µ affect both size and direction of

the step

2 Levenberg–Marquardt becomes a method without line-search. 3 As for Trust region each step (approximately) solve the

minimization of the model problem min m(x + s) = f(x) + ∇f(x)s + 1 2sT H(x)s where H(x) = J(x)T J(x) + µI is symmetric and positive definite (SPD).

4 H(x) is SPD and the minimum is

s = −H(x)−1g(x), g(x) = ∇f(x)T

Non linear Least Squares 13 / 18

slide-14
SLIDE 14

The Levemberg–Marquardt step

Algorithm (Generic LM algorithm)

x, µ assigned; η1 = 0.25; η2 = 0.75; γ1 = 2; γ2 = 1/3; f ← F(x); J ← ∇F(x); while f > ǫ do s ← arg min m(x+s) = 1

2 f2+f T s+ 1 2(JT J +µI)s;

pred ← m(x + s) − m(x); ared ←

1 2 F(x + s)2 − 1 2 f2;

r ← (ared/pred); if r < η1 then x ← x; µ ← γ1µ; — reject step, enlarge µ else x ← x + s; — accept step if r > η2 then µ ← γ2µ; — reduce µ end if end if end while

Non linear Least Squares 14 / 18

slide-15
SLIDE 15

The Levemberg–Marquardt step

Let r the ratio of expected and actual reduction of a step a faster strategy for the µ update is the following

Algorithm (Generic LM algorithm)

if r > 0 then µ ← µ max 1 3, 1 − (2r − 1)3

  • ν ← 2

else µ ← µ ν; ν ← 2 ν; end if H.B. Nielsen Damping Parameter in Marquardt’s Method IMM, DTU. Report IMM-REP-1999-05, 1999. http://www.imm.dtu.dk/~hbn/publ/TR9905.ps

Non linear Least Squares 15 / 18

slide-16
SLIDE 16

The Dog-Leg step

Outline

1

The Nonlinear Least Squares Problem

2

The Levemberg–Marquardt step

3

The Dog-Leg step

Non linear Least Squares 16 / 18

slide-17
SLIDE 17

The Dog-Leg step

The Dog-Leg step

As for the Thrust Region method we have 2 searching direction: One is the Gauss-Newton direction (when µ = 0) dGN = −

  • J(x)T J(x)

−1 ∇f(x)T , ∇f(x)T = J(x)F(x) and the gradient direction (when µ = ∞) dSD = −∇f(x)T = −J(x)T F(x), to be finished!

Non linear Least Squares 17 / 18

slide-18
SLIDE 18

The Dog-Leg step

References

  • J. Stoer and R. Bulirsch

Introduction to numerical analysis Springer-Verlag, Texts in Applied Mathematics, 12, 2002.

  • J. E. Dennis, Jr. and Robert B. Schnabel

Numerical Methods for Unconstrained Optimization and Nonlinear Equations SIAM, Classics in Applied Mathematics, 16, 1996.

Non linear Least Squares 18 / 18