Overview of Multivariate Optimization Topics Multivariate - - PowerPoint PPT Presentation

overview of multivariate optimization topics multivariate
SMART_READER_LITE
LIVE PREVIEW

Overview of Multivariate Optimization Topics Multivariate - - PowerPoint PPT Presentation

Overview of Multivariate Optimization Topics Multivariate Optimization Overview Problem definition The unconstrained optimization problem is a generalization of the line search problem Algorithms Find a vector a such that


slide-1
SLIDE 1

Example 1: Optimization Problem

a1 a2 −5 −4 −3 −2 −1 1 2 3 4 5 −5 5

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

3

Overview of Multivariate Optimization Topics

  • Problem definition
  • Algorithms

– Cyclic coordinate method – Steepest descent – Conjugate gradient algorithms – PARTAN – Newton’s method – Levenberg-Marquardt

  • Concise, subjective summary
  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

1

Example 1: Optimization Problem

a1 a2 −5 −4 −3 −2 −1 1 2 3 4 5 −5 5

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

4

Multivariate Optimization Overview

  • The “unconstrained optimization” problem is a generalization of

the line search problem

  • Find a vector a such that

a∗ = argmin

a

f(a)

  • Note that the are no constraints on a
  • Example: Find the vector of coefficients (w ∈ Rp×1) that

minimize the average absolute error of a linear model

  • Akin to a blind person trying to find their way to the bottom of a

valley in a multidimensional landscape

  • We want to reach the bottom with the minimum number of “cane

taps”

  • Also vaguely similar to taking core samples for oil prospecting
  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

2

slide-2
SLIDE 2

Example 1: Optimization Problem

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

7

Example 1: Optimization Problem

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

5

Example 1: MATLAB Code

function [] = OptimizationProblem (); % ============================================================================== % User - Specified Parameters % ============================================================================== x =

  • 5:0 .05 :5;

y =

  • 5:0 .05 :5;

% ============================================================================== % Evaluate the Function % ============================================================================== [X,Y] = meshgrid(x,y); [Z,G] = OptFn(X,Y); functionName = ’OptimizationProblem ’; fileIdentifier = fopen ([ functionName ’.tex ’],’w’); % ============================================================================== % Contour Map % ============================================================================== figure; FigureSet (2,’Slides ’); contour(x,y,Z ,50); xlabel(’a_1 ’); ylabel(’a_2 ’); zoom on; AxisSet (8); fileName = sprintf(’%s-%s’,functionName ,’Contour ’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

8

Example 1: Optimization Problem

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

6

slide-3
SLIDE 3

case 1, view (45 ,10); case 2, view ( -55 ,22); case 3, view ( -131 ,10);

  • therwise , error(’Not
  • implemented. ’);

end fileName = sprintf(’%s-%s%d’,functionName ,’Surface ’,c1); print(fileName ,’-depsc ’); fprintf(fileIdentifier ,’%%============================================================================= fprintf(fileIdentifier ,’\\ newslide\n’); fprintf(fileIdentifier ,’\\ slideheading{ Example \\ arabic{exc }: Optimization Problem }\n’); fprintf(fileIdentifier ,’%%============================================================================= fprintf(fileIdentifier ,’\\ includegraphics [ scale =1]{ Matlab /%s}\n’,fileName ); fprintf(fileIdentifier ,’\n’); end % ============================================================================== % List the MATLAB Code % ============================================================================== fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ newslide \n’); fprintf(fileIdentifier ,’\\ slideheading{ Example \\ arabic{exc }: MATLAB Code }\n’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\t \\ matlabcode{Matlab /% s.m }\n’,functionName ); fclose( fileIdentifier );

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

11

print(fileName ,’-depsc ’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ newslide\n’); fprintf(fileIdentifier ,’\\ stepcounter{exc }\n’); fprintf(fileIdentifier ,’\\ slideheading{Example \\ arabic{exc }: Optimization Problem }\n’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ includegraphics [ scale =1]{ Matlab /%s}\n’,fileName ); fprintf(fileIdentifier ,’\n’); % ============================================================================== % Quiver Map % ============================================================================== figure; FigureSet (1,’Slides ’); axis ([-5 5 -5 5]); contour(x,y,Z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); hold on; xCoarse =

  • 5:0.5 :5;

yCoarse =

  • 5:0.5 :5;

[X,Y] = meshgrid(xCoarse ,yCoarse ); [ZCoarse ,GCoarse] = OptFn(X,Y); nr = size(xCoarse ,1); dzx = GCoarse( 1:nr ,1:nr); dzy = GCoarse(nr + (1: nr),1:nr); quiver(xCoarse ,yCoarse ,dzx ,dzy ); hold

  • ff;

xlabel(’a_1 ’); ylabel(’a_2 ’); zoom on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

9

Global Optimization?

  • In general, all optimization algorithms find a local minimum in as

few steps as possible

  • There are also “global” optimization algorithms based on ideas

such as – Evolutionary computing – Genetic algorithms – Simulated annealing

  • None of these guarantee convergence in a finite number of

iterations

  • All require a lot of computation
  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

12

AxisSet (8); fileName = sprintf(’%s-%s’,functionName ,’Quiver ’); print(fileName ,’-depsc ’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ newslide\n’); fprintf(fileIdentifier ,’\\ slideheading{ Example \\ arabic{exc }: Optimization Problem }\n’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ includegraphics [ scale =1]{ Matlab /%s}\n’,fileName ); fprintf(fileIdentifier ,’\n’); % ============================================================================== % 3D Maps % ============================================================================== figure; set(gcf ,’Renderer ’,’zbuffer ’); FigureSet (1,’Slides ’); h = surf(x,y,Z); set(h,’LineStyle ’,’None ’); xlabel(’a_1 ’); ylabel(’a_2 ’); shading interp; grid on; AxisSet (8); hl = light(’Position ’ ,[0 ,0 ,30]); set(hl ,’Style ’,’Local ’); set(h,’BackFaceLighting ’,’unlit ’) material dull for c1 =1:3 switch c1

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

10

slide-4
SLIDE 4

Cyclic Coordinate Method

  • 1. For i = 1 to p,

ai := argmin

α

f([a1, a2, . . . , ai−1, α, ai+1, . . . , ap])

  • 2. Loop to 1 until convergence

+ Simple to implement + Each line search can be performed semi-globally to avoid shallow local minima + Can be used with nominal variables + f(a) can be discontinuous + No gradient required − Very slow compared to gradient-based optimization algorithms − Usually only practical when the number of parameters, p, is small

  • There are modified versions with faster convergence
  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

15

Optimization Comments

  • Ideally, when we construct models we should favor those which

can be optimized with few shallow local minima and reasonable computation

  • Graphically you can think of the function to be minimized as the

elevation in a complicated high-dimensional landscape

  • The problem is to find the lowest point
  • The most common approach is to go downhill
  • The gradient points in the most “uphill” direction
  • The steepest downhill direction is the opposite of the gradient
  • Most optimization algorithms use a line search algorithm
  • The methods mostly differ only in the way that the “direction of

descent” is generated

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

13

Example 2: Cyclic Coordinate Method

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

16

Optimization Algorithm Outline

  • The basic steps of these algorithms is as follows
  • 1. Pick a starting vector a
  • 2. Find the direction of descent, d
  • 3. Move in that direction until a minimum is found:

α∗ := argmin

α

f(a + αd) a := a + α∗d

  • 4. Loop to 2 until convergence
  • Most of the theory of these algorithms is based on quadratic

surfaces

  • Near local minima, this is a good approximation
  • Note that the functions should (must) have continuous gradients

(almost) everywhere

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

14

slide-5
SLIDE 5

Example 2: Cyclic Coordinate Method

5 10 15 20 25 1 2 3 4 5 6 Iteration Euclidean Position Error

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

19

Example 2: Cyclic Coordinate Method

−3 −2 −1 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0.5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

17

Example 2: Relevant MATLAB Code

function [] = CyclicCoordinate (); %clear all; close all; ns = 26; x =

  • 3;

y = 1; b0 =

  • 1;

ls = 30; a = zeros(ns ,2); f = zeros(ns ,1); [z,dzx ,dzy] = OptFn(x,y); a(1 ,:) = [x y]; f(1) = z; for cnt = 2:ns , if rem(cnt ,2)==1 , d = [1 0]’; % Along x direction else d = [0 1]’; % Along y direction end; [b,fmin] = LineSearch ([x y]’,d,b0 ,ls); x = x + b*d(1); y = y + b*d(2);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

20

Example 2: Cyclic Coordinate Method

5 10 15 20 25 1 2 3 4 5 6 7 Iteration Function Value

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

18

slide-6
SLIDE 6

print -depsc CyclicCoordinateContourB ; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2 ])’).^2) ’).^(1/2); h = plot(k-1,xerr ,’b’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Euclidean Position Error ’); xlim ([0 ns -1]); ylim ([0 xerr (1)]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc CyclicCoordinatePositionError; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Function Value ’); ylim ([0 f(1)]); xlim ([0 ns -1]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

23

a(cnt ,:) = [x y]; f(cnt) = fmin; end; [x,y] = meshgrid (0+( -0 .01 :0 .001 :0.01 ) ,3+(-0 .01 :0 .001 :0.01 )); [z,dzx ,dzy] = OptFn(x,y); [zopt ,id1] = min(z); [zopt ,id2] = min(zopt ); id1 = id1(id2 ); xopt = x(id1 ,id2 ); yopt = y(id1 ,id2 ); [x,y] = meshgrid (1 .883 +(-0 .02 :0 .001 :0 .02),-2.963 +(-0 .02 :0 .001 :0 .02 )); [z,dzx ,dzy] = OptFn(x,y); [zopt2 ,id1] = min(z); [zopt2 ,id2] = min(zopt2 ); id1 = id1(id2 ); xopt2 = x(id1 ,id2 ); yopt2 = y(id1 ,id2 ); figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid ( -5:0.1:5,-5:0.1 :5); z = OptFn(x,y); contour(x,y,z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

21

print -depsc CyclicCoordinateErrorLinear;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

24

set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’); set(h(1),’LineWidth ’,1.5); set(h(2),’LineWidth ’,0.5); set(h(1),’MarkerSize ’ ,5); set(h(2),’MarkerSize ’ ,4); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8); print -depsc CyclicCoordinateContourA ; figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid (-1.5 + ( -2:0 .05 :2),-1.5 + ( -2:0 .05 :2)); [z,dzx ,dzy] = OptFn(x,y); contour(x,y,z ,75); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

22

slide-7
SLIDE 7

Example 3: Steepest Descent

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

27

Steepest Descent The gradient of the function f(a) is defined as the vector of partial derivatives: ∇af(a) ≡

  • ∂f(a)

∂a1 ∂f(a) ∂a2

. . .

∂f(a) ∂ap

T

  • It can be shown that the gradient, ∇af(a), “points” in the

direction of maximum ascent

  • The negative of the gradient, −∇af(a), “points” in the direction
  • f maximum descent
  • A vector d is a direction of descent if there exists a ǫ such that

f(a + λd) < f(a) for all 0 < λ < ǫ

  • It can also be shown that d is a direction of descent iff

(∇af(a))

T

d < 0

  • The algorithm of steepest descent uses d = −∇af(a)
  • The most fundamental of all algorithms for minimizing a

continuously differentiable function

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

25

Example 3: Steepest Descent

−2 −1.8 −1.6 −1.4 −1.2 −2.2 −2.1 −2 −1.9 −1.8 −1.7 −1.6 −1.5 −1.4 −1.3 −1.2 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

28

Steepest Descent + Very stable algorithm − Can converge very slowly once near the local minima where the surface is approximately quadratic

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

26

slide-8
SLIDE 8

Example 3: Relevant MATLAB Code

function [] = SteepestDescent (); %clear all; close all; ns = 26; x =

  • 3;

y = 1; b0 = 0.01; ls = 30; a = zeros(ns ,2); f = zeros(ns ,1); [z,g] = OptFn(x, y); a(1 ,:) = [x y]; f(1) = z; d = -g/norm(g); for cnt = 2:ns , [b,fmin] = LineSearch ([x y]’,d,b0 ,ls); x = x + b*d(1); y = y + b*d(2); [z,g] = OptFn(x, y); d = -g; d = d/norm(d);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

31

Example 3: Steepest Descent

5 10 15 20 25 1 2 3 4 5 6 7 Iteration Function Value

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

29

a(cnt ,:) = [x y]; f(cnt) = z; end; [x,y] = meshgrid (0+( -0 .01 :0 .001 :0.01 ) ,3+(-0 .01 :0 .001 :0.01 )); [z,dzx ,dzy] = OptFn(x,y); [zopt ,id1] = min(z); [zopt ,id2] = min(zopt ); id1 = id1(id2 ); xopt = x(id1 ,id2 ); yopt = y(id1 ,id2 ); [x,y] = meshgrid (1 .883 +(-0 .02 :0 .001 :0 .02),-2.963 +(-0 .02 :0 .001 :0 .02 )); [z,dzx ,dzy] = OptFn(x,y); [zopt2 ,id1] = min(z); [zopt2 ,id2] = min(zopt2 ); id1 = id1(id2 ); xopt2 = x(id1 ,id2 ); yopt2 = y(id1 ,id2 ); [zopt zopt2] figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid ( -5:0.1:5,-5:0.1 :5); z = OptFn(x,y); contour(x,y,z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

32

Example 3: Steepest Descent Method

5 10 15 20 25 1 2 3 4 5 6 Iteration Euclidean Position Error

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

30

slide-9
SLIDE 9

AxisSet (8); print -depsc SteepestDescentErrorLinear ;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

35

h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’); set(h(1),’LineWidth ’,1.5); set(h(2),’LineWidth ’,0.5); set(h(1),’MarkerSize ’ ,5); set(h(2),’MarkerSize ’ ,4); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8); print -depsc SteepestDescentContourA; figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid (-1.6 + (-0.5:0 .01 :0.5),-1.7 + (-0.5:0.01 :0.5 )); z = OptFn(x,y); contour(x,y,z ,75); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

33

Conjugate Gradient Algorithms

  • 1. Take a steepest descent step
  • 2. For i = 2 to p
  • α := argmin

α

f(a + αd)

  • a := a + αd
  • gi := ∇f(a)
  • β :=

g

T i gi

gT

i−1gi−1

  • d := −gi + βdi
  • 3. Loop to 1 until convergence
  • Based on quadratic approximations of f
  • Called the Fletcher-Reeves method
  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

36

AxisSet (8); print -depsc SteepestDescentContourB; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2 ])’).^2) ’).^(1/2); h = plot(k-1,xerr ,’b’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Euclidean Position Error ’); xlim ([0 ns -1]); ylim ([0 xerr (1)]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc SteepestDescentPositionError; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Function Value ’); ylim ([0 f(1)]); xlim ([0 ns -1]); grid on; set(gca ,’Box ’,’Off ’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

34

slide-10
SLIDE 10

Example 4: Fletcher-Reeves Conjugate Gradient

5 10 15 20 25 1 2 3 4 5 6 7 Iteration Function Value

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

39

Example 4: Fletcher-Reeves Conjugate Gradient

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

37

Example 4: Fletcher-Reeves Conjugate Gradient

5 10 15 20 25 1 2 3 4 5 6 Iteration Euclidean Position Error

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

40

Example 4: Fletcher-Reeves Conjugate Gradient

1.5 2 2.5 −3.5 −3.4 −3.3 −3.2 −3.1 −3 −2.9 −2.8 −2.7 −2.6 −2.5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

38

slide-11
SLIDE 11

h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’); set(h(1),’LineWidth ’,1.5); set(h(2),’LineWidth ’,0.5); set(h(1),’MarkerSize ’ ,5); set(h(2),’MarkerSize ’ ,4); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8); print -depsc FletcherReevesContourA; figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid (1.5:0.01 :2.5 ,-3.5:0 .01:-2.5); z = OptFn(x,y); contour(x,y,z ,75); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

43

Example 4: Relevant MATLAB Code

function [] = FletcherReeves (); %clear all; close all; ns = 26; x =

  • 3;

y = 1; b0 = 0.01; ls = 30; a = zeros(ns ,2); f = zeros(ns ,1); [z,g] = OptFn(x, y); a(1 ,:) = [x y]; f(1) = z; d = -g/norm(g); % First direction for cnt = 2:ns , [b,fmin] = LineSearch ([x y]’,d,b0 ,ls); x = x + b*d(1); y = y + b*d(2); go = g; % Old gradient [z,g] = OptFn(x, y); beta = (g’*g)/(go ’*go);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

41

AxisSet (8); print -depsc FletcherReevesContourB; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2 ])’).^2) ’).^(1/2); h = plot(k-1,xerr ,’b’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Euclidean Position Error ’); xlim ([0 ns -1]); ylim ([0 xerr (1)]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc FletcherReevesPositionError; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Function Value ’); ylim ([0 f(1)]); xlim ([0 ns -1]); grid on; set(gca ,’Box ’,’Off ’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

44

d = -g + beta*d; a(cnt ,:) = [x y]; f(cnt) = z; end; [x,y] = meshgrid (0+( -0 .01 :0 .001 :0.01 ) ,3+(-0 .01 :0 .001 :0.01 )); [z,dzx ,dzy] = OptFn(x,y); [zopt ,id1] = min(z); [zopt ,id2] = min(zopt ); id1 = id1(id2 ); xopt = x(id1 ,id2 ); yopt = y(id1 ,id2 ); [x,y] = meshgrid (1 .883 +(-0 .02 :0 .001 :0 .02),-2.963 +(-0 .02 :0 .001 :0 .02 )); [z,dzx ,dzy] = OptFn(x,y); [zopt2 ,id1] = min(z); [zopt2 ,id2] = min(zopt2 ); id1 = id1(id2 ); xopt2 = x(id1 ,id2 ); yopt2 = y(id1 ,id2 ); figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid ( -5:0.1:5,-5:0.1 :5); z = OptFn(x,y); contour(x,y,z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

42

slide-12
SLIDE 12

Example 5: Polak-Ribiere Conjugate Gradient

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

47

AxisSet (8); print -depsc FletcherReevesErrorLinear ;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

45

Example 5: Polak-Ribiere Conjugate Gradient

1.5 2 2.5 −3.5 −3.4 −3.3 −3.2 −3.1 −3 −2.9 −2.8 −2.7 −2.6 −2.5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

48

Conjugate Gradient Algorithms Continued

  • There is also a variant called Polak-Ribiere where

β := (gi − gi−1)

Tgi

g

T

i−1gi−1

+ Only requires the gradient + Converges in a finite No. steps when f(a) is quadratic and perfect line searches are used − Less stable numerically than steepest descent − Sensitive to inexact line searches

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

46

slide-13
SLIDE 13

Example 5: MATLAB Code

function [] = PolakRibiere (); %clear all; close all; ns = 26; x =

  • 3;

y = 1; b0 = 0.01; ls = 30; a = zeros(ns ,2); f = zeros(ns ,1); [z,g] = OptFn(x, y); a(1 ,:) = [x y]; f(1) = z; d = -g/norm(g); % First direction for cnt = 2:ns , [b,fmin] = LineSearch ([x y]’,d,b0 ,ls); x = x + b*d(1); y = y + b*d(2); go = g; % Old gradient [z,g] = OptFn(x, y); beta = ((g-go)’*g)/(go ’*go);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

51

Example 5: Polak-Ribiere Conjugate Gradient

5 10 15 20 25 1 2 3 4 5 6 7 Iteration Function Value

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

49

d = -g + beta*d; a(cnt ,:) = [x y]; f(cnt) = z; end; [x,y] = meshgrid (0+( -0 .01 :0 .001 :0.01 ) ,3+(-0 .01 :0 .001 :0.01 )); [z,dzx ,dzy] = OptFn(x,y); [zopt ,id1] = min(z); [zopt ,id2] = min(zopt ); id1 = id1(id2 ); xopt = x(id1 ,id2 ); yopt = y(id1 ,id2 ); [x,y] = meshgrid (1 .883 +(-0 .02 :0 .001 :0 .02),-2.963 +(-0 .02 :0 .001 :0 .02 )); [z,dzx ,dzy] = OptFn(x,y); [zopt2 ,id1] = min(z); [zopt2 ,id2] = min(zopt2 ); id1 = id1(id2 ); xopt2 = x(id1 ,id2 ); yopt2 = y(id1 ,id2 ); figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid ( -5:0.1:5,-5:0.1 :5); z = OptFn(x,y); contour(x,y,z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

52

Example 5: Polak-Ribiere Conjugate Gradient

5 10 15 20 25 1 2 3 4 5 6 Iteration Euclidean Position Error

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

50

slide-14
SLIDE 14

AxisSet (8); print -depsc PolakRibiereErrorLinear;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

55

h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’); set(h(1),’LineWidth ’,1.5); set(h(2),’LineWidth ’,0.5); set(h(1),’MarkerSize ’ ,5); set(h(2),’MarkerSize ’ ,4); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8); print -depsc PolakRibiereContourA ; figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid (1.5:0.01 :2.5 ,-3.5:0 .01:-2.5); z = OptFn(x,y); contour(x,y,z ,75); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

53

Parallel Tangents (PARTAN)

  • 1. First gradient step
  • d := ∇f(a)
  • α := argminα f(a + αd)
  • sp := αd
  • a := a + sp
  • 2. Gradient Step
  • dg := ∇f(a)
  • α := argminα f(a + αd)
  • sg := αd
  • a := a + sg
  • 3. Conjugate Step
  • dp := sp + sg
  • α := argminα f(a + αd)
  • sp := αd
  • a := a + sp
  • 4. Loop to 2 until convergence
  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

56

AxisSet (8); print -depsc PolakRibiereContourB ; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2 ])’).^2) ’).^(1/2); h = plot(k-1,xerr ,’b’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Euclidean Position Error ’); xlim ([0 ns -1]); ylim ([0 xerr (1)]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc PolakRibierePositionError ; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Function Value ’); ylim ([0 f(1)]); xlim ([0 ns -1]); grid on; set(gca ,’Box ’,’Off ’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

54

slide-15
SLIDE 15

Example 6: PARTAN

1.5 2 2.5 −3.5 −3.4 −3.3 −3.2 −3.1 −3 −2.9 −2.8 −2.7 −2.6 −2.5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

59

PARTAN Concept

a0 a1 a2 a3 a4 a5 a6 a7

  • First two steps are steepest descent
  • Thereafter, each iteration consists of two steps
  • 1. Search along the direction

di = ai − ai−2 where ai is the current point and ai−2 is the point from two steps ago

  • 2. Search in the direction of the negative gradient

di = −∇f(ai)

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

57

Example 6: PARTAN

5 10 15 20 25 1 2 3 4 5 6 7 Iteration Function Value

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

60

Example 6: PARTAN

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

58

slide-16
SLIDE 16

cnt = 2; while cnt <ns , % Gradient step [z,g] = OptFn(x,y); d = -g/norm(g); % Direction [bg ,fmin] = LineSearch ([x y]’,d,b0 ,ls); xg = x + bg*d(1); yg = y + bg*d(2); cnt = cnt + 1; a(cnt ,:) = [xg yg]; f(cnt) = OptFn(xg ,yg); fprintf(’G : %d %5 .3f\n’,cnt ,f(cnt )); if cnt ==ns , break; end; % Conjugate d = [xg -xa yg -ya]’; if norm(d)=0, d = d/norm(d); [bp ,fmin] = LineSearch ([xg yg]’,d,b0 ,ls); else bp = 0; end; if bp >0, % Line search in conjugate direction was successful fprintf(’P :’); x = xg + bp*d(1); y = yg + bp*d(2);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

63

Example 6: PARTAN

5 10 15 20 25 1 2 3 4 5 6 Iteration Euclidean Position Error

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

61

else % Could not move - do another gradient update cnt = cnt + 1; a(cnt ,:) = a(cnt -1 ,:); f(cnt) = f(cnt -1); if cnt ==ns , break; end; fprintf(’G2:’); [z,g] = OptFn(xg ,yg); d = -g/norm(g); % Direction [bp ,fmin] = LineSearch ([xg yg]’,d,b0 ,ls); x = xg + bp*d(1); y = yg + bp*d(2); end; % Update anchor point xa = xg; ya = yg; cnt = cnt + 1; a(cnt ,:) = [x y]; f(cnt) = OptFn(x,y); fprintf(’ %d %5.3f\n’,cnt ,f(cnt )); end; [x,y] = meshgrid (0+( -0 .01 :0 .001 :0.01 ) ,3+(-0 .01 :0 .001 :0.01 )); [z,dzx ,dzy] = OptFn(x,y); [zopt ,id1] = min(z); [zopt ,id2] = min(zopt ); id1 = id1(id2 ); xopt = x(id1 ,id2 );

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

64

Example 6: MATLAB Code

function [] = Partan (); %clear all; close all; ns = 26; x =

  • 3;

y = 1; b0 = 0.01; ls = 30; a = zeros(ns ,2); f = zeros(ns ,1); [z,g] = OptFn(x,y); a(1 ,:) = [x y]; f(1) = z; xa = x; ya = y; % First step - substitute for a Conjugate step d = -g/norm(g); % First direction [bp ,fmin] = LineSearch ([x y]’,d,b0 ,100); x = x + bp*d(1); % Standin for a conjugate step y = y + bp*d(2); a(2 ,:) = [x y]; f(2) = fmin;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

62

slide-17
SLIDE 17

xlim ([0 ns -1]); ylim ([0 xerr (1)]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc PartanPositionError; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Function Value ’); ylim ([0 f(1)]); xlim ([0 ns -1]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc PartanErrorLinear ;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

67

yopt = y(id1 ,id2 ); [x,y] = meshgrid (1 .883 +(-0 .02 :0 .001 :0 .02),-2.963 +(-0 .02 :0 .001 :0 .02 )); [z,dzx ,dzy] = OptFn(x,y); [zopt2 ,id1] = min(z); [zopt2 ,id2] = min(zopt2 ); id1 = id1(id2 ); xopt2 = x(id1 ,id2 ); yopt2 = y(id1 ,id2 ); figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid ( -5:0.1:5,-5:0.1 :5); z = OptFn(x,y); contour(x,y,z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’); set(h(1),’LineWidth ’,1.5); set(h(2),’LineWidth ’,0.5); set(h(1),’MarkerSize ’ ,5); set(h(2),’MarkerSize ’ ,4); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

65

PARTAN Pros and Cons

a0 a1 a2 a3 a4 a5 a6 a7

+ For quadratic functions, converges in a finite number of steps + Easier to implement than 2nd order methods + Can be used with large number of parameters + Each (composite) step is at least as good as steepest descent + Tolerant of inexact line searches − Each (composite) step requires two line searches

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

68

AxisSet (8); print -depsc PartanContourA; figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid (1.5:0.01 :2.5 ,-3.5:0 .01:-2.5); z = OptFn(x,y); contour(x,y,z ,75); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8); print -depsc PartanContourB; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2 ])’).^2) ’).^(1/2); h = plot(k-1,xerr ,’b’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Euclidean Position Error ’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

66

slide-18
SLIDE 18

Example 7: Newton’s with Steepest Descent Safeguard

0.5 1 1.5 2 −3 −2.5 −2 −1.5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

71

Newton’s Method ak+1 = ak − H(ak)−1 ∇f(ak) where ∇f(ak) is the gradient and H(ak) is the hessian of f(a), H(ak) ≡ ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

∂2f(a) ∂a2

1

∂2f(a) ∂a1 ∂a2

. . .

∂2f(a) ∂a1 ∂ap ∂2f(a) ∂a2 ∂a1 ∂2f(a) ∂a2

2

. . .

∂2f(a) ∂a2 ∂ap

. . . . . . ... . . .

∂2f(a) ∂ap ∂a1 ∂2f(a) ∂ap ∂a2

. . .

∂2f(a) ∂a2

p

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

  • Based on a quadratic approximation of the function f(a)
  • If f(a) is quadratic, converges in one step
  • If H(a) is positive-definite, the problem is well defined near local

minima where f(a) is nearly quadratic

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

69

Example 7: Newton’s with Steepest Descent Safeguard

10 20 30 40 50 60 70 80 90 1 2 3 4 5 6 7 Iteration Function Value

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

72

Example 7: Newton’s with Steepest Descent Safeguard

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

70

slide-19
SLIDE 19

y = y + b*d(2); [z,g,H] = OptFn(x,y); a(cnt ,:) = [x y]; f(cnt) = z; end; [x,y] = meshgrid (0+( -0 .01 :0 .001 :0.01 ) ,3+(-0 .01 :0 .001 :0.01 )); [z,dzx ,dzy] = OptFn(x,y); [zopt ,id1] = min(z); [zopt ,id2] = min(zopt ); id1 = id1(id2 ); xopt = x(id1 ,id2 ); yopt = y(id1 ,id2 ); [x,y] = meshgrid (1 .883 +(-0 .02 :0 .001 :0 .02),-2.963 +(-0 .02 :0 .001 :0 .02 )); [z,dzx ,dzy] = OptFn(x,y); [zopt2 ,id1] = min(z); [zopt2 ,id2] = min(zopt2 ); id1 = id1(id2 ); xopt2 = x(id1 ,id2 ); yopt2 = y(id1 ,id2 ); figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid ( -5:0.1:5,-5:0.1 :5); z = OptFn(x,y); contour(x,y,z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

75

Example 7: Newton’s with Steepest Descent Safeguard

10 20 30 40 50 60 70 80 90 1 2 3 4 5 6 Iteration Euclidean Position Error

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

73

axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’); set(h(1),’LineWidth ’,1.5); set(h(2),’LineWidth ’,0.5); set(h(1),’MarkerSize ’ ,5); set(h(2),’MarkerSize ’ ,4); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8); print -depsc NewtonsContourA ; figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid (1.0 + ( -1:0 .02 :1),

  • 2.4 + ( -1:0 .02 :1));

z = OptFn(x,y); contour(x,y,z ,75); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); hold

  • ff;

xlabel(’X’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

76

Example 7: Relevant MATLAB Code

function [] = Newtons (); %clear all; close all; ns = 100; x =

  • 3;

% Starting x y = 1; % Starting y b0 = 1; a = zeros(ns ,2); f = zeros(ns ,1); [z,g,H] = OptFn(x, y); a(1 ,:) = [x y]; f(1) = z; for cnt = 2:ns , d = -inv(H)*g; if d’*g>0, % Revert to steepest descent if is not direction

  • f

descent %fprintf ( ’(%2d of %2d) Min. Eig :%5 .3f Reverting...\n’,cnt ,ns ,min(eig(H ))); d = -g; end; d = d/norm(d); [b,fmin] = LineSearch ([x y]’,d,b0 ,100); %a(cnt ,:) = (a(cnt -1,:)’ - inv(H)*g)’; % Pure Newton ’s Method x = x + b*d(1);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

74

slide-20
SLIDE 20

Newton’s Method Pros and Cons ak+1 = ak − H(ak)−1 ∇f(ak) + Very fast convergence near local minima − Not guaranteed to converge (may actually diverge) − Requires p × p Hessian − Requires a p × p matrix inverse that uses O(p3) operations

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

79

ylabel(’Y’); zoom on; AxisSet (8); print -depsc NewtonsContourB ; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2 ])’).^2) ’).^(1/2); h = plot(k-1,xerr ,’b’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Euclidean Position Error ’); xlim ([0 ns -1]); ylim ([0 xerr (1)]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc NewtonsPositionError ; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Function Value ’); ylim ([0 f(1)]); xlim ([0 ns -1]);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

77

Levenberg-Marquardt

  • 1. Determine if ǫkI + H(ak) is positive definite. If not, ǫk := 4ǫk

and repeat.

  • 2. Solve the following equation for ak+1

[ǫkI + H(ak)] (ak+1 − ak) = −∇f(ak) 3. rk ≡ f(ak) − f(ak+1) q(ak) − q(ak+1) where q(a) is the quadratic approximation of f(a) based on the f(a), ∇f(a), and H(ak)

  • 4. If rk < 0.25, then ǫk+1 := 4ǫk

If rk > 0.75, then ǫk+1 := 1

2ǫk

If rk ≤ 0, then ak+1 := ak

  • 5. If not converged, k := k + 1 and loop to 1.
  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

80

grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc NewtonsErrorLinear;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

78

slide-21
SLIDE 21

Example 8: Levenberg-Marquardt Conjugate Gradient

1.5 2 2.5 −3.5 −3.4 −3.3 −3.2 −3.1 −3 −2.9 −2.8 −2.7 −2.6 −2.5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

83

Levenberg-Marquardt Comments

  • Similar to Newton’s method
  • Has safety provisions for regions where quadratic approximation is

inappropriate

  • Compare

Newton’s: ak+1 = ak − H(ak)−1 ∇f(ak) LM : [ǫkI + H(ak)] (ak+1 − ak) = −∇f(ak)

  • If ǫ = 0, these are equivalent
  • If ǫ → ∞, ak+1 → ak
  • ǫ is chosen to ensure that the smallest eigenvalue of H(ak) is

positive and sufficiently large (≥ δ)

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

81

Example 8: Levenberg-Marquardt Conjugate Gradient

5 10 15 20 25 1 2 3 4 5 6 7 Iteration Function Value

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

84

Example 8: Levenberg-Marquardt Conjugate Gradient

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 X Y

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

82

slide-22
SLIDE 22

y = a(cnt ,2); zo = zn; % Old function value zn = OptFn(x,y); xd = (a(cnt ,:)’-ap); qo = zo; qn = zn + g’*xd + 0.5*xd ’*H*xd; if qo==qn , % Test for convergence x = a(cnt ,1); y = a(cnt ,2); a(cnt:ns ,:) = ones(ns -cnt +1 ,1)*[x y]; f(cnt:ns ,:) = OptFn(x,y); break; end; r = (zo -zn )/(qo -qn); if r<0.25 , eta = eta * 4; elseif r>0.50 , % 0.75 is recommended , but much slower eta = eta / 2; end; if zn >zo , % Back up a(cnt ,:) = a(cnt -1 ,:); else ap = a(cnt ,:) ’; end; x = a(cnt ,1);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

87

Example 8: Levenberg-Marquardt Conjugate Gradient

5 10 15 20 25 1 2 3 4 5 6 Iteration Euclidean Position Error

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

85

y = a(cnt ,2); a(cnt ,:) = [x y]; f(cnt) = OptFn(x,y); %disp ([ cnt a(cnt ,:) f(cnt) r eta ]) end; [x,y] = meshgrid (0+( -0 .01 :0 .001 :0.01 ) ,3+(-0 .01 :0 .001 :0.01 )); [z,dzx ,dzy] = OptFn(x,y); [zopt ,id1] = min(z); [zopt ,id2] = min(zopt ); id1 = id1(id2 ); xopt = x(id1 ,id2 ); yopt = y(id1 ,id2 ); [x,y] = meshgrid (1 .883 +(-0 .02 :0 .001 :0 .02),-2.963 +(-0 .02 :0 .001 :0 .02 )); [z,dzx ,dzy] = OptFn(x,y); [zopt2 ,id1] = min(z); [zopt2 ,id2] = min(zopt2 ); id1 = id1(id2 ); xopt2 = x(id1 ,id2 ); yopt2 = y(id1 ,id2 ); figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid ( -5:0.1:5,-5:0.1 :5); z = OptFn(x,y); contour(x,y,z ,50); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

88

Example 8: Relevant MATLAB Code

function [] = LevenbergMarquardt (); %clear all; close all; ns = 26; x =

  • 3;

% Starting x y = 1; % Starting y eta = 0.0001; a = zeros(ns ,2); f = zeros(ns ,1); [zn ,g,H] = OptFn(x, y); a(1 ,:) = [x y]; f(1) = zn; ap = [x y]’; % Previous point for cnt = 2:ns , [zn ,g,H] = OptFn(x,y); while min(eig(eta*eye (2)+H))<0, eta = eta * 4; end; a(cnt ,:) = (ap - inv(eta*eye (2)+H)*g )’; x = a(cnt ,1);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

86

slide-23
SLIDE 23

set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc LevenbergMarquardtErrorLinear;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

91

hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’); set(h(1),’LineWidth ’,1.5); set(h(2),’LineWidth ’,0.5); set(h(1),’MarkerSize ’ ,5); set(h(2),’MarkerSize ’ ,4); hold

  • ff;

xlabel(’X’); ylabel(’Y’); zoom on; AxisSet (8); print -depsc LevenbergMarquardtContourA ; figure; FigureSet (1,4.5 ,2 .75 ); [x,y] = meshgrid (1.5:0.01 :2.5 ,-3.5:0 .01:-2.5); z = OptFn(x,y); contour(x,y,z ,75); h = get(gca ,’Children ’); set(h,’LineWidth ’,0.2); axis(’square ’); hold on; h = plot(a(:,1),a(:,2), ’k’,a(:,1),a(:,2),’r’); set(h(1),’LineWidth ’,1.2); set(h(2),’LineWidth ’,0.6); hold

  • ff;

xlabel(’X’); ylabel(’Y’);

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

89

Levenberg-Marquardt Pros and Cons [ǫkI + H(ak)] (ak+1 − ak) = −∇f(ak)

  • Many equivalent formulations

+ No line search required + Can be used with approximations to the hessian + Extremely fast convergence (2nd order) − Requires gradient and hessian (or approximate hessian) − Requires O(p3) operations for each solution to the key equation

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

92

zoom on; AxisSet (8); print -depsc LevenbergMarquardtContourB ; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2 ])’).^2) ’).^(1/2); h = plot(k-1,xerr ,’b’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Euclidean Position Error ’); xlim ([0 ns -1]); ylim ([0 xerr (1)]); grid on; set(gca ,’Box ’,’Off ’); AxisSet (8); print -depsc LevenbergMarquardtPositionError ; figure; FigureSet (2,4.5 ,2 .75 ); k = 1:ns; h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’); set(h(1),’Marker ’,’.’); set(h,’MarkerSize ’ ,6); xlabel(’Iteration ’); ylabel(’Function Value ’); ylim ([0 f(1)]); xlim ([0 ns -1]); grid on;

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

90

slide-24
SLIDE 24

Optimization Algorithm Summary Algorithm Convergence Stable ∇f(a) H(a) LS Cyclic Coordinate Slow Y N N Y Steepest Descent Slow Y Y N Y Conjugate Gradient Fast N Y N Y PARTAN Fast Y Y N Y Newton’s Method Very Fast N Y Y N Levenberg-Marquardt Very Fast Y Y Y N

  • J. McNames

Portland State University ECE 4/557 Multivariate Optimization

  • Ver. 1.14

93