Higher Derivatives in Matlab Using MAD Shaun Forth Engineering - - PowerPoint PPT Presentation

higher derivatives in matlab using mad
SMART_READER_LITE
LIVE PREVIEW

Higher Derivatives in Matlab Using MAD Shaun Forth Engineering - - PowerPoint PPT Presentation

Higher Derivatives in Matlab Using MAD Shaun Forth Engineering Systems Department Cranfield University (DCMT Shrivenham) Shrivenham, Swindon SN6 8LA, U.K. email: S.A.Forth@cranfield.ac.uk www.amorg.co.uk/AD/staff_SAF.html 5th European Workshop


slide-1
SLIDE 1

Higher Derivatives in Matlab Using MAD

Shaun Forth

Engineering Systems Department Cranfield University (DCMT Shrivenham) Shrivenham, Swindon SN6 8LA, U.K. email: S.A.Forth@cranfield.ac.uk www.amorg.co.uk/AD/staff_SAF.html

5th European Workshop on Automatic Differentiation University of Hertfordshire, UK, May 21-22 2007

1/ 29 Higher Derivatives in Matlab Using MAD

slide-2
SLIDE 2

Plan

1

Introduction

2

MAD’s Forward Mode fmad class derivvec class

3

Higher Derivatives Second Derivatives How The Heck Does This Work? Required Changes to MAD

4

Reverse Mode AD in Matlab

5

Results Minpack Elastic-Plastic Torsion (EPT) Problem Object-Oriented Test Case

6

Conclusions & Future Work

2/ 29 Higher Derivatives in Matlab Using MAD

slide-3
SLIDE 3

Introduction

MAD’s overloaded forward mode is efficient for first derivatives [For06]. Requests for higher derivatives of order 2 to 4 for uncertainty analysis and robust optimisation. Could implement a Taylor series class ` a la ADOL-C [GJU96] - manpower expensive. OR use MAD to recursively differentiate itself (c.f. EuroAD2 - Barak Perlmutter and Jeffrey Siskind, [Gri00, p109]).

3/ 29 Higher Derivatives in Matlab Using MAD

slide-4
SLIDE 4

MAD’s Forward Mode

Based on two classes: fmad class - forward mode AD by operator overloading derivvec class - for storage and combination of multiple directional derivatives.

4/ 29 Higher Derivatives in Matlab Using MAD

slide-5
SLIDE 5

fmad Class Constructor

e.g. x=fmad([1.1 2 3],[4 5 6]);

5/ 29 Higher Derivatives in Matlab Using MAD

slide-6
SLIDE 6

fmad Class Constructor

e.g. x=fmad([1.1 2 3],[4 5 6]); Defines fmad object with

◮ value component - row vector [1.1 2 3] ◮ deriv component - single directional derivative [4 5 6] 5/ 29 Higher Derivatives in Matlab Using MAD

slide-7
SLIDE 7

fmad Class Constructor

e.g. x=fmad([1.1 2 3],[4 5 6]); Defines fmad object with

◮ value component - row vector [1.1 2 3] ◮ deriv component - single directional derivative [4 5 6]

Perform overloaded operations, e.g., element-wise multiplication via times z=x.*x value = 1.2100 4.0000 9.0000 derivatives = 8.8000 20.0000 36.0000 AD enabled by the times.m function of the fmad class.

5/ 29 Higher Derivatives in Matlab Using MAD

slide-8
SLIDE 8

fmad times function for z=x.*y

function z=times(x,y) if isa(x,’fmad’) z=x; % deep copy to avoid constructor, no refs. to x if isa(y,’fmad’) z.deriv=y.value.*z.deriv+z.value.*y.deriv; z.value=z.value.*y.value; else z.value=z.value.*y; z.deriv=y.*z.deriv; end else z=y; % deep copy to avoid constructor, no refs. to y z.value=x.*z.value; z.deriv=x.*z.deriv; end

6/ 29 Higher Derivatives in Matlab Using MAD

slide-9
SLIDE 9

Working with Multiple Directional Derivatives

What if we want the Jacobian?

7/ 29 Higher Derivatives in Matlab Using MAD

slide-10
SLIDE 10

Working with Multiple Directional Derivatives

What if we want the Jacobian? Seed derivatives with identity I3 x=fmad([1.1 2 3],eye(3));

7/ 29 Higher Derivatives in Matlab Using MAD

slide-11
SLIDE 11

Working with Multiple Directional Derivatives

What if we want the Jacobian? Seed derivatives with identity I3 x=fmad([1.1 2 3],eye(3)); Overloaded operation with same times function gives value = 1.2100 4.0000 9.0000 Derivatives Size = 1 3

  • No. of derivs = 3

derivs(:,:,1) = 2.2000 derivs(:,:,2) = 4 derivs(:,:,3) = 6

7/ 29 Higher Derivatives in Matlab Using MAD

slide-12
SLIDE 12

The fmad and derivvec classes

The fmad constructor function

function xad=fmad(x,dx) % FUNCTION: FMAD % SYNOPSIS: Class constructor for forward Matlab AD objects xad.value=x; sx=size(xad.value); sd=size(dx); if prod(sx)==prod(sd) % #values=#derivs => single directional derivative xad.deriv=reshape(dx,sx); else % #values~=#derivs => multiple directional derivatives % Pass derivatives to derivvec constructor xad.deriv=derivvec(dx,size(xad.value)); end

8/ 29 Higher Derivatives in Matlab Using MAD

slide-13
SLIDE 13

The fmad and derivvec classes

The fmad constructor function

function xad=fmad(x,dx) % FUNCTION: FMAD % SYNOPSIS: Class constructor for forward Matlab AD objects xad.value=x; sx=size(xad.value); sd=size(dx); if prod(sx)==prod(sd) % #values=#derivs => single directional derivative xad.deriv=reshape(dx,sx); else % #values~=#derivs => multiple directional derivatives % Pass derivatives to derivvec constructor xad.deriv=derivvec(dx,size(xad.value)); end

8/ 29 Higher Derivatives in Matlab Using MAD

slide-14
SLIDE 14

The derivvec class

Store derivatives as a matrix with each directional derivative unrolled into a column.

9/ 29 Higher Derivatives in Matlab Using MAD

slide-15
SLIDE 15

The derivvec class

Store derivatives as a matrix with each directional derivative unrolled into a column. e.g. derivvec(eye(3),[1 3]) derivatives conceptualised as as, direc 1 direc 2 direc 3 [1, 0, 0] [0, 1, 0] [0, 0, 1]

  • 9/ 29

Higher Derivatives in Matlab Using MAD

slide-16
SLIDE 16

The derivvec class

Store derivatives as a matrix with each directional derivative unrolled into a column. e.g. derivvec(eye(3),[1 3]) derivatives conceptualised as as, direc 1 direc 2 direc 3 [1, 0, 0] [0, 1, 0] [0, 0, 1]

  • But stored as,

    direc 1 direc 2 direc 3   1     1     1       =   1 1 1  

9/ 29 Higher Derivatives in Matlab Using MAD

slide-17
SLIDE 17

The times operation of the derivvec class

e.g. Need to calculate,

  • 1.1

2 3

  • . ∗

    direc 1 direc 2 direc 3   1     1     1       with multiplication of each of the 3 directional derivatives. Convert value to column matrix and replicate 3 times   1.1 1.1 1.1 2 2 2 3 3 3   . ∗   1 1 1   =   1.1 2 3   columns give required directional derivatives.

10/ 29 Higher Derivatives in Matlab Using MAD

slide-18
SLIDE 18

Accessor Functions

Getting the value getvalue(z) ans = 1.2100 4.0000 9.0000 Getting external representation of derivatives getderivs(z) ans(:,:,1) = 2.2000 ans(:,:,2) = 4 ans(:,:,3) = 6 Getting unrolled internal representation getinternalderivs(z) ans = 2.2000 4 6

11/ 29 Higher Derivatives in Matlab Using MAD

slide-19
SLIDE 19

Higher Derivatives

Basic idea is to be able to use MAD recursively. For first derivatives: x=fmad([1.1 2 3],eye(3)); z=x.*x zvalue=getvalue(z) zvalue = 1.2100 4.0000 9.0000 zderivs=getinternalderivs(z) zderivs = 2.2000 4.0000 6.0000 Can we just differentiate the above process a second time with MAD?

12/ 29 Higher Derivatives in Matlab Using MAD

slide-20
SLIDE 20

Second Derivatives

xx=fmad([1.1 2 3],eye(3)); % xx’s value and derivs. D1=I_3 x=fmad(xx,eye(3)); % x’s value =xx, x’s derivs. D2=I_3 z=x.*x; zvalue=getvalue(getvalue(z)) % z’s value’s value zvalue = 1.2100 4.0000 9.0000 zderivs=getinternalderivs(getvalue(z)) % z’s value’s derivs % in D1 direcs. zderivs = 2.2000 4.0000 6.0000 zderivs=getvalue(getinternalderivs(z)) % value of z’s derivs % in D2 direc. zderivs = 2.2000 4.0000 6.0000

13/ 29 Higher Derivatives in Matlab Using MAD

slide-21
SLIDE 21

Second Derivatives (ctd)

z2ndderivs=reshape(... getinternalderivs(getinternalderivs(z)),[3 3 3]) % derivs. in D1 direc. of z’s derivs. in D2 direc. z2ndderivs(:,:,1) = 2 z2ndderivs(:,:,2) = 0 2 z2ndderivs(:,:,3) = 0 2

14/ 29 Higher Derivatives in Matlab Using MAD

slide-22
SLIDE 22

Why The Heck Does This Work?

xx=fmad([1.1 2 3],eye(3)); creates object, xx [1.1 2 3] eye(3) value deriv x=fmad(xx,eye(3)); creates object, x eye(3) xx [1.1 2 3] eye(3) value deriv value deriv

15/ 29 Higher Derivatives in Matlab Using MAD

slide-23
SLIDE 23

How The Heck (ctd.)

z=x.*x; forms object, z 2.*x.value.*x.deriv x.value.*x.value value deriv

16/ 29 Higher Derivatives in Matlab Using MAD

slide-24
SLIDE 24

How The Heck (ctd.)

z=x.*x; forms object, z 2.*x.value.*x.deriv value deriv xx [1.1 2 3] eye(3) value deriv .* xx [1.1 2 3] eye(3) value deriv

16/ 29 Higher Derivatives in Matlab Using MAD

slide-25
SLIDE 25

How The Heck (ctd.)

z=x.*x; forms object, z 2.*x.value.*x.deriv value deriv [1.21, 4.0, 9.0]   2.2 0 0 4 0 0 6   value deriv

16/ 29 Higher Derivatives in Matlab Using MAD

slide-26
SLIDE 26

How The Heck (ctd.)

z=x.*x; forms object, z value deriv [1.21, 4.0, 9.0]   2.2 0 0 4 0 0 6   value deriv xx [1.1 2 3] eye(3) value deriv .* 2 eye(3)

16/ 29 Higher Derivatives in Matlab Using MAD

slide-27
SLIDE 27

How The Heck (ctd.)

z=x.*x; forms object, z value deriv [1.21, 4.0, 9.0]   2.2 0 0 4 0 0 6   value deriv xx [1.1 2 3] eye(3) value deriv .* 2 eye(3) xx   1.1 1.1 1.1 2 2 2 3 3 3   [I3, I3, I3] value deriv .* 2 I3

16/ 29 Higher Derivatives in Matlab Using MAD

slide-28
SLIDE 28

How The Heck (ctd.)

z=x.*x; forms object, z value deriv [1.21, 4.0, 9.0]   2.2 0 0 4 0 0 6   value deriv xx [1.1 2 3] eye(3) value deriv .* 2 eye(3)   2.2 0 0 4 0 0 6   [I3, I3, I3] .* 2   e1 e2 e3   value deriv

16/ 29 Higher Derivatives in Matlab Using MAD

slide-29
SLIDE 29

How The Heck (ctd.)

z=x.*x; forms object, z value deriv [1.21, 4.0, 9.0]   2.2 0 0 4 0 0 6   value deriv xx [1.1 2 3] eye(3) value deriv .* 2 eye(3)   2.2 0 0 4 0 0 6     I3 I3 I3   .* 2   e1 e1 e1 e2 e2 e2 e3 e3 e3   value deriv

16/ 29 Higher Derivatives in Matlab Using MAD

slide-30
SLIDE 30

How The Heck (ctd.)

z=x.*x; forms object, z value [1.21, 4.0, 9.0]   2.2 0 0 4 0 0 6   value deriv xx [1.1 2 3] eye(3) value deriv .* 2 eye(3)   2.2 0 0 4 0 0 6     e1 0 0 e2 0 0 e3   value deriv deriv

16/ 29 Higher Derivatives in Matlab Using MAD

slide-31
SLIDE 31

Required Changes to MAD

Few changes needed Made derivvec class superiorto(’fmad’). Subscript Referencing - make via direct calls to fmad’s subsref function. For efficiency - eliminate unnecessary temporary fmad objects created.

17/ 29 Higher Derivatives in Matlab Using MAD

slide-32
SLIDE 32

Made derivvec class superiorto(fmad)

derivvec Constructor Function

function dv=derivvec(x,shape,nd) . .

  • therwise

error(’derivvec must have 1 or 2 inputs’) end superiorto(’fmad’) Required for binary operations between fmad and derivvec objects. e.g. xfmad.*yderivvec Ensures object precedence is such that derivvec operation called and not the fmad operation. Enables differentiation through the derivvec class.

18/ 29 Higher Derivatives in Matlab Using MAD

slide-33
SLIDE 33

Subscript Referencing

Cannot make subscript references such as x(1:5,2) within fmad class if x is fmad class. Have to replace overloaded subscripting with direct calls to fmad’s subsref function.

In fmad’s prod.m Product Function

localderivs=pval(ones(s(1),1),:)./Aval; replaced with S.type=’()’; S.subs=ones(s(1),1),’:’; localderivs=subsref(pval,S)./Aval;

19/ 29 Higher Derivatives in Matlab Using MAD

slide-34
SLIDE 34

Eliminate Unnecessary Temporary fmad Objects

A.value of fmad class:

Original Code in prod.m (fmad objects highlighted)

Aval=A.value; s=size(Aval); S.subs={ones(s(1),1),’:’}; localderivs=subsref(pval,S)./Aval;

Modified Code in prod.m (fmad objects highlighted)

Aval=A.value; s=size(deactivate(Aval)); S.subs={ones(s(1),1),’:’}; localderivs=subsref(pval,S)./Aval; Reduces number of fmad operations performed.

20/ 29 Higher Derivatives in Matlab Using MAD

slide-35
SLIDE 35

Reverse Mode AD in Matlab

Uses a madtape class to build up 2 tapes:

1

  • perations tape of all arithmetic operations, including subscripting.

2

value tape of sizes of values involved in calculation, values of those involved nonlinearly.

Functions to set adjoints and propagate back through the computation. Ensure madtape class superior to fmad to enable forward-over-reverse.

21/ 29 Higher Derivatives in Matlab Using MAD

slide-36
SLIDE 36

Use of Reverse Mode - Minpack EPT Gradient

x=madtape(x0); % initialise madtape object f=MinpackEPT_F(x,Prob); % Evaluate the function % and create tapes fmadtape=getvalue(f) % Extract the objective function value setadjoint(f,1); % initialise adjoint of dependent var. MADCalcAdjoints % propagate adjoints back % through the operations tape gmadtape=getadjoint(x); % extract adjoint of independent

22/ 29 Higher Derivatives in Matlab Using MAD

slide-37
SLIDE 37

Results

Very preliminary Minpack EPT case. Objected oriented application

23/ 29 Higher Derivatives in Matlab Using MAD

slide-38
SLIDE 38

Minpack EPT Problem

Objective function with sparse Hessian

24/ 29 Higher Derivatives in Matlab Using MAD

slide-39
SLIDE 39

Minpack EPT Problem

Ratio of Hessian time to function time. n = nx ∗ nx 4 16 64 256 hand-coded 12. 13. 12. 14. fmad(sparse) on fmad(sparse) 662. 664. 1154. 18553. fmad(sparse) on reverse 1160. 1147. 1235. 2897. Modify Source Code hand-coded 13. 14. 14. 15. fmad(sparse) on fmad(sparse) 569. 559. 757. 4778. fmad(sparse) on reverse 1114. 1094. 1194. 2727. n2 16 256 4096 65536 All times averaged over a minimum of 5s.

25/ 29 Higher Derivatives in Matlab Using MAD

slide-40
SLIDE 40

Source Code Modification

[f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/hy % unnecessary differentiated division op.

26/ 29 Higher Derivatives in Matlab Using MAD

slide-41
SLIDE 41

Source Code Modification

[f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/hy % unnecessary differentiated division op.

26/ 29 Higher Derivatives in Matlab Using MAD

slide-42
SLIDE 42

Source Code Modification

[f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/hy % unnecessary differentiated division op.

26/ 29 Higher Derivatives in Matlab Using MAD

slide-43
SLIDE 43

Source Code Modification

[f,fgrad,fhess]=MinpackEPT_FGH(x,Prob) nx=Prob.user.nx; % nx is class double ny=length(x)./nx; % x active => ny active with zero derivs. if ny~=Prob.user.ny error(’Must have length(x)==nx*ny’) end hx = 1/(nx+1); % hx is inactive hy = 1/(ny+1); % hy is active with zero derivs. v = zeros(nx+2,ny+2); % ny active => v active v(2:nx+1,2:ny+1) = reshape(x,nx,ny); % v active allows subscripted adssignment % computer dvdx and dvdy on each edge of the grid dvdx = (v(2:nx+2,:)-v(1:nx+1,:))/hx; dvdy = (v(:,2:ny+2)-v(:,1:ny+1))/deactivate(hy); % deactivate hy => cheaper division

26/ 29 Higher Derivatives in Matlab Using MAD

slide-44
SLIDE 44

Object-Oriented Test Case Preliminary Results

Test case with 4 classes from aerospace sector n = 21, m = 101.

  • max. 11 nonzeros/row of J.

Require derivatives for uncertainty propagation. Users require derivatives to order 3 - results here to order 2 - worked first time! technique cpu(JF)/cpu(F) cpu(∇2F)/cpu(F) FD 22.0 486.7 fmad 1.09

  • fmad on fmad
  • 2.41

Very little computation performed - nearly all time creating objects and some trivial computation - highlights benefits of OO!! OO destroys performance of FD.

27/ 29 Higher Derivatives in Matlab Using MAD

slide-45
SLIDE 45

Jacobian Sparsity

28/ 29 Higher Derivatives in Matlab Using MAD

slide-46
SLIDE 46

Conclusions & Future Work

MAD can differentiate object-oriented code - including itself! Further performance optimisation of fmad class needed:

◮ Inactive fmad object is needed. ◮ Store size of fmad object directly - don’t take size of it’s value.

Sparsity detection needed for first and higher derivatives - some proof

  • f concept done using sparse logical matrices - summer 2007.

madtape class needs improvements:

◮ Generate tape once and do multiple adjoints for same inputs x - need

to extend to different x with same control flow.

◮ Performance analysis and optimisation. 29/ 29 Higher Derivatives in Matlab Using MAD

slide-47
SLIDE 47

References

Shaun A. Forth. An efficient overloaded implementation of forward mode automatic differentiation in MATLAB. ACM Trans. Math. Softw., 32(2):195–222, June 2006. Andreas Griewank, David Juedes, and Jean Utke. Algorithm 755: ADOL-C: A package for the automatic differentiation of algorithms written in C/C++. ACM Transactions on Mathematical Software, 22(2):131–167, 1996. Andreas Griewank. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Number 19 in Frontiers in Applied Mathematics. SIAM, Philadelphia, PA, 2000.

30/ 29 Higher Derivatives in Matlab Using MAD