Automatic Differentiation for Optimum Design, Applied to Sonic Boom - PowerPoint PPT Presentation

Automatic Differentiation for Optimum Design, Applied to Sonic Boom Reduction Laurent Hasco¨ et, Mariano V´ azquez, Alain Dervieux Laurent.Hascoet@sophia.inria.fr Tropics Project, INRIA Sophia-Antipolis AD Workshop, Cranfield, June 5-6, 2003

1 PLAN: • Part 1: A gradient-based shape optimization to reduce the Sonic Boom • Part 2: Utilization and improvements of reverse A.D to compute the Adjoint • Conclusion

2 PART 1: GRADIENT-BASED SONIC BOOM REDUCTION • The Sonic Boom optimization problem • A mixed Adjoint/AD strategy • Resolution of the Adjoint and Gradient • Numerical results

3 The Sonic Boom Problem Control Euler Pressure Cost points ⇒ Geometry ⇒ flow ⇒ shock ⇒ function γ Ω( γ ) W ( γ ) ∇ p ( W ) j ( γ )

4 Gradient-based Optimization j ( γ ) models the strength of the downwards Sonic Boom ⇒ Compute the gradient j ′ ( γ ) and use it in an optimization loop! ⇒ Use reverse-mode AD to compute this gradient

5 Differentiate the iterative solver? W ( γ ) is defined implicitly through the Euler equations Ψ( γ, W ) = 0 ⇒ The program uses an iterative solver ⇒ Brute force reverse AD differentiates the whole iterative process • Does it make sense? • Is it efficient ?

6 A mixed Adjoint/AD strategy Back to the mathematical adjoint system:  Ψ( γ, W ) = 0 (state system)        ∂W ( γ, W ) − ( ∂ Ψ ∂J ∂W ( γ, W )) t . Π   = 0 (adjoint system)     j ′ ( γ ) = ∂J ∂γ ( γ, W ) − ( ∂ Ψ  ∂γ ( γ, W )) t . Π  = 0 (optimality condition)    lower level ⇒ reverse AD cf Part 2 upper level ⇒ hand-coded specific solver for Π

7 Upper level Solve ∂ Ψ ∂W ( γ, W )) t . Π = ∂J ∂W ( γ, W ) ⇒ Use a matrix-free solver Preconditioning: “defect correction” ⇒ use the inverse of 1 st order ∂ Ψ ∂W to precondition 2 nd order ∂ Ψ ∂W

8 Overall Optimization Loop Loop: • compute Π with a matrix-free solver • use Π to compute j ′ ( γ ) • 1-D search in the direction j ′ ( γ ) • update γ (using transpiration conditions)

8 Overall Optimization Loop Loop: • compute Π with a matrix-free solver • use Π to compute j ′ ( γ ) • 1-D search in the direction j ′ ( γ ) • update γ (using transpiration conditions) Performances: Assembling ∂ Ψ ∂W ( γ, W )) t . Π takes about 7 times as long as assembing the state residual Ψ( γ, W )) ⇒ Solving for Π takes about 4 times as long as solving for W .

9 Numerical results 1 Gradient j ′ ( γ ) on the skin of the plane:

10 Numerical results 2 Improvement of the sonic boom after 8 optimization cycles:

11 PART 2: REVERSE AD TO COMPUTE THE ADJOINT • Some principles of Reverse AD • The “Tape” memory problem, the “Checkpointing” tactic • Impact of Data Dependences Analysis • Impact of In-Out Data Flow Analysis

12 Principles of reverse AD AD rewrites source programs to make them compute derivatives. R m → I R n consider: P : { I 1 ; I 2 ; . . . I p ; } implementing f : I

12 Principles of reverse AD AD rewrites source programs to make them compute derivatives. R m → I R n consider: P : { I 1 ; I 2 ; . . . I p ; } implementing f : I identify with: f = f p ◦ f p − 1 ◦ · · · ◦ f 1 name: x 0 = x and x k = f k ( x k − 1 )

12 Principles of reverse AD AD rewrites source programs to make them compute derivatives. R m → I R n consider: P : { I 1 ; I 2 ; . . . I p ; } implementing f : I identify with: f = f p ◦ f p − 1 ◦ · · · ◦ f 1 name: x 0 = x and x k = f k ( x k − 1 ) chain rule: f ′ ( x ) = f ′ p ( x p − 1 ) .f ′ p − 1 ( x p − 2 ) . . . . .f ′ 1 ( x 0 )

12 Principles of reverse AD AD rewrites source programs to make them compute derivatives. R m → I R n consider: P : { I 1 ; I 2 ; . . . I p ; } implementing f : I identify with: f = f p ◦ f p − 1 ◦ · · · ◦ f 1 name: x 0 = x and x k = f k ( x k − 1 ) chain rule: f ′ ( x ) = f ′ p ( x p − 1 ) .f ′ p − 1 ( x p − 2 ) . . . . .f ′ 1 ( x 0 ) f ′ ( x ) generally too large and expensive ⇒ take useful views! y = f ′ ( x ) . ˙ ˙ x = f ′ p ( x p − 1 ) .f ′ p − 1 ( x p − 2 ) . . . . .f ′ 1 ( x 0 ) . ˙ x tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y reverse AD Evaluate both from right to left !

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y time y

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y time y f ′∗ p ( x p − 1 ) .

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y time y f ′∗ p ( x p − 1 ) . f ′∗ p − 1 ( x p − 2 ) .

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y time y f ′∗ p ( x p − 1 ) . f ′∗ p − 1 ( x p − 2 ) . . . . x = f ′∗ 1 ( x 0 ) .

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y x p − 1 = f p − 1 ( x p − 2 ) ; time y f ′∗ p ( x p − 1 ) . f ′∗ p − 1 ( x p − 2 ) . . . . x = f ′∗ 1 ( x 0 ) .

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y x p − 2 = f p − 2 ( x p − 3 ) ; x p − 1 = f p − 1 ( x p − 2 ) ; time y f ′∗ p ( x p − 1 ) . f ′∗ p − 1 ( x p − 2 ) . . . . x = f ′∗ 1 ( x 0 ) .

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y x 0 ; forward sweep x 1 = f 1 ( x 0 ) ; . . . x p − 2 = f p − 2 ( x p − 3 ) ; x p − 1 = f p − 1 ( x p − 2 ) ; time y f ′∗ p ( x p − 1 ) . f ′∗ p − 1 ( x p − 2 ) . . . . x = f ′∗ 1 ( x 0 ) . backward sweep

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y x 0 ; forward sweep x 1 = f 1 ( x 0 ) ; . . . x p − 2 = f p − 2 ( x p − 3 ) ; ② ✓ x p − 1 = f p − 1 ( x p − 2 ) ; ✓ ✓ ✓ ✓ time retrieve ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✴ y ✐ f ′∗ p ( x p − 1 ) . f ′∗ p − 1 ( x p − 2 ) . . . . x = f ′∗ 1 ( x 0 ) . backward sweep

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y x 0 ; forward sweep x 1 = f 1 ( x 0 ) ; . . . ② x p − 2 = f p − 2 ( x p − 3 ) ; ✁ ✁ ✁ ② ✁ ✓ x p − 1 = f p − 1 ( x p − 2 ) ; ✁ ✓ ✁ ✓ ✁ ✓ ✁ ✓ time retrieve retrieve ✁ ✓ ✁ ✓ ✁ ✓ ✁ ✓ ✁ ✓ ✁ ✓ ✁ ✓ ✴ ✓ y ✁ ✐ f ′∗ p ( x p − 1 ) . ✁ f ′∗ p − 1 ( x p − 2 ) . ✁ ☛ ✐ . . . x = f ′∗ 1 ( x 0 ) . backward sweep

13 Reverse AD is more tricky than Tangent AD x = f ′∗ ( x ) .y = f ′∗ 1 ( x 0 ) . . . . f ′∗ p − 1 ( x p − 2 ) .f ′∗ p ( x p − 1 ) .y x 0 ; forward sweep ② x 1 = f 1 ( x 0 ) ; . . . ② x p − 2 = f p − 2 ( x p − 3 ) ; ✁ ✁ ✁ ② ✁ ✓ x p − 1 = f p − 1 ( x p − 2 ) ; ✁ ✓ ✁ ✓ ✁ ✓ ✁ ✓ time retrieve retrieve ✁ ✓ ✁ ✓ ✁ ✓ retrieve ✁ ✓ ✁ ✓ ✁ ✓ ✁ ✓ ✓ ✴ y ✁ ✐ f ′∗ p ( x p − 1 ) . ✁ f ′∗ p − 1 ( x p − 2 ) . ❄ ✁ ☛ ✐ ✐ . . . x = f ′∗ 1 ( x 0 ) . backward sweep Memory usage (“Tape”) is the bottleneck!

14 Example: Program fragment: ... v 2 = 2 ∗ v 1 + 5 v 4 = v 2 + p 1 ∗ v 3 /v 2 ...

14 Example: Program fragment: ... v 2 = 2 ∗ v 1 + 5 v 4 = v 2 + p 1 ∗ v 3 /v 2 ... Corresponding transposed Partial Jacobians:     1 0 1 2 1 − p 1 ∗ v 3 1 0     v 2 f ′∗ ( x ) = ...  ...     2 p 1 1    1  v 2    1 0

15 Reverse mode on the example ... v 4 ∗ (1 − p 1 ∗ v 3 /v 2 v 2 = ¯ ¯ v 2 + ¯ 2 ) v 3 = ¯ ¯ v 3 + ¯ v 4 ∗ p 1 /v 2 v 4 = 0 ¯ v 1 = ¯ ¯ v 1 + 2 ∗ ¯ v 2 v 2 = 0 ¯ ...

15 Reverse mode on the example ... v 2 = 2 ∗ v 1 + 5 v 4 = v 2 + p 1 ∗ v 3 /v 2 ... ... v 4 ∗ (1 − p 1 ∗ v 3 /v 2 v 2 = ¯ ¯ v 2 + ¯ 2 ) v 3 = ¯ ¯ v 3 + ¯ v 4 ∗ p 1 /v 2 v 4 = 0 ¯ v 1 = ¯ ¯ v 1 + 2 ∗ ¯ v 2 v 2 = 0 ¯ ...

15 Reverse mode on the example ... Push( v 2 ) v 2 = 2 ∗ v 1 + 5 Push( v 4 ) v 4 = v 2 + p 1 ∗ v 3 /v 2 ... ... Pop( v 4 ) v 4 ∗ (1 − p 1 ∗ v 3 /v 2 v 2 = ¯ ¯ v 2 + ¯ 2 ) v 3 = ¯ ¯ v 3 + ¯ v 4 ∗ p 1 /v 2 v 4 = 0 ¯ Pop( v 2 ) v 1 = ¯ ¯ v 1 + 2 ∗ ¯ v 2 v 2 = 0 ¯ ...

16 Reverse mode on our application From subroutine Psi : Psi : γ, W �→ Ψ( γ, W ) , Use reverse AD to build subroutine Psi : Psi : γ, W, Ψ �→ ∂ Ψ ∂W ( γ, W )) t . Ψ

Automatic Differentiation for Optimum Design, Applied to Sonic Boom - PowerPoint PPT Presentation

Automatic Differentiation for Optimum Design, Applied to Sonic Boom Reduction Laurent Hasco et, Mariano V azquez, Alain Dervieux Laurent.Hascoet@sophia.inria.fr Tropics Project, INRIA Sophia-Antipolis AD Workshop, Cranfield, June 5-6, 2003

Differentiation Differentiation stems from beliefs about differences among learners, how they

JUST THE MATHS SLIDES NUMBER 10.3 DIFFERENTIATION 3 (Elementary techniques of

JUST THE MATHS SLIDES NUMBER 10.4 DIFFERENTIATION 4 (Products and quotients) &

4.4. Vertical Differentiation Matilde Machado Industrial Organization- Matilde Machado Vertical

Beautiful differentiation Conal Elliott LambdaPix 1 September, 2009 ICFP Conal Elliott

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Chino Basin Optimum Chino Basin Optimum Basin Management Basin Management Program Program

OPTIMUM OPTIMUM ADAPTIVE ALGORITHMS ADAPTIVE ALGORITHMS for for SYSTEM IDENTIFICATION SYSTEM

Automatic Differentiation Tools for FreeFem++ Workshop FreeFem++ Sylvain Auliac (

CSC321 Lecture 10: Automatic Differentiation Roger Grosse Roger Grosse CSC321 Lecture 10:

CSC421/2516 Lecture 6: Automatic Differentiation Roger Grosse and Jimmy Ba Roger Grosse and

Vector Forward Mode Automatic Differentiation on SIMD/SIMT architectures Jan H uckelheim,

A Simply Typed -Calculus of Forward Automatic Differentiation Oleksandr Manzyuk National

0 Towards Polyhedral Automatic Differentiation uckelheim 1,2 Navjot Kukreja 1 Jan H December

CSC421/2516 Lecture 3: Automatic Differentiation & Distributed Representations Jimmy Ba

Automatic Differentiation by Program Transformation Laurent Hasco et INRIA Sophia-Antipolis,

15% > Black market fueling credential theft Quantifying risk of account takeover Protecting

MAYASEVENs Hacking Diary Who are we? Nop Phoomthaisong MAYASEVEN Team Cybersecurity

10/1/2018 About us Kari Campbe ll Pe g Banks F ounde r, Dog T ire d Doggie Dayc are

So on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they

Faster Con struction of Plan ar 2-Cen ters David Eppstein Dept. In form ation an d Com puter

On Gradient Descent and Local vs. Global Optimum We conjecture that both simulating anneal- ing

Phase synchronization An example of global optimality on manifolds Nicolas Boumal, Inria &

Pointwise second-order necessary optimality conditions and sensitivity relations Nonlinear

Automatic Differentiation for Optimum Design, Applied to Sonic Boom - PowerPoint PPT Presentation

Automatic Differentiation for Optimum Design, Applied to Sonic Boom Reduction Laurent Hasco et, Mariano V azquez, Alain Dervieux Laurent.Hascoet@sophia.inria.fr Tropics Project, INRIA Sophia-Antipolis AD Workshop, Cranfield, June 5-6, 2003

Differentiation Differentiation stems from beliefs about differences among learners, how they

JUST THE MATHS SLIDES NUMBER 10.3 DIFFERENTIATION 3 (Elementary techniques of

JUST THE MATHS SLIDES NUMBER 10.4 DIFFERENTIATION 4 (Products and quotients) &amp;

4.4. Vertical Differentiation Matilde Machado Industrial Organization- Matilde Machado Vertical

Beautiful differentiation Conal Elliott LambdaPix 1 September, 2009 ICFP Conal Elliott

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Chino Basin Optimum Chino Basin Optimum Basin Management Basin Management Program Program

OPTIMUM OPTIMUM ADAPTIVE ALGORITHMS ADAPTIVE ALGORITHMS for for SYSTEM IDENTIFICATION SYSTEM

Automatic Differentiation Tools for FreeFem++ Workshop FreeFem++ Sylvain Auliac (

CSC321 Lecture 10: Automatic Differentiation Roger Grosse Roger Grosse CSC321 Lecture 10:

CSC421/2516 Lecture 6: Automatic Differentiation Roger Grosse and Jimmy Ba Roger Grosse and

Vector Forward Mode Automatic Differentiation on SIMD/SIMT architectures Jan H uckelheim,

A Simply Typed -Calculus of Forward Automatic Differentiation Oleksandr Manzyuk National

0 Towards Polyhedral Automatic Differentiation uckelheim 1,2 Navjot Kukreja 1 Jan H December

CSC421/2516 Lecture 3: Automatic Differentiation &amp; Distributed Representations Jimmy Ba

Automatic Differentiation by Program Transformation Laurent Hasco et INRIA Sophia-Antipolis,

15% &gt; Black market fueling credential theft Quantifying risk of account takeover Protecting

MAYASEVENs Hacking Diary Who are we? Nop Phoomthaisong MAYASEVEN Team Cybersecurity

10/1/2018 About us Kari Campbe ll Pe g Banks F ounde r, Dog T ire d Doggie Dayc are

So on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they

Faster Con struction of Plan ar 2-Cen ters David Eppstein Dept. In form ation an d Com puter

On Gradient Descent and Local vs. Global Optimum We conjecture that both simulating anneal- ing

Phase synchronization An example of global optimality on manifolds Nicolas Boumal, Inria &amp;

Pointwise second-order necessary optimality conditions and sensitivity relations Nonlinear

JUST THE MATHS SLIDES NUMBER 10.4 DIFFERENTIATION 4 (Products and quotients) &

CSC421/2516 Lecture 3: Automatic Differentiation & Distributed Representations Jimmy Ba

15% > Black market fueling credential theft Quantifying risk of account takeover Protecting

Phase synchronization An example of global optimality on manifolds Nicolas Boumal, Inria &