The TAPENADE AD Tool Laurent Hasco et, Val erie Pascual, - - PowerPoint PPT Presentation

the tapenade ad tool
SMART_READER_LITE
LIVE PREVIEW

The TAPENADE AD Tool Laurent Hasco et, Val erie Pascual, - - PowerPoint PPT Presentation

The TAPENADE AD Tool Laurent Hasco et, Val erie Pascual, Rose-Marie Greborio Laurent.Hascoet@sophia.inria.fr Tropics Project, INRIA Sophia-Antipolis AD Workshop, Cranfield, June 5-6, 2003 1 PLAN: AD: principles of Tangent and Reverse


slide-1
SLIDE 1

The TAPENADE AD Tool

Laurent Hasco¨ et, Val´ erie Pascual, Rose-Marie Greborio Laurent.Hascoet@sophia.inria.fr Tropics Project, INRIA Sophia-Antipolis AD Workshop, Cranfield, June 5-6, 2003

slide-2
SLIDE 2

1

PLAN:

  • AD: principles of Tangent and Reverse
  • Tapenade: technology from Compilation and Parallelization
  • Call Graphs, Flow Graphs, Symbol Tables
  • Static Analyses on Flow Graphs
  • Dependency Analysis
  • Tapenade: an AD tool on the web
  • Further Developments
slide-3
SLIDE 3

2

AD: Principles of Tangent and Reverse

AD rewrites source programs to make them compute derivatives. consider: P : {I1; I2; . . . Ip; } implementing f : I Rm → I Rn

slide-4
SLIDE 4

2

AD: Principles of Tangent and Reverse

AD rewrites source programs to make them compute derivatives. consider: P : {I1; I2; . . . Ip; } implementing f : I Rm → I Rn identify with: f = fp ◦ fp−1 ◦ · · · ◦ f1 name: x0 = x and xk = fk(xk−1)

slide-5
SLIDE 5

2

AD: Principles of Tangent and Reverse

AD rewrites source programs to make them compute derivatives. consider: P : {I1; I2; . . . Ip; } implementing f : I Rm → I Rn identify with: f = fp ◦ fp−1 ◦ · · · ◦ f1 name: x0 = x and xk = fk(xk−1) chain rule: f ′(x) = f ′

p(xp−1).f ′ p−1(xp−2). . . . .f ′ 1(x0)

slide-6
SLIDE 6

2

AD: Principles of Tangent and Reverse

AD rewrites source programs to make them compute derivatives. consider: P : {I1; I2; . . . Ip; } implementing f : I Rm → I Rn identify with: f = fp ◦ fp−1 ◦ · · · ◦ f1 name: x0 = x and xk = fk(xk−1) chain rule: f ′(x) = f ′

p(xp−1).f ′ p−1(xp−2). . . . .f ′ 1(x0)

f ′(x) generally too large and expensive ⇒ take useful views! ˙ y = f ′(x). ˙ x = f ′

p(xp−1).f ′ p−1(xp−2). . . . .f ′ 1(x0). ˙

x tangent AD x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

reverse AD Evaluate both from right to left !

slide-7
SLIDE 7

3

AD: Example

... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ...

slide-8
SLIDE 8

3

AD: Example

... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ... The corresponding (fragment of) Jacobian is: f ′(x) = ...      1 1 1 1−p1∗v3

v2

2

p1 v2

          1 2 1 1      ...

slide-9
SLIDE 9

4

Tangent AD keeps the structure of P: ˙ y = f ′(x). ˙ x = f ′

p(xp−1).f ′ p−1(xp−2). . . . .f ′ 1(x0). ˙

x ... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ...

slide-10
SLIDE 10

4

Tangent AD keeps the structure of P: ˙ y = f ′(x). ˙ x = f ′

p(xp−1).f ′ p−1(xp−2). . . . .f ′ 1(x0). ˙

x ... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ... ˙ v2 = 2 ∗ ˙ v1 ˙ v4 = ˙ v2 ∗ (1 − p1 ∗ v3/v2

2) + ˙

v3 ∗ p1/v2 just inserts the products ˙ xk = f ′

k(xk−1) for k = 1 to p.

slide-11
SLIDE 11

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y

slide-12
SLIDE 12

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

slide-13
SLIDE 13

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

f ′∗

p−1(xp−2).

slide-14
SLIDE 14

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

f ′∗

p−1(xp−2).

. . . x = f ′∗

1 (x0).

slide-15
SLIDE 15

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

f ′∗

p−1(xp−2).

. . . x = f ′∗

1 (x0).

xp−1 = fp−1(xp−2);

slide-16
SLIDE 16

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

f ′∗

p−1(xp−2).

. . . x = f ′∗

1 (x0).

xp−1 = fp−1(xp−2); xp−2 = fp−2(xp−3); . . . x1 = f1(x0); x0; forward sweep backward sweep

slide-17
SLIDE 17

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

f ′∗

p−1(xp−2).

. . . x = f ′∗

1 (x0).

xp−1 = fp−1(xp−2); xp−2 = fp−2(xp−3); . . . x1 = f1(x0); x0; forward sweep backward sweep

② ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✴

retrieve

slide-18
SLIDE 18

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

f ′∗

p−1(xp−2).

. . . x = f ′∗

1 (x0).

xp−1 = fp−1(xp−2); xp−2 = fp−2(xp−3); . . . x1 = f1(x0); x0; forward sweep backward sweep

② ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✴

retrieve

✐ ② ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ☛

retrieve

slide-19
SLIDE 19

5

AD: Reverse is more tricky than Tangent

x = f ′∗(x).y = f ′∗

1 (x0). . . . f ′∗ p−1(xp−2).f ′∗ p (xp−1).y

time y f ′∗

p (xp−1).

f ′∗

p−1(xp−2).

. . . x = f ′∗

1 (x0).

xp−1 = fp−1(xp−2); xp−2 = fp−2(xp−3); . . . x1 = f1(x0); x0; forward sweep backward sweep

② ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✴

retrieve

✐ ② ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ☛

retrieve

✐ ② ❄

retrieve

Memory usage (“Tape”) is the bottleneck!

slide-20
SLIDE 20

6

AD: Continued Example

Program fragment: ... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ...

slide-21
SLIDE 21

6

AD: Continued Example

Program fragment: ... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ... Corresponding transposed Partial Jacobians: f ′∗(x) = ...      1 2 1 1           1 1 1−p1∗v3

v2

2

1

p1 v2

     ...

slide-22
SLIDE 22

7

AD: Reverse mode on the example

... ¯ v2 = ¯ v2 + ¯ v4 ∗ (1 − p1 ∗ v3/v2

2)

¯ v3 = ¯ v3 + ¯ v4 ∗ p1/v2 ¯ v4 = 0 ¯ v1 = ¯ v1 + 2 ∗ ¯ v2 ¯ v2 = 0 ...

slide-23
SLIDE 23

7

AD: Reverse mode on the example

... ¯ v2 = ¯ v2 + ¯ v4 ∗ (1 − p1 ∗ v3/v2

2)

¯ v3 = ¯ v3 + ¯ v4 ∗ p1/v2 ¯ v4 = 0 ¯ v1 = ¯ v1 + 2 ∗ ¯ v2 ¯ v2 = 0 ... ... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ...

slide-24
SLIDE 24

7

AD: Reverse mode on the example

... ¯ v2 = ¯ v2 + ¯ v4 ∗ (1 − p1 ∗ v3/v2

2)

¯ v3 = ¯ v3 + ¯ v4 ∗ p1/v2 ¯ v4 = 0 ¯ v1 = ¯ v1 + 2 ∗ ¯ v2 ¯ v2 = 0 ... ... v2 = 2 ∗ v1 + 5 v4 = v2 + p1 ∗ v3/v2 ... Push(v4) Push(v2) Pop(v4) Pop(v2)

slide-25
SLIDE 25

8

AD: The Checkpointing tactic

A Storage/Recomputation tradeoff:

slide-26
SLIDE 26

8

AD: The Checkpointing tactic

A Storage/Recomputation tradeoff: Tapenade does it on the Call Graph :

slide-27
SLIDE 27

9

Tapenade: Internal Representation Take profit of well-known techniques from Compilation and Parallelization:

  • Use a general abstract Imperative Language (IL)
  • Represent programs as Call Graphs of Flow Graphs
  • Store symbol declarations in nested Symbol Tables
slide-28
SLIDE 28

10

slide-29
SLIDE 29

11

Application: Inversion of the Flow Graph

slide-30
SLIDE 30

12

Application: Loop Inversion

slide-31
SLIDE 31

13

Tapenade: Global Static Analyses on Flow Graphs

  • classical IN-OUT analysis.
  • forward dependence with respect to independent inputs.
  • backward influence on dependent inputs.
  • specific TBR analysis for the reverse mode.
  • . . . pointer analysis . . .

Usual restrictions: conservative assumptions, arrays . . .

slide-32
SLIDE 32

14

Application: reduced snapshots

Snapshot = IN(checkpoint) OUT(checkpoint and after)

slide-33
SLIDE 33

15

Tapenade: Using Data Dependencies

flow: write x→read x anti: read x→write x

  • utput: write x→write x

Data Dependencies form

  • a partial order between run-time instructions.
  • a graph between textual instructions.

Any instructions shuffle that respects Data Dependencies is valid !

slide-34
SLIDE 34

16

Application: Loop Fusion in “Vector” Mode

... a = 2.0 ∗ a + 10.0 b = c + sin(a) c = 0.0 ...

slide-35
SLIDE 35

16

Application: Loop Fusion in “Vector” Mode

... a = 2.0 ∗ a + 10.0 b = c + sin(a) c = 0.0 ... Do n = 1, ndt ˙ a(n) = 2.0 ∗ ˙ a(n) Enddo Do n = 1, ndt ˙ b(n) = ˙ c(n) + cos(a) ∗ ˙ a(n) Enddo Do n = 1, ndt ˙ c(n) = 0.0 Enddo

slide-36
SLIDE 36

16

Application: Loop Fusion in “Vector” Mode

... a = 2.0 ∗ a + 10.0 b = c + sin(a) c = 0.0 ... Do n = 1, ndt ˙ a(n) = 2.0 ∗ ˙ a(n) Enddo Do n = 1, ndt ˙ b(n) = ˙ c(n) + cos(a) ∗ ˙ a(n) Enddo Do n = 1, ndt ˙ c(n) = 0.0 Enddo ... a = 2.0 ∗ a + 10.0 Do n = 1, ndt ˙ a(n) = 2.0 ∗ ˙ a(n) ˙ b(n) = ˙ c(n) + cos(a) ∗ ˙ a(n) ˙ c(n) = 0.0 Enddo b = c + sin(a) c = 0.0 ...

slide-37
SLIDE 37

17

Tapenade: an AD tool on the web

  • Servlet on http://www-sop.inria.fr/tropics or batch
  • Uploads your Files and Includes
  • Displays results and messages with links to source
slide-38
SLIDE 38

18

Future work...

Tapenade now 18 months old. Several applications: Aeronautics, Hydrology, Chemistry, Biology... Many developments still waiting:

  • User Directives: active I-O, checkpoints, special loops
  • FORTRAN95, and then C
  • Dead code in the Reverse mode
  • Validity domain for derivatives