Fusing filters with Integer Linear Programming Amos Robinson (thats - - PowerPoint PPT Presentation

fusing filters with integer linear programming
SMART_READER_LITE
LIVE PREVIEW

Fusing filters with Integer Linear Programming Amos Robinson (thats - - PowerPoint PPT Presentation

Fusing filters with Integer Linear Programming Amos Robinson (thats me!) Gabriele Keller Ben Lippmeier I dont want to write this DO 10 I = 1, SIZE(XS) SUM1 = SUM1 + XS(I) IF (XS(I) .GT. 0) THEN SUM2 = SUM2 + XS(I) END IF 10 CONTINUE


slide-1
SLIDE 1

Fusing filters with Integer Linear Programming

Amos Robinson (that’s me!) Gabriele Keller Ben Lippmeier

slide-2
SLIDE 2

I don’t want to write this

DO 10 I = 1, SIZE(XS) SUM1 = SUM1 + XS(I) IF (XS(I) .GT. 0) THEN SUM2 = SUM2 + XS(I) END IF 10 CONTINUE

  • DO 20 I = 1, SIZE(XS)

NOR1(I) = XS(I) / SUM1 NOR2(I) = XS(I) / SUM2 20 CONTINUE

slide-3
SLIDE 3

I’d rather write this

sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs ys = filter (> 0) xs sum2 = fold (+) 0 ys nor2 = map (/ sum2) xs

slide-4
SLIDE 4

But I also want speed

  • Naive compilation: one loop for each combinator
  • We need fusion!
slide-5
SLIDE 5

Vertical fusion

sum1 = fold (+) 0 xs -- loop 1 nor1 = map (/ sum1) xs -- loop 2 ys = filter (> 0) xs -- loop 3 sum2 = fold (+) 0 ys -- loop 4 nor2 = map (/ sum2) xs -- loop 5

slide-6
SLIDE 6

Vertical fusion

sum1 = fold (+) 0 xs -- loop 1 nor1 = map (/ sum1) xs -- loop 2 sum2 = filterFold (> 0) (+) 0 xs -- loop 3 nor2 = map (/ sum2) xs -- loop 4

slide-7
SLIDE 7

Horizontal fusion

sum1 = fold (+) 0 xs -- loop 1 nor1 = map (/ sum1) xs -- loop 2 sum2 = filterFold (> 0) (+) 0 xs -- loop 3 nor2 = map (/ sum2) xs -- loop 4

slide-8
SLIDE 8

Horizontal fusion

sum1 = fold (+) 0 xs -- loop 1 (nor1, sum2) = mapFilterFold (/ sum1) (> 0) (+) 0 xs -- loop 2 nor2 = map (/ sum2) xs -- loop 3

slide-9
SLIDE 9

Finished

sum1 = fold (+) 0 xs -- loop 1 (nor1, sum2) = mapFilterFold (/ sum1) (> 0) (+) 0 xs -- loop 2 nor2 = map (/ sum2) xs -- loop 3

slide-10
SLIDE 10

Multiple choices

  • What if we applied the fusion rules in a different
  • rder?
  • There are far too many to try all of them, but…
slide-11
SLIDE 11

Order matters

sum1 = fold (+) 0 xs -- loop 1 nor1 = map (/ sum1) xs -- loop 2 ys = filter (> 0) xs -- loop 3 sum2 = fold (+) 0 ys -- loop 4 nor2 = map (/ sum2) xs -- loop 5

slide-12
SLIDE 12

Order matters

sum1 = fold (+) 0 xs -- loop 1 nor1 = map (/ sum1) xs -- loop 2 sum2 = filterFold (> 0) (+) 0 xs -- loop 3 nor2 = map (/ sum2) xs -- loop 5

slide-13
SLIDE 13

Order matters

(sum1, sum2) = foldFilterFold (+) 0 (> 0) (+) 0 xs -- loop 1 nor1 = map (/ sum1) xs -- loop 2 nor2 = map (/ sum2) xs -- loop 3

slide-14
SLIDE 14

Order matters

(sum1, sum2) = foldFilterFold (+) 0 (> 0) (+) 0 xs -- loop 1 nor1 = map (/ sum1) xs -- loop 2 nor2 = map (/ sum2) xs -- loop 3

slide-15
SLIDE 15

Order matters

(sum1, sum2) = foldFilterFold (+) 0 (> 0) (+) 0 xs -- loop 1 (nor1, nor2) = mapMap (/ sum1) (/ sum2) xs -- loop 2

slide-16
SLIDE 16

Order matters

(sum1, sum2) = foldFilterFold (+) 0 (> 0) (+) 0 xs -- loop 1 (nor1, nor2) = mapMap (/ sum1) (/ sum2) xs -- loop 2

slide-17
SLIDE 17

Which order?

  • Finding the best order is the hard part.
  • That’s why we use…
slide-18
SLIDE 18

Integer Linear Programming!

Minimise y - x Objective Subject to 0 ≤ x ≤ 2 Constraints ≤ y ≤ 2 x + 2y ≥ 3 Where x : Variables y : Z Z

slide-19
SLIDE 19

Integer Linear Programming!

Minimise y - x Objective Subject to 0 ≤ x ≤ 2 Constraints ≤ y ≤ 2 x + 2y ≥ 3 Where x : = 2 Variables y : = 1 Z Z

slide-20
SLIDE 20

Create a graph

slide-21
SLIDE 21

sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs ys = filter (> 0) xs sum2 = fold (+) 0 ys nor2 = map (/ sum2) xs

xs

slide-22
SLIDE 22

sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs ys = filter (> 0) xs sum2 = fold (+) 0 ys nor2 = map (/ sum2) xs sum1 = fold (+) 0 xs

xs sum1

slide-23
SLIDE 23

sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs ys = filter (> 0) xs sum2 = fold (+) 0 ys nor2 = map (/ sum2) xs sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs

xs sum1 nor1

slide-24
SLIDE 24

sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs sum2 = fold (+) 0 ys nor2 = map (/ sum2) xs sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs ys = filter (> 0) xs

xs sum1 nor1 ys

slide-25
SLIDE 25

ys = filter (> 0) xs sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs nor2 = map (/ sum2) xs sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs sum2 = fold (+) 0 ys

xs sum1 nor1 ys sum2

slide-26
SLIDE 26

ys = filter (> 0) xs sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs nor2 = map (/ sum2) xs sum1 = fold (+) 0 xs nor1 = map (/ sum1) xs sum2 = fold (+) 0 ys

xs sum1 nor1 ys sum2 nor2

nor2 = map (/ sum2) xs

slide-27
SLIDE 27

xs sum1 nor1 ys sum2 nor2

Different size loops

|xs| |xs| |ys| |xs| |xs|

slide-28
SLIDE 28

xs sum1 nor1 ys sum2 nor2

Different size loops

|xs| |xs| |ys| |xs| |xs|

slide-29
SLIDE 29

xs sum1 nor1 ys sum2 nor2

Different size loops

|xs| |xs| |ys| |xs| |xs|

slide-30
SLIDE 30

Filter constraint

Minimise … Subject to … f(sum1, ys) ≤ f(sum1, sum2) f(sum2, ys) ≤ f(sum1, sum2)

  • f(a,b) = 0 iff a and b are fused together
slide-31
SLIDE 31

Objective function

slide-32
SLIDE 32

xs sum1 nor1 ys sum2 nor2

slide-33
SLIDE 33

xs sum1 nor1 ys sum2 nor2 100

slide-34
SLIDE 34

xs sum1 nor1 ys sum2 nor2 100 1

slide-35
SLIDE 35

xs sum1 nor1 ys sum2 nor2 100 1

slide-36
SLIDE 36

xs sum1 nor1 ys sum2 nor2 100 1 100

slide-37
SLIDE 37

xs sum1 nor1 ys sum2 nor2 100 1 100 100

slide-38
SLIDE 38

xs sum1 nor1 ys sum2 nor2 100 1 100 100 100

slide-39
SLIDE 39

xs sum1 nor1 ys sum2 nor2 100 1 100 100 100 1

slide-40
SLIDE 40

xs sum1 nor1 ys sum2 nor2 100 1 100 100 100 1 100

slide-41
SLIDE 41

Objective function

Minimise 100f(sum1, ys) + 1f(sum1, sum2) + 100f(sum1, nor2) + 100f(ys, sum2) + 100f(ys, nor1) + 1f(sum2, nor1) + 100f(nor1, nor2)

slide-42
SLIDE 42

Cyclic clusterings cannot be executed

xs sum1 nor1 ys sum2 nor2

slide-43
SLIDE 43

Non-fusible edge

xs sum1 nor1 ys sum2 nor2

slide-44
SLIDE 44
  • (sum1) < o(nor1)

Non-fusible edge

slide-45
SLIDE 45

Fusible edge

xs sum1 nor1 ys sum2 nor2

slide-46
SLIDE 46

if f(ys, sum2) = 0 then o(ys) = o(sum2) else o(ys) < o(sum2)

Fusible edge

slide-47
SLIDE 47

1f(ys,sum2) ≤ o(sum2) - o(ys) ≤ 100f(ys, sum2)

Fusible edge

slide-48
SLIDE 48

1f(ys,sum2) ≤ o(sum2) - o(ys) ≤ 100f(ys, sum2) ≤ o(sum2) - o(ys) ≤ 0

  • (sum2)=o(ys)

Fusible edge - fused

slide-49
SLIDE 49

1f(ys,sum2) ≤ o(sum2) - o(ys) ≤ 100f(ys, sum2) 1 ≤ o(sum2) - o(ys) ≤ 100

  • (sum2)>o(ys)

Fusible edge - unfused

slide-50
SLIDE 50

No edge

xs sum1 nor1 ys sum2 nor2

slide-51
SLIDE 51

No edge

if f(sum1, ys) = 0 then o(sum1) = o(ys)

slide-52
SLIDE 52
  • 100f(sum1, ys) ≤ o(ys) - o(sum1)

≤ 100f(sum1, ys)

No edge

slide-53
SLIDE 53
  • 100f(sum1, ys) ≤ o(ys) - o(sum1)

≤ 100f(sum1, ys) ≤ o(ys) - o(sum1)≤ 0

  • (ys) = o(sum1)

No edge - fused

slide-54
SLIDE 54
  • 100f(sum1, ys) ≤ o(ys) - o(sum1)

≤ 100f(sum1, ys)

  • 100

≤ o(ys) - o(sum1)≤ 100

No edge - unfused

slide-55
SLIDE 55

Minimise 100f(sum1, ys) + 1f(sum1, sum2) + 100f(sum1, nor2) + 100f(ys, sum2) + 100f(ys, nor1) + 1f(sum2, nor1) + 100f(nor1, nor2) Subject to f(sum1, ys) ≤ f(sum1, sum2) f(sum2, ys) ≤ f(sum1, sum2)

  • 100f(sum1, ys)

≤ o(ys)

  • o(sum1) ≤ 100f(sum1, ys)
  • 100f(sum1, sum2)

≤ o(sum2) - o(sum1) ≤ 100f(sum1, sum2) 1f(ys, sum2) ≤ o(sum2) - o(ys) ≤ 100f(ys, sum2)

  • 100f(nor1, nor2)

≤ o(nor2) - o(nor1) ≤ 100f(nor1, nor2)

  • (sum1) < o(nor1)
  • (sum2) < o(nor2)

All together

slide-56
SLIDE 56

Result clustering

f(sum1, ys) = 0 f(ys, sum2) = 0 f(sum1, sum2) = 0

  • f(sum1, nor2)

= 1 f(ys, nor1) = 1 f(sum2, nor1) = 1

  • f(nor1, nor2)

= 0 xs sum1 nor1 ys sum2 nor2

slide-57
SLIDE 57
  • Integer linear programming isn’t as scary as it sounds!
  • We can fuse small (<10 combinator) programs in

adequate time

  • But we still need to look into large programs
  • And we need to support more combinators

In conclusion

slide-58
SLIDE 58
  • Quickhull, Normalize2, Closest points, Quad tree

and other test cases

  • GLPK and CPLEX both took < 100ms.

Timing: small programs

slide-59
SLIDE 59
  • Randomly generated with 24 combinators
  • GLPK (open source) took > 20min
  • COIN/CBC (open source) took 90s
  • CPLEX (commercial) took < 1s!

Timing: large program

slide-60
SLIDE 60

References

  • Megiddo 1997: Optimal weighted loop fusion for

parallel programs

  • Darte 1999: On the complexity of loop fusion
  • Lippmeier 2013: Data flow fusion with series

expressions in Haskell

slide-61
SLIDE 61

Differences from Megiddo

  • With combinators instead of loops, we have more

semantic information about the program.

  • Which lets us recognise size-changing operations

like filters, and fuse together.

slide-62
SLIDE 62

Future work

  • Currently only a few combinators: map, map2, filter,

fold, gather (bpermute), cross product

  • Need to support: length, reverse, append,

segmented fold, segmented map, segmented…