Approximating with Input Level Granularity Parker Hill, Michael - - PowerPoint PPT Presentation

approximating with input level granularity
SMART_READER_LITE
LIVE PREVIEW

Approximating with Input Level Granularity Parker Hill, Michael - - PowerPoint PPT Presentation

Approximating with Input Level Granularity Parker Hill, Michael Laurenzano, Mehrzad Samadi Scott Mahlke, Jason Mars, Lingjia Tang Computational Model Each operation executed with several inputs Computation 2 Sensitivity to Input 3


slide-1
SLIDE 1

Approximating with Input Level Granularity

Parker Hill, Michael Laurenzano, Mehrzad Samadi Scott Mahlke, Jason Mars, Lingjia Tang

slide-2
SLIDE 2

2

Computational Model

  • Each operation executed with several inputs

Computation

slide-3
SLIDE 3

3

Sensitivity to Input

slide-4
SLIDE 4

4

Sensitivity to Input

Gamma Filter Input

slide-5
SLIDE 5

5

Sensitivity to Input

Gamma Filter (16x8 Tiling*) Approximation Input *Samadi et al. ASPLOS 2014

slide-6
SLIDE 6

6

Sensitivity to Input

Gamma Filter (16x8 Tiling*) Approximation Input

Is this an acceptable approximation method?

*Samadi et al. ASPLOS 2014

slide-7
SLIDE 7

7

Sensitivity to Input

Gamma Filter (16x8 Tiling*) Approximation Input

*Samadi et al. ASPLOS 2014

slide-8
SLIDE 8

8

Sensitivity to Input

Gamma Filter (16x8 Tiling*) Approximation Input

*Samadi et al. ASPLOS 2014

slide-9
SLIDE 9

9

Sensitivity to Input

Gamma Filter (16x8 Tiling*) Approximation Input

✓ ⊗

*Samadi et al. ASPLOS 2014

slide-10
SLIDE 10

10

Previous Work

  • Use some set of inputs to:

– Determine if approximation is accurate enough – Pick fastest acceptable approximation

  • Reuse the approximation for several inputs
slide-11
SLIDE 11

11

16x8 Tiling Speedup 49x 4x2 Tiling 5.9x

Performance vs Accuracy

✓ ⊗

slide-12
SLIDE 12

12

16x8 Tiling Speedup 49x 4x2 Tiling 5.9x

Performance vs Accuracy

✓ ⊗ ✓ ✓

slide-13
SLIDE 13

13

16x8 Tiling Speedup 49x 4x2 Tiling 5.9x

Performance vs Accuracy

✓ ⊗ ✓ __ ✓

slide-14
SLIDE 14

14

Trade-off with Many Inputs

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %

4 x 2 t i l i n g a p p r

  • x

i m a t i

  • n

( 5 . 9 x s p e e d u p )

P r

  • p
  • r

t i

  • n

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n
slide-15
SLIDE 15

15

Trade-off with Many Inputs

  • Conservative approximation → small speedup

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %

4 x 2 t i l i n g a p p r

  • x

i m a t i

  • n

( 5 . 9 x s p e e d u p )

P r

  • p
  • r

t i

  • n

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n
slide-16
SLIDE 16

16

Trade-off with Many Inputs

  • Conservative approximation → small speedup
  • Cannot approximate more aggressively

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %

4 x 2 t i l i n g a p p r

  • x

i m a t i

  • n

( 5 . 9 x s p e e d u p )

P r

  • p
  • r

t i

  • n

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 %

8 x 8 t i l i n g ( 2 2 x s p e e d u p )

P r

  • p
  • r

t i

  • n
slide-17
SLIDE 17

17

Trade-off with Many Inputs

  • Conservative approximation → small speedup
  • Cannot approximate more aggressively

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %

4 x 2 t i l i n g a p p r

  • x

i m a t i

  • n

( 5 . 9 x s p e e d u p )

P r

  • p
  • r

t i

  • n

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 %

8 x 8 t i l i n g ( 2 2 x s p e e d u p )

P r

  • p
  • r

t i

  • n
slide-18
SLIDE 18

18

Trade-off with Many Inputs

  • Conservative approximation → small speedup
  • Cannot approximate more aggressively

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %

4 x 2 t i l i n g a p p r

  • x

i m a t i

  • n

( 5 . 9 x s p e e d u p )

P r

  • p
  • r

t i

  • n

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 %

8 x 8 t i l i n g ( 2 2 x s p e e d u p )

P r

  • p
  • r

t i

  • n
  • We would like to approximate inputs differently
slide-19
SLIDE 19

19

Dynamic Approximation Challenges

  • Must analyze accurately

– Cannot violate TOQ – Need to pick a fast approximation

  • Must analyze quickly

– Limits potential speedup

slide-20
SLIDE 20

20

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

slide-21
SLIDE 21

21

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

slide-22
SLIDE 22

22

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

✓ ✓ ✓ ⊗

slide-23
SLIDE 23

23

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

16x8

✓ ✓ ✓ ⊗

slide-24
SLIDE 24

24

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

16x8

✓ ✓ ✓ ⊗

slide-25
SLIDE 25

25

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

slide-26
SLIDE 26

26

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

✓ ✓ ⊗ ⊗

slide-27
SLIDE 27

27

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

✓ ✓ ⊗ ⊗

4x2

slide-28
SLIDE 28

28

Analysis Approximations Tiling

2x2 16x16 4x2 16x8

Input

2x2 16x8

Selection

  • Approx. Result

Customized Approximation

16x16 4x2

One Possible Dynamic System

2) Apply analysis to each pair:

  • Performance
  • Output quality

3) Select best approximation:

  • Meets accuracy constraint
  • High performance

1) Provide:

  • A set of approximations
  • Input

4) Apply approximation

✓ ✓ ⊗ ⊗

4x2

slide-29
SLIDE 29

29

Dynamic Oracle Selections

  • Optimal choice depends heavily on input

1 6 x 1 6 8 x 1 6 1 6 x 3 2 8 x 3 2 3 2 x 3 2 4 x 1 6 8 x 8 1 6 x 6 4 3 2 x 1 6 3 2 x 6 4 6 4 x 1 2 8 4 x 8 4 x 3 2 1 6 x 1 2 8 3 2 x 1 2 8 4 x 4 8 x 6 4 8 x 4 2 4

  • t

h e r s % 5 % 1 % 1 5 %

P r

  • p
  • r

t i

  • n
slide-30
SLIDE 30

30

Dynamic Oracle Performance

1 % 9 % 8 % % 1 % 2 % 3 % 4 % 5 %

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n

P r

  • p
  • r

t i

  • n

O u t p u t Q u a l i t y

  • Accuracy near TOQ
slide-31
SLIDE 31

31

Dynamic Oracle Performance

1 % 9 % 8 % % 1 % 2 % 3 % 4 % 5 %

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n

P r

  • p
  • r

t i

  • n

O u t p u t Q u a l i t y

% 2 % 4 % 6 % 8 % 1 % x 1 x 2 x 3 x 4 x

S p e e d u p P r

  • p
  • r

t i

  • n
  • Accuracy near TOQ
  • 61x average speedup
slide-32
SLIDE 32

32

Dynamic Oracle Performance

  • Accuracy near TOQ
  • 61x average speedup

(compared to 5.9x for 4x2 tiling)

1 % 9 % 8 % % 1 % 2 % 3 % 4 % 5 %

M i s s e d O p p

  • r

t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i

  • l

a t i

  • n

P r

  • p
  • r

t i

  • n

O u t p u t Q u a l i t y

% 2 % 4 % 6 % 8 % 1 % x 1 x 2 x 3 x 4 x

S p e e d u p P r

  • p
  • r

t i

  • n

O u t p u t Q u a l i t y

1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %

4 x 2 t i l i n g a p p r

  • x

i m a t i

  • n

( 5 . 9 x s p e e d u p )

P r

  • p
  • r

t i

  • n
slide-33
SLIDE 33

33

Conclusion

  • Adjusting approximation per input is important

– 61x potential speedup for dynamic system – 5.9x potential speedup for static system

  • To take advantage of this opportunity:

– Dynamic system predicts approximation per input – High prediction accuracy – Quick predictions

slide-34
SLIDE 34

34

Questions?