Approximating with Input Level Granularity Parker Hill, Michael - - PowerPoint PPT Presentation
Approximating with Input Level Granularity Parker Hill, Michael - - PowerPoint PPT Presentation
Approximating with Input Level Granularity Parker Hill, Michael Laurenzano, Mehrzad Samadi Scott Mahlke, Jason Mars, Lingjia Tang Computational Model Each operation executed with several inputs Computation 2 Sensitivity to Input 3
2
Computational Model
- Each operation executed with several inputs
Computation
3
Sensitivity to Input
4
Sensitivity to Input
Gamma Filter Input
5
Sensitivity to Input
Gamma Filter (16x8 Tiling*) Approximation Input *Samadi et al. ASPLOS 2014
6
Sensitivity to Input
Gamma Filter (16x8 Tiling*) Approximation Input
✓
Is this an acceptable approximation method?
*Samadi et al. ASPLOS 2014
7
Sensitivity to Input
Gamma Filter (16x8 Tiling*) Approximation Input
✓
*Samadi et al. ASPLOS 2014
8
Sensitivity to Input
Gamma Filter (16x8 Tiling*) Approximation Input
✓
*Samadi et al. ASPLOS 2014
9
Sensitivity to Input
Gamma Filter (16x8 Tiling*) Approximation Input
✓ ⊗
*Samadi et al. ASPLOS 2014
10
Previous Work
- Use some set of inputs to:
– Determine if approximation is accurate enough – Pick fastest acceptable approximation
- Reuse the approximation for several inputs
11
16x8 Tiling Speedup 49x 4x2 Tiling 5.9x
Performance vs Accuracy
✓ ⊗
12
16x8 Tiling Speedup 49x 4x2 Tiling 5.9x
Performance vs Accuracy
✓ ⊗ ✓ ✓
13
16x8 Tiling Speedup 49x 4x2 Tiling 5.9x
Performance vs Accuracy
✓ ⊗ ✓ __ ✓
14
Trade-off with Many Inputs
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %
4 x 2 t i l i n g a p p r
- x
i m a t i
- n
( 5 . 9 x s p e e d u p )
P r
- p
- r
t i
- n
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
15
Trade-off with Many Inputs
- Conservative approximation → small speedup
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %
4 x 2 t i l i n g a p p r
- x
i m a t i
- n
( 5 . 9 x s p e e d u p )
P r
- p
- r
t i
- n
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
16
Trade-off with Many Inputs
- Conservative approximation → small speedup
- Cannot approximate more aggressively
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %
4 x 2 t i l i n g a p p r
- x
i m a t i
- n
( 5 . 9 x s p e e d u p )
P r
- p
- r
t i
- n
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 %
8 x 8 t i l i n g ( 2 2 x s p e e d u p )
P r
- p
- r
t i
- n
17
Trade-off with Many Inputs
- Conservative approximation → small speedup
- Cannot approximate more aggressively
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %
4 x 2 t i l i n g a p p r
- x
i m a t i
- n
( 5 . 9 x s p e e d u p )
P r
- p
- r
t i
- n
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 %
8 x 8 t i l i n g ( 2 2 x s p e e d u p )
P r
- p
- r
t i
- n
18
Trade-off with Many Inputs
- Conservative approximation → small speedup
- Cannot approximate more aggressively
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %
4 x 2 t i l i n g a p p r
- x
i m a t i
- n
( 5 . 9 x s p e e d u p )
P r
- p
- r
t i
- n
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 %
8 x 8 t i l i n g ( 2 2 x s p e e d u p )
P r
- p
- r
t i
- n
- We would like to approximate inputs differently
19
Dynamic Approximation Challenges
- Must analyze accurately
– Cannot violate TOQ – Need to pick a fast approximation
- Must analyze quickly
– Limits potential speedup
20
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
21
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
22
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
✓ ✓ ✓ ⊗
23
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
16x8
✓ ✓ ✓ ⊗
24
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
16x8
✓ ✓ ✓ ⊗
25
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
26
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
✓ ✓ ⊗ ⊗
27
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
✓ ✓ ⊗ ⊗
4x2
28
Analysis Approximations Tiling
2x2 16x16 4x2 16x8
Input
2x2 16x8
Selection
- Approx. Result
Customized Approximation
16x16 4x2
One Possible Dynamic System
2) Apply analysis to each pair:
- Performance
- Output quality
3) Select best approximation:
- Meets accuracy constraint
- High performance
1) Provide:
- A set of approximations
- Input
4) Apply approximation
✓ ✓ ⊗ ⊗
4x2
29
Dynamic Oracle Selections
- Optimal choice depends heavily on input
1 6 x 1 6 8 x 1 6 1 6 x 3 2 8 x 3 2 3 2 x 3 2 4 x 1 6 8 x 8 1 6 x 6 4 3 2 x 1 6 3 2 x 6 4 6 4 x 1 2 8 4 x 8 4 x 3 2 1 6 x 1 2 8 3 2 x 1 2 8 4 x 4 8 x 6 4 8 x 4 2 4
- t
h e r s % 5 % 1 % 1 5 %
P r
- p
- r
t i
- n
30
Dynamic Oracle Performance
1 % 9 % 8 % % 1 % 2 % 3 % 4 % 5 %
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
P r
- p
- r
t i
- n
O u t p u t Q u a l i t y
- Accuracy near TOQ
31
Dynamic Oracle Performance
1 % 9 % 8 % % 1 % 2 % 3 % 4 % 5 %
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
P r
- p
- r
t i
- n
O u t p u t Q u a l i t y
% 2 % 4 % 6 % 8 % 1 % x 1 x 2 x 3 x 4 x
S p e e d u p P r
- p
- r
t i
- n
- Accuracy near TOQ
- 61x average speedup
32
Dynamic Oracle Performance
- Accuracy near TOQ
- 61x average speedup
(compared to 5.9x for 4x2 tiling)
1 % 9 % 8 % % 1 % 2 % 3 % 4 % 5 %
M i s s e d O p p
- r
t u n i t y F a s t + H i g h Q u a l i t y T O Q = 9 % T O Q V i
- l
a t i
- n
P r
- p
- r
t i
- n
O u t p u t Q u a l i t y
% 2 % 4 % 6 % 8 % 1 % x 1 x 2 x 3 x 4 x
S p e e d u p P r
- p
- r
t i
- n
O u t p u t Q u a l i t y
1 % 9 % 8 % % 5 % 1 % 1 5 % 2 % 2 5 % 3 % 3 5 %
4 x 2 t i l i n g a p p r
- x
i m a t i
- n
( 5 . 9 x s p e e d u p )
P r
- p
- r
t i
- n
33
Conclusion
- Adjusting approximation per input is important
– 61x potential speedup for dynamic system – 5.9x potential speedup for static system
- To take advantage of this opportunity:
– Dynamic system predicts approximation per input – High prediction accuracy – Quick predictions
34