against Platform Uncertainties Thomas Wahl Northeastern University - - PowerPoint PPT Presentation
against Platform Uncertainties Thomas Wahl Northeastern University - - PowerPoint PPT Presentation
Stabilizing Numeric Programs against Platform Uncertainties Thomas Wahl Northeastern University August 28, 2017 Example: Ray Tracing int raySphere( float *r, float *s, float radiusSq) { float A = dot3(r,r); float B = -2.0 * dot3(s,r); float C =
Example: Ray Tracing
For the input on the right: AMD GPU A8-3850: ๐ธ = โ0.000122 NVIDIA Quadro 600 GPU: ๐ธ = +0.000244
int raySphere(float *r, float *s, float radiusSq) { float A = dot3(r,r); float B = -2.0 * dot3(s,r); float C = dot3(s,s) - radiusSq; float D = B*B - 4*A*C; if (D > 0.00001) { float sign = ( C > 0.00001 ) ? 1 : -1; โฎ
9.0 + 0.9 + 0.09 + 0.009 + 0.0009 + 0.00009 + 0.000009 + 0.0000009
0.099 9.999
9.9999999
0.0009999
CPU: GPU:
9.9 9.9 0.00099
9.0 + 0.9 + 0.09 + 0.009 + 0.0009 + 0.00009 + 0.000009 + 0.0000009
0.0000099
Platform Variations: Contracted Operations
Fused Multiply-Add (FMA) MULT ADD ROUND ROUND ROUND MULT ADD
๐ ๐ ๐ ๐ ๐ ๐
๐ โ ๐ โ ๐ ๐ ร ๐ โ ๐
VOLATILITY IN NUMERIC PROGRAMS
Volatile Expressions
= expression whose semantics depends on the (expression) evaluation model, which determines: Sums, products: Dot products:
- evaluation order
- availability and use of hardware features
such as fused multiply-add (FMA) ๐
1 ๐ฝ + ๐ 2 ๐ฝ + โฆ + ๐ ๐(๐ฝ)
๐
1 ๐ฝ ร ๐1 ๐ฝ + โฆ + ๐ ๐ ๐ฝ ร ๐๐(๐ฝ)
Quantifying Volatility
Questions:
- How do we compute this, for input ๐ฝ ?
- Once computed, what is it good for?
Let ๐ be a floating-point expression in the program. The volatile bound of ๐ for input ๐ฝ is min
๐ ๐ ๐ฝ, ๐
, max
๐ ๐ ๐ฝ, ๐
๐: expression evaluation models
Volatility in Matrix Calculations
๏ผwell-designed programs ๏ผhigh degree of reproducibility
But: Ray Tracing
For some inputs: AMD GPU A8-3850: ๐ธ = โ0.000122 NVIDIA Quadro 600 GPU: ๐ธ = +0.000244
int raySphere(float *r, float *s, float radiusSq) { float A = dot3(r,r); float B = -2.0 * dot3(s,r); float C = dot3(s,s) - radiusSq; float D = B*B - 4*A*C; if (D > 0.00001) { float sign = ( C > 0.00001 ) ? 1 : -1; โฎ
Stabilizing Numeric Programs: Overview
Goal:
- Improve reproducibility of program results
Naive solution: โdeterminizeโ the whole program cl /fp:strict source.cpp
- implements strict evaluation:
FMA disabled, expressions from left to right
Local Stabilization
Identifying Major Destabilizers
int raySphere(float *r, float *s, float radiusSq) { float A = dot3(r,r); float B = -2.0 * dot3(s,r); float C = dot3(s,s) - radiusSq; float D = B*B - 4*A*C; if (D > 0.00001) { float sign = ( C > 0.00001 ) ? 1 : -1; โฎ
Stabilizing ๐ธโs expression
int raySphere(float *r, float *s, float radiusSq) { float A = dot3(r,r); float B = -2.0 * dot3(s,r); float C = dot3(s,s) - radiusSq; /* donโt optimize ... */ float D = B*B - 4*A*C; if (D > 0.00001) { float sign = ( C > 0.00001 ) ? 1 : -1; โฎ
Stabilizing D = B*B โ 4*A*C : new volatile bound for ๐ธ of [โ0.250000000, 0.125000000]
Stabilizing ๐ธโs expression not enough
Two causes of large volatile bounds for ๐ฌ:
- 1. volatility caused by ๐ธโs defining expression
- 2. volatility inherited from earlier expressions,
which causes bloated inputs to ๐ธ !
Provenance of ๐ธโs Volatility
= for each preceding expression (here: ๐ต, ๐ถ, ๐ท), the contribution to (impact on) ๐ธโs volatility
[VMCAI 2017]
int raySphere(float *r, float *s, float radiusSq) { float A = dot3(r,r); /* FMA forbidden! */ float B = -2.0 * dot3(s,r); /* donโt reorder! */ float C = dot3(s,s) - radiusSq; float D = B*B - 4*A*C; if (D > 0.00001) { float sign = ( C > 0.00001 ) ? 1 : -1; โฎ
- Stabilizing B = 2.0 * dot3(s,r):
new bound for ๐ธ : [-0.002806663, 0.156753913]
- Stabilizing C = dot3(s,s) - radiusSq
new bound for ๐ธ : [0.125000000, 0.156753913]
๏ผ FPA expressions are volatile: semantics fragile against platform variations ๏ผ expose volatility, or prove robustness, on a per-input basis ๏ผ repair: make program robust against platform uncertainties
Take-Home Messages
Future plans:
- platform uncertainties in machine learning (or other big-data)
- prove robustness for a range of inputs
- trade-offs: num. stability vs. accuracy (โ!) vs. efficiency
Acknowledgements
Joint with (@ NEU):
- Yijia Gu (stud.), Computer and Information Science
- Miriam Leeser (fac.) and Mahsa Bayati (stud.), Engineering