Automatically Improving Accuracy for Floating Point Expressions
Pavel Panchekha Alex Sanchez-Stern James Wilcox Zach Tatlock
⇒
error
Pavel Alex James Zach Panchekha Sanchez-Stern Wilcox Tatlock - - PowerPoint PPT Presentation
Automatically Improving Accuracy for Floating Point Expressions error Pavel Alex James Zach Panchekha Sanchez-Stern Wilcox Tatlock Floating Points Wild Success Floating Points Wild Success F R Often floating point is
Automatically Improving Accuracy for Floating Point Expressions
Pavel Panchekha Alex Sanchez-Stern James Wilcox Zach Tatlock
error
Floating Point’s Wild Success
Floating Point’s Wild Success
But not always! Often floating point is close to real arithmetic
Floating Point’s Wild Success
But not always!
Numerous articles retracted [Altman ’99, ’03] Financial regulations [Euro ’98] Market distortions [McCullough ’99, Quinn ’83]
Rounding Error in Sculpture
Blake Courter @bcourter
Rounding Error in Sculpture
Rounding error
Blake Courter @bcourter
Rounding Error in Sculpture
Rounding error
Blake Courter @bcourter
Existing options
Big Float+ Fast Code + More Reliable
+ Reliable + Fast Code
Heuristic search to find expert transformations
e’ e
Heuristic search to find expert transformations
Worked Example How Herbie Works Evaluation
e’ e
Heuristic search to find expert transformations
Worked Example How Herbie Works Evaluation
eF eR
e’ e
Rounding Error in Quadratic
−b + √ b2 − 4ac 2a
error
64Rounding Error in Quadratic
−b + √ b2 − 4ac 2a
64What is rounding error?
7 ULPs
error JeKR exact JeKF computed
Rounding Error in Quadratic
−b + √ b2 − 4ac 2a
64What is rounding error?
7 ULPs log(ULPs) estimates # of incorrect bits
log(ULPs) JeKR exact JeKF computed
Rounding Error in Quadratic
b log(ULPs)
−b + √ b2 − 4ac 2a
64
c b − b a
if b ∈ A
−b+ √ b2−4ac 2a
if b ∈ B
2c −b− √ b2−4ac
if b ∈ C − c
b
if b ∈ D
Rounding Error in Quadratic
b log(ULPs) A B C D
⇒
−b + √ b2 − 4ac 2a
64
c b − b a
if b ∈ A
−b+ √ b2−4ac 2a
if b ∈ B
2c −b− √ b2−4ac
if b ∈ C − c
b
if b ∈ D
Rounding Error in Quadratic
b log(ULPs) A
⇒
−b + √ b2 − 4ac 2a
B C D
64Overflow If is large, overflows and the the whole expression returns .
Jb2KF ∞
b A
c b − b a
if b ∈ A
−b+ √ b2−4ac 2a
if b ∈ B
2c −b− √ b2−4ac
if b ∈ C − c
b
if b ∈ D
Rounding Error in Quadratic
b log(ULPs)
⇒
−b + √ b2 − 4ac 2a
Pretty Accurate
64A B C D A B
A B C D
c b − b a
if b ∈ A
−b+ √ b2−4ac 2a
if b ∈ B
2c −b− √ b2−4ac
if b ∈ C − c
b
if b ∈ D
Rounding Error in Quadratic
b log(ULPs)
⇒
−b + √ b2 − 4ac 2a
Catastrophic Cancellation If is large, but and are small, and the difference is rounded off.
b
a c
b ≈ p b2 − 4ac
64A B C
c b − b a
if b ∈ A
−b+ √ b2−4ac 2a
if b ∈ B
2c −b− √ b2−4ac
if b ∈ C − c
b
if b ∈ D D
Rounding Error in Quadratic
b log(ULPs)
⇒
−b + √ b2 − 4ac 2a
Overflow again
64A B C D A B C
Rounding Error in Quadratic
b
⇒
b
−b + √ b2 − 4ac 2a
log(ULPs)
⇒
A B C D
c b − b a
if b ∈ A
−b+ √ b2−4ac 2a
if b ∈ B
2c −b− √ b2−4ac
if b ∈ C − c
b
if b ∈ D
64 64A B C D
Heuristic search to find expert transformations
Worked Example How Herbie Works Evaluation
eF eR
e’ e
Heuristic search to find expert transformations
Worked Example How Herbie Works Evaluation
eF eR
e’ e
Herbie Architecture
sample cands focus generate more candidates regimes
eF eR
e’ e
Herbie Architecture
sample cands focus regimes
Ground truth
eF eR
e’ e
generate more candidates
Herbie Architecture
sample cands focus regimes
Localize error
eF eR
e’ e
generate more candidates
Herbie Architecture
sample cands focus regimes
Heuristic search
eF eR
e’ e
generate more candidates
Herbie Architecture
sample cands focus regimes
Keep all good candidates
eF eR
e’ e
generate more candidates
Herbie Architecture
sample cands focus regimes
Combine candidates
eF eR
e’ e
generate more candidates
Herbie Architecture
sample cands focus regimes
Ground truth
eF eR
e’ e
generate more candidates
Determine ground truth
Compute in F
X = sample(domain(e))
e.g. X = {1.2 · 10−17, −3.8 · 10204, 173.5, . . . }
error e.g. {13.2b, 51.7b, 1b, . . . }
Get 64-bit prefix with MPFR. Subtle! See paper.
JeKF(X)
Round(JeKR(X))
64 random bits
Herbie Architecture
sample cands focus regimes
Ground truth
eF eR
e’ e
generate more candidates
Herbie Architecture
sample cands focus regimes
eF eR
e’ e
generate more candidates
Localize error
Cancellation Overflow
Focus: Estimate Error Source
+
p b2 − 4ac
−b
+F
+R
Round(x +R y)
(x +F y)
R
fR
fF
−b + √ b2 − 4ac 2a
J KR
x =
J KR
y =
f
e
Herbie Architecture
sample cands focus regimes
eF eR
e’ e
generate more candidates
Localize error
Herbie Architecture
sample cands focus regimes
Heuristic search
eF eR
e’ e
generate more candidates
Herbie Architecture
sample cands focus rewrite regimes
Create candidates
series simplify
eF eR
e’ e
Apply rewrites to
−b + √ b2 − 4ac 2a
(−b)2 − ( √ b2 − 4ac)2 −b − √ b2 − 4ac ! /2a
Recursive rewrites:
(0 − b) + √ b2 − 4ac 2a
0 −
√ b2 − 4ac
… 120 more … Rule DB −x 0 − x
x + y x2 − y2 x − y
(x − y) + z x − (y − z)
No cancellation in denominator
Herbie Architecture
sample cands focus rewrite regimes
Create candidates
series simplify
eF eR
e’ e
Herbie Architecture
sample cands focus rewrite series simplify regimes
Approximate expr
eF eR
e’ e
Idea: near-identities
Series Expansions
−b + √ b2 − 4ac 2a
Bounded Laurent series:
(for )
√ 1 − x ≈ 1 − x/2
x ≈ 0
−b + b(1 − 4ac/2b2) 2a
Herbie Architecture
sample cands focus rewrite series simplify regimes
Approximate expr
eF eR
e’ e
Herbie Architecture
sample cands focus rewrite series simplify regimes
Cancel & clean up
eF eR
e’ e
Simplify Expressions
(−b)2 − ( √ b2 − 4ac)2 −b − √ b2 − 4ac ! /2a
Difficult! [Caviness ’70]
E-graphs [Nelson ’79]
= 2c −b − √ b2 − 4ac =
✓ 4ac −b − √ b2 − 4ac ◆ /2a
=
✓ b2 − (b2 − 4ac) −b − √ b2 − 4ac ◆ /2a
=
b2 − ( √ b2 − 4ac)2 −b − √ b2 − 4ac ! /2a
Herbie Architecture
sample cands focus rewrite series simplify regimes
Cancel & clean up
eF eR
e’ e
Herbie Architecture
sample cands focus regimes
Keep all good candidates
eF eR
e’ e
rewrite series simplify
Herbie Architecture
sample cands focus regimes
Combine candidates
eF eR
e’ e
rewrite series simplify
−b + √ b2 − 4ac 2a
−c b c b − b a
2c −b − √ b2 − 4ac
Regime Inference
Regime Inference
c b − b aif b ∈ (∞, −1.15E122]
−b+ √ b2−4ac 2aif b ∈ (−1.125E122, 1.06E−304]
2c −b− √ b2−4acif b ∈ (1.06E−304, 4.62E63] − c
bif b ∈ (4.62E63, ∞)
Dynamic programming:
A B C D
Herbie Architecture
sample cands focus regimes
Combine candidates
eF eR
e’ e
rewrite series simplify
Heuristic search to find expert transformations
Worked Example How Herbie Works Evaluation
eF eR
e’ e
Heuristic search to find expert transformations
Worked Example How Herbie Works Evaluation
eF eR
e’ e
Evaluating Herbie
Examples from Hamming’s NMSE
Chapter 3: Function evaluation 28 worked examples & problems Quadratic formula (4) Algebraic rearrangement (12) Series expansion (12) Branches and regimes (2)
Average bits correct (longer is better)
Accuracy of input Accuracy of output Dramatic improvement
Average bits correct (longer is better)
Handle overflow More accurate series expansion No trig factorization Only branch on var
Of 12 with answers: Same in 8 Different in 4
Overhead CDF (left is better)
Median: 40%
I wasn't sure how to best rewrite [my]
stable versions of the formulas, and fixed all the divide-by-zero errors.
Harley Montgomery Clustering (bigger, darker blocks better)
⇒
Heuristic search to find expert transformations
Worked Example How Herbie Works Evaluation
eF eR
e’ e
Improve accuracy of floating point programs Sampling to estimate error Reduce global error to per-operation error Iterative rewriting highest-error operations Different expressions for different inputs
http: // herbie.uwplse.org /
eF eR
e’ e
Herbie and Maximum Error
Often improved by Herbie: Improvements large (28b) and small (.5b) 1+b improvement for 10/28 programs Fewer high-error pts, same max error.
Bits error (histogram)
Herbie as Part of a Pipeline
Find inaccurate expressions Improve accuracy Prove accuracy satisfactory Optimize code FPDebug Herbie Rosa FPTaylor STOKE-FP
Error graphs along and
a
c
a c
Finding the rewrite rules
Standard mathematical identities:
Commutativity, inverses, fractions, trig identities
No numerical methods knowledge Don’t need to be true identities
False rules do not improve accuracy Herbie will ignore them
Regimes often gains ~15 bits
Improvement from regimes (longer is better) Dot : input program average accurage Bar : Herbie result w/out regimes