Daisy
a framework for sound accuracy analysis of numerical programs
Eva Darulova eva@mpi-sws.org Anastasiia Izycheva izycheva@in.tum.de
Daisy a framework for sound accuracy analysis of numerical programs - - PowerPoint PPT Presentation
Daisy a framework for sound accuracy analysis of numerical programs Eva Darulova Anastasiia Izycheva eva@mpi-sws.org izycheva@in.tum.de Daisy a framework for sound accuracy analysis of numerical programs } and optimization Eva Darulova
Eva Darulova eva@mpi-sws.org Anastasiia Izycheva izycheva@in.tum.de
Eva Darulova eva@mpi-sws.org
Anastasiia Izycheva izycheva@in.tum.de
Observation:
Observation:
many applications use more resources than they really need
Observation:
many applications use more resources than they really need
➡ particularly important for embedded systems
Observation:
many applications use more resources than they really need
➡ particularly important for embedded systems ➡ in numerical programs:
Observation:
many applications use more resources than they really need
➡ particularly important for embedded systems ➡ in numerical programs:
Observation:
many applications use more resources than they really need
➡ particularly important for embedded systems ➡ in numerical programs:
Why?
Observation:
many applications use more resources than they really need
➡ particularly important for embedded systems ➡ in numerical programs:
Why?
challenging to optimise manually
Observation:
many applications use more resources than they really need
➡ particularly important for embedded systems ➡ in numerical programs:
Why?
challenging to optimise manually
Observation:
many applications use more resources than they really need
➡ particularly important for embedded systems ➡ in numerical programs:
Why?
challenging to optimise manually
real-valued program with accuracy spec
Daisy
general real-valued program with accuracy spec
Daisy
general real-valued program with accuracy spec
Daisy
automated
finite-precision program real-valued program with accuracy spec
finite-precision program real-valued program with accuracy spec
1: def sine(x: Real): Real = { 2: require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) 3: 4: x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 5: 6: } ensuring(res => res +/- 1.001e-11)
finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic real-valued program with accuracy spec
1: def sine(x: Real): Real = { 2: require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) 3: 4: x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 5: 6: } ensuring(res => res +/- 1.001e-11)
finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic real-valued program with accuracy spec
1: def sine(x: Real): Real = { 2: require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) 3: 4: x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 5: 6: } ensuring(res => res +/- 1.001e-11)
finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic real-valued program with accuracy spec
1: def sine(x: Real): Real = { 2: require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) 3: 4: x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 5: 6: } ensuring(res => res +/- 1.001e-11)
How to measure accuracy?
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
absolute errors
max
x∈I |f(x) − ˜
f(˜ x)|
How to measure accuracy? absolute errors
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
relative errors
max
x∈I |f(x) − ˜
f(˜ x)|
max
x∈I
f(˜ x) f(x)
How to measure accuracy? absolute errors
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
relative errors
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: automatically and accurately bound worst-case errors
max
x∈I
f(˜ x) f(x)
max
x∈I |f(x) − ˜
f(˜ x)|
Static dataflow analysis
max
x∈I |f(x) − ˜
f(˜ x)|
Static dataflow analysis
for each arithmetic operation
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic Static dataflow analysis
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic Challenge: conditionals (control-flow may diverge) Static dataflow analysis
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic Challenge: conditionals (control-flow may diverge) Challenge: loops (errors may grow unboundedly) Static dataflow analysis
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic
abstract domains:
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic
abstract domains:
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic
abstract domains:
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic
abstract domains:
additional techniques:
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic
abstract domains:
additional techniques:
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic
abstract domains:
additional techniques:
max
x∈I |f(x) − ˜
f(˜ x)|
Challenge: tight bounds for nonlinear arithmetic
abstract domains:
additional techniques:
error bounds often within one order of magnitude of true errors
state-of-the-art definition
x∈I
x)− ˜ f(⃗ ˜ x)| minx∈I|f(⃗ x)|
Challenge: tight bounds
state-of-the-art definition
x∈I
x)− ˜ f(⃗ ˜ x)| minx∈I|f(⃗ x)|
Challenge: tight bounds Challenge: division by zero
Our solution:
Assume NO division by zero
x∈I
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(x) = x + 0.3
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3) + d0.3)(1 + e+) + d+
f(x) = x + 0.3
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3) + d0.3)(1 + e+) + d+
f(x) = x + 0.3
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3) + d0.3)(1 + e+) + d+
f(x) = x + 0.3
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3) + d0.3)(1 + e+) + d+
f(x) = x + 0.3
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3) + d0.3)(1 + e+) + d+
f(x) = x + 0.3
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(x) = x + 0.3
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3))(1 + e+)
Naive approach: Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(x) = x + 0.3
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3))(1 + e+)
errrel = max
x∈I,|ei|≤M,|di|≤δ
x + 0.3
Naive approach: Does not scale! Replace ˜
f(~ ˜ x) with its abstraction
according to IEEE754
f(~ x,~ e, ~ d)
f(x) = x + 0.3
f(⃗ x,⃗ e, ⃗ d) = (x(1 + ex) + dx + 0.3(1 + e0.3))(1 + e+)
errrel = max
x∈I,|ei|≤M,|di|≤δ
x + 0.3
k
i=1
=0
k
i=1
k
i=1
Evaluate the simplified expression using:
x∈I,|ei|≤M,|di|≤δ
Challenge: difficult to obtain tight bounds
Challenge: difficult to obtain tight bounds
tighter than the state-of-the-art tools compute
Challenge: difficult to obtain tight bounds
tighter than the state-of-the-art tools compute
Challenge: difficult to obtain tight bounds
tighter than the state-of-the-art tools compute
Challenge: difficult to obtain tight bounds Challenge: division by zero
tighter than the state-of-the-art tools compute
Challenge: difficult to obtain tight bounds Challenge: division by zero
tighter than the state-of-the-art tools compute
Our solution:
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
real-valued program with accuracy spec finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic
verified by static analysis for linear and nonlinear arithmetic a (short) numerical arithmetic expression
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
real-valued program with accuracy spec finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic
verified by static analysis for linear and nonlinear arithmetic a (short) numerical arithmetic expression
✔
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
real-valued program with accuracy spec finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic
✔
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
real-valued program with accuracy spec finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic
✔
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
real-valued program with accuracy spec finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic
mixed-precision
✔
def sine(x: Real): Real = { require(-1.5 <= x && x <= 1.5 && x +/- 1e-11) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 } ensuring(res => res +/- 1.001e-11)
real-valued program with accuracy spec finite-precision program
def sine(x: Double): Double = { require(-1.5 <= x && x <= 1.5) x - (x^3)/6.0 + (x^5)/120.0 - (x^7)/5040.0 }
floating-point arithmetic
def sine(x: Long): Long = { require(-1.5 <= x && x <= 1.5) val _t1 = ((x * x) >> 31) val _t2 = ((_t1 * x) >> 30) val _t3 = ((_t2 << 30) / 1610612736l) val _t4 = ((x << 1) - _t3) . . .
fixed-point arithmetic
mixed-precision
+ ✔
a + (b + c) 6= (a + b) + c
a + (b + c) 6= (a + b) + c
before: assumed fixed evaluation order for soundness
a + (b + c) 6= (a + b) + c
before: assumed fixed evaluation order for soundness Opportunity:
a + (b + c) 6= (a + b) + c
Challenge: efficient search before: assumed fixed evaluation order for soundness Opportunity:
a + (b + c) 6= (a + b) + c
Goal: find order which
a + (b + c) 6= (a + b) + c
Goal: find order which
Our solution: genetic (heuristic) algorithm
a + (b + c) 6= (a + b) + c
Goal: find order which
Our solution: genetic (heuristic) algorithm
significant (up to 70%) improvements in errors possible
def rigidBody(x1: Double, x2: Double, x3: Double): Double = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
def rigidBody(x1: Double, x2: Double, x3: Double): Double = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 3.5e-13 but not 1.75e-13
def rigidBody(x1: Double, x2: Double, x3: Double): Double = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
What if double floating-point precision is just not enough?
satisfies absolute error bound: 3.5e-13 but not 1.75e-13
def rigidBody(x1: Double, x2: Double, x3: Double): Double = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
What if double floating-point precision is just not enough?
satisfies absolute error bound: 3.5e-13 but not 1.75e-13
def rigidBody(x1: Quad, x2: Quad, x3: Quad): Quad = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 1.5e-28 but is significantly slower than double precision
def rigidBody(x1: Quad, x2: Quad, x3: Quad): Quad = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 1.5e-28 but is significantly slower than double precision
def rigidBody(x1: Quad, x2: Quad, x3: Quad): Quad = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 1.5e-28 but is significantly slower than double precision
def rigidBody(x1: Double, x2: Double, x3: Double): Quad = {
}
Use mixed-precision
def rigidBody(x1: Quad, x2: Quad, x3: Quad): Quad = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 1.5e-28 but is significantly slower than double precision
def rigidBody(x1: Double, x2: Double, x3: Double): Quad = {
}
Use mixed-precision
def rigidBody(x1: Quad, x2: Quad, x3: Quad): Quad = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 1.5e-28 but is significantly slower than double precision
def rigidBody(x1: Double, x2: Double, x3: Double): Quad = {
}
Use mixed-precision
def rigidBody(x1: Quad, x2: Quad, x3: Quad): Quad = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 1.5e-28 but is significantly slower than double precision
def rigidBody(x1: Double, x2: Double, x3: Double): Quad = {
}
Use mixed-precision
satisfies absolute error bound: 1.75e13 28% faster than uniform quad precision
def rigidBody(x1: Quad, x2: Quad, x3: Quad): Quad = { require(-15.0 ≤ x1 ≤ 15 && -15.0 ≤ x2 ≤ 15.0 && -15.0 ≤ x3 ≤ 15)
}
satisfies absolute error bound: 1.5e-28 but is significantly slower than double precision
def rigidBody(x1: Double, x2: Double, x3: Double): Quad = {
}
Use mixed-precision
satisfies absolute error bound: 1.75e13 28% faster than uniform quad precision
Challenge: efficient search & performance estimation
Goal: automatically find mixed-precision assignment which
def rigidBody(x1: Double, x2: Double, x3: Double): Quad = {
}
Goal: automatically find mixed-precision assignment which
Our solution:
def rigidBody(x1: Double, x2: Double, x3: Double): Quad = {
}
rewriting: improves error mixed-precision: improves performance
rewriting: improves error mixed-precision: improves performance improve performance even more
+ =
rewriting: improves error mixed-precision: improves performance improve performance even more
+ =
def rigidBody(x1: Quad, x2: Quad, x3: Double): Double = { (-D(x1 *Q x2) -D (x1 +Q x3)) -D ((x2 *Q 2.0f) *D x3) }
satisfies absolute error bound: 1.75e13 43% faster than uniform quad precision
rewriting: improves error mixed-precision: improves performance improve performance even more
+ =
Challenge: optimizing both at the same time does not scale
def rigidBody(x1: Quad, x2: Quad, x3: Double): Double = { (-D(x1 *Q x2) -D (x1 +Q x3)) -D ((x2 *Q 2.0f) *D x3) }
satisfies absolute error bound: 1.75e13 43% faster than uniform quad precision
rewriting: improves error mixed-precision: improves performance improve performance even more
+ =
Challenge: optimizing both at the same time does not scale
def rigidBody(x1: Quad, x2: Quad, x3: Double): Double = { (-D(x1 *Q x2) -D (x1 +Q x3)) -D ((x2 *Q 2.0f) *D x3) }
satisfies absolute error bound: 1.75e13 43% faster than uniform quad precision
Our solution:
framework for the analysis and optimisation of numerical programs Supported Features:
➡
for absolute error analysis with intervals, more features to come
framework for the analysis and optimisation of numerical programs
Design Goals:
framework for the analysis and optimisation of numerical programs
Design Goals:
framework for the analysis and optimisation of numerical programs
Design Goals:
framework for the analysis and optimisation of numerical programs
Design Goals:
➡ dataflow analysis (as in Rosa, Fluctuat)
framework for the analysis and optimisation of numerical programs
Design Goals:
➡ dataflow analysis (as in Rosa, Fluctuat) ➡ optimisation-based analysis (as in FPTaylor, preliminary implementation)
framework for the analysis and optimisation of numerical programs
Design Goals:
➡ dataflow analysis (as in Rosa, Fluctuat) ➡ optimisation-based analysis (as in FPTaylor, preliminary implementation) ➡ interval subdivision for all techniques
framework for the analysis and optimisation of numerical programs
Design Goals:
➡ dataflow analysis (as in Rosa, Fluctuat) ➡ optimisation-based analysis (as in FPTaylor, preliminary implementation) ➡ interval subdivision for all techniques
Daisy source code: https://github.com/malyzajko/daisy [1] Towards a Compiler for Reals E. Darulova, V. Kuncak, ACM Transactions on Programming Languages and Systems, TOPLAS’17 [2] On Sound Relative Error Bounds for Floating-Point Arithmetic A. Izycheva, E. Darulova, FMCAD’17 [3] Sound Mixed-Precision Optimization with Rewriting E. Darulova, E. Horn, S. Sharma, ICCPS’18 [4] A Verified Certificate Checker for Floating-Point Error Bounds H. Becker, E. Darulova, M. O. Myreen, arXiv:1707.02115, 2017
several tools exist today: Rosa, Fluctuat, Gappa, FPTaylor, Real2Float, PRECiSA
several tools exist today: Rosa, Fluctuat, Gappa, FPTaylor, Real2Float, PRECiSA