Peter Dinda Conor Hetland Prescience Lab Department of EECS Northwestern University pdinda.org presciencelab.org
Survey Available Here, Please Participate! Paper in IPDPS 2018
Do Developers Understand IEEE Floating Point? Peter Dinda Conor - - PowerPoint PPT Presentation
Do Developers Understand IEEE Floating Point? Peter Dinda Conor Hetland Prescience Lab Department of EECS Northwestern University Survey Available Here, pdinda.org Please Participate! presciencelab.org Paper in IPDPS 2018 Paper in a
Peter Dinda Conor Hetland Prescience Lab Department of EECS Northwestern University pdinda.org presciencelab.org
Survey Available Here, Please Participate! Paper in IPDPS 2018
Paper in a Nutshell: Not Really
– Aimed at practitioners likely to use FP – Quizzes for core, optimization, and suspicion of results – First study of this kind
– … and don’t know it – Some factors mitigate, but none particularly well
– … and do know it
– … but similar to students in a sophomore course
2
Outline
– Important caveat!
– What are we doing?
3
For a Long Time…
4
Compiler (Optimizations) Hardware (Optimizations) Small set of hardware, IEEE compliance universal, slow change Small set of compilers used, slow change, difficult to break IEEE compliance Developer Focused on scientific and engineering uses, Some understanding of numerical methods Assumption/understanding of IEEE floating point IEEE 754(-2008) Standard Stable, pretty much universal standard since early 1980s Considerable complexity
The Concerns Now
5
Compiler (Optimizations) Fast evolution (e.g., hardware diversity (GPUs, FPGAs, ARM), half-floats, different denorm handling, non-IEEE compliance, power/energy) Fast evolution (e.g., numerous compilers, automatic precision reduction, approximate computing, optimization flag choice, automatic optimization setting search, power/energy) Developer Dramatic expansion in uses (e.g., machine learning, analytics, big data, and other expanding uses of FP) Less knowledge of numerical methods, and the standard IEEE 754(-2008) Standard Stable, pretty much universal standard since early 1980s Considerable complexity Hardware (Optimizations)
Do Developers Understand….
6
Compiler (Optimizations) Hardware (Optimizations) Fast evolution (e.g., hardware diversity (GPUs, FPGAs, ARM), half-floats, different denorm handling, non-IEEE compliance, power/energy) Fast evolution (e.g., numerous compilers, automatic precision reduction, approximate computing, optimization flag choice, automatic optimization setting search, power/energy) Developer Dramatic expansion in uses (e.g., machine learning, analytics, big data, and other expanding uses of FP) Less knowledge of numerical methods, and standard IEEE 754(-2008) Standard Stable, pretty much universal standard since early 1980s Considerable complexity
Core Focus Optimization Focus … and Suspicion
Study Design
– Participant background (for factor analysis) – Core quiz – Optimization quiz – Suspicion quiz
– http://presciencelab.org/float
7
Study Design
– Pose questions that might arise during software development
– Don’t test if they remember terminology, test if they can see the concept
8
Core Quiz
arithmetic, even though it looks like it
– Commutativity, associativity, distributivity,
signaling…
integer arithmetic either...
– Overflow (saturation), underflow, NaN, signaling
9
Example
10
Optimization Quiz
compliance
– MADD, Flush-to-Zero
compliance
– What’s the highest -O level that is standard compliant? – Is --fast-math standards compliant?
11
12
Suspicion Quiz
numeric problems
when your code produces a…
– Overflow, underflow, precision (rounding), invalid (NaN), or denormalized result
through
13
Example
14
Participant Recruitment Goals
management for science and engineering
– Both as main and secondary roles
15
Biggest Caveat: Not a random sample
Participant Recruitment Process
– Relevant department chairs, center directors, faculty, postdocs, and Ph.D. students at NU – Highest-level personal contacts at national labs – Faculty contacts at >20 universities
to people relevant to our recruitment goals
16
Participant Background / Factors
– Plus additional 52 undergrads for suspicion quiz
– 2 pages of details in paper
– Will highlight a few as we go on
17
18
19
5 10 15 10 20 30 40 Core Questions Correct Count
Chance
Experience With Code Matters (slightly)
20
2 4 6 8 10 12 14 16 >1M 100K-1M 10K-100K 1K-10K 100-1K Number of Questions # Correct # Incorrect # Don't Know # Unanswered
Figure 16: Effect of Contributed Codebase Size on core
Chance
Area Matters (slightly)
21
2 4 6 8 10 12 14 16 EE CS CE Math PhysSci Eng Number of Questions # Correct # Incorrect # Don't Know # Unanswered
Chance
22
Participants Aware of Not Understanding Optimizations (HW/SW)
23
0.5 1 1.5 2 2.5 3 EE CE CS Math PhysSci Eng Number of Questions # Correct # Incorrect # Don't Know # Unanswered
Figure 20: Effect of Area on optimization quiz scores.
Participants Aware of Not Understanding Optimizations (HW/SW)
24
0.5 1 1.5 2 2.5 3 EE CE CS Math PhysSci Eng Number of Questions # Correct # Incorrect # Don't Know # Unanswered
Figure 20: Effect of Area on optimization quiz scores.
“don’t know”
25
26
1 2 3 4 5 20 40 60 80 100 Suspicion Level Percent Reporting Overflow Underflow Precision Invalid Denorm
Can you tell these graphs apart? One is undergrads in an introductory systems course
27
1 2 3 4 5 20 40 60 80 100 Suspicion Level Percent Reporting Overflow Underflow Precision Invalid Denorm
Can you tell these graphs apart? One is undergrads in an introductory systems course
28
1 2 3 4 5 20 40 60 80 100 Suspicion Level Percent Reporting Overflow Underflow Precision Invalid Denorm
1/3 do not find NaN Maximally Suspicious
Caveats
– We cannot be sure we have hit our recruitment goals
– Survey design was iterated based on feedback
– But these are users
29
Potential Actions
– Much like PL and compilers community did with undefined behavior in C
30
Potential Actions
– Work in progress
precision arithmetic
– Work in progress
and hardware optimizations
– “Achievement Unlocked” – Work in progress
31
A Work in Progress: FPSpy
application binary
– Gets out of the way on conflict with application
debugger-style techniques to track issues
– Aggregate mode:
– Individual mode:
unmodified applications
– Does developer confusion as measured in present study manifest in codes in current use?
32
A Work In Progress: FPKernel
latency and overhead in a kernel-only model
– Like our Hybrid Run-Time (HRT) scheme and the Nautilus Kernel Framework that supports it
arbitrary precision software FP to create simple arithmetic model for programmer
– FP exceptions trigger transition to software FP – NaN boxing / signaling NaN for values – Made more practical by fast FP exceptions
33
34
5000 10000 15000 20000 25000 30000 R415-U R415-K R815-U R815-K KNL-U KNL-K
Cycles Machine-User/Kernel Minimum Time to Floating Point Exception Handler Linux User-level Versus Nautilus Kernel-level
Min: kernel-level is 6.5-30x faster Median: kernel-level is 3-30x faster Variance: kernel-level is 17-95x better
Paper in a Nutshell: Not Really
– Aimed at practitioners likely to use FP – Quizzes for core, optimization, and suspicion of results – First study of this kind
– … and don’t know it – Some factors mitigate, but none particularly well
– … and do know it
– … but similar to students in a sophomore course
35
For More Information
– pdinda@northwestern.edu – http://pdinda.org
– ConorHetland2015@u.northwestern.edu
– http://presciencelab.org/float
– http://presciencelab.org
– NSF, DOE
36