Low-Precision Arithmetic CS4787 Lecture 21 Spring 2020 The - PowerPoint PPT Presentation

Low-Precision Arithmetic CS4787 Lecture 21 — Spring 2020

The standard approach Single-precision floating point (FP32) • 32-bit floating point numbers 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 sign 8-bit exponent 23-bit mantissa • Usually, the represented value is represented number = ( − 1) sign · 2 exponent − 127 · 1 .b 22 b 21 b 20 . . . b 0 leading -bit before the mantissa. This is how floating point numbers

Three special cases 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 sign 8-bit exponent 23-bit mantissa • When the exponent is all 0s, and the mantissa is all 0s • This represents the real number 0 • Note the possibility of “negative 0” • When the exponent is all 0s, and the mantissa is nonzero • Called a “denormal number” — degrades precision gracefully as 0 approached represented number = ( − 1) sign · 2 − 126 · 0 .b 22 b 21 b 20 . . . b 0 .

Three special cases (continued) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 sign 8-bit exponent 23-bit mantissa • When the exponent is all 1s (255), and the mantissa is all 0s • This represents infinity or negative infinity , depending on the sign • Indicates overflow or division by 0 occurred at some point • Note that these events usually do not cause an exception, but sometimes do! • When the exponent is all 1s (255), but the mantissa is nonzero • Represents something that is not a number , called a NaN value. • This usually indicates some sort of compounded error. • The bits of the mantissa can be a message that indicates how the error occurred.

<latexit sha1_base64="kHg8T0Z1m6scH+OTF1dqFe5rpg=">ADpHicfVLbtNAEHUSLiVAaeEBJF5GtKAipVZSgYCHShW8FImHcklbKQ7Vej1OVl3vWrvrkMjxA5/JH/AZzDopalrESJZHczkz5+zEuRTWdbu/Gs3WjZu3bq/dad+9d3/9wcbmw2OrC8Oxz7XU5jRmFqVQ2HfCSTzNDbIslngSn3/w+ZMJGiu0+uZmOQ4zNlIiFZw5Cp1tNn5+TGF7CpFQEGXMjeO4/FJtg7DAgIAkqCKL0cAP4cZU48YIhqkRgk6pIpWagNQIci2UowYablG5Gr0DTCWwHTkhE4RpDer7udQW7eVq2vYv1O4CajnWaRCu49tUOxrA/AIMdmE6h30K6BwNc9olmFpdKGSamf6cpGPCHfu/zRDqIJo5FWSCJeRg6nrswYJ1ZYVSFEw/YhGuzQwv+pq0lwJiUmNZcIs3x8kYZlV1UTj/0uC8YGJXGbUIEx2njpfDTVhQPza2umG3Nrju9JbOVrC0I3q0dpRoXmQkHZfM2kGvm7thyYwTXGLVjgqLOePnbIQDcr0ydljWx1LBc4okfjZ9pHMdvdxRszaWRZTpb8HezXng/KDQqXvh2WQuWFQ8UXg9JC+if0lweJMidnJHDuBG0K/AxM4w7us+VKTkz9lzkq0QyZkZC7e+Fr4UaliPUGTozq0i93lWtrjvHe2HvVfju897WwfuljmvB0+BZsBP0gjfBQXAYHAX9gDd+N9ebj5tPWi9an1pfW/1FabOx7HkUrFjr+x+yuC0Z</latexit> Measuring Error in Floating Point Arithmetic If x ∈ R is a real number within the range of a floating point representation, and ˜ x is the closest representable floating-point number to it, then | ˜ x − x | = | round( x ) − x | ≤ | x | · ε machine . Here, ε machine is called the machine epsilon and bounds the relative error of the format.

<latexit sha1_base64="RA8adcVwj6evcJ2Lniqp6Hjeql0=">ADz3icbVLbtNAEHUSLsVc2tJHXkbESIloqQCQR8qVe0LvLUSvUhxVK3XY2fV9a61uy61XCNe+Rd+iL9hnEQ0bVnJ8uxczp45M1EuhXWj0Z9Wu/Po8ZOna8/85y9evlrf2Hx9anVhOJ5wLbU5j5hFKRSeOEknucGWRZJPIsuD5v42RUaK7T65socpxlLlUgEZ45cF5ut318TCK4DYCqGoKS/QSAOVBFlEhXQjQonKMEoYJBITeUqHeRaKAeJNhlz2xCEXBgeQIxKO7QECT2hEqGEwGBcNHQ6EMkFDMl6BzNnAX0bMFnwCwE74M5TKwdGej4sL8gFkaFlOhusd0M79NYtgk6oShzt/jbTbywlcwxIStiD0OG1q4wuVFz3KNSwh7LvhzObM47VGLN6kUMc6lWvf9NbgerDAFbqbyAkoW7+OejeNAThFWmbWyFJ92qBmzE+o8HV9TaEU58EoFjaCNG0R6oX0oGwjeiGqRSHFxvd0XA0P/DQGC+Nrc8RzRfP4w1LzKaH5fM2sl4lLtpxYwTXGLth4VFauSpTghU7EM7bSa71UN78gTN+OljwSe1crKpZW2YRZdICzOz9WOP8X2xSuOTztBIqLxwqvngoKSQ4Dc2SQixoW5wsyWDcCOIKfMYM45mfOeVnBl7KfK7jWTMpELt7Qw/CjWtUtQZOlPWpN74vlYPjdOd4fjDcPd4p7t/sNRxzXvjvfV63tj75O17X7wj78Tj7a32bvugfdg57nzv/Oj8XKS2W8uaLe/O6fz6C3lFOHY=</latexit> Error of Floating-Point Computations If x and y are real-numbers representable in a floating- point format, � denotes an (infinite-precision) binary operation (such as +, · , etc.) and • denotes the floating-point version of that operation, then x • y = round( x � y ) and | ( x • y ) � ( x � y ) |  | x � y |· ε machine , as long as the result is in range.

Exceptions to this error model • If the exact result is larger than the largest representable value • The floating-point result is infinity (or minus infinity) • This is called overflow • If the exact result falls in the range of denormal numbers, there may be more error than the model predicts • If there is an invalid computation • e.g. the square root of a negative number, or infinity + (-infinity) • the result is NaN

How can we use this info to make our ML systems more scalable?

Low-precision compute • Idea: replace the 32-bit or 64-bit floating point numbers traditionally used for ML with smaller numbers • For example, 16-bit floats or 8-bit integers • New specialized chips for accelerating ML training . • Many of these chips leverage low-precision compute . NVIDIA’s GPUs Intel’s NNP Google’s TPU

A low-precision alternative FP16/Half-precision floating point • 16-bit floating point numbers 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 sign 5-bit exp 10-bit mantissa • Usually, the represented value is x = ( − 1) sign bit · 2 exponent − 15 · 1 . significand 2

Benefits of Low-Precision: Compute • Use SIMD/vector instructions to run more computations at once SIMD Parallelism SIMD Precision 64-bit float vector 4 multiplies/cycle F64 F64 F64 F64 (vmulpd instruction) 32-bit float vector 8 multiplies/cycle F32 F32 F32 F32 F32 F32 F32 F32 (vmulps instruction) 16-bit int vector 16 multiplies/cycle (vpmaddwd instruction) 8-bit int vector 32 multiplies/cycle (vpmaddubsw instruction)

Benefits of Low Precision: Memory • Puts less pressure on memory and caches Memory Throughput Precision in DRAM 64-bit float vector 5 numbers/ns … … F64 F64 F64 32-bit float vector 10 numbers /ns … … F32 F32 F32 F32 F32 F32 16-bit int vector 20 numbers /ns … … 8-bit int vector 40 numbers /ns … … (assuming ~40 GB/sec memory bandwidth)

Benefits of Low Precision: Communication • Uses less network bandwidth in distributed applications Memory Throughput Precision in DRAM 32-bit float vector 10 numbers /ns … … F32 F32 F32 F32 F32 F32 16-bit int vector 20 numbers /ns … … 8-bit int vector 40 numbers /ns … … Specialized lossy compression >40 numbers /ns … … (assuming ~40 GB/sec network bandwidth)

Benefits of Low Precision: Power • Low-precision computation can even have a super-linear effect on energy float32 int16 float32 int16 int16 float32 mul multiplier • Memory energy can also have quadratic dependence on precision algorithm runtime float32 float32 memory

Effects of Low-Precision Computation • Pros • Fit more numbers (and therefore more training examples) in memory • Store more numbers (and therefore larger models) in the cache • Transmit more numbers per second • Compute faster by extracting more parallelism • Use less energy • Cons • Limits the numbers we can represent • Introduces quantization error when we store a full-precision number in a low- precision representation

Numeric properties of 16-bit floats • A larger machine epsilon ( larger rounding errors ) of rounding), ✏ machine ≈ 9 . 8 × 10 − 4 machine • Compare 32-bit floats which had floats, ✏ machine ≈ 1 . 2 × 10 − 7 . • A smaller overflow threshold ( easier to overflow ) at about about 6 . 5 × 10 4 • Compare 32-bit floats where it’s ± 3 . 4 × 10 38 appropriate. This • A larger underflow threshold ( easier to underflow ) at about about 6 . 0 × 10 − 8 . (about 1 . 4 × 10 − 45 for • Compare 32-bit floats where it’s of the operation With all these drawbacks, does anyone use this?

Half-precision floating point support • Supported on most modern machine-learning-targeted GPUs • Including efficient implementation on NVIDIA Pascal GPUs • Good empirical results for deep learning Micikevicius et al. “Mixed Precision Training.” on arxiv, 2017.

One way to address limited range: more exponent bits Bfloat16 — “brain floating point” • Another 16-bit floating point number 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 sign 8-bit exp 7-bit mantissa Q: What can we say about the range of bfloat16 numbers as compared with IEEE half-precision floats and single-precision floats? How does their machine epsilon compare?

Bfloat16 (continued) • Main benefit: numeric range is now the same as single-precision float • Since it looks like a truncated 32-bit float • This is useful because ML applications are more tolerant to quantization error than they are to overflow • Also supported on specialized hardware

An alternative to low-precision floating point Fixed point numbers • p + q + 1 –bit fixed point number 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 sign fixed-point number • The represented number is x = ( − 1) sign bit � � integer part + 2 − q · fractional part = 2 − q · whole thing as signed integer

Low-Precision Arithmetic CS4787 Lecture 21 Spring 2020 The - PowerPoint PPT Presentation

Low-Precision Arithmetic CS4787 Lecture 21 Spring 2020 The standard approach Single-precision floating point (FP32) 32-bit floating point numbers 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9

By Shervin Daneshpajouh Computer Arithmetic Computer Arithmetic p Computer Computer Arithmetic

Digital Design Discussion: Arithmetic Binary Arithmetic Floating-Point Arithmetic Binary

AM P A R CudA Multiple Precision ARithmetic librarY When do we need more precision?

Mixed Precision Training PAI Overview What is mixed-precision

Lecture 4 Arithmetic-Logic Unit 1 Arithmetic - Logic Unit ALU Handles integers Does the

Arithmetic for Computers October 31, 2008 Arithmetic for Computers ALU Arithmetic Logic Unit

Section 4 Section 4 Arithmetic Units a 4-1 1 ALU ALU a 4-2 2 Arithmetic Logic Unit (ALU)

Faster arbitrary-precision dot product and matrix multiplication Fredrik Johansson Inria

VLVK EHF. VLVK EHF. Precision machining Precision machining Professional precision for

2018 Milken Institute Hamptons Dialogues Precision, Precision, Precision: The Future of Health

Week 1: Introduc/on Precision and covariance matrix 2 1.2C

Arithmetic circuits with locally low algebraic rank Mrinal Kumar Joint work with Shubhangi Saraf

Arithmetic Logic Unit (ALU) By : Khawar Nehal 18 June 2020 Updated 21 June 2020 1 / 32

Arithmetic Series (Lesson Slides) UNIT #7: Sequences and Series WARMUP Arithmetic Series

Peano Arithmetic Definition. The axioms of Peano Arithmetic (1889), denoted PA , consist of the

Fast Arithmetic Philipp Koehn 27 September 2019 Philipp Koehn Computer Systems Fundamental:

Dealing with floats Floats approximate real numbers, but

Today: Floats! 1 University of Washington Today Topics: Floating Point ! Background: Fractional

Live at Home Annual Review Provider Information Event Suzanne Westhead Director - Strategic

Perceptually informed organization of textural sounds OFAI research seminar, 2012-10-23 ringing

Floating-Point Math and Accuracy Dr. Richard Berger High-Performance Computing Group College of

Review Recursion Factorial (Iterative and Recursive versions) Call Stack (Last-in,

Print Books with Browsers Designing books in browsers using Paged.js Julie Blanc (@julieblancfr)

Before we begin Any questions? Function Pointers Recall: Program Memory Function pointers