LLVM Numerics Improvements Michael C. Berg, Apple LLVM Developers - - PowerPoint PPT Presentation

llvm numerics improvements
SMART_READER_LITE
LIVE PREVIEW

LLVM Numerics Improvements Michael C. Berg, Apple LLVM Developers - - PowerPoint PPT Presentation

LLVM Numerics Improvements Michael C. Berg, Apple LLVM Developers Meeting, Brussels, Belgium, April 2019 1 Agenda Handling Numerics via Flags Current LLVM Numerics Models How Unsafe Changes Behavior Mixed Mode


slide-1
SLIDE 1
  • 1

Michael C. Berg, Apple LLVM Developers’ Meeting, Brussels, Belgium, April 2019

  • LLVM Numerics Improvements
slide-2
SLIDE 2

Agenda

  • Handling Numerics via Flags
  • Current LLVM Numerics Models
  • How Unsafe Changes Behavior
  • Mixed Mode
  • Flag Guided Optimizations
  • Conclusions
  • 2
slide-3
SLIDE 3

3

Language Front Ends Mid Level Optimizer SelectionDAG Targeted Backends SDNode LLVM IR DAG Lowering MachineInstr

Module and IR Flags Introduced IR Flags Translated to SDNode IR Flags Translated to MachineInstr

Handling Numerics via Flags

GlobalIsel

slide-4
SLIDE 4

Agenda

  • Handling Numerics via Flags
  • Current LLVM Numerics Models
  • How Unsafe Changes Behavior
  • Mixed Mode
  • Flag Guided Optimizations
  • Conclusions
  • 4
slide-5
SLIDE 5

Current LLVM Numerics Models

  • Unsafe : module-wide scope overrides Fast Math Flags (FMF).
  • Fast-Math: IR scope, FMFs all set.
  • Precise-Math: IR scope, FMFs all unset, IEEE–754.
  • 5
slide-6
SLIDE 6

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math

  • 6
slide-7
SLIDE 7

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math Nsz Overrides √ X

  • 6

Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant.

slide-8
SLIDE 8

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math Nsz Overrides √ X Nnan Overrides √ X

  • 6

Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Nnan: Allow optimizations to assume the arguments and result are not NaN.

slide-9
SLIDE 9

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math Nsz Overrides √ X Nnan Overrides √ X Ninf Overrides √ X

  • 6

Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf: Allow optimizations to assume the arguments and result are not +/-Inf.

slide-10
SLIDE 10

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math Nsz Overrides √ X Nnan Overrides √ X Ninf Overrides √ X Arcp Overrides √ X

  • 6

Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf: Allow optimizations to assume the arguments and result are not +/-Inf. Arcp: Allow optimizations to use reciprocal

  • perations with approximate expressions.
slide-11
SLIDE 11

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math Nsz Overrides √ X Nnan Overrides √ X Ninf Overrides √ X Arcp Overrides √ X Contract Overrides √ X

  • 6

Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf: Allow optimizations to assume the arguments and result are not +/-Inf. Arcp: Allow optimizations to use reciprocal

  • perations with approximate expressions.

Contract: Allow floating-point contraction (e.g. fusing a multiply add/sub).

slide-12
SLIDE 12

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math Nsz Overrides √ X Nnan Overrides √ X Ninf Overrides √ X Arcp Overrides √ X Contract Overrides √ X Reassoc Overrides √ X

  • 6

Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf: Allow optimizations to assume the arguments and result are not +/-Inf. Arcp: Allow optimizations to use reciprocal

  • perations with approximate expressions.

Contract: Allow floating-point contraction (e.g. fusing a multiply add/sub). Reassoc: Allow reassociation transformations on floating-point instructions.

slide-13
SLIDE 13

Current LLVM Numerics Models

Models and their Flags Unsafe Fast-Math Precise- Math Nsz Overrides √ X Nnan Overrides √ X Ninf Overrides √ X Arcp Overrides √ X Contract Overrides √ X Reassoc Overrides √ X Afn Overrides √ X

  • 6

Nsz: Allow optimizations to treat the sign of a zero argument or result as insignificant. Nnan: Allow optimizations to assume the arguments and result are not NaN. Ninf: Allow optimizations to assume the arguments and result are not +/-Inf. Arcp: Allow optimizations to use reciprocal

  • perations with approximate expressions.

Contract: Allow floating-point contraction (e.g. fusing a multiply add/sub). Afn: Allow substitution of approximate calculations for functions (sin, log, cos, etc). Reassoc: Allow reassociation transformations on floating-point instructions.

slide-14
SLIDE 14

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-15
SLIDE 15

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-16
SLIDE 16

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-17
SLIDE 17

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √ Ninf X √ √

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-18
SLIDE 18

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √ Ninf X √ √ Arcp NA √ √

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-19
SLIDE 19

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √ Ninf X √ √ Arcp NA √ √ Contract √ √ √

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-20
SLIDE 20

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √ Ninf X √ √ Arcp NA √ √ Contract √ √ √ Reassoc √ √ √

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-21
SLIDE 21

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √ Ninf X √ √ Arcp NA √ √ Contract √ √ √ Reassoc √ √ √

Changing order of

  • perations may cause

rounding differences, NaN and Inf instances may materialize in new ways or even disappear, generalizing the intended values expected in user code.

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-22
SLIDE 22

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √ Ninf X √ √ Arcp NA √ √ Contract √ √ √ Reassoc √ √ √ Afn NA √ √

Changing order of

  • perations may cause

rounding differences, NaN and Inf instances may materialize in new ways or even disappear, generalizing the intended values expected in user code.

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-23
SLIDE 23

Current LLVM Numerics Models

FMF Precision and Behavior Math

  • peration
  • rder

changed IEEE behavior changed IEEE precision changed Nsz √ √ √ Nnan √ √ √ Ninf X √ √ Arcp NA √ √ Contract √ √ √ Reassoc √ √ √ Afn NA √ √ Fast √ √ √

Changing order of

  • perations may cause

rounding differences, NaN and Inf instances may materialize in new ways or even disappear, generalizing the intended values expected in user code.

  • 7

Notes: The above FMF on IR maps to the same optimizations as Unsafe

slide-24
SLIDE 24

Current LLVM Numerics Models

8

slide-25
SLIDE 25

Model Attributes Fine Grain Control IR annotated with flags NaNs and Infs Preserved Best Performance and Size IEEE Compliant

Current LLVM Numerics Models

8

slide-26
SLIDE 26

Model Attributes Fine Grain Control IR annotated with flags NaNs and Infs Preserved Best Performance and Size IEEE Compliant Unsafe X NA X √ X

Current LLVM Numerics Models

8

slide-27
SLIDE 27

Model Attributes Fine Grain Control IR annotated with flags NaNs and Infs Preserved Best Performance and Size IEEE Compliant Unsafe X NA X √ X Fast-math √ √ X √ X

Current LLVM Numerics Models

8

slide-28
SLIDE 28

Model Attributes Fine Grain Control IR annotated with flags NaNs and Infs Preserved Best Performance and Size IEEE Compliant Unsafe X NA X √ X Fast-math √ √ X √ X Precise-math √ None or arcp √ X √

Current LLVM Numerics Models

8

slide-29
SLIDE 29

Model Attributes Fine Grain Control IR annotated with flags NaNs and Infs Preserved Best Performance and Size IEEE Compliant Unsafe X NA X √ X Fast-math √ √ X √ X Precise-math √ None or arcp √ X √ Unsafe with Precise-math X NA X X X

Current LLVM Numerics Models

8

slide-30
SLIDE 30

Agenda

  • Handling Numerics via Flags
  • Current LLVM Numerics Models
  • How Unsafe Changes Behavior
  • Mixed Mode
  • Flag Guided Optimizations
  • Conclusions
  • 9
slide-31
SLIDE 31
  • Code emitted as precise can be modified by Unsafe.
  • Math functions like acos, cos, sin, asin, etc created by

another model can have modified behavior and precision.

  • Reassociation globally/locally removes constraints.

10

How Unsafe Changes Behavior

slide-32
SLIDE 32

Agenda

  • Handling Numerics via Flags
  • Current LLVM Numerics Models
  • How Unsafe Changes Behavior
  • Mixed Mode
  • Flag Guided Optimizations
  • Conclusions
  • 11
slide-33
SLIDE 33

Mixed Mode

  • Interleave IR with mixture of flags at some granularity (lib, function, expression).
  • Incompatible with Unsafe
  • Fast-Math, Precise-Math and other models can coexist.
  • Fine granularity of optimization control
  • No loss of generality from expressed model
  • More design options to manage optimizations
  • 12
slide-34
SLIDE 34

Model Attributes Unsafe Fast-math Precise-math Mixed Mode Unsafe with Precise-math Fine Grain Control X √ √ √ X IR annotated with flags NA √ None or arcp In context NA NaNs and Infs Preserved X X √ In context X Best Performance and Size √ √ X In context X IEEE Compliant X X √ In context X

Mixed Mode

13

slide-35
SLIDE 35

Model Attributes Unsafe Fast-math Precise-math Mixed Mode Unsafe with Precise-math Fine Grain Control X √ √ √ X IR annotated with flags NA √ None or arcp In context NA NaNs and Infs Preserved X X √ In context X Best Performance and Size √ √ X In context X IEEE Compliant X X √ In context X

Mixed Mode

Mixed Mode is available in LLVM 8.0

13

slide-36
SLIDE 36

Agenda

  • Handling Numerics via Flags
  • Current Models in LLVM
  • How Unsafe Changes Behavior
  • Mixed Mode
  • Flag Guided Optimizations
  • Conclusions
  • 14
slide-37
SLIDE 37

For the following f32 input: %x ~= 3.4028234664E+38 (largest positive number in f32) c1 = 1.0, c2 = -1.0 We convert this IR: %t1 = fadd float %x, 0x3FF0000000000000 ; t1 = x + 1.0 %t2 = fadd nsz reassoc float %t1, 0xBFF0000000000000 ; t2 = t1 + -1.0 To this with Unsafe or IR flags: %t3 = fadd nsz reassoc %x, 0.0 The result of %t3 is %x Whereas the precise version yields: %t1 results in Infinity, which propagates to %t2

15

Fadd Combine

fadd nsz reassoc (fadd x,c1), c2 -> fadd nsz reassoc x, c1 + c2

slide-38
SLIDE 38

For the following f32 input: %x = 10, %t1 = 0.3 0x36A0000000000000 ~= 1.4012984643E−45 (smallest positive number) We convert this IR: %t1 = fdiv float 3.0, 10.0 %t2 = fmul reassoc float %x, 0x36A0000000000000 ; t2 = x * 1.4012984643E−45 %t3 = fmul reassoc float %t2, %t1 ; t3 = t2 * 0.3 To this with Unsafe or IR flags: %t4 = fmul reassoc float %t2, 0 1.4012984643E−45 * 0.3, which is correctly rounded to zero. Whereas the precise version yields: 1.4012984643E−44 * 0.3, which is non zero.

16

Fmul Combine

fmul reassoc (fmul x, c1), c2 -> fmul reassoc x, c1 * c2

slide-39
SLIDE 39

This IR: %div = fdiv arcp half %x, 10.0 %z = fpext half %div to float Produces (Unsafe/Fast) x86_64 with avx: .LCPI4_0: .long 1036828672 # float 0.0999755859 … vmulss .LCPI4_0(%rip), %xmm0, %xmm0 # z = x * 0.0999755859 This IR: %div = fdiv half %x, 10.0 %z = fpext half %div to float Produces (Precise) x86_64 with avx: .LCPI4_0: .long 1092616192 # float 10 … vdivss .LCPI4_0(%rip), %xmm0, %xmm0 # z = x / 10

17

Fdiv Code Generation

slide-40
SLIDE 40

Agenda

  • Handling Numerics via Flags
  • Current Models in LLVM
  • How Unsafe Changes Behavior
  • Mixed Mode
  • Flag Guided Optimizations
  • Conclusions
  • 18
slide-41
SLIDE 41
  • Emit flags on IR and exclude Unsafe to get desired

model behavior.

  • Mixed mode facilitates fine grained control, while

promoting versatility in implementing optimizations.

  • Compiler implementers can use the current

infrastructure to implement Mixed mode today for their targets.

19

Conclusions

slide-42
SLIDE 42
  • FMF function specialization along a call edge
  • Inlining with FMF applied from caller instance of call
  • Pragma controls
  • Per function controls for replacing math lib calls
  • New Math Models, new FMF and combinatorics

20

Future Work Ideas

Note: See llvm-dev EuroLLVM Numerics issues email thread for continuing discussion

slide-43
SLIDE 43

LLVM Numerics Improvements
 Michael C. Berg, LLVM Developers’ Meeting, Brussels, Belgium, April 2019

  • Questions?
  • 21