Synthesizing an Instruction Selection Rule Library from Semantic - - PowerPoint PPT Presentation

synthesizing an instruction selection rule library from
SMART_READER_LITE
LIVE PREVIEW

Synthesizing an Instruction Selection Rule Library from Semantic - - PowerPoint PPT Presentation

1 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD Programming Paradigms Chair, IPD Snelting Compiler Design Lab Synthesizing an Instruction Selection Rule Library from Semantic


slide-1
SLIDE 1 1 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD Programming Paradigms Chair, IPD Snelting Compiler Design Lab

Synthesizing an Instruction Selection Rule Library from Semantic Specifjcations

Sebastian Buchwald, Andreas Fried, Sebastian Hack

KIT – The Research University in the Helmholtz Association

www.kit.edu

slide-2
SLIDE 2

Instruction Selection

2 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

And Not x y z

1

andn %x, %y, %z x y z

1

Replace IR pattern with a single goal instruction No total ordering, no (virtual) register allocation yet

slide-3
SLIDE 3

State of the Art

3 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

rule library instruction selector Syntactic specifjcation of patterns Code generation E.g. GCC machine description, LLVM TableGen Large rule libraries, growing larger Tedious manual maintenance Error-prone, especially missing patterns

slide-4
SLIDE 4

Multiple Patterns per Goal

4 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

And Not x y

1

Xor Or x y

1 1

Xor And x y

1 1

Sub And y x

1 1

Full support of new instruction needs 4 rules + commutativity Easier to specify semantics once

slide-5
SLIDE 5

Existing Rulesets are Incomplete

5 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

x86 has extensive addressing modes r = &a[x + 4*y + 42]; r = *(p + x + x); = ⇒ = ⇒ leal a+42(%x,%y,4), %r movb (%p,%x,2), %r Rules are missing from GCC 7.3 (left) and Clang 6.0.0 (right) leal (%x,%y,4), %z addl %x, %p addl $a+42, %z movb (%x,%p), %r …but susceptible to commutativity or associativity r = &a[42 + x + 4*y]; r = *(p + (x + x))); leal a+42(%x,%y,4), %r movb (%p,%x,2), %r

slide-6
SLIDE 6

Existing Rulesets are Incomplete

5 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

x86 has extensive addressing modes r = &a[x + 4*y + 42]; r = *(p + x + x); = ⇒ = ⇒ leal a+42(%x,%y,4), %r movb (%p,%x,2), %r Rules are missing from GCC 7.3 (left) and Clang 6.0.0 (right) leal (%x,%y,4), %z addl %x, %p addl $a+42, %z movb (%x,%p), %r …but susceptible to commutativity or associativity r = &a[42 + x + 4*y]; r = *(p + (x + x))); leal a+42(%x,%y,4), %r movb (%p,%x,2), %r

slide-7
SLIDE 7

Existing Rulesets are Incomplete

5 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

x86 has extensive addressing modes r = &a[x + 4*y + 42]; r = *((p + x) + x); = ⇒ = ⇒ leal a+42(%x,%y,4), %r movb (%p,%x,2), %r Rules are missing from GCC 7.3 (left) and Clang 6.0.0 (right) leal (%x,%y,4), %z addl %x, %p addl $a+42, %z movb (%x,%p), %r …but susceptible to commutativity or associativity r = &a[42 + x + 4*y]; r = *(p + (x + x))); = ⇒ = ⇒ leal a+42(%x,%y,4), %r movb (%p,%x,2), %r

slide-8
SLIDE 8

New Approach

6 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

IR specifjcation machine specifjcation rule library instruction selector Synthesis Semantic specifjcation of instructions Synthesize rule library

For each machine instruction g: Find all smallest IR patterns equivalent to g

Correct and complete rule libraries Push-button support for new ISAs or ISA extensions

slide-9
SLIDE 9

Specifying Instructions

Gulwani et al., PLDI 2011

7 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

And va[0] va[1] vr[0] Not va[0] vr[0] andn va[0] va[1] vr[0] Specifjcation as SMT terms: Arguments va and results vr are 32-bit bitvectors Semantics Q relate arguments to results:

QAnd = (vr[0] = va[0] ∧ va[1]) QNot = (vr[0] = ¬va[0]) Qandn = (vr[0] = ¬va[0] ∧ va[1])

slide-10
SLIDE 10

Component-Based Synthesis

Gulwani et al., PLDI 2011

8 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

And Xor x y

1 1

Provide IR instructions as components, machine instruction as goal SMT encoding of connections between components Produce pattern semantics Q from connections SMT solver fjnds connections with correct semantics

slide-11
SLIDE 11

Component-Based Synthesis

Gulwani et al., PLDI 2011

8 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

And Xor x y

1 1

Provide IR instructions as components, machine instruction as goal SMT encoding of connections between components Produce pattern semantics Q from connections SMT solver fjnds connections with correct semantics

slide-12
SLIDE 12

Component-Based Synthesis

Gulwani et al., PLDI 2011

8 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

And Xor x y

1 1

Provide IR instructions as components, machine instruction as goal SMT encoding of connections between components Produce pattern semantics Q+ from connections SMT solver fjnds connections with correct semantics Q+ = QXor([a, b], [c]) ∧ QAnd([d, e], [f])∧ (a = x) ∧ (b = y) ∧ (d = c) ∧ (e = y) ∧ (result = f)

slide-13
SLIDE 13

Component-Based Synthesis

Gulwani et al., PLDI 2011

8 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

And Xor x y

1 1

Provide IR instructions as components, machine instruction as goal SMT encoding of connections between components Produce pattern semantics Q+ from connections SMT solver fjnds connections with correct semantics Q+ = QXor([a, b], [c]) ∧ QAnd([d, e], [f])∧ (a = x) ∧ (b = y) ∧ (d = c) ∧ (e = y) ∧ (result = f)

slide-14
SLIDE 14

Memory Access

9 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load addr m x

1 1 1 2

addl %x, (%addr) addr m x

1 2

IR graph includes memory dependencies (→ HotSpot) Actually use notional SSA value for memory state m : M Store: update, Load: query

slide-15
SLIDE 15

SMT Representation

10 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Theory “ArraysEx” provides maps, M = Array(Pointer, Value) Problem: ∀m : M . . . and ∄m : M . . .: 2235 possibilities But most addresses are irrelevant Extract symbolic addresses from goal’s semantics Only model those (*addr + x)0

7

(*addr + x)8

15

(*addr + x)16

23

(*addr + x)24

31

addr addr + 1 addr + 2 addr + 3

slide-16
SLIDE 16

SMT Representation

10 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Theory “ArraysEx” provides maps, M = Array(Pointer, Value) Problem: ∀m : M . . . and ∄m : M . . .: 2235 possibilities But most addresses are irrelevant ⇒ Extract symbolic addresses from goal’s semantics Only model those (*addr + x)0...7 (*addr + x)8...15 (*addr + x)16...23 (*addr + x)24...31 addr → addr + 1 → addr + 2 → addr + 3 →

slide-17
SLIDE 17

Synthesis Task

11 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

∃p : Pattern. ∀va : Args. ∀vr : Results. Q+(p, va, vr) ⇐ ⇒ Q(goal, va, vr) Unfortunately intractable as-is: ∀ quantifjers ⇒ Counterexample-guided inductive synthesis (CEGIS) Too many difgerent components

Gulwani’s technique: Assumes right components already selected Enumeration: Search space too large

⇒ Need a compromise

slide-18
SLIDE 18

Iterative CEGIS

12 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

∃p : Pattern. ∀va : Args. ∀vr : Results. Q+(p, va, vr) ⇐ ⇒ Q(goal, va, vr) Gulwani’s algorithm has problems with extraneous components

IRs provide > 20 instructions Each pattern needs few, but some multiple times

Solution: Iterate over sub-multisets of IR in increasing size

Run synthesis for each

IR = {Add, Load, Store} {Add} {Load} {Store} {Add, Add} {Add, Load} {Add, Store} {Load, Load} {Load, Store} {Store, Store} {Add, Add, Add} {Add, Add, Load} …

slide-19
SLIDE 19

Synthesis Results

13 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

IR: 22 simple operations Machine instructions: IA32 32-bit integer subset

Basic group: RISC-like, no addressing mode

One eight-core desktop workstation Group #Goals #Patterns

  • Max. Size

Synthesis Time Basic 39 575 4 3:25 Load/Store 35 607 4 5:45 Unary arithmetic 70 2106 7 18:10:58 Binary arithmetic 260 6316 6 10:27:06 cmp/test; jcc 265 145441 7 3:00:07:05 Total 630 154470 7 4:04:50:54

slide-20
SLIDE 20

Application: Instruction Selection

14 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Turn patterns into instruction selection rules Greedy DAG matcher (≈ LLVM) Integrated in Firm research compiler

Synthesized matcher goes fjrst Handwritten matcher used as fallback

Synthesized matcher covers 75.7 % of SPEC CINT2000

slide-21
SLIDE 21

SPEC CINT2000 Performance Results

15 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD Overhead
  • 2.5 %
0 % +5 % +10 % +15 % 164.gzip 175.vpr 176.gcc 181.mcf 186.crafty 197.parser 254.gap 255.vortex 256.bzip2 300.twolf 253.perlbmk +30.67% +11.56 %

Basic Handwritten

164.gzip 175.vpr 176.gcc 181.mcf 186.crafty 197.parser 253.perlbmk 254.gap 255.vortex 256.bzip2 300.twolf +1.13 %

Full Handwritten

slide-22
SLIDE 22

Rule Libraries of Other Compilers

16 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Turn patterns into compiler test cases

Add a Add Add 42 x Mul 4 y

1 1 1 1

char a[4242]; char *ia32_Lea(int x, int y) { return &a[x + 4 * y + 42]; } Compile and check for goal instruction leal a+42(%x,%y,4), %r

slide-23
SLIDE 23

Results

17 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

GCC 7.3 supports 31 400 / 63 012 rules (50 %) Clang 6.0.0 RC3 supports 26 647 / 63 012 rules (42 %) More information on our website: http://libfirm.org/selgen Full tables of unsupported patterns Links to examples in Godbolt’s Compiler Explorer ⇒ Instruction selection patterns still missing in production compilers

slide-24
SLIDE 24

Further Work

18 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Waiting for SMT solver progress

 Floating point  Division

The to-do list Synthesis techniques for larger patterns Vector instructions, loops Multiple bit widths Might be a good idea ? Completeness vs. synthesis performance ? Function calls

slide-25
SLIDE 25

Conclusion

19 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Contributions Automatic synthesis of instruction rule libraries

Memory encoding for synthesis Iterative CEGIS

Generated instruction selector

On par with handwritten counterpart

Instruction selector testing

Manual rule libraries are incomplete

Artifact Synthesis tool, research compiler libFirm, compiler testing scripts Freely available under GPL http://libfirm.org/selgen

slide-26
SLIDE 26 20 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

END

slide-27
SLIDE 27 21 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

SMT→

slide-28
SLIDE 28

SMT Solving

22 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

∃x : BitVec32. ∃y : BitVec32. x > 0 ∧ y > 0 ∧ x ∗ x + y ∗ y = 0 SAT + fjrst-order quantifjers + Theories

“FixedSizeBitVectors” implements two’s-complement arithmetic

Solver produces model for outer ∃ quantifjers

No other quantifjers: “quantifjer-free” → better performance

Model: x 16382 216, y 32766 216

slide-29
SLIDE 29

SMT Solving

22 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

∃x : BitVec32. ∃y : BitVec32. x > 0 ∧ y > 0 ∧ x ∗ x + y ∗ y = 0 SAT + fjrst-order quantifjers + Theories

“FixedSizeBitVectors” implements two’s-complement arithmetic

Solver produces model for outer ∃ quantifjers

No other quantifjers: “quantifjer-free” → better performance

Model: x 16382 216, y 32766 216

slide-30
SLIDE 30

SMT Solving

22 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

∃x : BitVec32. ∃y : BitVec32. x > 0 ∧ y > 0 ∧ x ∗ x + y ∗ y = 0 SAT + fjrst-order quantifjers + Theories

“FixedSizeBitVectors” implements two’s-complement arithmetic

Solver produces model for outer ∃ quantifjers

No other quantifjers: “quantifjer-free” → better performance

Model: x = 16382 ∗ 216, y = 32766 ∗ 216

slide-31
SLIDE 31 23 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Mem→

slide-32
SLIDE 32

Memory Access

24 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load addr m x

1 1 1 2

addl %x, (%addr) addr m x

1 2

Store: update, Load: query Keep the antidependencies!

slide-33
SLIDE 33

Memory Access

24 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load addr m x

1 1 1 2

addl %x, (%addr) addr m x

1 2

Store: update, Load: query Keep the antidependencies!

slide-34
SLIDE 34

Remembering Loads

25 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Remember loads with access fmag. M = Pointer → (Bool × Value) Load Set access fmag, extract data Store Update data, leave access fmag untouched

  • ld_addr + x

✓ addr → value fmag

Store Add Load addr m x

1 1 1 2
slide-35
SLIDE 35

Remembering Loads

25 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Remember loads with access fmag. M = Pointer → (Bool × Value) Load Set access fmag, extract data Store Update data, leave access fmag untouched

  • ld_addr + x

✗ addr → value fmag

Store Add Load addr m x

1 1 1 2
slide-36
SLIDE 36

SMT Representation

26 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Theory “ArraysEx” provides maps, M = Array(Pointer, (Bool × Value)) Problem: ∀m : M . . . and ∄m : M . . .: 2235 possibilities But most addresses are irrelevant Extract relevant addresses from goal’s semantics Only model those Bit-vectors for effjciency va 1 va 1 1 va 1 2 va 1 3 fmag data

slide-37
SLIDE 37

SMT Representation

26 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Theory “ArraysEx” provides maps, M = Array(Pointer, (Bool × Value)) Problem: ∀m : M . . . and ∄m : M . . .: 2235 possibilities But most addresses are irrelevant ⇒ Extract relevant addresses from goal’s semantics Only model those Bit-vectors for effjciency va[1] + 0 va[1] + 1 va[1] + 2 va[1] + 3 fmag data

slide-38
SLIDE 38 27 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Gulwani→

slide-39
SLIDE 39

Component-Based Synthesis

Gulwani et al., PLDI 2011

28 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load addr m x

1 1 1 2

Constraints ensure well-formedness Derive pattern semantics Q p va vr from assignment to *-pos

slide-40
SLIDE 40

Component-Based Synthesis

Gulwani et al., PLDI 2011

28 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load addr m x

1 1 1 2

Constraints ensure well-formedness Derive pattern semantics Q p va vr from assignment to *-pos

slide-41
SLIDE 41

Component-Based Synthesis

Gulwani et al., PLDI 2011

28 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load 1 2

1 1 1 2

1 2 3 4 5 6 arg0 arg1 arg2

Constraints ensure well-formedness Derive pattern semantics Q p va vr from assignment to *-pos

slide-42
SLIDE 42

Component-Based Synthesis

Gulwani et al., PLDI 2011

28 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load 1 2

1 1 1 2

1 2 3 4 5 6 arg0 arg1 arg2 Load (res0) Load (res1) Add Store load-pos = 3 add-pos = 5 store-pos = 6

Constraints ensure well-formedness Derive pattern semantics Q p va vr from assignment to *-pos

slide-43
SLIDE 43

Component-Based Synthesis

Gulwani et al., PLDI 2011

28 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Store Add Load 1 2

1 1 1 2

1 2 3 4 5 6 arg0 arg1 arg2 Load (res0) Load (res1) Add Store load-pos = 3 add-pos = 5 store-pos = 6 store-arg-1-pos = 0 store-arg-0-pos = 3 store-arg-2-pos = 5

Constraints ensure well-formedness Derive pattern semantics Q+(p, va, vr) from assignment to *-pos

slide-44
SLIDE 44 29 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

CEGIS→

slide-45
SLIDE 45

Counterexample-Guided Inductive Synthesis

  • a. k. a. CEGIS
30 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

∃p : Pattern. ∀va : Args. ∀vr : Results. Q+(p, va, vr) ⇐ ⇒ Q(goal, va, vr) Small set of test-cases T usually enough Synthesis ∃p : Pattern.

t∈T (testOK(p, t))

Counterexample ∃t : TestCase. ¬testOK(p∗, t) T ← ∅ (sat, p∗) (sat, t∗) T ← T ∪ {t∗} unsat no solution unsat solution: p∗

slide-46
SLIDE 46 31 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

linear→

slide-47
SLIDE 47

Linear Type Encoding

32 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Alternative to the access fmag Add SMT constraints to ensure linear type property (i.e. exactly one use per def) def use 1 use 2 use 3 … use n

i

(use-i-arg-0-pos = def-pos) = 1 Pseudo-boolean constraint, supported by Z3 but not SMT-LIB Other optimization relies on access fmag

slide-48
SLIDE 48 33 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD
  • pts→
slide-49
SLIDE 49

Further Optimizations

34 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Load/Store/both necessary? ∃mbefore : M. ∃mafter : M. Q(goal, [. . . mbefore . . .], [. . . mafter . . .]) ∧ mbefore ̸= mafter ≥ d uses of a sort with d defs? Source (def without use) for all uses?

slide-50
SLIDE 50 35 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

cov→

slide-51
SLIDE 51

Instruction Coverage

36 2018-02-28 Sebastian Buchwald, Andreas Fried, Sebastian Hack - Synthesizing Instruction Selection IPD

Frequency of unsupported instructions: Phi/Sync 35.7 % Conditional 20.0 % Call 18.5 % Internal 11.1 % Load/Store 7.8 % Cast 4.8 % Arithmetic 1.5 % Div/Mod 0.4 % Builtin 0.1 %