instruction selection
play

Instruction Selection Aslan Askarov aslan@cs.au.dk Partially based - PowerPoint PPT Presentation

Compilation 2016 Instruction Selection Aslan Askarov aslan@cs.au.dk Partially based on slides by E. Ernst Where are we? High-level source code Translation to Lexing/Parsing Semantic analysis LLVM-- IR Low-level target


  1. 
 
 Compilation 2016 Instruction Selection Aslan Askarov aslan@cs.au.dk 
 Partially based on slides by E. Ernst

  2. Where are we? High-level source code Translation to Lexing/Parsing Semantic analysis LLVM-- IR Low-level target Instruction code selection Register allocation

  3. Instruction selection — translating IR elements into target • How to pick instructions for di ff erent IR elements? • When IR is relatively simple, such as LLVM--, the process is relatively straightforward • most of the hard work is done by the codegen • When IR is a bit more complex, such as the textbook IR Tree language, there is more work to be done at this phase • Maximum Munch algorithm

  4. Tree IR language (from Textbook) • A simple tree expression language: signature TREE = sig type label = Temp.label datatype stm = MOVE of exp * exp | EXP of exp | JUMP of exp * label list | CJUMP of relop * exp * exp * label * label | SEQ of stm * stm | LABEL of label and exp = CONST of int | NAME of label | TEMP of Temp.temp | BINOP of binop * exp * exp | MEM of exp | CALL of exp * exp list | ESEQ of stm * exp and binop = PLUS | MINUS | MUL | DIV | AND | OR | LSHIFT | RSHIFT | ARSHIFT | XOR and relop = EQ | NE | LT | GT | LE | GE | ULT | ULE | UGT | UGE ... end

  5. Instruction Selection for Tree IR language • Each IR node does one thing, real MEM machine instructions typically do BINOP several things • Ex: typical memory access ➜ PLUS e CONST • This is good, IR should be primitive c • Instruction selection = find ways to MEM express IR trees using instructions + • NB: using shorthand notation ➜ e CONST c

  6. Describing Instructions • Basic device: the tree pattern • Matching idea • A tree pattern is a partial tree, a tile • From the top: concrete nodes • At bottom: blanks, standing for subtrees, called leaves 
 • Repeated matching, tiling, reconstructs an IR tree • Read o ff instruction sequence: 
 top-down traversal = reverse order

  7. For illustration: Jouette • Need concrete instruction set • Hypothetical (RISC) CPU architecture ‘Jouette’ • Instructions ➜ ADD r i ⃪ r j + r k MUL r i ⃪ r j * r k • Three-address format: 
 SUB r i ⃪ r j - r k flexible locations DIV r i ⃪ r j / r k • Arithmetic operations: 
 ADDI r i ⃪ r j + c only in registers SUBI r i ⃪ r j - c • Addressing modes: 
 LOAD r i ⃪ M[ r j + c ] only one address, fixed o ff set

  8. Jouette Tiles • Two categories: • ‘Expression tile’: produces a result in a register • ‘Statement tile’: creates a side-e ff ect • Special case: a register is an atomic expression TEMP shorthand: TEMP (no name) r i t

  9. Jouette Expression Tiles • Main arithmetic operations: unique patterns + ADD r i ⃪ r j + r k - SUB r i ⃪ r j - r k * MUL r i ⃪ r j * r k / DIV r i ⃪ r j / r k

  10. Jouette Expression Tiles • Arithmetic operations involving immediate: 
 multiple interpretations — multiple patterns + + CONST ADDI r i ⃪ r j + c CONST CONST SUBI r i ⃪ r j - c - CONST

  11. Jouette Expression Tiles • Reading from memory: many interpretations LOAD r i ⃪ M[ r j + c ] MEM MEM MEM MEM CONST + + CONST CONST

  12. Jouette Statement Tiles • Storing in memory: larger tiles STORE M[ r i + c ] ⃪ r j MOVE MOVE MOVE MOVE MEM MEM MEM MEM + + CONST CONST CONST

  13. Jouette Statement Tiles • Moving in memory MOVE MOVEMM[ r i ] ⃪ M[ r j ] MEM MEM • (Not a typical RISC instruction, but illustrative) • NB: store tiles always match the two nodes MOVE(MEM,_) simultaneously

  14. Example Tilings • Consider an IR tree for a[i] := x Discuss MOVE how this MEM MEM tree can + + specify that assignment! MEM * FP CONST x + TEMP i CONST 4 FP CONST a

  15. Example Tilings • One way to tile this IR tree for a[i] := x MOVE LOAD r 1 ⃪ M[FP + a ] MEM MEM ADDI r 2 ⃪ r 0 + 4 MUL r 2 ⃪ r i * r 2 + + ADD r 1 ⃪ r 1 + r 2 MEM * FP CONST x LOAD r 2 ⃪ M[FP + x ] STORE M[ r 1 + 0] ⃪ r 2 + TEMP i CONST 4 FP CONST a

  16. Example Tilings • Another way to tile this IR tree for a[i] := x MOVE MEM MEM LOAD r1 ⃪ M[FP + a] + + ADDI r2 ⃪ r0 + 4 MUL r2 ⃪ ri * r2 MEM * FP CONST x ADD r1 ⃪ r1 + r2 + TEMP i CONST 4 FP CONST a

  17. Example Tilings • An “anti-optimal” tiling of the tree for a[i] := x MOVE ADDI r1 ⃪ r0 + a ADD r1 ⃪ FP + r1 MEM MEM LOAD r1 ⃪ M[r1 + 0] + + ADDI r2 ⃪ r0 + 4 MUL r2 ⃪ ri * r2 MEM * FP CONST x ADD r1 ⃪ r1 + r2 + TEMP i CONST 4 ADDI r2 ⃪ r0 + x ADD r2 ⃪ FP + r2 FP CONST a

  18. Optimal vs Optimum Tilings • What’s the “best” tiling? • Minimal number of instructions? • Best performance at runtime? • Compositionally assumption: Can compute “best” based on each tile (reality: cost is not additive!) • Choice here: Minimal number of instructions • Optimal: No gain combining two neighboring tiles • Optimum: No tiling has lower cost • Property for optimal: local, for optimum: global • Note that optimum ⇒ optimal, not vice versa

  19. Comparing Criteria • Obviously, optimal easier than optimum • Then, how valuable is optimum? • RISC CPU architecture: Not terribly important • each tile small, optimal/optimum often identical • CISC CPU architecture: More important • larger tiles, many choices everywhere

  20. Algorithm: Maximal Munch • A greedy algorithm, fast, easy to understand • Idea: • Start from root of IR tree, work downward • At each node N , choose biggest tile that matches • Recur on leaves of chosen tile (not children of N ! ) • Note: Is never stuck if all single-node tiles exist

  21. Maximal Munch Example • The second tiling for a[i] := x MOVE MEM MEM LOAD r1 ⃪ M[FP + a] ADDI r2 ⃪ r0 + 4 + + MUL r2 ⃪ ri * r2 ADD r1 ⃪ r1 + r2 MEM * FP CONST x ADDI r2 ⃪ FP + x + TEMP i CONST 4 FP CONST a

  22. Optimum Algorithm • An algorithm based on dynamic programming, a bit more complex than maximal munch • Idea: • Start from bottom of IR tree, work upward 
 (recursion: process children, then current node) • Concept: assign cost to each node (bottom up) • At each node, compute cost for each tile T by adding cost of T to cost of T' s leaves • Solution is optimum

  23. Algorithm Complexity • Parameters: • N : number of nodes in given IR tree • T : number of tiles • K : average number of non-leaf nodes in tiles • K' : max no. of nodes to check to see which tiles match • T' : average number of tiles matching at a node • Maximal Munch: N/K(K'+T') • Optimum (dyn.pgm.) algorithm: N(K'+T') • But this is linear in the size of the IR tree! • “No problem!”

  24. Tree Grammars • Motivation: Some CPUs, e.g., Motorola 68000, have register classes: data vs. address registers • Problem: using previous algorithm, sub-tiling may produce result in the wrong type of register • Idea: d ➜ MEM(+( a ,CONST)) • Specify tiles as CFG rules ➜ d ➜ MEM(+(CONST , a )) d ➜ MEM(CONST) • Non-terminal indicates class d ➜ MEM( a ) • Derivation creates IR tree d ➜ a a ➜ d • Ambiguity = alternative tilings • Tools exist (code-generator generators), 
 usage not unlike parser generators

  25. CPU Architecture Issues • RISC was mostly invented to fit well with modern code generation • RISC features, good and bad: • many registers (e.g., 32) • every register can do everything (just one class) • arithmetic operations only on registers (no MUL?) • three-address instructions (flexible placement) • just one memory addressing mode ( M[reg+const] ) • uniform instruction size (e.g., 32 bit) • every instruction has a single e ff ect/result

  26. Summary • IR nodes do one thing, instructions many • Tree patterns, tiles, ‘leaves’ of tiles • Instruction selection: Cover IR tree with tiles • Jouette architecture, instruction set • Jouette statement tiles, expression tiles • Example tilings • Optimum vs. optimal tilings • Algorithms: Maximal munch; dyn. programming • Tree grammars

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend