allocation and instruction

Allocation and Instruction Scheduling Christian Schulte KTH Royal - PowerPoint PPT Presentation

Combinatorial Register Allocation and Instruction Scheduling Christian Schulte KTH Royal Institute of Technology RISE SICS (Swedish Institute of Computer Science), until June 2018 joint work with: Mats Carlsson RISE SICS Roberto Castaeda


  1. Combinatorial Register Allocation and Instruction Scheduling Christian Schulte KTH Royal Institute of Technology RISE SICS (Swedish Institute of Computer Science), until June 2018 joint work with: Mats Carlsson RISE SICS Roberto Castañeda Lozano RISE SICS + KTH Frej Drejhammar RISE SICS Gabriel Hjort Blindell KTH + RISE SICS funded by: Ericsson AB Swedish Research Council (VR 621-2011-6229)

  2. Compilation September 2019 source assembly back-end front-end optimizer program program (code generator) Combinatorial Register Allocation & Instruction Scheduling, Schulte • Front-end: depends on source programming language • changes infrequently (well…) • Optimizer: independent optimizations • changes infrequently (well…) • Back-end: depends on processor architecture 2 • changes often: new process, new architectures, new features, …

  3. Generating Code: Unison September 2019 source assembly back-end front-end optimizer program program (code generator) Combinatorial Register Allocation & Instruction Scheduling, Schulte Unison • Infrequent changes: front-end & optimizer • reuse state-of-the-art: LLVM, for example • Frequent changes: back-end • use flexible approach: Unison 3

  4. State of the Art instruction September 2019 selection Combinatorial Register Allocation & Instruction Scheduling, Schulte add r0 r1 r2 x = y + z; mv $a6f0 r0 • Code generation organized into stages • instruction selection, 4

  5. State of the Art register September 2019 allocation x  register r0 Combinatorial Register Allocation & Instruction Scheduling, Schulte y  memory (spill to stack) x = y + z; … • Code generation organized into stages • instruction selection, register allocation, 5

  6. State of the Art instruction September 2019 scheduling x = y + z; u = v – w; Combinatorial Register Allocation & Instruction Scheduling, Schulte … … u = v – w; x = y + z; • Code generation organized into stages • instruction selection, register allocation, instruction scheduling 6

  7. State of the Art instruction register instruction September 2019 selection allocation scheduling • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible 7

  8. State of the Art instruction instruction register September 2019 selection scheduling allocation • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Example: instruction scheduling  register allocation • increased delay between instructions can increase throughput  registers used over longer time-spans  more registers needed 8

  9. State of the Art instruction register instruction September 2019 selection allocation scheduling • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Example: instruction scheduling  register allocation • put variables into fewer registers  more dependencies among instructions  less opportunity for reordering instructions 9

  10. State of the Art instruction instruction register September 2019 selection scheduling allocation • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Stages use heuristic algorithms • for hard combinatorial problems (NP hard) • assumption: optimal solutions not possible anyway • difficult to take advantage of processor features • error-prone when adapting to change 10

  11. State of the Art instruction instruction register September 2019 selection scheduling allocation • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Stages use heuristic algorithms preclude optimal code, • for hard combinatorial problems (NP hard) • assumption: optimal solutions not possible anyway make development • difficult to take advantage of processor features complex • error-prone when adapting to change 11

  12. Rethinking: Unison Idea • No more staging and complex heuristic algorithms! September 2019 • many assumptions are decades old... • Use state-of-the-art technology for solving combinatorial Combinatorial Register Allocation & Instruction Scheduling, Schulte optimization problems: constraint programming • tremendous progress in last two decades... • Generate and solve single model • captures all code generation tasks in unison • high-level of abstraction: based on processor description • flexible: ideally, just change processor description • potentially optimal: tradeoff between decisions accurately reflected 12

  13. Unison Approach instruction constraint model selection September 2019 constraints instruction input constraints program scheduling constraints Combinatorial Register Allocation & Instruction Scheduling, Schulte register allocation processor description • Generate constraint model • based on input program and processor description • constraints for all code generation tasks • generate but not solve : simpler and more expressive 13

  14. Unison Approach instruction constraint model selection September 2019 off-the-shelf constraints constraint instruction input constraints solver program scheduling constraints Combinatorial Register Allocation & Instruction Scheduling, Schulte register assembly allocation program processor description • Off-the-shelf constraint solver solves constraint model • solution is assembly program • optimization takes inter-dependencies into account • optimal solution with respect to model in principle (time) possible 14

  15. Scope of this Talk • Unison proper September 2019 • instruction scheduling • register allocation • Instruction selection not covered Combinatorial Register Allocation & Instruction Scheduling, Schulte • also constraint-based model available • less mature • Complete and Practical Universal Instruction Selection , Gabriel Hjort Blindell, Mats Carlsson, Roberto Castañeda Lozano, Christian Schulte. Transactions on Embedded Computing Systems, ACM Press, 2017. 15

  16. Overview • Basic Register Allocation September 2019 • Instruction Scheduling • Advanced Register Allocation [sketch] • Global Register Allocation Combinatorial Register Allocation & Instruction Scheduling, Schulte • Solving • Evaluation • Discussion 16

  17. Source Material • Constraint-based Register Allocation and Instruction September 2019 Scheduling , Roberto Castañeda Lozano, Mats Carlsson, Frej Drejhammar, Christian Schulte. CP 2012. • Combinatorial Spill Code Optimization and Ultimate Coalescing , Roberto Castañeda Lozano, Mats Carlsson, Gabriel Combinatorial Register Allocation & Instruction Scheduling, Schulte Hjort Blindell, Christian Schulte. LCTES 2014. • Combinatorial Register Allocation and Instruction Scheduling , Roberto Castañeda Lozano, Mats Carlsson, Gabriel Hjort Blindell, Christian Schulte. Transactions on Programming Languages and Systems, ACM Press, 2019. • Survey on Combinatorial Register Allocation and Instruction Scheduling , Roberto Castañeda Lozano, Christian Schulte. 17 Computing Surveys, ACM Press, 2019.

  18. Unit and Scope • Function is unit of compilation September 2019 • generate code for one function at a time • Scope Combinatorial Register Allocation & Instruction Scheduling, Schulte • global generate code for whole function • local generate code for each basic block in isolation • Basic block : instructions that are always executed together • start execution at beginning of block • execute all instructions • leave execution at end of block 18

  19. BASIC REGISTER ALLOCATION Local (and slightly naïve) register allocation Combinatorial Register Allocation 19 September 2019 & Instruction Scheduling, Schulte

  20. Local Register Allocation t 2  mul t 1 , 2 September 2019 t 3  sub t 1 , 2 t 4  add t 2 , t 3 ... t 5  mul t 1 , t 4 Combinatorial Register Allocation & Instruction Scheduling, Schulte  jr t 5 • Instruction selection has already been performed • Temporaries • defined or def -occurrence (lhs) t 3  sub t 1 , 2 t 3 in • used or use -occurrence (rhs) t 3  sub t 1 , 2 t 1 in • Basic blocks are in SSA (single static assignment) form • each temporary is defined once 20 • standard state-of-the-art approach

  21. Liveness & Interference t 1 t 2 t 3 t 4 t 5 t 2  mul t 1 , 2 September 2019 t 3  sub t 1 , 2 t 4  add t 2 , t 3 ... t 5  mul t 1 , t 4 Combinatorial Register Allocation & Instruction Scheduling, Schulte  jr t 5 live ranges • Temporary is live from def to last use, defining its live range • live ranges are linear (basic block + SSA) • Temporaries interfere if their live ranges overlap 21 • Non-interfering temporaries can be assigned to same register

Recommend


More recommend