Allocation and Instruction Scheduling Christian Schulte KTH Royal - PowerPoint PPT Presentation

Combinatorial Register Allocation and Instruction Scheduling Christian Schulte KTH Royal Institute of Technology RISE SICS (Swedish Institute of Computer Science), until June 2018 joint work with: Mats Carlsson RISE SICS Roberto Castañeda Lozano RISE SICS + KTH Frej Drejhammar RISE SICS Gabriel Hjort Blindell KTH + RISE SICS funded by: Ericsson AB Swedish Research Council (VR 621-2011-6229)

Compilation September 2019 source assembly back-end front-end optimizer program program (code generator) Combinatorial Register Allocation & Instruction Scheduling, Schulte • Front-end: depends on source programming language • changes infrequently (well…) • Optimizer: independent optimizations • changes infrequently (well…) • Back-end: depends on processor architecture 2 • changes often: new process, new architectures, new features, …

Generating Code: Unison September 2019 source assembly back-end front-end optimizer program program (code generator) Combinatorial Register Allocation & Instruction Scheduling, Schulte Unison • Infrequent changes: front-end & optimizer • reuse state-of-the-art: LLVM, for example • Frequent changes: back-end • use flexible approach: Unison 3

State of the Art instruction September 2019 selection Combinatorial Register Allocation & Instruction Scheduling, Schulte add r0 r1 r2 x = y + z; mv $a6f0 r0 • Code generation organized into stages • instruction selection, 4

State of the Art register September 2019 allocation x  register r0 Combinatorial Register Allocation & Instruction Scheduling, Schulte y  memory (spill to stack) x = y + z; … • Code generation organized into stages • instruction selection, register allocation, 5

State of the Art instruction September 2019 scheduling x = y + z; u = v – w; Combinatorial Register Allocation & Instruction Scheduling, Schulte … … u = v – w; x = y + z; • Code generation organized into stages • instruction selection, register allocation, instruction scheduling 6

State of the Art instruction register instruction September 2019 selection allocation scheduling • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible 7

State of the Art instruction instruction register September 2019 selection scheduling allocation • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Example: instruction scheduling  register allocation • increased delay between instructions can increase throughput  registers used over longer time-spans  more registers needed 8

State of the Art instruction register instruction September 2019 selection allocation scheduling • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Example: instruction scheduling  register allocation • put variables into fewer registers  more dependencies among instructions  less opportunity for reordering instructions 9

State of the Art instruction instruction register September 2019 selection scheduling allocation • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Stages use heuristic algorithms • for hard combinatorial problems (NP hard) • assumption: optimal solutions not possible anyway • difficult to take advantage of processor features • error-prone when adapting to change 10

State of the Art instruction instruction register September 2019 selection scheduling allocation • Code generation organized into stages Combinatorial Register Allocation & Instruction Scheduling, Schulte • stages are interdependent: no optimal order possible • Stages use heuristic algorithms preclude optimal code, • for hard combinatorial problems (NP hard) • assumption: optimal solutions not possible anyway make development • difficult to take advantage of processor features complex • error-prone when adapting to change 11

Rethinking: Unison Idea • No more staging and complex heuristic algorithms! September 2019 • many assumptions are decades old... • Use state-of-the-art technology for solving combinatorial Combinatorial Register Allocation & Instruction Scheduling, Schulte optimization problems: constraint programming • tremendous progress in last two decades... • Generate and solve single model • captures all code generation tasks in unison • high-level of abstraction: based on processor description • flexible: ideally, just change processor description • potentially optimal: tradeoff between decisions accurately reflected 12

Unison Approach instruction constraint model selection September 2019 constraints instruction input constraints program scheduling constraints Combinatorial Register Allocation & Instruction Scheduling, Schulte register allocation processor description • Generate constraint model • based on input program and processor description • constraints for all code generation tasks • generate but not solve : simpler and more expressive 13

Unison Approach instruction constraint model selection September 2019 off-the-shelf constraints constraint instruction input constraints solver program scheduling constraints Combinatorial Register Allocation & Instruction Scheduling, Schulte register assembly allocation program processor description • Off-the-shelf constraint solver solves constraint model • solution is assembly program • optimization takes inter-dependencies into account • optimal solution with respect to model in principle (time) possible 14

Scope of this Talk • Unison proper September 2019 • instruction scheduling • register allocation • Instruction selection not covered Combinatorial Register Allocation & Instruction Scheduling, Schulte • also constraint-based model available • less mature • Complete and Practical Universal Instruction Selection , Gabriel Hjort Blindell, Mats Carlsson, Roberto Castañeda Lozano, Christian Schulte. Transactions on Embedded Computing Systems, ACM Press, 2017. 15

Overview • Basic Register Allocation September 2019 • Instruction Scheduling • Advanced Register Allocation [sketch] • Global Register Allocation Combinatorial Register Allocation & Instruction Scheduling, Schulte • Solving • Evaluation • Discussion 16

Source Material • Constraint-based Register Allocation and Instruction September 2019 Scheduling , Roberto Castañeda Lozano, Mats Carlsson, Frej Drejhammar, Christian Schulte. CP 2012. • Combinatorial Spill Code Optimization and Ultimate Coalescing , Roberto Castañeda Lozano, Mats Carlsson, Gabriel Combinatorial Register Allocation & Instruction Scheduling, Schulte Hjort Blindell, Christian Schulte. LCTES 2014. • Combinatorial Register Allocation and Instruction Scheduling , Roberto Castañeda Lozano, Mats Carlsson, Gabriel Hjort Blindell, Christian Schulte. Transactions on Programming Languages and Systems, ACM Press, 2019. • Survey on Combinatorial Register Allocation and Instruction Scheduling , Roberto Castañeda Lozano, Christian Schulte. 17 Computing Surveys, ACM Press, 2019.

Unit and Scope • Function is unit of compilation September 2019 • generate code for one function at a time • Scope Combinatorial Register Allocation & Instruction Scheduling, Schulte • global generate code for whole function • local generate code for each basic block in isolation • Basic block : instructions that are always executed together • start execution at beginning of block • execute all instructions • leave execution at end of block 18

BASIC REGISTER ALLOCATION Local (and slightly naïve) register allocation Combinatorial Register Allocation 19 September 2019 & Instruction Scheduling, Schulte

Local Register Allocation t 2  mul t 1 , 2 September 2019 t 3  sub t 1 , 2 t 4  add t 2 , t 3 ... t 5  mul t 1 , t 4 Combinatorial Register Allocation & Instruction Scheduling, Schulte  jr t 5 • Instruction selection has already been performed • Temporaries • defined or def -occurrence (lhs) t 3  sub t 1 , 2 t 3 in • used or use -occurrence (rhs) t 3  sub t 1 , 2 t 1 in • Basic blocks are in SSA (single static assignment) form • each temporary is defined once 20 • standard state-of-the-art approach

Liveness & Interference t 1 t 2 t 3 t 4 t 5 t 2  mul t 1 , 2 September 2019 t 3  sub t 1 , 2 t 4  add t 2 , t 3 ... t 5  mul t 1 , t 4 Combinatorial Register Allocation & Instruction Scheduling, Schulte  jr t 5 live ranges • Temporary is live from def to last use, defining its live range • live ranges are linear (basic block + SSA) • Temporaries interfere if their live ranges overlap 21 • Non-interfering temporaries can be assigned to same register

Allocation and Instruction Scheduling Christian Schulte KTH Royal - PowerPoint PPT Presentation

Combinatorial Register Allocation and Instruction Scheduling Christian Schulte KTH Royal Institute of Technology RISE SICS (Swedish Institute of Computer Science), until June 2018 joint work with: Mats Carlsson RISE SICS Roberto Castaeda

More Register Allocation Last time Register allocation Global allocation via graph

Instruction Scheduling Last time Register allocation Today Instruction

Instruction Scheduling Last week Register allocation Today Instruction scheduling

Instruction Set 2 Architecting a vocabulary for the HW INSTRUCTION SET OVERVIEW 3 Instruction

Project Nexus Principle Workshop Project Nexus Principle Workshop ALLOCATION ALLOCATION 15

EE 457 Unit 3 Instruction Sets 2 With Focus on our Case Study: MIPS INSTRUCTION SET OVERVIEW 3

EE 109 Unit 10 MIPS Instruction Set MIPS INSTRUCTION OVERVIEW 10.3 10.4 Instruction Set

EE 457 Unit 3 Instruction Sets With Focus on our Case Study: MIPS INSTRUCTION SET OVERVIEW 3.3

EXPLICIT INSTRUCTION EXPLICIT INSTRUCTION Michael L. Kamil Michael L. Kamil Stanford University

Lecture 3: Instruction Lecture 3: Instruction of a computer that a machine language of a

Instruction encoding The ISA defines The format of an instruction (syntax) The

Slide Handouts: Instruction Ask the Expert Welcome to Module 6 Lesson 1. Instruction: Ask the

Instruction Scheduling cs5363 1 Instruction scheduling Reordered Original Instruction code

Cost Allocation Plans and Indirect Cost Rates Cost Allocation Plans and Indirect Cost Rates

GRANT MANAGEMENT AND COST ALLOCATION DISCUSSION TOPICS Grant Tracking Cost Allocation

Chapter 4 The Medium Access Control Sublayer 1 The Channel Allocation Problem Static

trt rstts

Course Script INF 5110: Compiler con- struction INF5110, spring 2020 Martin Steffen Contents

Storage and Indexing 1 Overview We covered storage of unstructured files in HDFS

Which of the transistors below are on? 9k 9k A B 5V 5V 1k 1k -5V D C 6V 2V 4V

HW/SW Codesign w/ FPGAsMicroprocessors/Embedded Cores ECE 495/595 Microprocessors/Embedded Cores

Time-Space Tradeoffs for Two-Pass Learning Sumegha Garg (Princeton) Joint Work with Ran Raz

Plan for Lexical Analysis with Jlex and One Pass Code Gen Structure of the MeggyJava Compiler

Intro to Unity Shaders CM163 Lab 1 Rendering Pipeline Vertex Shader - Program that transforms

Sambuz

Useful Links

Newsletter

Mail Us