Register Allocation III Interference Graph Allocators Chaitin Last time Briggs – Register allocation across function calls Today – Register allocation options CS553 Lecture Register Allocation III 1 CS553 Lecture Register Allocation III 2 Granularity of Allocation (Renumber step in Briggs) Coalescing What is allocated to registers? Move instructions – Variables/Temporaries – Code generation can produce unnecessary move instructions – Live ranges/Webs ( i.e., du-chains with common uses) mov t1, t2 – If we can assign t1 and t2 to the same register, we can eliminate the move – Values ( i.e., definitions; same as variables with SSA) Idea b 1 t 1 : x := 5 Variables: 2 ( x & y ) – If t1 and t2 are not connected in the interference graph, coalesce them into Live Ranges/Web: 3 (t 1 → t 2 ,t 4 ; a single variable t 2 : y := x t 4 : ... x ... t 2 → t 3 ; b 2 b 3 t 3 : x := y+1 t 5 : x := 3 t 3 ,t 5 → t 6 ) Problem Values: 4 (t 1 , t 2 , t 3 , t 5 , φ (t 3 ,t 5 )) – Coalescing can increase the number of edges and make a graph uncolorable t 6 : ... x ... b 4 – Limit coalescing coalesce What are the tradeoffs? to avoid uncolorable t1 t2 t1 t2 Each allocation unit is given a symbolic register name ( e.g., s1 , s2 , etc .) graphs CS553 Lecture Register Allocation III 3 CS553 Lecture Register Allocation III 4 1
Coalescing Logistics Computing the Interference Graph (in MiniJava compiler) Rule Use results of live variable analysis – When building the interference graph, do NOT make virtual registers Computing interference graph to enable coalescing interfere due to copies. – If the virtual registers s1 and s2 do not interfere and there is a copy for each flow graph node n do statement s1 = s2 then s1 and s2 can be coalesced. for each def in def(n) do – Example for each temp in liveout(n) do if ( not stmt(n) isa MOVE or use != temp) then E ← E ∪ (def, temp) CS553 Lecture Register Allocation III 5 CS553 Lecture Register Allocation III 6 Coalescing in MiniJava compiler Register Allocation: Spilling Currently the InterferenceGraph only has one Temp.Temp associated If we can’t find a k-coloring of the interference graph with each node – Spill variables (nodes) until the graph is colorable – represent each merged node with just one of the temps – keep a separate map of representatives mapped to sets of temps Choosing variables to spill – also keep a map of temps mapped to their representative – Choose least frequently accessed variables – when rewriting the code use the representative instead of the original temp – Break ties by choosing nodes with the most conflicts in the interference graph – Yes, these are heuristics! CS553 Lecture Register Allocation III 7 CS553 Lecture Register Allocation III 8 2
Weighted Interference Graph Improvement #1: Simplification Phase [Chaitin 81] Goal Idea f ( r ) � – Weight( s ) = f(r) is execution frequency of r – Nodes with < k neighbors are guaranteed colorable references r of s � – Improvement over simple greedy coloring algorithm Static approximation – Use some reasonable scheme to rank variables Remove them from the graph first – Some possibilities – Reduces the degree of the remaining nodes – Weight( s ) = num of times s is used in program – Weight( s ) = 10 × (# uses in loops) + (# uses in straightline code) Must spill only when all remaining nodes have degree ≥ k – Weight( s ) = 20 × (# uses in loops) + 2 × (# uses in straightline code) + (# uses in a branch statement) Referred to as pessimistic spilling CS553 Lecture Register Allocation III 9 CS553 Lecture Register Allocation III 10 The Problem: Worst Case Assumptions Improvement #2: Optimistic Spilling [Briggs 89] Is the following graph 2-colorable? s1 s1 s4 s2 s4 s2 s3 Idea s3 – Some neighbors might get the same color – Nodes with k neighbors might be colorable – Blocking does not imply that spilling is necessary Clearly 2-colorable – Push blocked nodes on stack (rather than place in − But Chaitin’s algorithm leads to an immediate block and spill spill set) Defer − The algorithm assumes the worst case, namely, that all neighbors will – Check colorability upon popping the stack, when decision be assigned a different color more information is available CS553 Lecture Register Allocation III 11 CS553 Lecture Register Allocation III 12 3
Algorithm [Briggs et al . 89] Example while interference graph not empty do Attempt to 2-color this graph ( , ) while ∃ a node n with < k neighbors do simplify Remove n from the graph Push n on a stack if any nodes remain in the graph then { blocked with >= k edges } Increasing Weight: Stack: a b Pick a node n to spill { lowest spill-cost/highest degree } e d Push n on stack a c defer decision f b * Remove n from the graph e c b f * while stack not empty do c a * Pop node n from stack d e * if n is colorable then f d make decision Allocate n to a register * blocked node else Insert spill code for n { Store after def; load before use } Reconstruct interference graph & start over CS553 Lecture Register Allocation III 13 CS553 Lecture Register Allocation III 14 Possible Register Allocation Design Example After Spilling t2 // missing stack setup // missing stack setup // callee-save assigns to temps Overall algorithm: graph coloring with simplification // callee-save assigns to temps t2 = $s0 t2 = $s0 Interference graph: two temps interfere if M[$fp-12] = t2 // procedure body – one is defined in a stmt and the other is live out of the same stmt // procedure body t3 = 1 – exception is a MOVE statement where the temps are the source and dest t3 = 1 Loop: Loop: if t3 > 20 goto End Coalesce : Briggs strategy if t3 > 20 goto End t3 = t3 + 1 – coalesce if new node will have fewer than K neighbors of significant t3 = t3 + 1 jal bar // defs: $v0 jal bar // defs: $v0 degree (>= K) and both nodes are involved in a move goto Loop goto Loop End: Spill heuristic : End: t4 = t3 + $v0 t4 = t3 + $v0 – spill the node with the lowest weight and break ties by spilling the node t5 = t4 + t3 t5 = t4 + t3 $v0 = t5 with the most total adjacent edges $v0 = t5 Simplification : // callee-save reads from temps // callee-save reads from temps $s0 = t2 – optimistic, push by increasing weight and feasibility, select blocked nodes t2 = M[$fp-12] using spill heuristic $s0 = t2 // sink stmt, uses callee-saves // and caller-saves Select : // sink stmt, uses callee-saves // missing stack cleanup and ret – attempt selection on everything in the stack before generating spill code // and caller-saves // missing stack cleanup and ret CS553 Lecture Register Allocation III 15 CS553 Lecture Register Allocation III 16 4
Example After coalesce Improvement #3: Live Range Splitting [Chow & Hennessy 84] After allocation // missing stack setup // missing stack setup Idea // callee-save assigns to temps // callee-save assigns to temps – Start with variables as our allocation unit M[$fp-12] = $s0 M[$fp-12] = $s0 – When a variable can’t be allocated, split it into multiple subranges for // procedure body // procedure body separate allocation t3 = 1 $s0 = 1 Loop: Loop: – Selective spilling: put some subranges in registers, some in memory if t3 > 20 goto End if $s0 > 20 goto End t3 = t3 + 1 – Insert memory operations at boundaries $s0 = $s0 + 1 jal bar // defs: $v0 jal bar // defs: $v0 goto Loop goto Loop End: End: Why is this a good idea? t4 = t3 + $v0 $v0 = $s0 + $v0 t5 = t4 + t3 $s0 = $v0 + $s0 $v0 = t5 $v0 = $s0 // callee-save reads from temps // callee-save reads from temps $s0 = M[$fp-12] $s0 = M[$fp-12] // sink stmt, uses callee-saves // sink stmt, uses callee-saves // and caller-saves // and caller-saves // missing stack cleanup and ret // missing stack cleanup and ret CS553 Lecture Register Allocation III 17 CS553 Lecture Register Allocation III 18 Improvement #4: Rematerialization [Chaitin 82]&[Briggs 84] Next Time Idea Lecture – Selectively re-compute values rather than loading from memory – Instruction scheduling – “Reverse CSE” Easy case – Value can be computed in single instruction, and – All operands are available Examples – Constants – Addresses of global variables – Addresses of local variables (on stack) CS553 Lecture Register Allocation III 19 CS553 Lecture Register Allocation III 20 5
Recommend
More recommend