1
play

1 Dependence Testing Dependence Testing: Simple Case Consider the - PowerPoint PPT Presentation

Java HotSpot VM Compiling for Parallelism & Locality Optimizations in Java HotSpot Server VM (a JIT) uses SSA: dead code, LICM, CSE, CP Last time range check elimination Data dependences and loops loop unrolling


  1. Java HotSpot VM Compiling for Parallelism & Locality Optimizations in Java HotSpot Server VM (a JIT) – uses SSA: dead code, LICM, CSE, CP Last time – range check elimination – Data dependences and loops – loop unrolling – instruction scheduling for the UltraSPARC III Today – OOP optimizations for Java reflection API – Finish data dependence analysis for loops – hot spot detection – virtual method inlining and dynamic deoptimization to undo Other features – generational copying collection, mark and compact or incremental for old objects – fast thread synchronization using "a breakthrough" CS553 Lecture Data Dependence Analysis 1 CS553 Lecture Data Dependence Analysis 2 Dependence Testing in General Algorithms for Solving the Dependence Problem Heuristics General code – GCD test (Banerjee76,Towle76): determines whether integer solution is do i 1 = l 1 ,h 1 possible, no bounds checking ... – Banerjee test (Banerjee 79): checks real bounds do i n = l n ,h n – Independent-variables test (pg. 820): useful when inequalities are not coupled A(f(i 1 ,...,i n )) = ... A(g(i 1 ,...,i n )) – I-Test (Kong et al. 90): integer solution in real bounds enddo – Lambda test (Li et al. 90): all dimensions simultaneously ... – Delta test (Goff et al. 91): pattern matches for efficiency enddo – Power test (Wolfe et al. 92): extended GCD and Fourier Motzkin combination Use some form of Fourier-Motzkin elimination for integers, There exists a dependence between iterations I=(i 1 , ..., i n ) and J=(j 1 , ..., j n ) exponential worst-case when – Parametric Integer Programming (Feautrier91) – f(I) = g(J) – Omega test (Pugh92) – (l 1 ,...l n ) < I,J < (h 1 ,...,h n ) CS553 Lecture Data Dependence Analysis 3 CS553 Lecture Data Dependence Analysis 4 1

  2. Dependence Testing Dependence Testing: Simple Case Consider the following code… Sample code do i = 1,5 do i = l,h A(3*i+2) = A(2*i+1)+1 A(a*i+c 1 ) = ... A(a*i+c 2 ) enddo enddo Question Dependence? – How do we determine whether one array reference depends on another – a*i 1 +c 1 = a*i 2 +c 2 , or across iterations of an iteration space? – a*i 1 – a*i 2 = c 2 -c 1 – Solution exists if a divides c 2 -c 1 CS553 Lecture Data Dependence Analysis 5 CS553 Lecture Data Dependence Analysis 6 Example GCD Test Idea Code i 1 – Generalize test to linear functions of iterators do i = l,h Code A(2*i+2) = A(2*i-2)+1 enddo do i = l i ,h i i 2 do j = l j ,h j Dependence? A(a 1 *i + a 2 *j + a 0 ) = ... A(b 1 *i + b 2 *j + b 0 ) ... 2*i 1 – 2*i 2 = -2 – 2 = -4 enddo (yes, 2 divides -4) enddo Again Kind of dependence? – a 1 *i 1 - b 1 *i 2 + a 2 *j 1 – b 2 *j 2 = b 0 – a 0 – Anti? i 2 + d = i 1 ⇒ d = -2 – Solution exists if gcd(a 1 ,a 2 ,b 1 ,b 2 ) divides b 0 – a 0 − Flow? i 1 + d = i 2 ⇒ d = 2 CS553 Lecture Data Dependence Analysis 7 CS553 Lecture Data Dependence Analysis 8 2

  3. Example Banerjee Test Code for (i=L; i<=U; i++) { x[a_0 + a_1*i] = ... ... = x[b_0 + b_1*i] do i = l i ,h i } do j = l j ,h j Does a_0 + a_1*i = b_0 + b_1*i’ for some real i and i’ ? A(4*i + 2*j + 1) = ... A(6*i + 2*j + 4) ... If so then (a_1*i - b_1*i’) = (b_0 - a_0) enddo enddo Determine upper and lower bounds on (a_1*i - b_1*i’) gcd(4,-6,2,-2) = 2 for (i=1; i<=5; i++) { x[i+5] = x[i]; } Does 2 divide 4-1? upper bound = a_1*max(i) - b_1 * min(i’) = 4 lower bound = a_1*min(i) - b_1*max(i’) = -4 b_0 - a_0 = CS553 Lecture Data Dependence Analysis 9 CS553 Lecture Data Dependence Analysis 10 Distance Vectors: Legality Direction Vector Definition Definition – A direction vector serves the same purpose as a distance vector when less – A dependence vector, v , is lexicographically nonnegative when the left- precision is required or available most entry in v is positive or all elements of v are zero – Element i of a direction vector is <, >, or = based on whether the source of Yes: (0,0,0), (0,1), (0,2,-2) the dependence precedes, follows or is in the same iteration as the target No: (-1), (0,-2), (0,-1,1) in loop i Example – A dependence vector is legal when it is lexicographically nonnegative (assuming that indices increase as we iterate) do i = 1,6 do j = 1,5 A(i,j) = A(i-1,j-1)+1 Why are lexicographically negative distance vectors illegal? enddo enddo What are legal direction vectors? j Direction vector: (<,<) i Distance vector: (1,1) CS553 Lecture Data Dependence Analysis 11 CS553 Lecture Data Dependence Analysis 12 3

  4. Loop-Carried Dependences Parallelization Definition Idea – A dependence D=(d 1 ,...d n ) is carried at loop level i if d i is the first nonzero – Each iteration of a loop may be executed in parallel if it carries no element of D dependences Example do i = 1,6 Example (different from last slide) Iteration Space do j = 1,6 do i = 1,6 A(i,j) = B(i-1,j)+1 do j = 1,5 B(i,j) = A(i,j-1)*2 A(i,j) = B(i-1,j-1)+1 enddo B(i,j) = A(i,j-1)*2 enddo enddo Distance vectors: (0,1) for accesses to A enddo (1,0) for accesses to B j Loop-carried dependences i Parallelize i loop? Distance Vectors : – The j loop carries dependence due to A (0,1) for A (flow) – The i loop carries dependence due to B (1,1) for B (flow) CS553 Lecture Data Dependence Analysis 13 CS553 Lecture Data Dependence Analysis 14 Scalar Expansion: Motivation Scalar Expansion Problem Idea – Loop-carried dependences inhibit parallelism – Eliminate false dependences by introducing extra storage – Scalar references result in loop-carried dependences Example Example do i = 1,6 T(i) = A(i) + B(i) do i = 1,6 C(i) = T(i) + 1/T(i) t = A(i) + B(i) enddo C(i) = t + 1/t enddo i i Can this loop be parallelized? Can this loop be parallelized? No. What kind of dependences are these? Anti dependences. Disadvantages? Convention for these slides: Arrays start with upper case letters, scalars do not CS553 Lecture Data Dependence Analysis 15 CS553 Lecture Data Dependence Analysis 16 4

  5. Scalar Expansion Details Example 2: Parallelization (reprise) Restrictions Why can’t this loop be parallelized? – The loop must be a countable loop i.e. The loop trip count must be independent of the body of the loop do i = 1,100 1 2 3 4 5 ... – The expanded scalar must have no upward exposed uses in the loop A(i) = A(i-1)+1 i do i = 1,6 enddo Distance Vector: (1) print(t) t = A(i) + B(i) Why can this loop be parallelized? C(i) = t + 1/t enddo do i = 1,100 − Nested loops may require much more storage 1 2 3 4 5 ... A(i) = A(i)+1 − When the scalar is live after the loop, we must move the correct array i enddo value into the scalar Distance Vector: (0) CS553 Lecture Data Dependence Analysis 17 CS553 Lecture Data Dependence Analysis 18 Example 1: Loop Permutation (reprise) Concepts Sample code Improve performance by ... – improving data locality – parallelizing the computation do j = 1,6 do i = 1,5 do i = 1,5 do j = 1,6 A(j,i) = A(j,i)+1 A(j,i) = A(j,i)+1 enddo enddo Data Dependence Testing enddo enddo – general formulation of the problem – GCD test and Banerjee test Why is this legal? – No loop-carried dependences, so we can arbitrarily change order of Data Dependences iteration execution – iteration space – distance vectors and direction vectors – loop carried CS553 Lecture Data Dependence Analysis 19 CS553 Lecture Data Dependence Analysis 20 5

  6. Next Time Lecture – Loop transformations for parallelism and locality Suggested Exercises – 11.3.2, 11.3.3, 11.6.2, 11.6.5, examples in slides CS553 Lecture Data Dependence Analysis 21 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend