loop transformations for parallelism locality
play

Loop Transformations for Parallelism & Locality Previously - PDF document

Loop Transformations for Parallelism & Locality Previously Loop transformations, unimodular transformation framework Loop interchange/permutation Loop reversal Checking transformation legality Today


  1. Loop Transformations for Parallelism & Locality Previously – � Loop transformations, unimodular transformation framework – � Loop interchange/permutation – � Loop reversal – � Checking transformation legality � Today – � Loop transformations and transformation frameworks – � Loop skewing – � Using Fourier-Motzkin Elimination for code generation CS553 Lecture 1 Why Transformation Frameworks? � Currently – � Frameworks used in compiler to … – � abstract loops, memory accesses, and data dependences in loop – � specify the effect of a sequence of loop transformations on the loop, its memory accesses, and its data dependences – � generate code from the transformed loop – � Loop transformations affect the schedule of the loop � Future – � How can framework technology be exposed in the programming model? � Frameworks – � Unimodular – � Polyhedral – � Presburger – � Sparse Polyhedral CS553 Lecture Loop Transformations 2

  2. Frameworks for Loop Transformations � Unimodular Loop Transformations [Banerjee 90],[Wolf & Lam 91] – � can represent loop permutation, loop reversal, and loop skewing – � unimodular linear mapping (determinant of matrix is + or - 1) – � T i = i’, T is a matrix, i and i’ are iteration vectors – � transformation is legal if the transformed dependence vector remain lexicographically positive – � limitations – � only perfectly nested loops – � all statements are transformed the same CS553 Lecture 3 Loop Skewing Original code � do i = 1,6 do j = 1,5 � j A(i,j) = A(i-1,j+1)+1 i � enddo � � enddo (1, -1) � Distance vector: � Can we permute the original loop? � Skewing: j’ i’ CS553 Lecture 4

  3. Transforming the Dependences and Array Accesses Original code � do i = 1,6 do j = 1,5 � A(i,j) = A(i-1,j+1)+1 � j enddo � i � enddo � Dependence vector: � New Array Accesses: j’ i’ CS553 Lecture 5 Transforming the Loop Bounds Original code � do i = 1,6 do j = 1,5 � A(i,j) = A(i-1,j+1)+1 � j enddo � � enddo i � Bounds: Transformed code � do i’ = 1,6 do j’ = 1+i’,5+i’ j’ � A(i’,j’-i’) = A(i’-1,j’-i’+1)+1 � enddo � � enddo i’ CS553 Lecture 6

  4. Code Generation � Goals – � express outermost loop bounds in terms of symbolic constants and constants – � express inner loop bounds in terms of any enclosing loop variables, symbolic constants, and constants � Approach – � Project out inner loop iteration variables to determine loop bounds for outer loops – � Fourier Motzkin elimination is the algorithm that projects a variable out of a polyhedron CS553 Lecture 7 Fourier-Motzkin Elimination: The Idea 1 >= i � Polyhedron – � convex intersection of a set of j <=5 inequalities – � model for iteration spaces i <= j j � Problem – � given a polyhedron how do we generate loop bounds that i scan all of its points? – � example: two possible loop orders – � ( i , j ) – � ( j , i ) CS553 Lecture 8

  5. Fourier-Motzkin Elimination: The Algorithm � FM( P, i_k ) => P’ Input: Output: Algorithm: for each lower bound of for each upper bound of CS553 Lecture 9 Distinguishing Upper and Lower Bounds � Simple Algorithm – � given that the polyhedron is represented as follows: – � any constraint with a positive coefficient for i_k is a lower bound – � any constraint with a negative coefficient for i_k is an upper bound j <=5 i <= j j CS553 Lecture 10 1 >= i i

  6. Triangular Iteration Space Example � ( i, j ) for target iteration space j <=5 i <= j j i � ( j, i ) for target iteration space 1 >= i CS553 Lecture 11 General Algorithm for Generating Loop Bounds Input: where the i vector is the desired loop order Output: Algorithm: for k = d to 1 by -1 CS553 Lecture 12

  7. Loop Skewing and Permutation Original code � do i = 1,6 do j = 1,5 � j A(i,j) = A(i-1,j+1)+1 i � enddo � � enddo (1, -1) � Distance vector: � Skewing followed by Permutation: i’ j’ CS553 Lecture 13 Transforming the Dependences and Array Accesses Original code do i = 1,6 � do j = 1,5 � A(i,j) = A(i-1,j+1)+1 � j enddo � i enddo � � Dependence vector: � New Array Accesses: i’ j’ CS553 Lecture 14

  8. Transforming the Loop Bounds Original code � do i = 1,6 do j = 1,5 � A(i,j) = A(i-1,j+1)+1 � j enddo � � enddo i � Bounds: Transformed code (use general loop bound alg) � do i’ = 2,11 do j’ = max(i’-5,1), min(6,i’-1) � i’ A(j’,i’-j’) = A(j’-1,i’-j’+1)+1 � enddo � � enddo j’ CS553 Lecture 15 Wavefront Parallelism Example � Example � do i = 1,6 do j = 1,min(5,7-i) � A(i,j) = A(i-1,j-1) � � + A(i,j-1) j enddo � � enddo i Iteration Space � Goal – � Determine a unimodular transformation that enables indicating that the inner loop is fully parallel. (with an OpenMP directive for example) � do i’ = 1,5 do j’ = 1, 7-i’ (parallel) � A(j’,i’) = A(j’-1,i’-1) � + A(j’,i’-1) enddo � enddo CS553 Lecture 16

  9. Concepts � Unimodular transformation framework – � represents loop permutation, loop reversal, and loop skewing – � provides mathematical framework for ... – � testing transformation legality, – � transforming array accesses and loop bounds, – � and combining transformations Fourier-Motzkin Elimination – � algorithm – � using for code generation Loop bounds – � how to determine upper and lower bounds for a variable when bounds are in matrix format � Examples – � triangular matrix, skew and permute example, and wavefront example CS553 Lecture 17 Next Time � Lecture – � More loop transformations – � Another transformation framework � CS553 Lecture Loop Transformations 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend