Uninterpreted Functions: Their use in Code Transformation - - PowerPoint PPT Presentation

uninterpreted functions their use in code transformation
SMART_READER_LITE
LIVE PREVIEW

Uninterpreted Functions: Their use in Code Transformation - - PowerPoint PPT Presentation

Uninterpreted Functions: Their use in Code Transformation Catherine Olschanowsky Mary Hall Michelle Strout CHiLL-I/E Team, Collaborators, and Funding Mary Hall (University of Utah) Michelle Strout (University of Arizona) Catherine


slide-1
SLIDE 1

Uninterpreted Functions: Their use in Code Transformation

Catherine Olschanowsky Mary Hall Michelle Strout

slide-2
SLIDE 2

2

Mary Hall (University of Utah) Michelle Strout (University of Arizona) Catherine Olschanowsky (Boise State Univ.) Mahdi Soltan Mohammadi (Univ. of Arizona) Payal Nandi (University of Utah) Eddie Davis (Boise State University) Wei He (University of Arizona) Jongsoo Park, Hongbo Rong, Raj Barik (Intel) Anand Venkat (Intel, PhD in 2016 at Utah)

CHiLL-I/E Team, Collaborators, and Funding

This work was supported in part by NSF grant CCF-1563732.

slide-3
SLIDE 3

I/E Transformations

Inspector 1 (e.g. index set splitting) Inspector 2 (e.g. compact-and-pad) Irregular Computation Executor (Transformed Irregular Computation) Inspector K

Compile time Runtime

Composed Inspector Explicit Functions Explicit Functions Explicit Functions Programmer

  • Defined

CUDA-CHiLL CHiLL compiler Inspector/Executor API Index Arrays CHiLL Transformation Script Sparse Polyhedral Framework Compositions of Loop and Data Transformations

Inspectors: Traverse index arrays at runtime collecting information and generating new index arrays. Executors: Execute the original computation using the information and/or index arrays produced.

slide-4
SLIDE 4

Wavefront parallelism [Mirchandaney 88] [Rauchwerger 98] [Zhuang 09] [Park 14] Distributed memory parallelism [Saltz 91] [Basumallik 09] [Ravishankar 12] Automatic dense-to-sparse data transformation [Mateev 00] [Pugh 98] [Arnold 10] Data and iteration reordering of parallel and reduction loops for improved data locality

[Ding 99] [Mitchell 99] [Mellor-Crummey 01] [Han 06]

Sparse tiling for aggregating across loops [Douglas 00] (Strout 01) [Mohiyuddin 09] (Krieger 13)

Effective at Improving Performance

slide-5
SLIDE 5

Fast, Compiler Generated I/Es Require a Common Abstraction

Common Among

  • The Inspector
  • The Executor
  • The Loop Transformation Framework

Uninterpreted Function @ compile time Explicit Functions @ runtime

Performance vs. Generality

Faster, specifically optimized I/E Slower, compiler generated

slide-6
SLIDE 6

Problem: need to modify current approaches to ...

Express inspector-executor transformations (Cathie) Perform data dependence analysis (Michelle) Express sparse data transformations (Mary)

Approach: Uninterpreted functions to represent

Non-affine loop bounds Memory accesses Run-time reordering functions Run-time groupings

Transformation Framework for Sparse Codes

slide-7
SLIDE 7

Loop transformation framework built on the polyhedral model Uses uninterpreted functions to represent index arrays Enables the composition of inspector-executor transformations Exposes opportunities for compiler to simplify indirect array accesses

Sparse Polyhedral Framework (SPF)

slide-8
SLIDE 8

SPF: Uninterpreted Functions Represent Index Arrays

Iteration space (CSR)

I = {[i, k]|0 ≤ i < n ∧ rowptr(i) ≤ k < rowptr(i + 1)}

Uninterpreted Function

A x y 4 7 9 3 1

// Dense matrix vector mult. for (i = 0; i < N; i++) { for (j = 0; j < N; j++) y[i] += A[i][j] * x[j]; } } // sparse matrix vector mult. (SpMV) for (i=0; i<n; i++) { for(k=rowptr[i];k<rowptr[i+1];k++){ y[i] += val[k]*x[col[k]]; } }

val: rowptr: col: 0 1 3 1 4 0 3

1 2 3 4 5

4 7 9 3 1 2 2 5 5 6 1 6 6 y = A*x

slide-9
SLIDE 9

// SpMV for CSR (Compressed Sparse Row) for (i=0; i<n; i++){ for (k=rowptr[i]; k<rowptr[i+1]; k++){ y[i] += a[k]*x[col[k]]; }} // Inspector code NNZ = count( rowptr ) c = order( rowptr ) c_inv = inverse( c ) // Executor code // SpMV for COO (Coordinate Storage) for (k'=0; k'<NNZ; k'++) { y[c_inv[k'][0]] += a[c_inv[k'][1]]*x[col[c_inv[k'][1]]]; }

SPF: Representing Inspector-Executor Transformations with Uninterpreted Functions

Coalesce Transformation

T = {[i, k] → [k0] | k0 = c(i, k) ∧0 ≤ k0 < NNZ} NNZ = count(I) c =

  • rder(I)

Old Iterators as Function

  • f New Iterator

i = c1(k0)[0] k = c1(k0)[1]

slide-10
SLIDE 10

UFs within a framework

Omega+ IEGenLib

Stop by Eddie’s poster to learn about his proposed IR