A Unified Framework for Schedule and Storage Optimization William - - PowerPoint PPT Presentation

a unified framework for schedule and storage optimization
SMART_READER_LITE
LIVE PREVIEW

A Unified Framework for Schedule and Storage Optimization William - - PowerPoint PPT Presentation

A Unified Framework for Schedule and Storage Optimization William Thies, Frdric Vivien*, Jeffrey Sheldon, and Saman Amarasinghe MIT Laboratory for Computer Science * ICPS/LSIIT, Universit Louis Pasteur http://compiler.lcs.mit.edu/aov


slide-1
SLIDE 1

A Unified Framework for Schedule and Storage Optimization

William Thies, Frédéric Vivien*, Jeffrey Sheldon, and Saman Amarasinghe

MIT Laboratory for Computer Science * ICPS/LSIIT, Université Louis Pasteur http://compiler.lcs.mit.edu/aov

slide-2
SLIDE 2

Motivating Example

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

j

slide-3
SLIDE 3

Motivating Example

j t = 1 (i, j) = (1, 1)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-4
SLIDE 4

Motivating Example

j t = 2 (i, j) = (1, 2)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-5
SLIDE 5

Motivating Example

j t = 3 (i, j) = (1, 3)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-6
SLIDE 6

Motivating Example

j t = 4 (i, j) = (1, 4)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-7
SLIDE 7

Motivating Example

j t = 5 (i, j) = (1, 5)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-8
SLIDE 8

Motivating Example

j t = 6 (i, j) = (2, 1)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-9
SLIDE 9

Motivating Example

j t = 7 (i, j) = (2, 2)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-10
SLIDE 10

Motivating Example

j t = 8 (i, j) = (2, 3)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-11
SLIDE 11

Motivating Example

j t = 9 (i, j) = (2, 4)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-12
SLIDE 12

Motivating Example

j t = 10 (i, j) = (2, 5)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-13
SLIDE 13

Motivating Example

j t = 25 (i, j) = (5, 5)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

slide-14
SLIDE 14

Motivating Example

j

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

Array Expansion i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

slide-15
SLIDE 15

Motivating Example

j Array Expansion t = 1 (i, j) = (1, 1) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-16
SLIDE 16

Motivating Example

j Array Expansion t = 2 (i, j) = {(1, 2), (2, 1)} i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-17
SLIDE 17

Motivating Example

j Array Expansion i j t = 3 (i, j) = {(1, 3), (2, 2), (3, 1)}

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-18
SLIDE 18

Motivating Example

j Array Expansion i j t = 4 (i, j) = {(1, 4), (2, 3), (3, 2), (4, 1)}

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-19
SLIDE 19

Motivating Example

j Array Expansion i j t = 5 (i, j) = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)}

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-20
SLIDE 20

Motivating Example

j Array Expansion i j t = 6 (i, j) = {(2, 5), (3, 4), (4, 3), (5, 2)}

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-21
SLIDE 21

Motivating Example

j Array Expansion i j t = 7 (i, j) = {(3, 5), (4, 4), (5, 3)}

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-22
SLIDE 22

Motivating Example

j Array Expansion i j t = 8 (i, j) = {(4, 5), (5, 4)}

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-23
SLIDE 23

Motivating Example

j Array Expansion i j t = 9 (i, j) = (5, 5)

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2])

t = 25 (i, j) = (5, 5)

init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2])

slide-24
SLIDE 24
  • Increasing storage can enable parallelism

– But storage can be expensive

  • Phase ordering problem

– Optimizing for storage restricts parallelism – Maximizing parallelism restricts storage options – Too complex to consider all combinations

Need efficient framework to integrate schedule and storage optimization

Parallelism/Storage Tradeoff

Cache RAM Disk

slide-25
SLIDE 25

Outline

  • Abstract problem
  • Simplifications
  • Concrete problem
  • Solution Method
  • Conclusions
slide-26
SLIDE 26
  • Given DAG of dependent operations

– Must execute producers before consumers – Must store a value until all consumers execute

Abstract Problem

  • Two parameters control execution:
  • 1. A scheduling function θ
  • Maps each operation to execution time
  • Parallelism is implicit
  • 2. A fully associative store of size m
slide-27
SLIDE 27
  • We can ask three questions:

Abstract Problem

  • Two parameters control execution:
  • 1. A scheduling function θ
  • Maps each operation to execution time
  • Parallelism is implicit
  • 2. A fully associative store of size m
slide-28
SLIDE 28
  • We can ask three questions:
  • 1. Given θ, what is the smallest m?

Abstract Problem

  • Two parameters control execution:
  • 1. A scheduling function θ
  • Maps each operation to execution time
  • Parallelism is implicit
  • 2. A fully associative store of size m
slide-29
SLIDE 29
  • We can ask three questions:
  • 1. Given θ, what is the smallest m?
  • 2. Given m, what is the “best” θ?

Abstract Problem

  • Two parameters control execution:
  • 1. A scheduling function θ
  • Maps each operation to execution time
  • Parallelism is implicit
  • 2. A fully associative store of size m
slide-30
SLIDE 30
  • We can ask three questions:
  • 1. Given θ, what is the smallest m?
  • 2. Given m, what is the “best” θ?
  • 3. What is the smallest m that is valid for all legal θ?

Abstract Problem

  • Two parameters control execution:
  • 1. A scheduling function θ
  • Maps each operation to execution time
  • Parallelism is implicit
  • 2. A fully associative store of size m
slide-31
SLIDE 31

Outline

  • Abstract problem
  • Simplifications
  • Concrete problem
  • Solution Method
  • Conclusions
slide-32
SLIDE 32

Simplifying the Schedule

  • Real programs aren’t DAG’s

– Dependence graph is parameterized by loops – Too many nodes to schedule

  • Size could even be unknown (symbolic constants)
  • Use classical solution: affine schedules

– Each statement has a scheduling function θ – Each θ is an affine function of the enclosing loop counters and symbolic constants – To simplify talk, ignore symbolic constants: θ(i) = B • i

slide-33
SLIDE 33

Simplifying the Storage Mapping

  • Programs use arrays, not associative maps

– If size decreases, need to specify which elements are mapped to the same location

slide-34
SLIDE 34

Simplifying the Storage Mapping

  • Programs use arrays, not associative maps

– If size decreases, need to specify which elements are mapped to the same location

slide-35
SLIDE 35

Simplifying the Storage Mapping

  • Programs use arrays, not associative maps

– If size decreases, need to specify which elements are mapped to the same location

slide-36
SLIDE 36

Simplifying the Storage Mapping

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-37
SLIDE 37

Simplifying the Storage Mapping

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-38
SLIDE 38

Simplifying the Storage Mapping

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-39
SLIDE 39

Simplifying the Store

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-40
SLIDE 40

Simplifying the Store

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-41
SLIDE 41

Simplifying the Store

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-42
SLIDE 42

Simplifying the Store

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-43
SLIDE 43

Simplifying the Store

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-44
SLIDE 44

Simplifying the Store

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

Occupancy Vectors (Strout et al.)

slide-45
SLIDE 45

Simplifying the Store

  • Specifies unit of overwriting within an array
  • Locations collapsed if separated by a

multiple of v

j i v = (1, 1)

2

: Original n 2 : d Transforme n

Occupancy Vectors (Strout et al.)

slide-46
SLIDE 46

Simplifying the Store

  • For a given schedule, v is valid if semantics

are unchanged using transformed array

  • Shorter vectors require less storage

j i v = (1, 1)

2

: Original n 2 : d Transforme n

Occupancy Vectors (Strout et al.)

slide-47
SLIDE 47

Outline

  • Abstract problem
  • Simplifications
  • Concrete problem
  • Solution Method
  • Conclusions
slide-48
SLIDE 48

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

i j

slide-49
SLIDE 49

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

i j

slide-50
SLIDE 50

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-51
SLIDE 51

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-52
SLIDE 52

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-53
SLIDE 53

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-54
SLIDE 54

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-55
SLIDE 55

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-56
SLIDE 56

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-57
SLIDE 57

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-58
SLIDE 58

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-59
SLIDE 59

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-60
SLIDE 60

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-61
SLIDE 61

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-62
SLIDE 62

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Solution: v = (1, 1)

i j

slide-63
SLIDE 63

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

slide-64
SLIDE 64

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

slide-65
SLIDE 65

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

slide-66
SLIDE 66

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

slide-67
SLIDE 67

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

slide-68
SLIDE 68

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

slide-69
SLIDE 69

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

slide-70
SLIDE 70

Answering Question #1

  • Given θ(i, j) = i + j, what is the shortest valid
  • ccupancy vector v?

Why not v = (0, 1)?

i j

???

slide-71
SLIDE 71

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

i j

slide-72
SLIDE 72

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

θ(i, j) is between: θ(i, j) = 2 ∗ i + j (inclusive) θ(i, j) = i (exclusive)

i j

slide-73
SLIDE 73

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

θ(i, j) is between: θ(i, j) = 2 ∗ i + j (inclusive) θ(i, j) = i (exclusive)

slide-74
SLIDE 74

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

θ(i, j) is between: θ(i, j) = 2 ∗ i + j (inclusive) θ(i, j) = i (exclusive)

slide-75
SLIDE 75

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

θ(i, j) is between: θ(i, j) = 2 ∗ i + j (inclusive) θ(i, j) = i (exclusive)

slide-76
SLIDE 76

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

θ(i, j) is between: θ(i, j) = 2 ∗ i + j (inclusive) θ(i, j) = i (exclusive)

slide-77
SLIDE 77

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

θ(i, j) is between: θ(i, j) = 2 ∗ i + j (inclusive) θ(i, j) = i (exclusive)

slide-78
SLIDE 78

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

θ(i, j) is between: θ(i, j) = 2 ∗ i + j (inclusive) θ(i, j) = i (exclusive)

slide-79
SLIDE 79

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

slide-80
SLIDE 80

j i

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

slide-81
SLIDE 81

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-82
SLIDE 82

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-83
SLIDE 83

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-84
SLIDE 84

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-85
SLIDE 85

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-86
SLIDE 86

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-87
SLIDE 87

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-88
SLIDE 88

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-89
SLIDE 89

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-90
SLIDE 90

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-91
SLIDE 91

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-92
SLIDE 92

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-93
SLIDE 93

Answering Question #2

  • Given v = (0, 1), what is the range of valid

schedules θ?

Lets try θ(i, j) = 2 ∗ i + j

j i

slide-94
SLIDE 94

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

i j

slide-95
SLIDE 95

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-96
SLIDE 96

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-97
SLIDE 97

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-98
SLIDE 98

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-99
SLIDE 99

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-100
SLIDE 100

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-101
SLIDE 101

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-102
SLIDE 102

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-103
SLIDE 103

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ

i j

slide-104
SLIDE 104

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ v = (2, 1)

i j

slide-105
SLIDE 105

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ v = (2, 1)

i j

slide-106
SLIDE 106

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ v = (2, 1)

i j

slide-107
SLIDE 107

Answering Question #3

  • What is the shortest v that is valid for all legal

affine schedules?

Range of legal θ v = (2, 1)

  • Def: v is an affine occupancy vector (AOV)

i j

slide-108
SLIDE 108

Outline

  • Abstract problem
  • Simplifications
  • Concrete problem
  • Solution Method
  • Conclusions
slide-109
SLIDE 109

i

Schedule Constraints

  • Dependence analysis yields:

– iteration i depends on iteration h(i) – h is an affine function

  • Consumer must execute

after producer

i1 h(i)

Schedule Constraint θ(i) ≥ θ(h(i)) + 1

i2

slide-110
SLIDE 110

Storage Constraints

i i1 i2 h(i)

slide-111
SLIDE 111

Storage Constraints

i i1 i2 h(i) v

slide-112
SLIDE 112

Storage Constraints

i i1 i2 h(i) v

slide-113
SLIDE 113

Storage Constraints

i i1 i2 h(i) v v

dynamic single assignment for i = 1 to n for j = 1 to n A[i][j] = … B[i][j] = …

slide-114
SLIDE 114

Storage Constraints

i i1 i2 h(i) v Consumer: i Producer: h(i)

slide-115
SLIDE 115

Storage Constraints

i i1 i2 h(i) v Consumer: i Producer: h(i) h(i).+.v Consumer: i Producer: h(i) Overwriting producer: h(i) + v

slide-116
SLIDE 116

Storage Constraints

i i1 i2 h(i) v Consumer: i Producer: h(i) h(i).+.v Consumer: i Producer: h(i) Overwriting producer: h(i) + v Consumer must execute before producer is overwritten

slide-117
SLIDE 117

Storage Constraints

i i1 i2 h(i) v Consumer: i Producer: h(i) h(i).+.v Consumer: i Producer: h(i) Overwriting producer: h(i) + v Consumer must execute before producer is overwritten

Storage Constraint θ(i) ≤ θ(h(i) + v)

slide-118
SLIDE 118

The Constraints

  • A given (θ, v) combination is valid if

– For all dependences h, – For all iterations i in the program: θ(i) ≥ θ(h(i)) + 1 schedule constraint θ(i) ≤ θ(h(i) + v) storage constraint

  • Given θ, want to find v satisfying constraints

– Might look simple, but it is not – Too many i’s and n’s to enumerate! – Need to reduce the number of constraints

slide-119
SLIDE 119

The Constraints

  • A given (θ, v) combination is valid if

– For all dependences h, – For all iterations i in the program: θ(i) ≥ θ(h(i)) + 1 schedule constraint θ(i) ≤ θ(h(i) + v) storage constraint

  • Given θ, want to find v satisfying constraints

– Might look simple, but it is not – Too many i’s to enumerate! – Need to reduce the number of constraints

slide-120
SLIDE 120

The Vertex Method (1-D)

  • An affine function is non-negative within an

interval [x1, x2] iff it is non-negative at x1 and x2

x1 x2

slide-121
SLIDE 121

The Vertex Method (1-D)

  • An affine function is non-negative over an

unbounded interval [x1, ∞) iff it is non-negative at x1 and is non-decreasing along the interval

x1

slide-122
SLIDE 122

The Vertex Method

  • The same result holds in higher dimensions

– An affine function is nonnegative over a bounded polyhedron D iff it is nonnegative at vertices of D

slide-123
SLIDE 123

Applying the Method (Quinton87)

  • Recall the storage constraints

– For all iterations i in the program: θ(i) ≤ θ(h(i) + v) – Re-arrange: 0 ≤ θ(h(i) + v) - θ(i)

  • The right hand side is:
  • 1. An affine function of i
  • 2. Nonnegative over the domain D of iterations

We can apply the vertex method

slide-124
SLIDE 124

Applying the Method

  • Replace i with the vertices w of its domain:

i1 i2 w1 w2 w3 w4 θ(h(i) + v) - θ(i)

iteration space

∀i∈D, θ(h(i) + v) - θ(i) ≥ 0 θ(h(w1) + v) - θ(w1) ≥ 0 θ(h(w2) + v) - θ(w2) ≥ 0 θ(h(w3) + v) - θ(w3) ≥ 0 θ(h(w4) + v) - θ(w4) ≥ 0

slide-125
SLIDE 125

The Reduced Constraints

  • Apply same method to schedule constraints
  • Now a given (θ, v) combination is valid if

– For all dependences h, – For all vertices w of the iteration domain: θ(w) ≥ θ(h(w)) + 1 schedule constraint θ(w) ≤ θ(h(w) + v) storage constraint

  • These are linear constraints

– θ and v are variables; h and w are constants – Given θ, constraints are linear in v (& vice-versa)

slide-126
SLIDE 126

θ(w(z)) ≥ θ(h(w(z))) + 1 schedule constraint θ(h(w(z)) + v) ≥ θ(w(z)) storage constraint θ(w) ≥ θ(h(w)) + 1 schedule constraint θ(w) ≤ θ(h(w) + v) storage constraint

Answering the Questions

  • 1. Given θ, we can “minimize” |v|
  • Linear programming problem
  • 2. Given v, we can find a “good” θ
  • Feautrier, 1992
  • 3. To find an AOV... still too many constraints!
  • For all θ satisfying the schedule constraints:

v must satisfy the storage constraints

slide-127
SLIDE 127

θ(w(z)) ≥ θ(h(w(z))) + 1 schedule constraint θ(h(w(z)) + v) ≥ θ(w(z)) storage constraint θ(w) ≥ θ(h(w)) + 1 schedule constraint θ(w) ≤ θ(h(w) + v) storage constraint

Finding an AOV

  • Apply the vertex method again!
  • Schedule constraints define domain of valid θ
  • Storage constraints can be written as a non-

negative affine function of components of θ: – Expand θ(i) = B • i B • w ≤ B • (h(w) + v) – Simplify (h(w) + v – w) • B ≥ 0

slide-128
SLIDE 128

Finding an AOV

  • Our constraints are now as follows:

– For all dependences h, – For all vertices w of the iteration domain, – For all vertices t of the space of valid schedules: t • w ≤ t • (h(w) + v) AOV constraint

  • Can find “shortest” AOV with linear program

– Finite number of constraints – h, w, and t are known constants

slide-129
SLIDE 129

The Big Picture

Input program Affine Dependences dependence analysis Schedule & Storage Constraints Constraints without i vertex method Constraints without θ vertex method Given θ, find v Given v, find θ Find an AOV, valid for all θ linear program linear program

slide-130
SLIDE 130

Details in Paper

  • Symbolic constants
  • Inter-statement dependences across loops
  • Farkas’ Lemma for improved efficiency
slide-131
SLIDE 131

Related Work

  • Universal Occupancy Vector (Strout et al.)

– Valid for all schedules, not just affine ones – Stencil of dependences in single loop nest

  • Storage for ALPHA programs (Quilleré,

Rajopadhye, Wilde)

– Polyhedral model, with occupancy vector analog – Assume schedule is given

  • PAF compiler (Cohen, Lefebvre, Feautrier)

– Minimal expansion → scheduling → contraction – Storage mapping A[i mod x][j mod y]

slide-132
SLIDE 132

Future Work

  • Allow affine left hand side references

– A[2∗j][n-i] = …

  • Consider multi-dimensional time schedules
  • Collapse multiple dimensions of storage
slide-133
SLIDE 133

Conclusions

  • Unified framework for determining:
  • 1. A good storage mapping for a given schedule
  • 2. A good schedule for a given storage mapping
  • 3. A good storage mapping for all valid schedules
  • Take away: representations and techniques
  • Occupancy vectors
  • Affine schedules
  • Vertex method