PetaBricks A Language and Compiler for Algorithmic Choice Jason - - PowerPoint PPT Presentation

petabricks
SMART_READER_LITE
LIVE PREVIEW

PetaBricks A Language and Compiler for Algorithmic Choice Jason - - PowerPoint PPT Presentation

PetaBricks A Language and Compiler for Algorithmic Choice Jason Ansel Cy Chan Yee Lok Wong Marek Olszewski Qin Zhao Alan Edelman Saman Amarasinghe MIT - CSAIL June 16, 2009 Jason Ansel (MIT) PetaBricks June 16, 2009 1 / 47


slide-1
SLIDE 1

PetaBricks

A Language and Compiler for Algorithmic Choice Jason Ansel Cy Chan Yee Lok Wong Marek Olszewski Qin Zhao Alan Edelman Saman Amarasinghe

MIT - CSAIL

June 16, 2009

Jason Ansel (MIT) PetaBricks June 16, 2009 1 / 47

slide-2
SLIDE 2

Introduction Motivating Example

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 2 / 47

slide-3
SLIDE 3

Introduction Motivating Example

Algorithmic choice

Mergesort (N-way)

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-4
SLIDE 4

Introduction Motivating Example

Algorithmic choice

Mergesort (N-way)

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-5
SLIDE 5

Introduction Motivating Example

Algorithmic choice

Insertionsort Mergesort (N-way)

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-6
SLIDE 6

Introduction Motivating Example

Algorithmic choice

Insertionsort Radixsort Mergesort (N-way)

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-7
SLIDE 7

Introduction Motivating Example

Algorithmic choice

Quicksort Quicksort Insertionsort Radixsort Mergesort (N-way)

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-8
SLIDE 8

Introduction Motivating Example

Algorithmic choice

Quicksort Quicksort Insertionsort Radixsort Mergesort (N-way)

@15 N=2

STL Algorithm

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-9
SLIDE 9

Introduction Motivating Example

Algorithmic choice

Quicksort Quicksort Insertionsort Radixsort Mergesort (N-way)

@98 @75 N=4

Xeon (1 core)

Optimized For:

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-10
SLIDE 10

Introduction Motivating Example

Algorithmic choice

Quicksort Quicksort Insertionsort Radixsort Mergesort (N-way)

@98 @75 N=4

Xeon (1 core)

Optimized For:

@1420 @ 6 N=2

Xeon (8 cores)

Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-11
SLIDE 11

Introduction Motivating Example

Algorithmic choice

Quicksort Quicksort Insertionsort Radixsort Mergesort (N-way)

@98 @75 N=4

Xeon (1 core)

Optimized For:

@1420 @ 6 N=2

Xeon (8 cores) Niagra (8 cores)

@75 @1461 @2400 N=2,4,8,16 Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-12
SLIDE 12

Introduction Motivating Example

Algorithmic choice

Quicksort Quicksort Insertionsort Radixsort Mergesort (N-way)

@98 @75 N=4

Xeon (1 core)

Optimized For:

@1420 @ 6 N=2

Xeon (8 cores) Niagra (8 cores)

@75 @1461 @2400 N=2,4,8,16

Core 2 (2 cores)

@150 @600 @ 1 2 9 5 N=2,4,8 @38400 Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-13
SLIDE 13

Introduction Motivating Example

Algorithmic choice

Quicksort Quicksort Quicksort Quicksort Quicksort Quicksort Quicksort Quicksort Quicksort Quicksort Insertionsort Radixsort Mergesort (N-way)

@98 @75 N=4

Xeon (1 core)

Optimized For:

@1420 @ 6 N=2

Xeon (8 cores) Niagra (8 cores)

@75 @1461 @2400 N=2,4,8,16

Core 2 (2 cores)

@150 @600 @ 1 2 9 5 N=2,4,8 @38400 Jason Ansel (MIT) PetaBricks June 16, 2009 3 / 47

slide-14
SLIDE 14

Introduction Motivating Example

The PetaBricks language

The case for autotuning is obvious

Jason Ansel (MIT) PetaBricks June 16, 2009 4 / 47

slide-15
SLIDE 15

Introduction Motivating Example

The PetaBricks language

The case for autotuning is obvious How should the programmer represent choice?

Jason Ansel (MIT) PetaBricks June 16, 2009 4 / 47

slide-16
SLIDE 16

Introduction Motivating Example

The PetaBricks language

The case for autotuning is obvious How should the programmer represent choice? We present the PetaBricks programming language and compiler:

Jason Ansel (MIT) PetaBricks June 16, 2009 4 / 47

slide-17
SLIDE 17

Introduction Motivating Example

The PetaBricks language

The case for autotuning is obvious How should the programmer represent choice? We present the PetaBricks programming language and compiler:

Choice as a fundamental language construct

Jason Ansel (MIT) PetaBricks June 16, 2009 4 / 47

slide-18
SLIDE 18

Introduction Motivating Example

The PetaBricks language

The case for autotuning is obvious How should the programmer represent choice? We present the PetaBricks programming language and compiler:

Choice as a fundamental language construct Autotuning performed by the compiler

Jason Ansel (MIT) PetaBricks June 16, 2009 4 / 47

slide-19
SLIDE 19

Introduction Motivating Example

The PetaBricks language

The case for autotuning is obvious How should the programmer represent choice? We present the PetaBricks programming language and compiler:

Choice as a fundamental language construct Autotuning performed by the compiler Automatically parallelized

Jason Ansel (MIT) PetaBricks June 16, 2009 4 / 47

slide-20
SLIDE 20

Introduction Motivating Example

Sort in PetaBricks

1 transform Sort 2 from A[ n ] 3 to B[ n ] 4 { 5 from (A a ) to (B b) { 6 tunable WAYS; 7 /∗ Mergesort ∗/ 8 } or { 9 /∗ I n s e r t i o n s o r t ∗/ 10 } or { 11 /∗ Radixsort ∗/ 12 } or { 13 /∗ Quicksort ∗/ 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 5 / 47

slide-21
SLIDE 21

Introduction Motivating Example

Sort in PetaBricks

1 transform Sort 2 from A[ n ] 3 to B[ n ] 4 { 5 from (A a ) to (B b) { 6 tunable WAYS; 7 /∗ Mergesort ∗/ 8 } or { 9 /∗ I n s e r t i o n s o r t ∗/ 10 } or { 11 /∗ Radixsort ∗/ 12 } or { 13 /∗ Quicksort ∗/ 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 5 / 47

slide-22
SLIDE 22

Introduction Motivating Example

Sort in PetaBricks

1 transform Sort 2 from A[ n ] 3 to B[ n ] 4 { 5 from (A a ) to (B b) { 6 tunable WAYS; 7 /∗ Mergesort ∗/ 8 } or { 9 /∗ I n s e r t i o n s o r t ∗/ 10 } or { 11 /∗ Radixsort ∗/ 12 } or { 13 /∗ Quicksort ∗/ 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 5 / 47

slide-23
SLIDE 23

Introduction Motivating Example

Sort in PetaBricks

1 transform Sort 2 from A[ n ] 3 to B[ n ] 4 { 5 from (A a ) to (B b) { 6 tunable WAYS; 7 /∗ Mergesort ∗/ 8 } or { 9 /∗ I n s e r t i o n s o r t ∗/ 10 } or { 11 /∗ Radixsort ∗/ 12 } or { 13 /∗ Quicksort ∗/ 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 5 / 47

slide-24
SLIDE 24

Introduction Motivating Example

Sort in PetaBricks

1 transform Sort 2 from A[ n ] 3 to B[ n ] 4 { 5 from (A a ) to (B b) { 6 tunable WAYS; 7 /∗ Mergesort ∗/ 8 } or { 9 /∗ I n s e r t i o n s o r t ∗/ 10 } or { 11 /∗ Radixsort ∗/ 12 } or { 13 /∗ Quicksort ∗/ 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 5 / 47

slide-25
SLIDE 25

Introduction Motivating Example

Sort in PetaBricks

1 transform Sort 2 from A[ n ] 3 to B[ n ] 4 { 5 from (A a ) to (B b) { 6 tunable WAYS; 7 /∗ Mergesort ∗/ 8 } or { 9 /∗ I n s e r t i o n s o r t ∗/ 10 } or { 11 /∗ Radixsort ∗/ 12 } or { 13 /∗ Quicksort ∗/ 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 5 / 47

slide-26
SLIDE 26

Introduction Motivating Example

Sort in PetaBricks

1 transform Sort 2 from A[ n ] 3 to B[ n ] 4 { 5 from (A a ) to (B b) { 6 tunable WAYS; 7 /∗ Mergesort ∗/ 8 } or { 9 /∗ I n s e r t i o n s o r t ∗/ 10 } or { 11 /∗ Radixsort ∗/ 12 } or { 13 /∗ Quicksort ∗/ 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 5 / 47

slide-27
SLIDE 27

Introduction Language & Compiler Overview

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 6 / 47

slide-28
SLIDE 28

Introduction Language & Compiler Overview

The PetaBricks compiler

Sort is compiled into a autotuning binary

Jason Ansel (MIT) PetaBricks June 16, 2009 7 / 47

slide-29
SLIDE 29

Introduction Language & Compiler Overview

The PetaBricks compiler

Sort is compiled into a autotuning binary Trained on target architecture

Jason Ansel (MIT) PetaBricks June 16, 2009 7 / 47

slide-30
SLIDE 30

Introduction Language & Compiler Overview

The PetaBricks compiler

Sort is compiled into a autotuning binary Trained on target architecture

Structured genetic tuner

Jason Ansel (MIT) PetaBricks June 16, 2009 7 / 47

slide-31
SLIDE 31

Introduction Language & Compiler Overview

The PetaBricks compiler

Sort is compiled into a autotuning binary Trained on target architecture

Structured genetic tuner Trained with full number of threads

Jason Ansel (MIT) PetaBricks June 16, 2009 7 / 47

slide-32
SLIDE 32

Introduction Language & Compiler Overview

The PetaBricks compiler

Sort is compiled into a autotuning binary Trained on target architecture

Structured genetic tuner Trained with full number of threads Under 1 minute for Sort

Jason Ansel (MIT) PetaBricks June 16, 2009 7 / 47

slide-33
SLIDE 33

Introduction Language & Compiler Overview

The PetaBricks compiler

Sort is compiled into a autotuning binary Trained on target architecture

Structured genetic tuner Trained with full number of threads Under 1 minute for Sort

Results fed back into the compiler Final binary created

Jason Ansel (MIT) PetaBricks June 16, 2009 7 / 47

slide-34
SLIDE 34

Introduction Language & Compiler Overview

Sort algorithm timings1

0.0005 0.001 0.0015 0.002 0.0025 250 500 750 1000 1250 1500 1750 Time (s) Input Size InsertionSort

1On an 8-way Xeon E7340 system Jason Ansel (MIT) PetaBricks June 16, 2009 8 / 47

slide-35
SLIDE 35

Introduction Language & Compiler Overview

Sort algorithm timings1

0.0005 0.001 0.0015 0.002 0.0025 250 500 750 1000 1250 1500 1750 Time (s) Input Size InsertionSort QuickSort

1On an 8-way Xeon E7340 system Jason Ansel (MIT) PetaBricks June 16, 2009 8 / 47

slide-36
SLIDE 36

Introduction Language & Compiler Overview

Sort algorithm timings1

0.0005 0.001 0.0015 0.002 0.0025 250 500 750 1000 1250 1500 1750 Time (s) Input Size InsertionSort QuickSort MergeSort

1On an 8-way Xeon E7340 system Jason Ansel (MIT) PetaBricks June 16, 2009 8 / 47

slide-37
SLIDE 37

Introduction Language & Compiler Overview

Sort algorithm timings1

0.0005 0.001 0.0015 0.002 0.0025 250 500 750 1000 1250 1500 1750 Time (s) Input Size InsertionSort QuickSort MergeSort RadixSort

1On an 8-way Xeon E7340 system Jason Ansel (MIT) PetaBricks June 16, 2009 8 / 47

slide-38
SLIDE 38

Introduction Language & Compiler Overview

Sort algorithm timings1

0.0005 0.001 0.0015 0.002 0.0025 250 500 750 1000 1250 1500 1750 Time (s) Input Size InsertionSort QuickSort MergeSort RadixSort Autotuned

1On an 8-way Xeon E7340 system Jason Ansel (MIT) PetaBricks June 16, 2009 8 / 47

slide-39
SLIDE 39

Introduction Language & Compiler Overview

Timings on different architectures

Trained on Mobile Xeon 1-way Xeon 8-way Niagara Run on Mobile

  • 1.09x

1.67x 1.47x Xeon 1-way 1.61x

  • 2.08x

2.50x Xeon 8-way 1.59x 2.14x

  • 2.35x

Niagara 1.12x 1.51x 1.08x

  • Jason Ansel (MIT)

PetaBricks June 16, 2009 9 / 47

slide-40
SLIDE 40

Introduction Language & Compiler Overview

Timings on different architectures

Trained on Mobile Xeon 1-way Xeon 8-way Niagara Run on Mobile

  • 1.09x

1.67x 1.47x Xeon 1-way 1.61x

  • 2.08x

2.50x Xeon 8-way 1.59x 2.14x

  • 2.35x

Niagara 1.12x 1.51x 1.08x

  • Jason Ansel (MIT)

PetaBricks June 16, 2009 9 / 47

slide-41
SLIDE 41

Introduction Language & Compiler Overview

Timings on different architectures

Trained on Mobile Xeon 1-way Xeon 8-way Niagara Run on Mobile

  • 1.09x

1.67x 1.47x Xeon 1-way 1.61x

  • 2.08x

2.50x Xeon 8-way 1.59x 2.14x

  • 2.35x

Niagara 1.12x 1.51x 1.08x

  • Jason Ansel (MIT)

PetaBricks June 16, 2009 9 / 47

slide-42
SLIDE 42

Introduction Why choices

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 10 / 47

slide-43
SLIDE 43

Introduction Why choices

Early compilers

Code Gen Parsing Constrained Input Language (No choices)

Early computers (and compilers) were weak

Jason Ansel (MIT) PetaBricks June 16, 2009 11 / 47

slide-44
SLIDE 44

Introduction Why choices

Early compilers

Code Gen Parsing Constrained Input Language (No choices)

Early computers (and compilers) were weak Parsing and code generation dominated compilation

Jason Ansel (MIT) PetaBricks June 16, 2009 11 / 47

slide-45
SLIDE 45

Introduction Why choices

Early compilers

Code Gen Parsing Constrained Input Language (No choices)

Early computers (and compilers) were weak Parsing and code generation dominated compilation Needed a constrained input language to simplify compilation

Jason Ansel (MIT) PetaBricks June 16, 2009 11 / 47

slide-46
SLIDE 46

Introduction Why choices

Current compilers

Code Gen Parsing

Exposing Choices

Decisions Constrained Input Language (No choices)

Current computers are much more powerful Compilers can do a lot more

Jason Ansel (MIT) PetaBricks June 16, 2009 12 / 47

slide-47
SLIDE 47

Introduction Why choices

Current compilers

Code Gen Parsing

Exposing Choices

Decisions Constrained Input Language (No choices)

Current computers are much more powerful Compilers can do a lot more Input language is still constraining

Jason Ansel (MIT) PetaBricks June 16, 2009 12 / 47

slide-48
SLIDE 48

Introduction Why choices

Current compilers

Code Gen Parsing

Exposing Choices

Decisions Constrained Input Language (No choices)

Current computers are much more powerful Compilers can do a lot more Input language is still constraining Compilation dominated by exposing choices

Jason Ansel (MIT) PetaBricks June 16, 2009 12 / 47

slide-49
SLIDE 49

Introduction Why choices

Current compilers

Code Gen Parsing

Exposing Choices

Decisions Constrained Input Language (No choices)

Current computers are much more powerful Compilers can do a lot more Input language is still constraining Compilation dominated by exposing choices Input language specifies only one

Algorithmic choice Iteration order choice Parallelism strategy choice Data layout choice

Jason Ansel (MIT) PetaBricks June 16, 2009 12 / 47

slide-50
SLIDE 50

Introduction Why choices

Current compilers

Code Gen Parsing

Exposing Choices

Decisions Constrained Input Language (No choices)

Current computers are much more powerful Compilers can do a lot more Input language is still constraining Compilation dominated by exposing choices Input language specifies only one

Algorithmic choice Iteration order choice Parallelism strategy choice Data layout choice

Compiler must perform heroic analysis to reconstruct

  • ther choices

Jason Ansel (MIT) PetaBricks June 16, 2009 12 / 47

slide-51
SLIDE 51

Introduction Why choices

PetaBricks compiler

Code Gen Parsing

Exploring Choices & Making Decisions

Rich Input Language (w/ choices)

We propose explicit choices in the language

Jason Ansel (MIT) PetaBricks June 16, 2009 13 / 47

slide-52
SLIDE 52

Introduction Why choices

PetaBricks compiler

Code Gen Parsing

Exploring Choices & Making Decisions

Rich Input Language (w/ choices)

We propose explicit choices in the language The programmer defines the space of legal

Algorithmic choices Iteration orders (include parallel) Data layouts

Jason Ansel (MIT) PetaBricks June 16, 2009 13 / 47

slide-53
SLIDE 53

Introduction Why choices

PetaBricks compiler

Code Gen Parsing

Exploring Choices & Making Decisions

Rich Input Language (w/ choices)

We propose explicit choices in the language The programmer defines the space of legal

Algorithmic choices Iteration orders (include parallel) Data layouts

Allow compilers to focus on exploring choices Compiler no longer needs to reconstruct choices

Jason Ansel (MIT) PetaBricks June 16, 2009 13 / 47

slide-54
SLIDE 54

Introduction Why choices

Future-proof programs

The result: programs can adapt to their environment

Jason Ansel (MIT) PetaBricks June 16, 2009 14 / 47

slide-55
SLIDE 55

Introduction Why choices

Future-proof programs

The result: programs can adapt to their environment Choices make programs less brittle

Jason Ansel (MIT) PetaBricks June 16, 2009 14 / 47

slide-56
SLIDE 56

Introduction Why choices

Future-proof programs

The result: programs can adapt to their environment Choices make programs less brittle Programs change with architecture, available cores, inputs, etc

Jason Ansel (MIT) PetaBricks June 16, 2009 14 / 47

slide-57
SLIDE 57

PetaBricks Language Key Ideas

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 15 / 47

slide-58
SLIDE 58

PetaBricks Language Key Ideas

Algorithmic choice in the language

Algorithmic choice is the key aspect of PetaBricks

Jason Ansel (MIT) PetaBricks June 16, 2009 16 / 47

slide-59
SLIDE 59

PetaBricks Language Key Ideas

Algorithmic choice in the language

Algorithmic choice is the key aspect of PetaBricks Programmer can define multiple rules to compute the same data

Jason Ansel (MIT) PetaBricks June 16, 2009 16 / 47

slide-60
SLIDE 60

PetaBricks Language Key Ideas

Algorithmic choice in the language

Algorithmic choice is the key aspect of PetaBricks Programmer can define multiple rules to compute the same data Compiler re-use rules to create hybrid algorithms

Jason Ansel (MIT) PetaBricks June 16, 2009 16 / 47

slide-61
SLIDE 61

PetaBricks Language Key Ideas

Algorithmic choice in the language

Algorithmic choice is the key aspect of PetaBricks Programmer can define multiple rules to compute the same data Compiler re-use rules to create hybrid algorithms Can express choices at many different granularities

Jason Ansel (MIT) PetaBricks June 16, 2009 16 / 47

slide-62
SLIDE 62

PetaBricks Language Key Ideas

Synthesized outer control flow

Outer control flow synthesized by compiler

Jason Ansel (MIT) PetaBricks June 16, 2009 17 / 47

slide-63
SLIDE 63

PetaBricks Language Key Ideas

Synthesized outer control flow

Outer control flow synthesized by compiler Another choice that the programmer should not make

Jason Ansel (MIT) PetaBricks June 16, 2009 17 / 47

slide-64
SLIDE 64

PetaBricks Language Key Ideas

Synthesized outer control flow

Outer control flow synthesized by compiler Another choice that the programmer should not make

By rows?

Jason Ansel (MIT) PetaBricks June 16, 2009 17 / 47

slide-65
SLIDE 65

PetaBricks Language Key Ideas

Synthesized outer control flow

Outer control flow synthesized by compiler Another choice that the programmer should not make

By rows? By columns?

Jason Ansel (MIT) PetaBricks June 16, 2009 17 / 47

slide-66
SLIDE 66

PetaBricks Language Key Ideas

Synthesized outer control flow

Outer control flow synthesized by compiler Another choice that the programmer should not make

By rows? By columns? Diagonal? Reverse order? Blocked? Parallel?

Jason Ansel (MIT) PetaBricks June 16, 2009 17 / 47

slide-67
SLIDE 67

PetaBricks Language Key Ideas

Synthesized outer control flow

Outer control flow synthesized by compiler Another choice that the programmer should not make

By rows? By columns? Diagonal? Reverse order? Blocked? Parallel?

Instead programmer provides explicit producer-consumer relations

Jason Ansel (MIT) PetaBricks June 16, 2009 17 / 47

slide-68
SLIDE 68

PetaBricks Language Key Ideas

Synthesized outer control flow

Outer control flow synthesized by compiler Another choice that the programmer should not make

By rows? By columns? Diagonal? Reverse order? Blocked? Parallel?

Instead programmer provides explicit producer-consumer relations Allows compiler to explore choice space

Jason Ansel (MIT) PetaBricks June 16, 2009 17 / 47

slide-69
SLIDE 69

PetaBricks Language Compilation Example

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 18 / 47

slide-70
SLIDE 70

PetaBricks Language Compilation Example

Simple example program

1 transform RollingSum 2 from A[ n ] 3 to B[ n ] 4 { 5 // r u l e 0: use the p r e v i o u s l y computed value 6

  • B. c e l l ( i ) from (A. c e l l ( i ) a ,

7

  • B. c e l l ( i −1) leftSum ) {

8 return a+leftSum ; 9 } 10 11 // r u l e 1: sum a l l elements to the l e f t 12

  • B. c e l l ( i ) from (A. region (0 ,

i ) in ) { 13 return sum( in ) ; 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 19 / 47

slide-71
SLIDE 71

PetaBricks Language Compilation Example

Simple example program

1 transform RollingSum 2 from A[ n ] 3 to B[ n ] 4 { 5 // r u l e 0: use the p r e v i o u s l y computed value 6

  • B. c e l l ( i ) from (A. c e l l ( i ) a ,

7

  • B. c e l l ( i −1) leftSum ) {

8 return a+leftSum ; 9 } 10 11 // r u l e 1: sum a l l elements to the l e f t 12

  • B. c e l l ( i ) from (A. region (0 ,

i ) in ) { 13 return sum( in ) ; 14 } 15 }

Jason Ansel (MIT) PetaBricks June 16, 2009 19 / 47

slide-72
SLIDE 72

PetaBricks Language Compilation Example

Simple example program

... 5 // r u l e 0: use the p r e v i o u s l y computed value 6

  • B. c e l l ( i ) from (A. c e l l ( i ) a ,

7

  • B. c e l l ( i −1) leftSum ) {

8 return a+leftSum ; 9 } ...

A: B:

Jason Ansel (MIT) PetaBricks June 16, 2009 20 / 47

slide-73
SLIDE 73

PetaBricks Language Compilation Example

Simple example program

A: B:

... 11 // r u l e 1: sum a l l elements to the l e f t 12

  • B. c e l l ( i ) from (A. region (0 ,

i ) in ) { 13 return sum( in ) ; 14 } ...

Jason Ansel (MIT) PetaBricks June 16, 2009 21 / 47

slide-74
SLIDE 74

PetaBricks Language Compilation Example

Applicable regions

Compilation Process Applicable regions Choice grids Choice dependency graph

Jason Ansel (MIT) PetaBricks June 16, 2009 22 / 47

slide-75
SLIDE 75

PetaBricks Language Compilation Example

Applicable regions

Compilation Process Applicable regions Choice grids Choice dependency graph

// r u l e 0 : use the p r e v i o u s l y computed v a l u e

  • B. c e l l ( i )

from (A. c e l l ( i ) a ,

  • B. c e l l ( i −1)

leftSum ) { return a+leftSum ; }

Applicable where 1 ≤ i < n

Jason Ansel (MIT) PetaBricks June 16, 2009 22 / 47

slide-76
SLIDE 76

PetaBricks Language Compilation Example

Applicable regions

Compilation Process Applicable regions Choice grids Choice dependency graph

// r u l e 0 : use the p r e v i o u s l y computed v a l u e

  • B. c e l l ( i )

from (A. c e l l ( i ) a ,

  • B. c e l l ( i −1)

leftSum ) { return a+leftSum ; }

Applicable where 1 ≤ i < n

// r u l e 1 : sum a l l elements to the l e f t

  • B. c e l l ( i )

from (A. region (0 , i ) i n ) { return sum( i n ) ; }

Applicable where 0 ≤ i < n

Jason Ansel (MIT) PetaBricks June 16, 2009 22 / 47

slide-77
SLIDE 77

PetaBricks Language Compilation Example

Choice grids

Compilation Process Applicable regions Choice grids Choice dependency graph

1 n

R1 R0 or R1

Jason Ansel (MIT) PetaBricks June 16, 2009 23 / 47

slide-78
SLIDE 78

PetaBricks Language Compilation Example

Choice grids

Compilation Process Applicable regions Choice grids Choice dependency graph Divide data space into symbolic regions with common sets of choices

1 n

R1 R0 or R1

Jason Ansel (MIT) PetaBricks June 16, 2009 23 / 47

slide-79
SLIDE 79

PetaBricks Language Compilation Example

Choice grids

Compilation Process Applicable regions Choice grids Choice dependency graph Divide data space into symbolic regions with common sets of choices In this simple example:

A: Input (no choices) B: [0, 1) = rule 1 B: [1, n) = rule 0 or rule 1

1 n

R1 R0 or R1

Jason Ansel (MIT) PetaBricks June 16, 2009 23 / 47

slide-80
SLIDE 80

PetaBricks Language Compilation Example

Choice grids

Compilation Process Applicable regions Choice grids Choice dependency graph Divide data space into symbolic regions with common sets of choices In this simple example:

A: Input (no choices) B: [0, 1) = rule 1 B: [1, n) = rule 0 or rule 1

1 n

R1 R0 or R1

Applicable regions map rules → symbolic data Choice grids map symbolic data → rules

Jason Ansel (MIT) PetaBricks June 16, 2009 23 / 47

slide-81
SLIDE 81

PetaBricks Language Compilation Example

Choice dependency graph

Compilation Process Applicable regions Choice grids Choice dependency graph

B.region(1, n) Choices: r0, r1 (r0,=,-1) B.region(0, 1) Choices: r1 (r0,=,-1) A.region(0, n) (r1,<=),(r0,=) (r1,<=),(r0,=)

Jason Ansel (MIT) PetaBricks June 16, 2009 24 / 47

slide-82
SLIDE 82

PetaBricks Language Compilation Example

Choice dependency graph

Compilation Process Applicable regions Choice grids Choice dependency graph

B.region(1, n) Choices: r0, r1 (r0,=,-1) B.region(0, 1) Choices: r1 (r0,=,-1) A.region(0, n) (r1,<=),(r0,=) (r1,<=),(r0,=)

Adds dependency edges between symbolic regions

Jason Ansel (MIT) PetaBricks June 16, 2009 24 / 47

slide-83
SLIDE 83

PetaBricks Language Compilation Example

Choice dependency graph

Compilation Process Applicable regions Choice grids Choice dependency graph

B.region(1, n) Choices: r0, r1 (r0,=,-1) B.region(0, 1) Choices: r1 (r0,=,-1) A.region(0, n) (r1,<=),(r0,=) (r1,<=),(r0,=)

Adds dependency edges between symbolic regions Edges annotated with directions and rules

Jason Ansel (MIT) PetaBricks June 16, 2009 24 / 47

slide-84
SLIDE 84

PetaBricks Language Compilation Example

Choice dependency graph

Compilation Process Applicable regions Choice grids Choice dependency graph

B.region(1, n) Choices: r0, r1 (r0,=,-1) B.region(0, 1) Choices: r1 (r0,=,-1) A.region(0, n) (r1,<=),(r0,=) (r1,<=),(r0,=)

Adds dependency edges between symbolic regions Edges annotated with directions and rules Many compiler passes on this IR to:

Jason Ansel (MIT) PetaBricks June 16, 2009 24 / 47

slide-85
SLIDE 85

PetaBricks Language Compilation Example

Choice dependency graph

Compilation Process Applicable regions Choice grids Choice dependency graph

B.region(1, n) Choices: r0, r1 (r0,=,-1) B.region(0, 1) Choices: r1 (r0,=,-1) A.region(0, n) (r1,<=),(r0,=) (r1,<=),(r0,=)

Adds dependency edges between symbolic regions Edges annotated with directions and rules Many compiler passes on this IR to:

Simplify complex dependency patterns

Jason Ansel (MIT) PetaBricks June 16, 2009 24 / 47

slide-86
SLIDE 86

PetaBricks Language Compilation Example

Choice dependency graph

Compilation Process Applicable regions Choice grids Choice dependency graph

B.region(1, n) Choices: r0, r1 (r0,=,-1) B.region(0, 1) Choices: r1 (r0,=,-1) A.region(0, n) (r1,<=),(r0,=) (r1,<=),(r0,=)

Adds dependency edges between symbolic regions Edges annotated with directions and rules Many compiler passes on this IR to:

Simplify complex dependency patterns Add choices

Jason Ansel (MIT) PetaBricks June 16, 2009 24 / 47

slide-87
SLIDE 87

PetaBricks Language Compilation Example

Code generation

Autotuning Binary

PetaBricks Compiler

Final Binary

Choice Configuration File PetaBricks Source Code 1 PetaBricks source code is

compiled

Jason Ansel (MIT) PetaBricks June 16, 2009 25 / 47

slide-88
SLIDE 88

PetaBricks Language Compilation Example

Code generation

Autotuning Binary

PetaBricks Compiler

Final Binary

Choice Configuration File PetaBricks Source Code 1 PetaBricks source code is

compiled

2 An autotuning binary is created Jason Ansel (MIT) PetaBricks June 16, 2009 25 / 47

slide-89
SLIDE 89

PetaBricks Language Compilation Example

Code generation

Autotuning Binary

PetaBricks Compiler

Final Binary

Choice Configuration File PetaBricks Source Code 1 PetaBricks source code is

compiled

2 An autotuning binary is created 3 Autotuning occurs creating a

choice configuration file

Jason Ansel (MIT) PetaBricks June 16, 2009 25 / 47

slide-90
SLIDE 90

PetaBricks Language Compilation Example

Code generation

Autotuning Binary

PetaBricks Compiler

Final Binary

Choice Configuration File PetaBricks Source Code 1 PetaBricks source code is

compiled

2 An autotuning binary is created 3 Autotuning occurs creating a

choice configuration file

4 Choices are fed back into the

compiler to create a final binary

Jason Ansel (MIT) PetaBricks June 16, 2009 25 / 47

slide-91
SLIDE 91

PetaBricks Language Compilation Example

Autotuning

Based on two building blocks:

A genetic tuner An n-ary search algorithm

Jason Ansel (MIT) PetaBricks June 16, 2009 26 / 47

slide-92
SLIDE 92

PetaBricks Language Compilation Example

Autotuning

Based on two building blocks:

A genetic tuner An n-ary search algorithm

Flat parameter space Compiler generates a dependency graph describing this parameter space

Jason Ansel (MIT) PetaBricks June 16, 2009 26 / 47

slide-93
SLIDE 93

PetaBricks Language Compilation Example

Autotuning

Based on two building blocks:

A genetic tuner An n-ary search algorithm

Flat parameter space Compiler generates a dependency graph describing this parameter space Entire program tuned from bottom up

Jason Ansel (MIT) PetaBricks June 16, 2009 26 / 47

slide-94
SLIDE 94

PetaBricks Language Compilation Example

Parallel Runtime Library

Task-based parallel runtime Thread-local decks of runnable tasks

Jason Ansel (MIT) PetaBricks June 16, 2009 27 / 47

slide-95
SLIDE 95

PetaBricks Language Compilation Example

Parallel Runtime Library

Task-based parallel runtime Thread-local decks of runnable tasks Use a work-stealing algorithm similar to that of Cilk

Jason Ansel (MIT) PetaBricks June 16, 2009 27 / 47

slide-96
SLIDE 96

PetaBricks Language Other Language Features

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 28 / 47

slide-97
SLIDE 97

PetaBricks Language Other Language Features

More PetaBricks features

Automatic consistency checking The tunable keyword Call external code Custom training data generators Matrix versions for iterative algorithms Rule priorities where (clause for limiting applicable regions) Template transforms

Jason Ansel (MIT) PetaBricks June 16, 2009 29 / 47

slide-98
SLIDE 98

PetaBricks Language Other Language Features

More PetaBricks features

Automatic consistency checking The tunable keyword Call external code Custom training data generators Matrix versions for iterative algorithms Rule priorities where (clause for limiting applicable regions) Template transforms

Jason Ansel (MIT) PetaBricks June 16, 2009 29 / 47

slide-99
SLIDE 99

PetaBricks Language Other Language Features

More PetaBricks features

Automatic consistency checking The tunable keyword Call external code Custom training data generators Matrix versions for iterative algorithms Rule priorities where (clause for limiting applicable regions) Template transforms

Jason Ansel (MIT) PetaBricks June 16, 2009 29 / 47

slide-100
SLIDE 100

PetaBricks Language Other Language Features

More PetaBricks features

Automatic consistency checking The tunable keyword Call external code Custom training data generators Matrix versions for iterative algorithms Rule priorities where (clause for limiting applicable regions) Template transforms

Jason Ansel (MIT) PetaBricks June 16, 2009 29 / 47

slide-101
SLIDE 101

Results Benchmarks

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 30 / 47

slide-102
SLIDE 102

Results Benchmarks

Eigenvector Solve

Bisection QR decomposition Divide and conquer

Jason Ansel (MIT) PetaBricks June 16, 2009 31 / 47

slide-103
SLIDE 103

Results Benchmarks

Eigenvector Solve

0.02 0.04 0.06 0.08 0.1 0.12 200 400 600 800 1000 Time (s) Input Size Bisection

Jason Ansel (MIT) PetaBricks June 16, 2009 32 / 47

slide-104
SLIDE 104

Results Benchmarks

Eigenvector Solve

0.02 0.04 0.06 0.08 0.1 0.12 200 400 600 800 1000 Time (s) Input Size Bisection DC

Jason Ansel (MIT) PetaBricks June 16, 2009 32 / 47

slide-105
SLIDE 105

Results Benchmarks

Eigenvector Solve

0.02 0.04 0.06 0.08 0.1 0.12 200 400 600 800 1000 Time (s) Input Size Bisection DC QR

Jason Ansel (MIT) PetaBricks June 16, 2009 32 / 47

slide-106
SLIDE 106

Results Benchmarks

Eigenvector Solve

0.02 0.04 0.06 0.08 0.1 0.12 200 400 600 800 1000 Time (s) Input Size Bisection DC QR Autotuned

Jason Ansel (MIT) PetaBricks June 16, 2009 32 / 47

slide-107
SLIDE 107

Results Benchmarks

Eigenvector Solve

0.02 0.04 0.06 0.08 0.1 0.12 200 400 600 800 1000 Time (s) Input Size Bisection DC QR Autotuned Cutoff 25

Jason Ansel (MIT) PetaBricks June 16, 2009 32 / 47

slide-108
SLIDE 108

Results Benchmarks

Matrix Multiply

Basic Recursive decompositions Strassen’s algorithm Iteration order (blocking) Transpose

Jason Ansel (MIT) PetaBricks June 16, 2009 33 / 47

slide-109
SLIDE 109

Results Benchmarks

Matrix Multiply

1e-06 0.0001 0.01 1 100 10000 1 10 100 1000 10000 Time (s) Input Size Basic

Jason Ansel (MIT) PetaBricks June 16, 2009 34 / 47

slide-110
SLIDE 110

Results Benchmarks

Matrix Multiply

1e-06 0.0001 0.01 1 100 10000 1 10 100 1000 10000 Time (s) Input Size Basic Blocking

Jason Ansel (MIT) PetaBricks June 16, 2009 34 / 47

slide-111
SLIDE 111

Results Benchmarks

Matrix Multiply

1e-06 0.0001 0.01 1 100 10000 1 10 100 1000 10000 Time (s) Input Size Basic Blocking Transpose

Jason Ansel (MIT) PetaBricks June 16, 2009 34 / 47

slide-112
SLIDE 112

Results Benchmarks

Matrix Multiply

1e-06 0.0001 0.01 1 100 10000 1 10 100 1000 10000 Time (s) Input Size Basic Blocking Transpose Recursive

Jason Ansel (MIT) PetaBricks June 16, 2009 34 / 47

slide-113
SLIDE 113

Results Benchmarks

Matrix Multiply

1e-06 0.0001 0.01 1 100 10000 1 10 100 1000 10000 Time (s) Input Size Basic Blocking Transpose Recursive Strassen 256

Jason Ansel (MIT) PetaBricks June 16, 2009 34 / 47

slide-114
SLIDE 114

Results Benchmarks

Matrix Multiply

1e-06 0.0001 0.01 1 100 10000 1 10 100 1000 10000 Time (s) Input Size Basic Blocking Transpose Recursive Strassen 256 Autotuned

Jason Ansel (MIT) PetaBricks June 16, 2009 34 / 47

slide-115
SLIDE 115

Results Scalability

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 35 / 47

slide-116
SLIDE 116

Results Scalability

Scalability

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Speedup Number of Threads Autotuned Matrix Multiply

Jason Ansel (MIT) PetaBricks June 16, 2009 36 / 47

slide-117
SLIDE 117

Results Scalability

Scalability

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Speedup Number of Threads Autotuned Matrix Multiply Autotuned Sort

Jason Ansel (MIT) PetaBricks June 16, 2009 36 / 47

slide-118
SLIDE 118

Results Scalability

Scalability

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Speedup Number of Threads Autotuned Matrix Multiply Autotuned Sort Autotuned Poisson

Jason Ansel (MIT) PetaBricks June 16, 2009 36 / 47

slide-119
SLIDE 119

Results Scalability

Scalability

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Speedup Number of Threads Autotuned Matrix Multiply Autotuned Sort Autotuned Poisson Autotuned Eigenvector Solve

Jason Ansel (MIT) PetaBricks June 16, 2009 36 / 47

slide-120
SLIDE 120

Results Variable Accuracy

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 37 / 47

slide-121
SLIDE 121

Results Variable Accuracy

Variable accuracy

Most algorithms produce exact solutions

Jason Ansel (MIT) PetaBricks June 16, 2009 38 / 47

slide-122
SLIDE 122

Results Variable Accuracy

Variable accuracy

Most algorithms produce exact solutions Large class of algorithms can produce approximate solutions

Jason Ansel (MIT) PetaBricks June 16, 2009 38 / 47

slide-123
SLIDE 123

Results Variable Accuracy

Variable accuracy

Most algorithms produce exact solutions Large class of algorithms can produce approximate solutions

Iterative convergence Grid coarsening Others

Jason Ansel (MIT) PetaBricks June 16, 2009 38 / 47

slide-124
SLIDE 124

Results Variable Accuracy

Variable accuracy

Most algorithms produce exact solutions Large class of algorithms can produce approximate solutions

Iterative convergence Grid coarsening Others

Compiler/autotuner should be aware of variable accuracy

Jason Ansel (MIT) PetaBricks June 16, 2009 38 / 47

slide-125
SLIDE 125

Results Variable Accuracy

Variable accuracy

Most algorithms produce exact solutions Large class of algorithms can produce approximate solutions

Iterative convergence Grid coarsening Others

Compiler/autotuner should be aware of variable accuracy Compiler can examine optimal frontier of algorithms

Jason Ansel (MIT) PetaBricks June 16, 2009 38 / 47

slide-126
SLIDE 126

Results Variable Accuracy

Poisson’s equation

A variable accuracy benchmark Accuracy level expressed as a template parameter Autotuner exploits variable accuracy in a general way Choices:

Direct solve Jacobi iteration Successive over relaxation Multigrid

Jason Ansel (MIT) PetaBricks June 16, 2009 39 / 47

slide-127
SLIDE 127

Results Variable Accuracy

Choices in Multigrid

Grid Size

128

SOR Iteration

Time

64 32 16

SOR is an iterative algorithm

Jason Ansel (MIT) PetaBricks June 16, 2009 40 / 47

slide-128
SLIDE 128

Results Variable Accuracy

Choices in Multigrid

Grid Size

128

SOR Iteration

Time

64 32 16

SOR is an iterative algorithm Multigrid changes grid coarseness to speed up convergence Many standard shapes: V-Cycle,

Jason Ansel (MIT) PetaBricks June 16, 2009 40 / 47

slide-129
SLIDE 129

Results Variable Accuracy

Choices in Multigrid

Grid Size

128

SOR Iteration

Time

64 32 16

SOR is an iterative algorithm Multigrid changes grid coarseness to speed up convergence Many standard shapes: V-Cycle, W-Cycle, etc

Jason Ansel (MIT) PetaBricks June 16, 2009 40 / 47

slide-130
SLIDE 130

Results Variable Accuracy

Choices in Multigrid

Grid Size

128

SOR Iteration

Time

64 32 16

Direct Solve

SOR is an iterative algorithm Multigrid changes grid coarseness to speed up convergence Many standard shapes: V-Cycle, W-Cycle, etc Direct solver

Jason Ansel (MIT) PetaBricks June 16, 2009 40 / 47

slide-131
SLIDE 131

Results Variable Accuracy

Choices in Multigrid

Grid Size

128

SOR Iteration

Time

64 32 16

Direct Solve

SOR is an iterative algorithm Multigrid changes grid coarseness to speed up convergence Many standard shapes: V-Cycle, W-Cycle, etc Direct solver Different shapes = different algorithms

Jason Ansel (MIT) PetaBricks June 16, 2009 40 / 47

slide-132
SLIDE 132

Results Variable Accuracy

Autotuned V-cycle shapes for different accuracy requirements

10

1

Grid Size

2048 1024 512 256 128 64 32 16 Jason Ansel (MIT) PetaBricks June 16, 2009 41 / 47

slide-133
SLIDE 133

Results Variable Accuracy

Autotuned V-cycle shapes for different accuracy requirements

10

1

Grid Size

2048 1024 512 256 128 64 32 16

10

3 Jason Ansel (MIT) PetaBricks June 16, 2009 41 / 47

slide-134
SLIDE 134

Results Variable Accuracy

Autotuned V-cycle shapes for different accuracy requirements

10

1

Grid Size

2048 1024 512 256 128 64 32 16

10

3

10

5 Jason Ansel (MIT) PetaBricks June 16, 2009 41 / 47

slide-135
SLIDE 135

Results Variable Accuracy

Autotuned V-cycle shapes for different accuracy requirements

10

1

Grid Size

2048 1024 512 256 128 64 32 16

10

3

10

5

10

7

Grid Size

2048 1024 512 256 128 64 32 16 Jason Ansel (MIT) PetaBricks June 16, 2009 41 / 47

slide-136
SLIDE 136

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-137
SLIDE 137

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-138
SLIDE 138

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Partition accuracy space into discrete levels

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-139
SLIDE 139

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Partition accuracy space into discrete levels

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-140
SLIDE 140

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Partition accuracy space into discrete levels

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-141
SLIDE 141

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Grid size i Grid size 2i Partition accuracy space into discrete levels

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-142
SLIDE 142

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Grid size i

Grid size 2i Partition accuracy space into discrete levels Base space of candidate algorithms on optimal algorithms from coarser level

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-143
SLIDE 143

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Grid size i

Grid size 2i Partition accuracy space into discrete levels Base space of candidate algorithms on optimal algorithms from coarser level

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-144
SLIDE 144

Results Variable Accuracy

Dynamic programming technique for autotuning Multigrid

Grid size i

Grid size 2i Partition accuracy space into discrete levels Base space of candidate algorithms on optimal algorithms from coarser level

Jason Ansel (MIT) PetaBricks June 16, 2009 42 / 47

slide-145
SLIDE 145

Results Variable Accuracy

Poisson’s Equation

1e-05 0.0001 0.001 0.01 0.1 1 10 100 1000 10000 1 10 100 1000 Time (s) Input Size Direct

Jason Ansel (MIT) PetaBricks June 16, 2009 43 / 47

slide-146
SLIDE 146

Results Variable Accuracy

Poisson’s Equation

1e-05 0.0001 0.001 0.01 0.1 1 10 100 1000 10000 1 10 100 1000 Time (s) Input Size Direct Jacobi

Jason Ansel (MIT) PetaBricks June 16, 2009 43 / 47

slide-147
SLIDE 147

Results Variable Accuracy

Poisson’s Equation

1e-05 0.0001 0.001 0.01 0.1 1 10 100 1000 10000 1 10 100 1000 Time (s) Input Size Direct Jacobi SOR

Jason Ansel (MIT) PetaBricks June 16, 2009 43 / 47

slide-148
SLIDE 148

Results Variable Accuracy

Poisson’s Equation

1e-05 0.0001 0.001 0.01 0.1 1 10 100 1000 10000 1 10 100 1000 Time (s) Input Size Direct Jacobi SOR Multigrid

Jason Ansel (MIT) PetaBricks June 16, 2009 43 / 47

slide-149
SLIDE 149

Results Variable Accuracy

Poisson’s Equation

1e-05 0.0001 0.001 0.01 0.1 1 10 100 1000 10000 1 10 100 1000 Time (s) Input Size Direct Jacobi SOR Multigrid Autotuned

Jason Ansel (MIT) PetaBricks June 16, 2009 43 / 47

slide-150
SLIDE 150

Conclusion Final thoughts

Outline

1

Introduction Motivating Example Language & Compiler Overview Why choices

2

PetaBricks Language Key Ideas Compilation Example Other Language Features

3

Results Benchmarks Scalability Variable Accuracy

4

Conclusion Final thoughts

Jason Ansel (MIT) PetaBricks June 16, 2009 44 / 47

slide-151
SLIDE 151

Conclusion Final thoughts

Related work

Languages

Sequoia

Libraries & domain specific tuners

STAPL ATLAS FFTW SPARSITY SPIRAL ...

Jason Ansel (MIT) PetaBricks June 16, 2009 45 / 47

slide-152
SLIDE 152

Conclusion Final thoughts

For more information

PetaBricks makes programs future-proof, by allowing them to adapt to new architectures We plan to released PetaBricks at the end of summer Sign up for our mailing list to be notified For more information see: http://projects.csail.mit.edu/petabricks/ Questions?

Jason Ansel (MIT) PetaBricks June 16, 2009 46 / 47

slide-153
SLIDE 153

Conclusion Final thoughts

Thank you!

Jason Ansel (MIT) PetaBricks June 16, 2009 47 / 47