PetaBricks and Julia Kathleen C. Alexander Massachusetts Institute - - PowerPoint PPT Presentation

petabricks and julia
SMART_READER_LITE
LIVE PREVIEW

PetaBricks and Julia Kathleen C. Alexander Massachusetts Institute - - PowerPoint PPT Presentation

PetaBricks and Julia Kathleen C. Alexander Massachusetts Institute of Technology December 11th, 2013 Motivation Motivation Background Approach Results Recommendations Index The Programmers Dilemma a personal example energy


slide-1
SLIDE 1

PetaBricks and Julia

Kathleen C. Alexander Massachusetts Institute of Technology December 11th, 2013

slide-2
SLIDE 2

Motivation

slide-3
SLIDE 3

Motivation Background Approach Results Recommendations Index

The Programmer’s Dilemma

a personal example— energy landscapes

K.C. Alexander (MIT) PetaBricks and Julia 1 / 15

slide-4
SLIDE 4

Motivation Background Approach Results Recommendations Index

The Programmer’s Dilemma

which algorithm is best?

K.C. Alexander (MIT) PetaBricks and Julia 1 / 15

slide-5
SLIDE 5

Motivation Background Approach Results Recommendations Index

The Programmer’s Dilemma

which algorithm is best? Goal: determine the best algorithm for the application– which may be machine dependent

K.C. Alexander (MIT) PetaBricks and Julia 1 / 15

slide-6
SLIDE 6

Motivation Background Approach Results Recommendations Index

Parallel Programming

  • many parts of these al-

gorithms can be written in parallel

  • often they can be paral-

lelized in many different ways

  • optimizing these options

is a challenge

Determine the best way to parallelize the program– which will be machine dependent

K.C. Alexander (MIT) PetaBricks and Julia 2 / 15

slide-7
SLIDE 7

Motivation Background Approach Results Recommendations Index

Parallel Programming

  • many parts of these al-

gorithms can be written in parallel

  • often they can be paral-

lelized in many different ways

  • optimizing these options

is a challenge

Determine the best way to parallelize the program– which will be machine dependent

K.C. Alexander (MIT) PetaBricks and Julia 2 / 15

slide-8
SLIDE 8

Background

slide-9
SLIDE 9

Motivation Background Approach Results Recommendations Index

Petabricks – Algorithmic Choice

PetaBricks was developed to alleviate some of the optimiza- tion responsibility from the programmer the transform

K.C. Alexander (MIT) PetaBricks and Julia 3 / 15

slide-10
SLIDE 10

Motivation Background Approach Results Recommendations Index

Petabricks – Algorithmic Choice

PetaBricks was developed to alleviate some of the optimiza- tion responsibility from the programmer the transform compiling framework

Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 3 / 15

slide-11
SLIDE 11

Motivation Background Approach Results Recommendations Index

Petabricks – Autotuning

the autotuner determines the best configuration for the ma- chine under the tuning constraints

K.C. Alexander (MIT) PetaBricks and Julia 4 / 15

slide-12
SLIDE 12

Motivation Background Approach Results Recommendations Index

Petabricks – Autotuning

Sort

Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 4 / 15

slide-13
SLIDE 13

Motivation Background Approach Results Recommendations Index

Petabricks – Autotuning

Eigen Problem

Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 4 / 15

slide-14
SLIDE 14

Motivation Background Approach Results Recommendations Index

Petabricks – Autotuning

Matrix Multiply

Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 4 / 15

slide-15
SLIDE 15

Motivation Background Approach Results Recommendations Index

Julia

  • Julia was developed to bridge the gap between interpreted

and compiled scientific computing

  • streamlining parallelization techniques has been a priority

K.C. Alexander (MIT) PetaBricks and Julia 5 / 15

slide-16
SLIDE 16

Motivation Background Approach Results Recommendations Index

Julia

http://forio.com/julia/julia K.C. Alexander (MIT) PetaBricks and Julia 5 / 15

slide-17
SLIDE 17

Motivation Background Approach Results Recommendations Index

Julia

http://forio.com/julia/julia

Question: is there room for overlap between the PetaBricks and Julia approaches?

K.C. Alexander (MIT) PetaBricks and Julia 5 / 15

slide-18
SLIDE 18

Approach

slide-19
SLIDE 19

Motivation Background Approach Results Recommendations Index

Options for Implementation

Julia in PetaBricks

  • can utilize PetaBricks autotuner and compiler
  • PetaBricks compiler needs to interpret Julia

K.C. Alexander (MIT) PetaBricks and Julia 6 / 15

slide-20
SLIDE 20

Motivation Background Approach Results Recommendations Index

Options for Implementation

Julia in PetaBricks

  • can utilize PetaBricks autotuner and compiler
  • PetaBricks compiler needs to interpret Julia

PetaBricks in Julia

  • can run PetaBricks binaries inside Julia
  • no PetaBricks shared object files, functions require disk i/o
  • doesn’t take advantage of JuliaLang

K.C. Alexander (MIT) PetaBricks and Julia 6 / 15

slide-21
SLIDE 21

Motivation Background Approach Results Recommendations Index

Options for Implementation

Julia in PetaBricks

  • can utilize PetaBricks autotuner and compiler
  • PetaBricks compiler needs to interpret Julia

PetaBricks in Julia

  • can run PetaBricks binaries inside Julia
  • no PetaBricks shared object files, functions require disk i/o
  • doesn’t take advantage of JuliaLang

Julia + OpenTuner

  • apply PetaBricks framework to Julia
  • utilize OpenTuner to optimize Julia

K.C. Alexander (MIT) PetaBricks and Julia 6 / 15

slide-22
SLIDE 22

Motivation Background Approach Results Recommendations Index

Approach Used Here

PetaBricks in Julia

  • can run PetaBricks binaries inside Julia
  • no PetaBricks shared object files, functions require disk i/o
  • doesn’t take advantage of JuliaLang

K.C. Alexander (MIT) PetaBricks and Julia 7 / 15

slide-23
SLIDE 23

Motivation Background Approach Results Recommendations Index

Approach Used Here

PetaBricks in Julia

  • can run PetaBricks binaries inside Julia
  • no PetaBricks shared object files, functions require disk i/o
  • doesn’t take advantage of JuliaLang

⇒ most naive approach possible: → compile PetaBricks executable, exe → julia ¿ run(‘$exe $in $out‘)

K.C. Alexander (MIT) PetaBricks and Julia 7 / 15

slide-24
SLIDE 24

Motivation Background Approach Results Recommendations Index

Approach Used Here

PetaBricks in Julia

  • can run PetaBricks binaries inside Julia
  • no PetaBricks shared object files, functions require disk i/o
  • doesn’t take advantage of JuliaLang

⇒ most naive approach possible: → compile PetaBricks executable, exe → julia ¿ run(‘$exe $in $out‘) ⇒ compare with PetaBricks and Julia alone → lower bound of performance improvement → is there proof of benefit?

K.C. Alexander (MIT) PetaBricks and Julia 7 / 15

slide-25
SLIDE 25

Results

slide-26
SLIDE 26

Motivation Background Approach Results Recommendations Index

PetaBricks- Tuning Improvements

performance improvement— tuned and untuned PetaBricks Matrix Multiply

1000 2000 3000 4000 Size 50 100 150 200 250 Wall-Clock Time [s] tuned untuned

K.C. Alexander (MIT) PetaBricks and Julia 8 / 15

slide-27
SLIDE 27

Motivation Background Approach Results Recommendations Index

Comparing PetaBricks with Julia - Apples to Apples

PetaBricks → functions read in ASCII files and output same → determines parallelization during autotuning → autotuning can take days Julia → JIT for each independent execution → can addprocs(n), but may not parallelize → can be used interactively

K.C. Alexander (MIT) PetaBricks and Julia 9 / 15

slide-28
SLIDE 28

Motivation Background Approach Results Recommendations Index

Comparing PetaBricks with Julia - Apples to Apples

PetaBricks → functions read in ASCII files and output same → determines parallelization during autotuning → autotuning can take days PetaBricks → JIT for each independent executable → can addprocs(n), but may not parallelize → can be used interactively → make both programs do i/o → run both programs from shell → try addprocs(n) in Julia, with no other instructions → subtract ’hello world’ start-up time from Julia wall-clock

K.C. Alexander (MIT) PetaBricks and Julia 9 / 15

slide-29
SLIDE 29

Motivation Background Approach Results Recommendations Index

Comparing PetaBricks to Julia - EigenSolve

EigenSolve

200 400 600 800 1000 Size 2 4 6 8 10 Wall-Clock Time [s] Julia Julia-Scaled PetaBricks

→ Julia seems to do the best for large matrices → however, the results were not comparable → this test was not a good apples-to-apples perfor- mance test

K.C. Alexander (MIT) PetaBricks and Julia 10 / 15

slide-30
SLIDE 30

Motivation Background Approach Results Recommendations Index

Comparing PetaBricks with Julia - Sort

Sort

20 40 60 80 100 Size [104 ] 2 4 6 8 Wall-Clock Time [s] Julia Julia-Scaled PetaBricks

→ Julia and PetBricks con- verge for large vectors → PetaBricks is better with shorter vectors → effect of i/o not consid- ered wrt performance

K.C. Alexander (MIT) PetaBricks and Julia 11 / 15

slide-31
SLIDE 31

Motivation Background Approach Results Recommendations Index

Comparing PetaBricks with Julia Matrix Multiply

i5-3339 (4 CPU)

200 400 600 800 1000 Size 2 4 6 8 Wall-Clock Time [s] Julia Julia-Scaled PetaBricks

i7-3770 (8 CPU)

1000 2000 3000 4000 Size 20 40 60 80 Wall-Clock Time [s] Julia Julia-Scaled Julia-Scaled-8p PetaBricks

→ Julia and PetBricks converge moderate matrix sizes on fewer cores → PetaBricks is better with smaller lists and larger matrices → using addprocs(n) with no other instruction does not utilize parallel func- tionality in Julia

K.C. Alexander (MIT) PetaBricks and Julia 12 / 15

slide-32
SLIDE 32

Motivation Background Approach Results Recommendations Index

Running PetaBricks from Julia

Matrix Multiply

1000 2000 3000 4000 Size 20 40 60 80 Wall-Clock Time [s] Julia-Scaled PB-Julia-Scaled PetaBricks

→ Can get PetaBricks im- provement by incorpo- rating PetaBricks exe- cutable in Julia → effect of i/o not consid- ered wrt performance

K.C. Alexander (MIT) PetaBricks and Julia 13 / 15

slide-33
SLIDE 33

Recommendations

slide-34
SLIDE 34

Motivation Background Approach Results Recommendations Index

Recommendations

Matrix Multiply

200 400 600 800 1000 Size 2 4 6 8 Wall-Clock Time [s] Julia Julia-Scaled PetaBricks → under many circum- stances, Julia performs as well as PetaBricks without days

  • f

compilation

K.C. Alexander (MIT) PetaBricks and Julia 14 / 15

slide-35
SLIDE 35

Motivation Background Approach Results Recommendations Index

Recommendations

Matrix Multiply

200 400 600 800 1000 Size 2 4 6 8 Wall-Clock Time [s] Julia Julia-Scaled PetaBricks → there is room for im- provement on the start- up time for Julia

K.C. Alexander (MIT) PetaBricks and Julia 14 / 15

slide-36
SLIDE 36

Motivation Background Approach Results Recommendations Index

Recommendations

Matrix Multiply

1000 2000 3000 4000 Size 20 40 60 80 Wall-Clock Time [s] Julia-Scaled PB-Julia-Scaled PetaBricks → PetaBricks performance can be achieved by using a shell command in Julia

K.C. Alexander (MIT) PetaBricks and Julia 14 / 15

slide-37
SLIDE 37

Motivation Background Approach Results Recommendations Index

Recommendations

Ansel, et. al. MIT CSAIL Technical Report MIT-CSAIL-TR-2013-026 (2013).

→ implementing Open- Tuner (when better documentation is avail- able) with Julia may be a reasonable long term goal for performance gains of this kind

K.C. Alexander (MIT) PetaBricks and Julia 14 / 15

slide-38
SLIDE 38

Motivation Background Approach Results Recommendations Index

Index I

1

Motivation Background Approach Results Recommendations Index

The Programmer’s Dilemma

a personal example— energy landscapes

K.C. Alexander (MIT) PetaBricks and Julia 1 / 15

2

Motivation Background Approach Results Recommendations Index

Petabricks – Algorithmic Choice

PetaBricks was developed to alleviate some of the optimiza- tion responsibility from the programmer the transform compiling framework

Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 3 / 15

3

Motivation Background Approach Results Recommendations Index

Petabricks – Autotuning

Sort

Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 4 / 15

4

Motivation Background Approach Results Recommendations Index

Julia

http://forio.com/julia/julia

Question: is there room for overlap between the PetaBricks and Julia approaches?

K.C. Alexander (MIT) PetaBricks and Julia 5 / 15

5

Motivation Background Approach Results Recommendations Index

Approach Used Here

PetaBricks in Julia

  • can run PetaBricks binaries inside Julia
  • no PetaBricks shared object files, functions require disk i/o
  • doesn’t take advantage of JuliaLang

⇒ most naive approach possible: → compile PetaBricks executable, exe → julia ¿ run(‘$exe $in $out‘) ⇒ compare with PetaBricks and Julia alone → lower bound of performance improvement → is there proof of benefit?

K.C. Alexander (MIT) PetaBricks and Julia 7 / 15

6

Motivation Background Approach Results Recommendations Index

Comparing PetaBricks to Julia - EigenSolve

EigenSolve 200 400 600 800 1000 Size 2 4 6 8 10 Wall-Clock Time [s] Julia Julia-Scaled PetaBricks

→ Julia seems to do the best for large matrices → however, the results were not comparable → this test was not a good apples-to-apples perfor- mance test K.C. Alexander (MIT) PetaBricks and Julia 10 / 15

7

Motivation Background Approach Results Recommendations Index

Comparing PetaBricks with Julia Matrix Multiply

i5-3339 (4 CPU)

200 400 600 800 1000 Size 2 4 6 8 Wall-Clock Time [s] Julia Julia-Scaled PetaBricks

i7-3770 (8 CPU)

1000 2000 3000 4000 Size 20 40 60 80 Wall-Clock Time [s] Julia Julia-Scaled Julia-Scaled-8p PetaBricks → Julia and PetBricks converge moderate matrix sizes on fewer cores → PetaBricks is better with smaller lists and larger matrices → using addprocs(n) with no other instruction does not utilize parallel func- tionality in Julia K.C. Alexander (MIT) PetaBricks and Julia 12 / 15

8

Motivation Background Approach Results Recommendations Index

Running PetaBricks from Julia

Matrix Multiply 1000 2000 3000 4000 Size 20 40 60 80 Wall-Clock Time [s] Julia-Scaled PB-Julia-Scaled PetaBricks

→ Can get PetaBricks im- provement by incorpo- rating PetaBricks exe- cutable in Julia → effect of i/o not consid- ered wrt performance K.C. Alexander (MIT) PetaBricks and Julia 13 / 15

9

Motivation Background Approach Results Recommendations Index

Recommendations

Ansel, et. al. MIT CSAIL Technical Report MIT-CSAIL-TR-2013-026 (2013). → implementing Open- Tuner (when better documentation is avail- able) with Julia may be a reasonable long term goal for performance gains of this kind K.C. Alexander (MIT) PetaBricks and Julia 14 / 15

K.C. Alexander (MIT) PetaBricks and Julia 15 / 15