Accuracy-Aware Program Transformations Sasa Misailovic MIT CSAIL - - PowerPoint PPT Presentation

accuracy aware
SMART_READER_LITE
LIVE PREVIEW

Accuracy-Aware Program Transformations Sasa Misailovic MIT CSAIL - - PowerPoint PPT Presentation

Accuracy-Aware Program Transformations Sasa Misailovic MIT CSAIL Collaborators Martin Rinard , Michael Carbin, Stelios Sidiroglou, Henry Hoffmann, Deokhwan Kim, Daniel Roy, Zeyuan Allen Zhu, Michael Kling, Jonathan Kelner, Anant Agarwal


slide-1
SLIDE 1

Accuracy-Aware Program Transformations

Sasa Misailovic MIT CSAIL

slide-2
SLIDE 2

Collaborators

Martin Rinard, Michael Carbin, Stelios Sidiroglou, Henry Hoffmann, Deokhwan Kim, Daniel Roy, Zeyuan Allen Zhu, Michael Kling, Jonathan Kelner, Anant Agarwal

slide-3
SLIDE 3

Emerging Software and Hardware

slide-4
SLIDE 4

Emerging Software and Hardware

Big Data; Approximate

slide-5
SLIDE 5

Emerging Software and Hardware

Energy Conscious Big Data; Approximate

slide-6
SLIDE 6

Emerging Software and Hardware

Energy Conscious

Automatically Transform Computations to Trade Accuracy for Performance and Energy

Big Data; Approximate

slide-7
SLIDE 7

Solving Problems with Transformations

Data center needs to draw less power Voltage drops, clock ticks slower, start missing deadlines Program is taking too long to run System gets loaded, start missing deadlines Hand held needs to go longer between charges Lose cores, start missing deadlines Automatically Transform Computations to Trade Accuracy for Performance and Energy

slide-8
SLIDE 8

Consider This Transformation

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

slide-9
SLIDE 9

Loop Perforation

Effects:

 Should improve performance  Broadly applicable

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

slide-10
SLIDE 10

Loop Perforation

Common Reaction: But it changes the program semantics! The result will be wrong ?!

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

slide-11
SLIDE 11

Loop Perforation

Common Reaction: But it changes the program semantics! The result will be wrong ?! The result can be less accurate!

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

slide-12
SLIDE 12

Acceptability = Accuracy + Integrity

slide-13
SLIDE 13

Acceptability = Accuracy + Integrity

Optimization problem: minimize execution time given constraints on accuracy and integrity of the computation

slide-14
SLIDE 14

Optimization Inputs

Original Program

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

Input & Accuracy Specification Program Transformation

slide-15
SLIDE 15

Optimization Framework

  • Find Candidates

for Transformation

  • Analyze Effects of the

Transformations

  • Navigate Tradeoff Space

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

c c c

slide-16
SLIDE 16

Error Time c c c

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

slide-17
SLIDE 17

Error Time c c c

for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … }

slide-18
SLIDE 18

Time Error

slide-19
SLIDE 19

Time Error

slide-20
SLIDE 20

Time Error

Property: the result of the optimized program is within the specified error bound Query: Return the program that executes in minimal time

slide-21
SLIDE 21

Find Transformation Candidates:

  • Profile program to find time-consuming for loops

Analyze the Effects of Perforation:

  • Integrity: memory safety, well formed output
  • Performance: Compare execution times
  • Accuracy: Compare the quality of the results

Navigate Tradeoff Space:

  • Combine multiple perforatable loops

Prioritize loops by their individual performance and accuracy Greedy or Exhaustive Search with Pruning

Explicit Search Algorithm for Perforation

slide-22
SLIDE 22

Accuracy Analysis of Computation

c Input Original Program Output Output Abstraction (Application-Specific) Transformed Program Difference Bound

δ

<

slide-23
SLIDE 23

Analysis for Individual Loop Perforation

  • 1. Perforate one time-consuming loop at a time
  • 2. Execute perforated program
  • 3. Filter out critical loops:

a) Program crashes b) Accuracy loss > δmax c) Execution slows down d) Latent memory errors (Valgrind)

  • 4. Repeat 1-3 for all loops, inputs, perforation rates
slide-24
SLIDE 24

Individual Loop Perforation Results

5 10 15 20 25 30 35 40 Perforatable Latent Errors Bad Speedup Bad Accuracy Crash

# loops From [ICSE 2010]

slide-25
SLIDE 25

Individual Loop Perforation Results

5 10 15 20 25 30 35 40 Perforatable Latent Errors Bad Speedup Bad Accuracy Crash

# loops From [ICSE 2010]

slide-26
SLIDE 26

Individual Loop Perforation Results

5 10 15 20 25 30 35 40 Perforatable Latent Errors Bad Speedup Bad Accuracy Crash

# loops From [ICSE 2010]

slide-27
SLIDE 27

Individual Loop Perforation Results

5 10 15 20 25 30 35 40 Perforatable Latent Errors Bad Speedup Bad Accuracy Crash

# loops From [ICSE 2010]

slide-28
SLIDE 28

Individual Loop Perforation Results

5 10 15 20 25 30 35 40 Perforatable Latent Errors Bad Speedup Bad Accuracy Crash

# loops From [ICSE 2010]

slide-29
SLIDE 29

Individual Loop Perforation Results

5 10 15 20 25 30 35 40 Perforatable Latent Errors Bad Speedup Bad Accuracy Crash

# loops From [ICSE 2010]

slide-30
SLIDE 30

Percentage of Work Done in Perforatable Loops

20 40 60 80 100 120

% instructions

slide-31
SLIDE 31

Performance Increase of the T

  • p

Perforatable Loop (Relative Error < 0.1)

1 1.2 1.4 1.6 1.8 2 2.2

Speedup

slide-32
SLIDE 32

Result Interpretation

Manual inspection of perforatable computations:

x264: motion estimation bodytrack: MCMC swaptions: Monte Carlo simulation ferret: similarity hashing blackscholes: redundant computation canneal: simulated annealing streamcluster: cluster center search

Common: Approximate/heuristic computations

slide-33
SLIDE 33

x264 Cumulative Loop Scores

Mean Normalized Time Accuracy loss

From [FSE 2011]

slide-34
SLIDE 34

x264 Cumulative Loop Scores

Mean Normalized Time Accuracy loss

From [FSE 2011]

slide-35
SLIDE 35

Status

Good: Profitable accuracy/performance tradeoffs Matches the approximate computations But: No guarantees on accuracy No guarantees on safety How to improve it? How often large errors happen? What safety guarantees can we provide?

slide-36
SLIDE 36

Reasoning About Transformed Programs

Accuracy Probabilistic Reasoning [SAS ’11, POPL ‘12] (with Z. Zhu, J. Kelner, D. Roy, M. Rinard) Integrity Relational Logic Reasoning

[PLDI ‘12, PEPM ‘13]

(with M. Carbin, D. Kim, M. Rinard)

slide-37
SLIDE 37

… … … …

  • Nodes represent computation
  • Edges represent flow of data

From [POPL ‘12]

slide-38
SLIDE 38

  • Functions – process individual data
  • Reduction nodes – aggregate data

… … … …

slide-39
SLIDE 39

min avg avg avg avg

  • Functions – process individual data
  • Reduction nodes – aggregate data

… … … …

slide-40
SLIDE 40

min avg avg avg avg

Function substitution

  • Multiple implementations
  • Each has expected error/time (𝐹, 𝑈)

f2 f3 f1

… … … …

slide-41
SLIDE 41

min avg avg avg avg

Function substitution

  • Multiple implementations
  • Each has expected error/time (𝐹, 𝑈)

… … … …

slide-42
SLIDE 42

min avg avg avg avg

Function substitution

  • Inputs of functions have specified ranges
  • Each function has Lipschitz property

… … … …

[a,b] [c,d] [a,b] [c,d] [a,b] [c,d] … … … [a,b] [c,d]

slide-43
SLIDE 43

Sampling inputs of reduction nodes

  • Reductions consume fewer inputs

min avg avg avg avg

… … … …

slide-44
SLIDE 44

min avg avg

Sampling inputs of reduction nodes

  • Reductions consume fewer inputs

… … … …

slide-45
SLIDE 45

Search for Optimized Programs

Time

Property: With high probability the result of the optimized program is within the specified error bound

Error

slide-46
SLIDE 46

Search for Optimized Programs

Time

Property: With high probability the result of the optimized program is within the specified error bound

Error

𝐐𝐬 𝐒𝐟𝐭 − 𝐒𝐟𝐭′ < 𝐂 > 𝟐 − 𝛆

slide-47
SLIDE 47

Search for Optimized Programs

Time

Property: Query: Generate randomized program that executes in minimal time

Error

𝐐𝐬 𝐒𝐟𝐭 − 𝐒𝐟𝐭′ < 𝐂 > 𝟐 − 𝛆

slide-48
SLIDE 48

Find Transformation Candidates:

  • User provides function implementations and specs

Analyze Transformed Computations:

  • Construct analytic expressions for (1) performance

and (2) error emergence and propagation

  • Variables: probabilities of executing alternate versions

Navigate Tradeoff Space:

  • Construct mathematical optimization problem:

Using expressions for performance and error

  • Non-linear Non-convex tradeoff space:

1 + 𝜁 -approximation of globally optimal tradeoff curve

Constraint Based Search Algorithm

From [POPL ‘12]

slide-49
SLIDE 49

Divide and conquer

  • For each subcomputation

construct tradeoff curve

  • Dynamic programming

Properties

  • Polynomial time
  • 1 + 𝜁 -approximation of

true tradeoff curve

Tradeoff Curve Construction Algorithm

min

1

n n

n

avg avg

m m

slide-50
SLIDE 50

Divide and conquer

  • For each subcomputation

construct tradeoff curve

  • Dynamic programming

Properties

  • Polynomial time
  • 1 + 𝜁 -approximation of

true tradeoff curve

Tradeoff Curve Construction Algorithm

min

1

n n

n

avg avg

m m

slide-51
SLIDE 51

Divide and conquer

  • For each subcomputation

construct tradeoff curve

  • Dynamic programming

Properties

  • Polynomial time
  • 1 + 𝜁 -approximation of

true tradeoff curve

Tradeoff Curve Construction Algorithm

min

1

n n

n

avg

m

slide-52
SLIDE 52

Divide and conquer

  • For each subcomputation

construct tradeoff curve

  • Dynamic programming

Properties

  • Polynomial time
  • 1 + 𝜁 -approximation of

true tradeoff curve

Tradeoff Curve Construction Algorithm

min

1

n n

n

slide-53
SLIDE 53

Divide and conquer

  • For each subcomputation

construct tradeoff curve

  • Dynamic programming

Properties

  • Polynomial time
  • 1 + 𝜁 -approximation of

true tradeoff curve

Tradeoff Curve Construction Algorithm

min

1

n

slide-54
SLIDE 54

Divide and conquer

  • For each subcomputation

construct tradeoff curve

  • Dynamic programming

Properties

  • Polynomial time
  • 1 + 𝜁 -approximation of

true tradeoff curve

Tradeoff Curve Construction Algorithm

slide-55
SLIDE 55

Comparison With Explicit Search

Finite vs Infinite Size Search Space Input vs Declarative Specification Based General vs Restricted Model of Computation

slide-56
SLIDE 56

Related Work

Other Accuracy-aware Transformations We Explored:

  • Task Skipping [Rinard ICS ‘06, Rinard OOPSLA ’07]
  • Loop Parallelization with Data Races [TECS PEC ’12, RACES ‘12]
  • Dynamic Knobs [ASPLOS ‘11]

Our group has also been working on transformations to prevent

  • therwise fatal errors (segmentation faults, infinite loops, buffer
  • verflows, SQL injection attacks)

More Accuracy-aware Transformations Researchers Explored:

  • Unreliable Data Stores [Liu et al ASPLOS ‘11, Sampson et al PLDI ’11]
  • Multiple Implementations [Ansel et al PLDI ‘09, Chilimbi et al PLDI ’10]
  • Approximate Memoization [Chaudhuri et al FSE ’11]
slide-57
SLIDE 57

Takeaway

Emerging trend of computations on large data sets Accuracy-aware transformations are powerful tool

  • Improve performance
  • Reduce power
  • Facilitate dynamic adaptation

Interaction of program analysis and search techniques to find profitable, safe, and predictable tradeoffs