Evaluating Heuristic Optimization Phase Order Search Algorithms by - - PowerPoint PPT Presentation

evaluating heuristic optimization phase order search
SMART_READER_LITE
LIVE PREVIEW

Evaluating Heuristic Optimization Phase Order Search Algorithms by - - PowerPoint PPT Presentation

Evaluating Heuristic Optimization Phase Order Search Algorithms by Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Computer Science Department, Florida State University, Tallahassee, Florida Computer Science


slide-1
SLIDE 1

2007 International Symposium on Code Generation and Optimization (CGO) / 32

Evaluating Heuristic Optimization Phase Order Search Algorithms

by Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson

Computer Science Department, Florida State University, Tallahassee, Florida Computer Science Department, University of Virginia, Charlottesville, Virginia

slide-2
SLIDE 2

2 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Compiler Optimizations

  • Optimization phases require enabling

conditions

– need specific patterns in the code – many also need available registers

  • Phases interact with each other
  • Applying optimizations in different orders

generates different code

slide-3
SLIDE 3

3 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Phase Ordering Problem

  • To find an ordering of optimization phases that

produces optimal code with respect to possible phase orderings

  • Evaluating each sequence involves compiling,

assembling, linking, execution and verifying results

  • Best optimization phase ordering depends on

– source application – target platform – implementation of optimization phases

  • Long standing problem in compiler optimization!!
slide-4
SLIDE 4

4 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Addressing Phase Ordering

  • Exhaustive phase order space evaluation

[CGO ’06, LCTES ’06]

– possible for most functions – long search times for larger functions

  • Heuristic approaches

– commonly employed, extensively studied – allow faster searches – no guarantees on solution quality

slide-5
SLIDE 5

5 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Survey of Heuristic Algorithms

  • Cost and performance comparison

– with optimal – with other heuristic searches

  • Analyze phase order space properties

– sequence length – leaf sequences

  • Improving heuristic search algorithms

– propose new algorithms

slide-6
SLIDE 6

6 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Outline

  • Experimental setup
  • Local search techniques

– distribution of local minima – local search algorithms

  • Exploiting properties of leaf sequences

– genetic search algorithm

  • Conclusions
slide-7
SLIDE 7

7 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Outline

  • Experimental setup
  • Local search techniques

– distribution of local minima – local search algorithms

  • Exploiting properties of leaf sequences

– genetic search algorithm

  • Conclusions
slide-8
SLIDE 8

8 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Experimental Framework

  • We used the VPO compilation system

– established compiler framework, started development in 1988 – comparable performance to gcc –O2

  • VPO performs all transformations on a single

representation (RTLs), so it is possible to perform most phases in an arbitrary order

  • Experiments use all the 15 re-orderable
  • ptimization phases in VPO
  • Target architecture was the StrongARM SA-100

processor

slide-9
SLIDE 9

9 2007 International Symposium on Code Generation and Optimization (CGO) / 32

VPO Optimization Phases

register allocation k

  • remv. useless jumps

u minimize loop jumps j instruction selection s block reordering i reverse branches r dead assignment elim. h strength reduction q loop unrolling g

  • eval. order determin.
  • remv. unreachable code

d code abstraction n common subexpr. elim. c loop transformations l branch chaining b Optimization Phase ID Optimization Phase ID

slide-10
SLIDE 10

10 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Benchmarks

  • 12 MiBench benchmarks; 88 functions

fast spelling checker ispell searches for given words in phrases stringsearch

  • ffice

symmetric block cipher with variable length key blowfish secure hash algorithm sha security convert color .tiff image to b&w image tiff2bw image compression and decompression jpeg consumer compress 16-bit linear PCM samples to 4-bit adpcm fast fourier transform fft telecomm construct patricia trie for IP traffic patricia Dijkstra’s shortest path algorithm dijkstra network sort strings using quicksort sorting algorithm qsort test processor bit manipulation abilities bitcount auto

Description Program Category

slide-11
SLIDE 11

11 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Terminology

  • Active phase – an optimization phase that

modifies the function representation

  • Dormant phase – a phase that is unable to

find any opportunity to change the function

  • Function instance – any semantically,

syntactically, and functionally correct representation of the source function (that can be produced by our compiler)

slide-12
SLIDE 12

12 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Terminology (cont...)

  • Attempted sequence – phase sequence

comprising of both active and dormant phases

  • Active sequence – phase sequence only

comprising active phases

  • Batch sequence – active sequence

applied by the default (batch) compiler

slide-13
SLIDE 13

13 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Setup for Analyzing Search Algorithms

  • Exhaustively evaluate optimization phase
  • rder space

– represent phase order space as DAG

  • For each search algorithm

– use algorithm to generate next optimization phase sequence – lookup performance in DAG

slide-14
SLIDE 14

14 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Phase Order Search Space DAG

  • Performance

evaluation of each phase order is traversal in the DAG

– a-b-d = 52

100 60 66 46 79 99 54 96 52 a b c b a c c b d d

slide-15
SLIDE 15

15 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Outline

  • Experimental setup
  • Local search techniques

– distribution of local minima – local search algorithms

  • Exploiting properties of leaf sequences

– genetic search algorithm

  • Conclusions
slide-16
SLIDE 16

16 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Local Search Techniques

  • Consecutive sequences differ in only one

position

c b a a a a a a a c c b a c c c c c b b b b c a b b b a a a a a a c b a neighbors bseq

  • if m phases and a

sequence length of n, then will have n(m-1) neighbors

slide-17
SLIDE 17

17 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Local Search Space Properties

  • Analyze local search space to

– study distribution of local and global minima – study importance of sequence length

up to 100 attempts up to 100 attempts to generate seq. to generate seq.

  • f length n
  • f length n

is new is new sequence? sequence? get get perf perf. .

  • f seq. &
  • f seq. &

(m (m-

  • 1)n

1)n neighbors neighbors mark node mark node as seen as seen exit exit

Y N

record if record if node is node is minima minima

slide-18
SLIDE 18

18 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Distribution of Minima

0% 20% 40% 60% 80% 100% 1 1.5 2 2.5 3 3.5 4 4.5 Multiple of Batch Sequence Length

% (num. minima / total samples) % (num global minima / total minima)

slide-19
SLIDE 19

19 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Hill Climbing

randomly generate randomly generate sequence of sequence of length n length n get get perf perf. .

  • f seq. &
  • f seq. &

(m (m-

  • 1)n

1)n neighbors neighbors is seq. is seq. minima? minima? select best select best neighbor as neighbor as new base seq. new base seq. record record local local minima minima N Y

  • Steepest decent

– compare all successors of base – exit on local minima

slide-20
SLIDE 20

20 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Hill Climbing Results (cont...)

1 2 3 4 5

1 1 . 5 2 2 . 5 3 3 . 5 4 4 . 5 5 5 . 5 6 6 . 5 7 7 . 5 8 8 . 5 9 9 . 5 1 1 . 5

Multiple of Batch Sequence Length % best perf. distance from optimal

  • avg. steps to local minimum
slide-21
SLIDE 21

21 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Local Search Conclusions

  • Phase order space consists of few

minima, but significant percentage of local minima can be optimal

  • Selecting appropriate sequence length is

important

– smaller length results in bad performance – larger length is expensive

slide-22
SLIDE 22

22 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Outline

  • Experimental setup
  • Local search techniques

– distribution of local minima – local search algorithms

  • Exploiting properties of leaf sequences

– genetic search algorithm

  • Conclusions
slide-23
SLIDE 23

23 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Leaf Sequence Properties

  • Leaf function instances are generated when no

additional phases can be successfully applied

– sequences leading to leaf function instances are leaf sequences

  • Leaf sequences result in good performance

– at least one leaf instance represents an optimal phase ordering for over 86% of functions – significant percentage of leaf instances among

  • ptimal
slide-24
SLIDE 24

24 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Focusing on Leaf Sequences

  • Modify phase order search algorithms to
  • nly produce leaf sequences

– no need to guess appropriate sequence length – likely to result in optimal or close to optimal performance – leaf function instances comprise only 4.2% of the total instances

slide-25
SLIDE 25

25 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Genetic Algorithm

  • A biased sampling search method

– evolves solutions by merging parts of different solutions

Create initial Create initial population population

  • f optimization
  • f optimization

sequences sequences Evaluate fitness of Evaluate fitness of each sequence each sequence in the population in the population Terminate Terminate

  • cond. ?
  • cond. ?

Output the Output the best best sequence sequence found found crossover & crossover & mutation to mutation to create new create new generation generation Y Y N N

slide-26
SLIDE 26

26 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Modified Genetic Algorithm

  • Only generate leaf sequences

Create initial Create initial population population

  • f optimization
  • f optimization

sequences sequences Evaluate fitness of Evaluate fitness of each sequence each sequence in the population in the population Terminate Terminate

  • cond. ?
  • cond. ?

Output the Output the best best sequence sequence found found crossover & crossover & mutation to mutation to create new create new generation generation Y Y N N adjust adjust sequences sequences to leaf to leaf sequences sequences

slide-27
SLIDE 27

27 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Genetic Algorithm – Performance

2 4 6 8 10 1 1.75 2.5 3.25 4 4.75 5.5 6.25 7 7.75 8.5 9.25 10 10.75 Multiple of Batch Sequence Length % Performance from Optimal

All Sequences Leaf Sequences

slide-28
SLIDE 28

28 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Genetic Algorithm – Cost

40 80 120 1 1.75 2.5 3.25 4 4.75 5.5 6.25 7 7.75 8.5 9.25 10 10.8

Multiple of Batch Sequence Length Number of Generations All Sequences Leaf Sequences

slide-29
SLIDE 29

29 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Leaf Search Conclusion

  • Benefits of restricting searches to leaf

sequences

– no need for apriori knowledge of appropriate sequence length – near-optimal performance – low cost

slide-30
SLIDE 30

30 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Outline

  • Experimental setup
  • Local search techniques

– distribution of local minima – local search algorithms

  • Exploiting properties of leaf sequences

– genetic search algorithm

  • Conclusions
slide-31
SLIDE 31

31 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Conclusions

  • First study to compare heuristic search solutions

with optimal orderings

  • Analyzed properties of phase order search

space

– few local and global minima

  • Illustrated importance of choosing the

appropriate sequence length

  • Demonstrated importance of leaf sequences

– achieve near-optimal performance at low cost

slide-32
SLIDE 32

32 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Questions ?

slide-33
SLIDE 33

33 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Simulated Annealing

  • A worse solution is

accepted with

prob = exp (-δf/T)

– δf diff. in perf.

– T current temp.

  • Annealing schedule

– initial temperature – cooling schedule

randomly generate randomly generate sequence of sequence of length n length n get get perf perf. .

  • f seq. &
  • f seq. &

(m (m-

  • 1)n

1)n neighbors neighbors is seq. is seq. minima? minima? select best select best neighbor as neighbor as new base seq. new base seq. record record local local minima minima N Y select select best with best with prob prob p? p? N Y

slide-34
SLIDE 34

34 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Simulated Annealing (cont...)

  • Simulated annealing parameters

– sequence length 1.5 times batch length – initial temperature 0.5 to 0.95 – cooling schedule 0.5 to 0.95, steps 0.5

  • Experimental results

– perf. 0.15% from optimal, std. dev. of 0.13% – avg. perf. 15.95% worse, std. dev. of 0.55% – 41.06% iter. reach optimal, std. dev. of 0.81%

slide-35
SLIDE 35

35 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Random Algorithm

  • Random sampling used for search spaces

that are discrete and sparse

Randomly Randomly initialize initialize (leaf) (leaf) sequence sequence Lookup Lookup performance performance No improve > 100? Output best Output best sequence sequence found & exit found & exit Record best Record best performance performance

Y N

slide-36
SLIDE 36

36 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Random Algorithm – Performance

5 10 15 20 25

1 1 . 7 5 2 . 5 3 . 2 5 4 4 . 7 5 5 . 5 6 . 2 5 7 7 . 7 5 8 . 5 9 . 2 5 1 1 . 8

Multiple of Batch Sequence Length % Performance from Optima All Sequences Leaf Sequences

slide-37
SLIDE 37

37 2007 International Symposium on Code Generation and Optimization (CGO) / 32

Random Algorithm – Cost

100 110 120 130 140 150 160 170 180

1 1 . 7 5 2 . 5 3 . 2 5 4 4 . 7 5 5 . 5 6 . 2 5 7 7 . 7 5 8 . 5 9 . 2 5 1 1 . 7 5

Multiple of Batch Sequence Length

Number of Attempts All Sequences Leaf Sequences