Auto ma tic a lly Cha ra c te rizing L a rg e Sc a le Pro g ra m - - PowerPoint PPT Presentation

auto ma tic a lly cha ra c te rizing l a rg e sc a le pro
SMART_READER_LITE
LIVE PREVIEW

Auto ma tic a lly Cha ra c te rizing L a rg e Sc a le Pro g ra m - - PowerPoint PPT Presentation

Auto ma tic a lly Cha ra c te rizing L a rg e Sc a le Pro g ra m Be ha vio r Timothy Sherwood Erez Perelman Greg Hamerly Brad Calder T itle Ideal : To understand the effects of cycle-level events on full program execution


slide-1
SLIDE 1

Auto ma tic a lly Cha ra c te rizing L a rg e Sc a le Pro g ra m Be ha vio r

Timothy Sherwood Erez Perelman Greg Hamerly Brad Calder

slide-2
SLIDE 2

ASPL OS : She rwo o d e t. a l. 2

T itle

  • Ideal: To understand the effects of cycle-level

events on full program execution

  • Challenge: To achieve this without doing

complete detailed simulation

  • How: Build a high-level model of program

behavior that can be used in conjunction with limited detailed simulation

slide-3
SLIDE 3

ASPL OS : She rwo o d e t. a l. 3

G o a ls

  • The goals of this research are:

– To create an automatic system that is capable of intelligently characterizing time- varying program behavior – To provide both analytic and software tools to help with program phase identification – To demonstrate the utility of these tools for finding places to simulate (SimPoints)

– Without full program detailed simulation

slide-4
SLIDE 4

ASPL OS : She rwo o d e t. a l. 4

Our Appro a c h

  • Programs are neither

– Completely Homogenous – nor Totally Random

  • Instead they are quite structured
  • Discover this structure
  • The key is the code that is executing

– the code determines the program behavior

slide-5
SLIDE 5

ASPL OS : She rwo o d e t. a l. 5

L a rg e Sc a le Be ha vio r (g zip)

IPC Energy DL1 IL1 L2 bpred

slide-6
SLIDE 6

ASPL OS : She rwo o d e t. a l. 6

So me De finitio ns

  • Interval is

– A set of instructions that execute one after the other in program order – 100 Million Instructions

  • Phase is

– A set of intervals with very similar behavior – Regardless of temporal adjacency

slide-7
SLIDE 7

ASPL OS : She rwo o d e t. a l. 7

Outline

  • Examining the Programs
  • Finding Phases Automatically
  • Application to Efficient Simulation
  • Conclusions
slide-8
SLIDE 8

ASPL OS : She rwo o d e t. a l. 8

F ing e rprinting I nte rva ls

  • Fingerprint each interval in program

– Enabling us to build high level model

  • Basic Block Vector [PACT’01]

– Tracks the code that is executing – Long sparse vector – 1 dimension per static basic block – Based on instruction execution frequency

slide-9
SLIDE 9

ASPL OS : She rwo o d e t. a l. 9

Ba sic Blo c k Ve c to rs

BB Assembly Code of bzip 1 srl a2, 0x8, t4 and a2, 0xff, t12 addl zero, t12, s6 subl t7, 0x1, t7 cmpeq s6, 0x25, v0 cmpeq s6, 0, t0 bis v0, t0, v0 bne v0, 0x120018c48 2 subl t7, 0x1, t7 cmple t7, 0x3, t2 beq t2, 0x120018b04 3 ble t7, 0x120018bb4 4 and t4, 0xff, t5 srl t4, 0x8, t4 addl zero, t5, s6 cmpeq s6, 0x25, s0 cmpeq s6, 0, a0 bis s0, a0, s0 bne s0, 0x120018c48 5 subl t7, 0x1, t7 gt t7, 0x120018b90 ... ...

ID: 1 2 3 4 5 . BB Exec Count: <1, 20, 0, 5, 0, …> weigh by Block Size: <8, 3, 1, 7, 2, …> = <8, 60, 0, 35, 0, …> Normalize to 1 = <8%,58%,0%,34%,0%,…>

For each interval:

  • One BBV for each interval
  • We can now compare vectors
  • Start with simple manual analysis

–Compare all N2 pairs of intervals

  • Enter the Similarity Matrix…
slide-10
SLIDE 10

ASPL OS : She rwo o d e t. a l. 10

Simila rity Ma trix

  • Compare N2 intervals
  • Executed Instructions
  • n Diagonal axis
  • To compare 2 points go

horizontal from one and vertically from the other

  • Darker points indicate

similar vectors

  • Clearly shows the

phase-behavior

slide-11
SLIDE 11

ASPL OS : She rwo o d e t. a l. 11

A Mo re Co mple x Ma trix - g c c

  • Still much structure
  • Dark boxes show

phase-behavior

  • Boxes in interior show

recurring phases

  • Strong diagonal line

indicates first half is similar to second half

  • Manual inspection is

not feasible or scalable

slide-12
SLIDE 12

ASPL OS : She rwo o d e t. a l. 12

Outline

  • Examining the Programs
  • Finding Phases Automatically
  • Application to Efficient Simulation
  • Conclusions
slide-13
SLIDE 13

ASPL OS : She rwo o d e t. a l. 13

F inding the Pha se s

  • Basic Block Vector is a point in space
  • The problem is to find groups of

vectors/points that are all similar

– Making sure that all points in a group are similar to one another – And ensuring all points that are different, are put into different groups

  • This is a Clustering Problem
  • A Phase is a Cluster of BBVectors
slide-14
SLIDE 14

ASPL OS : She rwo o d e t. a l. 14

Pha se -finding Alg o rithm

I. Profile Program and track BB Vectors II. Use the K-means algorithm to find clusters in the data for many different values of K III. Score the likelihood of each clustering IV. Pick the best clustering

slide-15
SLIDE 15

ASPL OS : She rwo o d e t. a l. 15

I mpro ving Pe rfo rma nc e

  • K-means requires many manipulations

– Basic Block Vectors are very long

  • > 100,000 for gcc; 800,000 for microsoft apps

– Need to make the Vectors smaller

  • Still preserve relative distances
  • Random Projection

– Multiply the vector by a random matrix – Can safely reduce down to 15 dimensions – Reduce run-time from days to minutes

slide-16
SLIDE 16

ASPL OS : She rwo o d e t. a l. 16

E xa mple : g zip Re visite d

IPC Energy DL1 IL1 L2 bpred

slide-17
SLIDE 17

ASPL OS : She rwo o d e t. a l. 17

g zip – Pha se s Disc o ve re d

IPC Energy DL1 IL1 L2 bpred

slide-18
SLIDE 18

ASPL OS : She rwo o d e t. a l. 18

g c c - A Co mple x E xa mple

IPC Energy DL1 IL1 L2 bpred

slide-19
SLIDE 19

ASPL OS : She rwo o d e t. a l. 19

g c c – Pha se s Disc o ve re d

IPC Energy DL1 IL1 L2 bpred

slide-20
SLIDE 20

ASPL OS : She rwo o d e t. a l. 20

Outline

  • Examining the Programs
  • Finding Phases Automatically
  • Application to Efficient Simulation
  • Conclusions
slide-21
SLIDE 21

ASPL OS : She rwo o d e t. a l. 21

E ffic ie nt Simula tio n

  • Simulating to completion not feasible

– Detailed simulation on SPEC takes months – Cycle level effects can’t be ignored

  • To reduce simulation time

– Simulate only a subset of the program at cycle-level accuracy – What subset you pick is very important

  • For accuracy and efficiency
slide-22
SLIDE 22

ASPL OS : She rwo o d e t. a l. 22

Simula tio n Optio ns

  • Simulate Blind: no estimate of

accuracy

  • Single Point: problem with complex

programs that have many phases

  • Random Sample: high accuracy, but

many sections of similar code, you will be doing a lot of redundant work

  • Choose Multiple Points: by examining

the calculated phase information

slide-23
SLIDE 23

ASPL OS : She rwo o d e t. a l. 23

Multiple SimPo ints

  • Perform phase analysis
  • For each phase in the program

– Pick the interval most representative of the phase – This is the SimPoint for that phase

  • Perform detailed simulation for SimPoints
  • Weigh results for each SimPoint

– According to the size of the phase it represents

slide-24
SLIDE 24

ASPL OS : She rwo o d e t. a l. 24

Re sults – Ave ra g e E rro r

253% 131% 0% 5% 10% 15% 20% 25% 30%

No FastFwd FastFwd Billion Single SimPoint Multiple SimPoints

IPC - Average Error

slide-25
SLIDE 25

ASPL OS : She rwo o d e t. a l. 25

Re sults – Ma x E rro r

3736% 1986% 0% 20% 40% 60% 80% 100%

No FastFwd FastFwd Billion Single SimPoint Multiple SimPoints

IPC - Max Error

slide-26
SLIDE 26

ASPL OS : She rwo o d e t. a l. 26

Outline

  • Examining the Programs
  • Finding Phases Automatically
  • Application to Efficient Simulation
  • Conclusions
slide-27
SLIDE 27

ASPL OS : She rwo o d e t. a l. 27

Co nc lusio ns

  • Gap between

– Cycle level events – Full program effects

  • Exploit large scale structure

– Provide high level model – Find the model with no detail simulation – In conjunction with limited detail simulation

slide-28
SLIDE 28

ASPL OS : She rwo o d e t. a l. 28

Co nc lusio ns

  • Our Strategy

– Take advantage of structure found in program – Summarize the structure in the form of phases – Find phases using techniques from clustering

  • Use this for doing efficient simulation

– High accuracy – With orders of magnitude less time

  • http://www.cs.ucsd.edu/~sherwood
slide-29
SLIDE 29

ASPL OS : She rwo o d e t. a l. 29