Evaluating whether the training data provided for profile feedback - PowerPoint PPT Presentation

Evaluating whether the training data provided for profile feedback is a realistic control flow for the real workload. Darryl Gove, Lawrence Spracklen, John Henning Sun Microsystems Inc.

Outline • The trouble with feedback • Correspondence Values • Coverage • Concluding remarks 2

The trouble with feedback • Profile feedback uses a training run of the code • At minimum improves decisions about: > Basic block layout > Inlining of routines • But.... 3

The trouble with feedback • Profile feedback uses a training run of the code • At minimum improves decisions about: > Basic block layout > Inlining of routines • But.... > It takes twice as long to build > Requires 'representative' training data 4

The trouble with feedback • Profile feedback uses a training run of the code • At minimum improves decisions about: > Basic block layout Out of scope > Inlining of routines • But.... > It takes twice as long to build > Requires 'representative' training data But what does this really mean? 5

Method • Using SPEC CPU2000 benchmark suite • Checking to see how well the training workloads match the reference workloads • Benchmarks compiled with low optimisation • Instrumented to gather data on basic block counts and whether particular branches are taken or not • Multiple training (or reference) datasets added together to give a single training (or reference) workload. 6

What can't be used • For SPEC CPU profile is used on multiple platforms • Each platform may do different optimisations • So performance cannot be used as a test for representative data sets • If Platform A gets faster with profile feedback it does not imply that Platform B also will. • And similarly if Platform A derives no benefit. • So metrics have to be derived from platform agnostic metrics (if possible) 7

Static and dynamic branches • A static branch is a branch instruction that exists in the code. • A dynamic branch is one that occurs at runtime. • Hence one static branch can contribute many dynamic branches. 8

A representative workload is.... A representative training workload is one for which each static branch is either: • usually taken by both the training and reference workloads, or • usually untaken by both of them. 9

Correspondence Value • Correspondence Value for a benchmark • Total number of correctly predicted dynamic branches divided by the total number of dynamic branches • Ranges from zero (no branches correctly predicted) • To 100% (meaning all branches correctly predicted) ∑  Frequency branch ∗ TakenTrain branch ≡ TakenRef branch  branches CV = ∑ Frequency branch branches 10

Correspondence Values for CPU2000 Correspondence Correspondence CPU2000_INT CPU2000_FP between train between train Benchmark Benchmark and reference and reference 164.gzip 100% 168.wupwise 100% 175.vpr 100% 171.swim 100% 176.gcc 98% 172.mgrid 98% 181.mcf 100% 173.applu 100% 186.crafty 96% 177.mesa 96% 197.parser 99% 178.galgel 83% 252.eon 100% 179.art 100% 253.perlbmk 95% 183.equake 100% 254.gap 95% 187.facerec 100% 255.vortex 100% 188.ammp 100% 256.bzip2 96% 189.lucas 89% 300.twolf 100% 191.fma3d 100% 200.sixtrack 100% 301.apsi 72% 11

Visualising Correspondence Values • Results are easier to understand as graphs • x-axis is probability taken in reference workload • y-axis is probability taken in training workload • Size of mark is proportional to frequency encountered (ie taken or untaken) in reference workload. 12

Visualised Correspondence Value Branch usually Branch usually untaken in training taken in training but not in and reference reference workload workloads Branch usually Branch usually untaken in training taken in training and reference but not in workloads reference workload 13

300.twolf (CV=100%) Benchmark is well-trained 14

178.galgel (CV=83%) A couple of branches are miss-trained 15

186.crafty (CV=96%) Benchmark is unpredictable 16

301.apsi (CV=72%) Several important branches are miss-trained 17

Coverage data • However, one branch instruction might have multiple targets. • Data per branch is not easily accessible. • Basic block counts are more easily accessible, and unique. 18

Coverage data • However, one branch instruction might have multiple targets. • Data per branch is not easily accessible. • Basic block counts are more easily accessible, and unique. • However, > Some basic block counts scale with runtime (eg inner loop) > Some basic block counts are constant for all runtimes (eg initialisation code) 19

A representative workload is... • A representative training workload will exercise all the critical basic blocks of the reference workload. 20

Coverage definition • The coverage is the • Sum of the dynamic basic block counts for the reference workload that are also executed by the training workload • Divided by the sum of all the dynamic basic block counts for the reference workload. ∑ FrequencyRef block ∗ FrequencyTrain block  0  blocks coverage = ∑ FrequencyRef block blocks 21

Coverage CPU2000 CPU2000_INT Coverage CPU2000_FP Coverage Benchmark Benchmark 164.gzip 100% 168.wupwise 100% 175.vpr 100% 171.swim 100% 176.gcc 100% 172.mgrid 100% 181.mcf 100% 173.applu 100% 186.crafty 100% 177.mesa 98% 197.parser 100% 178.galgel 85% 252.eon 100% 179.art 100% 253.perlbmk 100% 183.equake 100% 254.gap 99% 187.facerec 100% 255.vortex 100% 188.ammp 100% 256.bzip2 100% 189.lucas 81% 300.twolf 100% 191.fma3d 100% 200.sixtrack 100% 301.apsi 37% 22

Visualising coverage • Sort the blocks in order of increasing execution count • x-axis is sorted basic block count for reference • y-axis is sorted basic block count for train • Size of mark is proportional to the execution count for the reference workload 23

300.twolf coverage (100%) 'Lolly-pop' shape indicating that blocks are similarly hot in training and reference workloads 24

301.apsi coverage (37%) Code that is hot in reference but not covered by training workload 25

Concluding remarks • Coverage is easy to calculate, and provides a low- bar for representative training workloads. • If a block is not covered it cannot have been trained • Correspondence Value calculations are a more detailed approach. • As can be seen from the apsi results, both approaches are complementary. • Using these calculations it is possible to evaluate whether the current training workloads are sufficient for code path optimisations. 26

Evaluating whether the training data provided for profile feedback is a realistic control flow for the real workload. John Henning John.Henning@sun.com

Evaluating whether the training data provided for profile feedback - PowerPoint PPT Presentation

Evaluating whether the training data provided for profile feedback is a realistic control flow for the real workload. Darryl Gove, Lawrence Spracklen, John Henning Sun Microsystems Inc. Outline The trouble with feedback Correspondence

Leadplane Training Course Leadplane Training Course The Basic Lead Profile The Basic Show Me

Operational Profile Training Agenda SBHC Site Coordinator

Evaluating Temperature Data Evaluating Temperature Data Bh. . Subramanyam Subramanyam ( (Subi

Compliance Training 2012 Compliance Training 2012 Training Objectives Training Objectives

Comparing User-Provided Tests to Developer-Provided Tests Ren Just, Chris Parnin, Ian Drosos,

Family Law and the Role of y Financial and Asset Experts Evaluating Whether, When and How to Use

Navigating FASB's New Pushdown Rules for Acquired Entities Evaluating Whether and How to Adopt

Notice of Claims in Claims-Made Insurance Policies Identifying Claims; Evaluating Whether and

Evaluating User Interfaces Evaluating User Interfaces Lecture slides modified from Eileen

New Staff Training Training Site Development Training Site Development 2 Training Site

Product Features Technical Training 2007 Technical Training 2007 Technical Training 2007

Truckee Donner Chamber of Commerce Visitor Profile Study Four Season Visitor Profile Study

May, 2020 Adani Ports and SEZ Limited Updated - May, 2020 Group Profile 04-07 Company Profile

TBR Adult Learner Profile 2018 Advising Academy March 15, 2018 Amy Moreland Profile of Adult

GODREJ & BOYCE MFG. CO. LTD. Job Profile for Campus Recruits Job Profile for Graduate

Creating The Perfect Giving Day Profile How to develop an engaging and eye catching profile to

EnaCloud: An Energy-saving Application Live Placement Approach for Cloud Computing Environments

In-Test Adaptation of Workload in Enterprise Application Performance Testing Maciej Kaczmarski

CENSUS: Counting Interleaved Workloads on Shared Storage 36 th International Conference on Massive

A Machine Learning Approach to Mapping Streaming Workloads to Dynamic Multicore Processors

From Traffic Measurement to From Traffic Measurement to Realistic Workload Generation Realistic

Dynamic Workload Management for Very Large Data Warehouses Juggling Feathers and Bowling Balls

Integrating Workload Specification and Extraction for Model- Based and Measurement-Based

An Approach and Case Study of Cloud Instance Type Selection for Multi-Tier Web Applications

Sambuz

Useful Links

Newsletter

Mail Us

Evaluating whether the training data provided for profile feedback - PowerPoint PPT Presentation

Evaluating whether the training data provided for profile feedback is a realistic control flow for the real workload. Darryl Gove, Lawrence Spracklen, John Henning Sun Microsystems Inc. Outline The trouble with feedback Correspondence

Leadplane Training Course Leadplane Training Course The Basic Lead Profile The Basic Show Me

Operational Profile Training Agenda SBHC Site Coordinator

Evaluating Temperature Data Evaluating Temperature Data Bh. . Subramanyam Subramanyam ( (Subi

Compliance Training 2012 Compliance Training 2012 Training Objectives Training Objectives

Comparing User-Provided Tests to Developer-Provided Tests Ren Just, Chris Parnin, Ian Drosos,

Family Law and the Role of y Financial and Asset Experts Evaluating Whether, When and How to Use

Navigating FASB's New Pushdown Rules for Acquired Entities Evaluating Whether and How to Adopt

Notice of Claims in Claims-Made Insurance Policies Identifying Claims; Evaluating Whether and

Evaluating User Interfaces Evaluating User Interfaces Lecture slides modified from Eileen

New Staff Training Training Site Development Training Site Development 2 Training Site

Product Features Technical Training 2007 Technical Training 2007 Technical Training 2007

Truckee Donner Chamber of Commerce Visitor Profile Study Four Season Visitor Profile Study

May, 2020 Adani Ports and SEZ Limited Updated - May, 2020 Group Profile 04-07 Company Profile

TBR Adult Learner Profile 2018 Advising Academy March 15, 2018 Amy Moreland Profile of Adult

GODREJ &amp; BOYCE MFG. CO. LTD. Job Profile for Campus Recruits Job Profile for Graduate

Creating The Perfect Giving Day Profile How to develop an engaging and eye catching profile to

EnaCloud: An Energy-saving Application Live Placement Approach for Cloud Computing Environments

In-Test Adaptation of Workload in Enterprise Application Performance Testing Maciej Kaczmarski

CENSUS: Counting Interleaved Workloads on Shared Storage 36 th International Conference on Massive

A Machine Learning Approach to Mapping Streaming Workloads to Dynamic Multicore Processors

From Traffic Measurement to From Traffic Measurement to Realistic Workload Generation Realistic

Dynamic Workload Management for Very Large Data Warehouses Juggling Feathers and Bowling Balls

Integrating Workload Specification and Extraction for Model- Based and Measurement-Based

An Approach and Case Study of Cloud Instance Type Selection for Multi-Tier Web Applications

Sambuz

Useful Links

Newsletter

Mail Us

GODREJ & BOYCE MFG. CO. LTD. Job Profile for Campus Recruits Job Profile for Graduate