Benchmark Design for Robust Profile-Directed Optimization SPEC - PowerPoint PPT Presentation

Benchmark Design for Robust Profile-Directed Optimization SPEC Workshop 2007 Paul Berube and José Nelson Amaral University of Alberta NSERC Alberta Ingenuity iCore January 21, 2007 Paul Berube 1

In this talk • SPEC: SPEC CPU • PDF: Offline, profile-guided optimization • Test: Evaluate • Data/Inputs: Program input data January 21, 2007 Paul Berube 2

PDF in Research • SPEC benchmarks and inputs used, but rules seldom followed exactly – PDF will continue regardless of admissibility in reported results • Some degree of profiling is taken as a given in many recent compiler and architecture works January 21, 2007 Paul Berube 3

An Opportunity to Improve • No PDF for base in CPU2006 – An opportunity to step back and consider • Current evaluation methodology for PDF is not rigorous – Dictated by inputs/rules provided in SPEC CPU – Usually followed when reporting PDF research January 21, 2007 Paul Berube 4

Current Methodology Static optimization input.ref Flag Tuning optimizing Test compiler peak_static January 21, 2007 Paul Berube 5

Current Methodology PDF optimization input.train input.ref Flag Tuning PDF Train optimizing Test Profile compiler peak_pdf Instrumenting compiler January 21, 2007 Paul Berube 6

Current Methodology PDF optimization input.train input.ref  Flag Tuning PDF Train optimizing Test Profile compiler if (peak_pdf > peak_static) Instrumenting peak := peak_pdf; compiler January 21, 2007 Paul Berube 7

Current Methodology PDF optimization input.train input.ref Flag Tuning PDF Train optimizing Test Profile compiler if (peak_pdf > peak_static) Instrumenting peak := peak_pdf; compiler else peak := peak_static; January 21, 2007 Paul Berube 8

Current Methodology PDF optimization input.train input.ref Flag Tuning Is this PDF Train optimizing Test comparison Profile compiler sound? Does 1 training and 1 test input predict PDF performance? (peak_pdf > peak_static) if (peak_pdf > peak_static) Instrumenting (peak_pdf > other_pdf) peak := peak_pdf; compiler else peak := peak_static; January 21, 2007 Paul Berube 9

Current Methodology PDF optimization input.train input.ref Flag Tuning Variance Is this between inputs PDF Train optimizing Test comparison can be larger than Profile compiler sound? reported Does 1 training improvements! and 1 test input predict PDF performance? (peak_pdf > peak_static) if (peak_pdf > peak_static) Instrumenting (peak_pdf > other_pdf) peak := peak_pdf; compiler else peak := peak_static; January 21, 2007 Paul Berube 10

January 21, 2007 vs. Static combined bzip2 – Train on xml -6 10 12 -4 -2 compressed 0 2 4 6 8 docs gap graphic jpeg Paul Berube log mp3 mpeg pdf program random reuters source > 14% xml 11

PDF is like Machine Learning • Complex parameter space • Limited observed data (training) • Adjust parameters to match observed data – maximize expected performance January 21, 2007 Paul Berube 12

Evaluation of Learning Systems • Must take sensitivity to training and evaluation inputs into account – PDF specializes code according to training data – Changing inputs can greatly alter performance • Performance results must have statistical significance measures – Differentiate between gains/losses and noise January 21, 2007 Paul Berube 13

Overfitting • Specializing for the training data too closely • Exploiting particular properties of the training data that do not generalize • Causes: – insufficient quantity of training data – insufficient variation among training data – deficient learning system January 21, 2007 Paul Berube 14

Overfitting • Currently: ✗ Engineer the compiler to not overfit the single training data (underfitting) ✗ No clear rules for input selection ✗ Some benchmark authors replicate data between train and ref • Overfitting can be rewarded! January 21, 2007 Paul Berube 15

Criteria for Evaluation • Predict expected future performance • Measure performance variance • Do not reward overfitting • Same evaluation criteria as ML – Cross-validation addresses these criteria January 21, 2007 Paul Berube 16

Cross-Validation • Split a collection of inputs into two or more non-overlapping sets • Train on one set, test on the other set(s) • Repeat, using a different set for training Test Train January 21, 2007 Paul Berube 17

Leave-one-out Cross-Validation • If little data, reduce test set to 1 input – Leave N out: only N inputs in test Test Train January 21, 2007 Paul Berube 18

Cross-Validation • The same data is NEVER in both the training and the testing set – Overfitting will not enhance performance • Multiple evaluations allows statistical measure to be calculated on the results – Standard deviation, confidence intervals... • Set of training inputs allows system to exploit commonalities between inputs January 21, 2007 Paul Berube 19

Proposed Methodology • PDFPeak score, distinct from peak – Report with standard deviation • Provide a PDF workload – Inputs used for both training and evaluation, so “medium” sized (~2 min running time) – 9 inputs needed for meaningful statistical measures January 21, 2007 Paul Berube 20

Proposed Methodology • Split inputs into 3 sets (at design time) • For each input in each evaluation, calculate speedup compared to (non-PDF) peak • Calculate (over all evaluations) – mean speedup – standard deviation of speedups January 21, 2007 Paul Berube 21

Example PDF Workload (9 inputs): jpeg mpeg xml html text doc pdf source program January 21, 2007 Paul Berube 22

Example – Split workload PDF Workload A jpeg (9 inputs): xml pdf jpeg mpeg xml B mpeg html html text source doc pdf source C text program doc program January 21, 2007 Paul Berube 23

Example – Train and Run A Train Instrumenting compiler January 21, 2007 Paul Berube 24

Example – Train and Run A PDF Train optimizing Profile(A) compiler Instrumenting compiler January 21, 2007 Paul Berube 25

Example – Train and Run A B+C PDF Train optimizing Test Profile(A) compiler mpeg 1% html 5% text 4% Instrumenting compiler doc -3% source 4% program 2% January 21, 2007 Paul Berube 26

Example – Train and Run B A+C PDF Train optimizing Test Profile(B) compiler jpeg 4% Mpeg 2% xml -1% html 5% text 5% text 3% Instrumenting doc 1% compiler doc -7% pdf 4% source 1% program 1% program 1% January 21, 2007 Paul Berube 27

Example – Train and Run C A+B PDF Train optimizing Test Profile(C) compiler jpeg 5% jpeg 2% Mpeg 2% xml 2% xml -3% html 5% mpeg -1% text 2% text 3% Instrumenting html 3% doc 2% compiler doc -7% pdf 3% pdf 3% source 1% source 3% program-1% program 1% January 21, 2007 Paul Berube 28

Example – Evaluate doc 1% doc -3% html 3% html 5% Average: 2.33 jpeg 5% jpeg 4% mpeg -1% mpeg 1% pdf 3% pdf 4% program 1% program 2% source 3% source 4% text 5% text 4% xml -1% xml 2% January 21, 2007 Paul Berube 29

Example – Evaluate doc 1% doc -3% html 3% html 5% Average: 2.33 jpeg 5% jpeg 4% mpeg -1% mpeg 1% pdf 3% pdf 4% program 1% program 2% Std. Dev: 2.30 source 3% source 4% text 5% text 4% xml -1% xml 2% January 21, 2007 Paul Berube 30

Example – Evaluate doc 1% doc -3% html 3% html 5% Average: 2.33 jpeg 5% jpeg 4% mpeg -1% PDF improves performance: mpeg 1% • 2.33±2.30%, 17 times out of 25 pdf 3% • 2.33±4.60%, 19 times out of 20 pdf 4% program 1% program 2% Std. Dev: 2.30 source 3% source 4% text 5% text 4% xml -1% xml 2% January 21, 2007 Paul Berube 31

Example – Evaluate PDF improves performance: • 2.33±2.30%, 17 times out of 25 • 2.33±4.60%, 19 times out of 20 (peak_pdf > peak_static)? (new_pdf > other_pdf)? Depends on mean and variance of both! January 21, 2007 Paul Berube 32

Pieces of Effective Evaluation • Workload of inputs • Education about input selection – Rules and guidelines for authors • Adoption of a new methodology for PDF evaluation January 21, 2007 Paul Berube 33

Practical Concerns • Benchmark user – Many additional runs, but on smaller inputs – Two additional program compilation • Benchmark author – Most INT benchmarks use multiple data, and/or additional data is easily available – PDF input set could be used for REF January 21, 2007 Paul Berube 34

Conclusion • PDF is here: important for compilers and architecture, in research and in practice • The current methodology for PDF evaluation is not reliable • Proposed a methodology for meaningful evaluation January 21, 2007 Paul Berube 35

Thanks Questions? January 21, 2007 Paul Berube 36

Benchmark Design for Robust Profile-Directed Optimization SPEC - PowerPoint PPT Presentation

Benchmark Design for Robust Profile-Directed Optimization SPEC Workshop 2007 Paul Berube and Jos Nelson Amaral University of Alberta NSERC Alberta Ingenuity iCore January 21, 2007 Paul Berube 1 In this talk SPEC: SPEC

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Finding Strongly Connected Components Directed Acyclic Graphs Directed Acyclic Graphs Directed

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Goal-Directed Design User Goals Models Goal-Directed Design Jrg Cassens References SoSe

Medicaid Benchmark Options Analysis Stakeholder Advisory Committee July 23, 2012 Overview

The HPC Challenge Benchmark: The HPC Challenge Benchmark: A Candidate for Replacing A Candidate

Incidence Relations and Directed Cycles Hao Wu George Washington University Directed graphs and

3.5 Connectivity in Directed Graphs Directed Graphs Directed graph. G = (V, E) Edge (u, v)

6.1 Directed Acyclic Graphs Directed acyclic graphs , or DAGs are acyclic directed graphs where

5.1 Directed Acyclic Graphs Directed acyclic graphs , or DAGs are acyclic directed graphs where

CS 401 Greedy Algorithms Xiaorui Sun 1 Directed Acyclic Graphs (DAG) Def: A DAG is a directed

An introduction to Optimization under Uncertainty 0-1 Multiband Robust Optimization* with special

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Goal-Directed Design: Research Goal-Directed Design Understanding the Problem Research

Automatic Configuration of Benchmark Sets for Classical Planning Alvaro Torralba, 1 Jendrik Seipp,

The Scenario Approach: Robust Optimization and Application to Control M.C. Campi University of

EnerJ Approximate Data Types for Safe and General Low-Power Computation Adrian Sampson Werner

Psychovisual Enhancements Aaron Koehl Dept. of Physics, Computer Science, and Engineering

Introduction to Data Processing with R Jon Clayden <j.clayden@ucl.ac.uk> DIBS Teaching

Leveraging Frequency Analysis for Deep Fake Image Recognition Joel Frank , Thorsten Eisenhofer,

Student Presentations Another skill to learn Like designing game, coding, art A

Title Enon lake copy.jpg Alabama Map .jpg 1a.jpg house side cropped.jpg house 2 .jpg House.jpg

PLANTATION SERVICES RIVER SOUTH PLANTATION Main lodge.jpg PLANTATION SERVICES RIVER SOUTH

Lift & Slide Configurations Sierra Pacific WI Panels Types Lift & Slide Doors &

Benchmark Design for Robust Profile-Directed Optimization SPEC - PowerPoint PPT Presentation

Benchmark Design for Robust Profile-Directed Optimization SPEC Workshop 2007 Paul Berube and Jos Nelson Amaral University of Alberta NSERC Alberta Ingenuity iCore January 21, 2007 Paul Berube 1 In this talk SPEC: SPEC

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Finding Strongly Connected Components Directed Acyclic Graphs Directed Acyclic Graphs Directed

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Goal-Directed Design User Goals Models Goal-Directed Design Jrg Cassens References SoSe

Medicaid Benchmark Options Analysis Stakeholder Advisory Committee July 23, 2012 Overview

The HPC Challenge Benchmark: The HPC Challenge Benchmark: A Candidate for Replacing A Candidate

Incidence Relations and Directed Cycles Hao Wu George Washington University Directed graphs and

3.5 Connectivity in Directed Graphs Directed Graphs Directed graph. G = (V, E) Edge (u, v)

6.1 Directed Acyclic Graphs Directed acyclic graphs , or DAGs are acyclic directed graphs where

5.1 Directed Acyclic Graphs Directed acyclic graphs , or DAGs are acyclic directed graphs where

CS 401 Greedy Algorithms Xiaorui Sun 1 Directed Acyclic Graphs (DAG) Def: A DAG is a directed

An introduction to Optimization under Uncertainty 0-1 Multiband Robust Optimization* with special

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Goal-Directed Design: Research Goal-Directed Design Understanding the Problem Research

Automatic Configuration of Benchmark Sets for Classical Planning Alvaro Torralba, 1 Jendrik Seipp,

The Scenario Approach: Robust Optimization and Application to Control M.C. Campi University of

EnerJ Approximate Data Types for Safe and General Low-Power Computation Adrian Sampson Werner

Psychovisual Enhancements Aaron Koehl Dept. of Physics, Computer Science, and Engineering

Introduction to Data Processing with R Jon Clayden &lt;j.clayden@ucl.ac.uk&gt; DIBS Teaching

Leveraging Frequency Analysis for Deep Fake Image Recognition Joel Frank , Thorsten Eisenhofer,

Student Presentations Another skill to learn Like designing game, coding, art A

Title Enon lake copy.jpg Alabama Map .jpg 1a.jpg house side cropped.jpg house 2 .jpg House.jpg

PLANTATION SERVICES RIVER SOUTH PLANTATION Main lodge.jpg PLANTATION SERVICES RIVER SOUTH

Lift &amp; Slide Configurations Sierra Pacific WI Panels Types Lift &amp; Slide Doors &amp;

Introduction to Data Processing with R Jon Clayden <j.clayden@ucl.ac.uk> DIBS Teaching

Lift & Slide Configurations Sierra Pacific WI Panels Types Lift & Slide Doors &