Performance What do we mean by Performance? We must take many - PowerPoint PPT Presentation

Performance What do we mean by Performance? • We must take many different factors into account: • Technology • basic circuit speed (clock speed, usually in MHz: millions of cycles per second, now in GHz: billions of cycles per sec.) • process technology (how many transistors on a chip) • Organization • what style of ISA (RISC or CISC) • what type of memory hierarchy • how many processors in the system • Software • quality of the compiler, OS, database driver, etc... • There’s alot more to measuring performance than clock speed... CSE378 W INTER , 2001 CSE378 W INTER , 2001 94 95 Metrics Execution Time • Raw speed (peak performance, but it is never attained) • Performance: • Execution time (also called response time, i.e. time to execute one 1 PerformanceA = - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - program from beginning to end). Need specific benchmarks for: ExecutiontimeA • Integer dominated programs (compilers, etc) • Scientific (lots of floating point usage) • Processor A is faster than processor B if: • Graphics/multimedia < ExecutiontimeA ExecutiontimeB • Throughput (total amount of work in given time) > • Good metrics for systems managers PerformanceA PerformanceB • Database programs (keeping the most people happy at the • Relative performance: same time) • Often, improving execution time will improve throughput, and vice PerformanceA ExecutiontimeB - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - = - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - versa. PerformanceB ExecutiontimeA CSE378 W INTER , 2001 CSE378 W INTER , 2001 96 97

Measuring Execution Time Definition of CPU Execution Time • Wall clock, response time, elapsed time • CPU Execution Time = CPU clock cycles x clock cycle time • Unix time function: • CPU execution time is program dependent [tahiti]:~ % time someprogram • CPU clock cycles is program dependent 346.085u 0.394s 5:48.32 99.4% 5+302k 0+0io 0pf+0w • clock cycle time (usually in nanoseconds, ns) depends on the • “time” lists User CPU time, System CPU time, elapsed time, particular machine percentage of elapsed time which is total CPU time, as well as • Since clock cycle time = 1/clock cycle rate (clock cycle rate is in information about the process size, quantity of IO, etc. MHz, millions of cycles per second) an alternate definition is: • Because of OS differences, it is hard to make comparisons from CPU clock cycles one system to another... CPU Execution Time = - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - clock cycle rate • For the remainder of this lecture, we’ll use User CPU time to mean CPU execution time (or just execution time ) CSE378 W INTER , 2001 CSE378 W INTER , 2001 98 99 CPI - Cycles Per Instruction Class of Instructions • Definition: CPI is the average number of clock cycles per • You can give different CPIs for various classes of instructions (e.g. instruction. floating point arithmetic instructions take longer than integer instructions, load-store instructions take longer than logical × instructions, etc.) CPU clock cycles = Number of Instructions CPI n ∑ ( × ) × CPU Exec time = CPIi Ci clock cycle time × × 1 CPU Exec Time = Number of Instructions CPI clock cycle time • C i is the number of instructions in the ith class that have been executed • CPI in isolation is not a measure of performance (program and compiler dependent) • Note that minimizing the number of instructions does not necessarily improve the execution time of the program • Ideally, CPI = 1, but this might slow down the clock (compromise) • Improving part of the architecture can improve a C j . We often talk • CPI can (and usually is) greater than 1 because of breaks in about the contribution to CPI of a certain class of instructions. control flow and the impact of the memory hierarchy • Can we have CPI < 1? CSE378 W INTER , 2001 CSE378 W INTER , 2001 100 101

Measuring average CPI Other Popular Metrics - MIPS • Instruction count: need a simulator or (possibly less precise) a • MIPS = Millions of Instructions Per Second profiler Instruction Count MIPS = - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - × 10 6 • Simulator “interprets” every instruction and counts them Exec time • Profiler can either count how many times each basic block has or been executed or use some sampling technique clock rate MIPS = - - - - - - - - - - - - - - - - - - - - - - - • CPU Execution time can be measured (elapsed time) × 10 6 CPI • Clock cycle time is given by the processor • Since MIPS is a rate, the higher the better. • We know execution time, cycle time, so we can solve for total • But MIPS in isolation is no better than CPI in isolation. MIPS is: cycles. • Program dependent • Knowing the total cycles together with the total number of instructions executed lets us solve for average CPI. • Does not take the instruction set into account (CISC programs will typically take fewer instructions than RISC, so we can’t compare different ISAs) CSE378 W INTER , 2001 CSE378 W INTER , 2001 102 103 The Trouble with MIPS Other Popular Metrics - MFLOPS • Using MIPS can give “wrong” results: • MFLOPS = Millions of floating point operations per second • Machine A with compiler C1 executes program P in 10 seconds, Number of floating point instructions MFLOPS = - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - using 100,000,000 instructions (10 MIPS) × 10 6 Exec time • Machine A with compiler C2 executes program P in 15 seconds, using 180,000,000 instructions (12 MIPS) • Same problems as MIPS: • While C1 is clearly faster than C2, C1 has a lower MIPS rating • Program dependent than C2.... • Doesn’t take instruction set into account • ... the trouble with MIPS is that it doesn’t take CPI into account. • Counts operations, not the time to execute them... CSE378 W INTER , 2001 CSE378 W INTER , 2001 104 105

Benchmarks Amdahl’s Law • Benchmarks: workload representative of what the computer will • The amount that we can improve performance with a given actually be used for. improvement is limited by the amount that the improved feature is actually used: • Industry benchmarks to compare machines: SPEC benchmarks (SPECint, SPECfp), Perfect Club Exec time affected by improvement Exec time after improvement = - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + Exec time unaffected • Database benchmarks Amount of improvement • Multimedia benchmarks • For instance, if loads/stores take up 33% of our execution time, • Caveats: how much do we need to improve loads/stores to make the • Compilers optimize specifically for benchmarks program run 1.5 times faster? • Old SPEC benchmarks (1992) were too small (didn’t test the • Important corollary: Make the common case fast. memory system sufficiently) • Utilities, user interface, etc. are often not in benchmarks CSE378 W INTER , 2001 CSE378 W INTER , 2001 106 107 Example Measurements Evolution of ISAs Instruction Category GCC SPICE Ave. CPI Load/Store 33% 40% 1.4 Branch 16% 8% 1.8 Jumps 2% 2% 1.2 FP Add - 5% 2.0 FP Sub - 3% 4.0 FP Mul - 6% 5.0 FP Div - 3% 19.0 Other (Integer add/sub, stl, etc) 49% 33% 1.0 • What is the average CPI for gcc? For spice? Should we expect CPI for a given category to be the same btwn two programs? CSE378 W INTER , 2001 CSE378 W INTER , 2001 108 109

Performance What do we mean by Performance? We must take many - PowerPoint PPT Presentation

Performance What do we mean by Performance? We must take many different factors into account: Technology basic circuit speed (clock speed, usually in MHz: millions of cycles per second, now in GHz: billions of cycles per sec.)

Statistics in Biology The Mean Mean ( x ) is a measure of the central tendency of a set of data

Notion of mean point in the data Why bother about mean point? Defining mean point can be

JUST THE MATHS SLIDES NUMBER 13.2 INTEGRATION APPLICATIONS 2 (Mean values) & (Root

As a prelude to the back-analysis intended for the full MAE Center report that is currently under

Mean Absolute Deviation Mean Absolute Deviation O Definition: Mean Absolute Deviation (MAD) is

Population size and Conservation TEST 1 Mean = 83, Geometric mean = 82, Harmonic mean = 81,

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial

PERFORMANCE S&Bs Community Bank benchmark includes banks with assets between $100 million

What does MFA mean? Jeffrey Goldberg jeff@1Password.com What does MFA mean? It

What does mean? What does Baptism mean? 1) Baptism is a symbol pointing to the truth of

Overview of mean-field and beyond mean-field theoretical studies on giant resonances G. Col

Part VII Accounting for the Endogeneity of Schooling 327 / 785 Endogeneity of schooling Mean

If market is efficient, does this mean expert advice is worthless? Does this mean there is no room

Mean-Shift Tracker 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Mean Shift

Constant mean curvature surfaces in homogeneous manifolds Beno t Daniel August 29, 2012

Data and Process Modelling Lab10. Quantitative Process Analysis Marco Montali KRDB Research

IC220 See through the marketing hype Slide Set #5B: Performance Key to understanding

Lecture 2: Performance Todays topics: Performance trends and equations Reminders:

Introductory Queueing Theory Tutorial Mor Harchol-Balter Computer Science Dept, Carnegie Mellon

CSEE 3827: Fundamentals of Computer Systems, Spring 2011 8. Processor Performance Prof. Martha

CS4617 Computer Architecture Lecture 4: Memory Hierarchy 2 Dr J Vaughan September 17, 2014 1/25

Chapter 5 Managing Process Constraints Theory of Constraints Managing Bottlenecks

Appendix A Appendix A Pipelining: Basic and Intermediate Concepts p 1 Overview Basics of

Performance What do we mean by Performance? We must take many - PowerPoint PPT Presentation

Performance What do we mean by Performance? We must take many different factors into account: Technology basic circuit speed (clock speed, usually in MHz: millions of cycles per second, now in GHz: billions of cycles per sec.)

Statistics in Biology The Mean Mean ( x ) is a measure of the central tendency of a set of data

Notion of mean point in the data Why bother about mean point? Defining mean point can be

JUST THE MATHS SLIDES NUMBER 13.2 INTEGRATION APPLICATIONS 2 (Mean values) &amp; (Root

As a prelude to the back-analysis intended for the full MAE Center report that is currently under

Mean Absolute Deviation Mean Absolute Deviation O Definition: Mean Absolute Deviation (MAD) is

Population size and Conservation TEST 1 Mean = 83, Geometric mean = 82, Harmonic mean = 81,

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial

PERFORMANCE S&amp;Bs Community Bank benchmark includes banks with assets between $100 million

What does MFA mean? Jeffrey Goldberg jeff@1Password.com What does MFA mean? It

What does mean? What does Baptism mean? 1) Baptism is a symbol pointing to the truth of

Overview of mean-field and beyond mean-field theoretical studies on giant resonances G. Col

Part VII Accounting for the Endogeneity of Schooling 327 / 785 Endogeneity of schooling Mean

If market is efficient, does this mean expert advice is worthless? Does this mean there is no room

Mean-Shift Tracker 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Mean Shift

Constant mean curvature surfaces in homogeneous manifolds Beno t Daniel August 29, 2012

Data and Process Modelling Lab10. Quantitative Process Analysis Marco Montali KRDB Research

IC220 See through the marketing hype Slide Set #5B: Performance Key to understanding

Lecture 2: Performance Todays topics: Performance trends and equations Reminders:

Introductory Queueing Theory Tutorial Mor Harchol-Balter Computer Science Dept, Carnegie Mellon

CSEE 3827: Fundamentals of Computer Systems, Spring 2011 8. Processor Performance Prof. Martha

CS4617 Computer Architecture Lecture 4: Memory Hierarchy 2 Dr J Vaughan September 17, 2014 1/25

Chapter 5 Managing Process Constraints Theory of Constraints Managing Bottlenecks

Appendix A Appendix A Pipelining: Basic and Intermediate Concepts p 1 Overview Basics of

JUST THE MATHS SLIDES NUMBER 13.2 INTEGRATION APPLICATIONS 2 (Mean values) & (Root

PERFORMANCE S&Bs Community Bank benchmark includes banks with assets between $100 million