Can We Understand Performance Counter Results? Vince Weaver ICL - PowerPoint PPT Presentation

Can We Understand Performance Counter Results? Vince Weaver ICL Lunch Talk 23 July 2010

How Do We Know if Counters are Working? Three common failures: • Wrong counter (PAPI, Kernel, User) • Counter works but gives wrong values • Counter is giving “right” values but documentation is wrong 1

Deterministic Events Easiest to Validate • Retired Instructions • Retired Branches • Retired Loads and Stores • Retired Multiplies and Divides • Retired µ ops • Retired Floating Point and SSE • Other ( fxch , cpuid , move operations, serializing instructions, memory barriers, and not-taken branches) 2

Ideal Deterministic Events • Results are same run-to-run • Event is frequent enough to be useful • The expected count can easily be determined by code inspection • Available on many processors 3

Retired Instruction Overcount Estimated Timer Frequency (Hz) 253.perlbmk.535 253.perlbmk.704 253.perlbmk.957 253.perlbmk.535 1000 253.perlbmk.957 250 100 253.perlbmk.704 550MHz Pentium III 100Hz 250Hz 1000Hz Estimated Timer Frequency (Hz) 253.perlbmk.704 176.gcc.expr 253.perlbmk.957 1000 176.gcc.200 250 100 176.gcc.scilab 2.2GHz Phenom 100Hz 250Hz 1000Hz Estimated Timer Frequency (Hz) 253.perlbmk.535 1000 176.gcc.166 250 176.gcc.166 100 2.8GHz Pentium 4 100Hz 250Hz 1000Hz 4

Tracking Down the Source of Overcounts • Work backward from existing benchmarks? • Assembly Language! 5

Contributors to Instruction Count on x86 64 Expected Count +1 for every Hardware Interrupt +1 for each memory page touched +1 for first floating point ins Processor Errata Undocumented processor quirks 6

Retired Instruction Results machine Raw Results Adjusted Results Adjustments Made Expected 226,990,030 226,990,030 Core2 10,793 ± 40 12 ± 1 HW Int Atom 11,601 ± 495 -43 ± 12 HW Int Nehalem 11,794 ± 1316 2 ± 7 HW Int Nehalem-EX 11,915 ± 9 6 ± 2 HW Int Pentium D R 2,610,571 ± 8 200,561 ± 8 Instr Double Counts Pentium D C 10,794 ± 28 -52 ± 5 HW Int Phenom 310,601 ± 11 11 ± 0 HW Int, FP Except Istanbul 311,830 ± 78 9 ± 1 HW Int, FP Except Pin 2.51868e9 ± 0 0 ± 0 Count rep string as 1 Qemu -16,410,000 ± 0 Valgrind -6,909,896 ± 0 7

Retired Stores Results machine Raw Results Adjusted Results Adjustments Made Expected 24,060,000 24,060,000 Core2 0 ± 0 0 ± 0 Atom n/a n/a Nehalem 411,632 ± 1483 410,014 ± 1 HW Int Nehalem-EX 411,914 ± 6 410,018 ± 1 HW Int Pentium D -12,880,000 ± 0 Phenom n/a n/a Istanbul n/a n/a Pin 802,180,000 ± 0 980,000 ± 0 Count rep string as 1 Qemu n/a n/a Valgrind -7,542,176 ± 0 8

Retired Floating Point machine FP1 FP2 SSE Core2 73,500,376 ± 140 40,299,997 ± 0 23,200,000 ± 0 Atom 38,800,000 ± 0 0 ± 0 88,299,597 ± 792 Nehalem 50,150,648 ± 140 17,199,998 ± 1 24,201,639 ± 957 Nehalem-EX 50,155,704 ± 562 17,199,998 ± 2 24,007,005 ± 197,401 Pentium D 100,400,262 ± 9 140,940,555 ± 39,287 53,149,435 ± 522,879 Phenom 26,600,001 ± 0 112,700,001 ± 0 15,800,000 ± 0 Istanbul 26,600,001 ± 0 112,700,001 ± 0 15,800,000 ± 0 9

Other Architectures • ARM – Cannot select only userspace events • ia64 – Loads, Stores, Instructions all deterministic • POWER – On Power6 Instructions is deterministic, Branches is not • SPARC – on Niagara1, Instructions is deterministic 10

Non-deterministic Events • Cache and Memory Related • Branch Predictor • Cycles • Stalls 11

Simplistic Cache Model 0 1 2 3 4 5 511 12

Simplistic Cache Model 0 1 2 3 4 5 511 13

Simplistic Cache Model �� 0 �� 1 2 3 4 5 511 14

L1 Data Cache Accesses float array[1000],sum = 0.0; PAPI_start_counters(events,1); for(int i=0; i<1000; i++) { sum += array[i]; } PAPI_stop_counters(counts,1); 21

Can We Understand Performance Counter Results? Vince Weaver ICL - PowerPoint PPT Presentation

Can We Understand Performance Counter Results? Vince Weaver ICL Lunch Talk 23 July 2010 How Do We Know if Counters are Working? Three common failures: Wrong counter (PAPI, Kernel, User) Counter works but gives wrong values Counter

8051 Serial Port and Timer/Counter Serial Port Timer Counter Chatchai Jantaraprim

The UN Global Counter- -Terrorism Strategy Terrorism Strategy The UN Global Counter The UN

Counter Braids: A novel counter architecture Balaji Prabhakar Balaji Prabhakar Stanford

For Loops and Arrays November 13, 2008 Counting Initialize counter Test counter against limit

Decidable Problems for Counter Systems Day 1 Introduction to Counter Systems St ephane Demri

Counter/Timers Overview ATmega328P has two 8-bit and one 16-bit counter/timers. Unit C

MEDI-MEAL A Reference Guide to Over-the-Counter Drugs Designed by Alexander Chen and Fiona Luo

Overview Narcotics Trafficking Operational picture Counter narcotics Strategies

Reversal-Bounded Counter Machines St ephane Demri LSV, CNRS, ENS Cachan Workshop on Logics

Below the Stomach 121 Counter Attack Lines close close Bearish Bullish Bearish Bullish 122

DSN02 Block Diagram. Use CORDIC IP Core. Counter limit n step wrap clock reset

8051 Serial Port and Timer/Counter Serial Port Timer Counter Chatchai Jantaraprim

Pro-Forestry Counter-Movement Study Overheads 2 3 COUNTER-MOVEMENT HYPOTHESES: VALUES 1.

Decidable Problems for Counter Systems Day 5 Model-Checking Counter Systems St ephane Demri

Revisiting Counter Mode to Repair Galois/Counter Mode Bo Zhu, Yin Tan and Guang Gong University

Decidability and complexity issues for subclasses of counter systems Lecture 4 Counter automata

Biological Organisation as the True Foundation of Reality Brian D. Josephson MindMatter

ECEN 5032 Data Networks Wireless PANs/MANs Peter Mathys mathys@colorado.edu University of

Transition a and Care M e Managem emen ent S Services es FY2020 Adrienne Weede, LCSW

Generic Theses on Spontaneous Wave Function Collapse Lajos Disi Wigner Research Centre for

A UDITING THE M EDICAID P ROGRAM Christopher Holder, CFE OIG Senior Auditor Office of Inspector

Embedded analy+cs delivers system-wide visibility for debug, safety, security and more... Design

Topological Complexity and related invariants Lucile Vandembroucq Centro de Matem atica -

Invariant Ricci-flat K ahler metrics on tangent bundles of compact symmetric spaces Jos e

Sambuz

Useful Links

Newsletter

Mail Us