Modeling Performance and Energy Efficiency of Applica5on - PowerPoint PPT Presentation

Modeling ¡Performance ¡and ¡Energy ¡ Efficiency ¡of ¡Applica5on ¡Codes ¡ Shirley ¡Moore ¡ University ¡of ¡Texas ¡at ¡El ¡Paso ¡ svmoore@utep.edu ¡ ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 1 ¡

Introduc5on ¡ • Current trends in HPC put great focus on constraining power consumption without decreasing performance. • Multicore systems are hierarchical and can include heterogeneous components. • Understanding the mapping of scientific applications onto multicore and heterogeneous systems is necessary to optimize performance and power consumption. • Goal: Efficient use of multicore and heterogeneous systems by scientific applications in terms of runtime and power consumption. 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 2 ¡

Performance-‑Power ¡Co-‑Modeling ¡ • Goals ¡ ¡ – Understand ¡and ¡predict ¡execu5on ¡5me ¡and ¡energy ¡ consump5on ¡of ¡a ¡given ¡code ¡or ¡kernel ¡ – Devise ¡DVFS ¡strategies ¡that ¡reduce ¡energy ¡ consump5on ¡with ¡minimal ¡effect ¡on ¡performance ¡ • Models ¡vary ¡from ¡purely ¡analy5cal ¡to ¡empirically ¡ based ¡sta5s5cal ¡models ¡based ¡on ¡regression ¡ analysis. ¡ ¡ • Different ¡models ¡may ¡be ¡required ¡for ¡different ¡ classes ¡of ¡applica5ons ¡and ¡even ¡different ¡phases ¡ of ¡the ¡same ¡applica5on. ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 3 ¡

What ¡is ¡DVFS? ¡ • Dynamic ¡Voltage ¡Frequency ¡Scaling ¡ • Changing ¡the ¡frequency ¡and ¡opera5ng ¡voltage ¡of ¡a ¡ processor ¡based ¡on ¡performance ¡requirements ¡in ¡ order ¡to ¡reduce ¡energy ¡consump5on ¡ • Power ¡ ¡= ¡cV 2 F ¡ – Linear ¡reduc5on ¡in ¡voltage ¡and ¡frequency ¡yields ¡cubic ¡ reduc5on ¡in ¡power ¡ ¡ • Energy ¡consump5on ¡is ¡power ¡integrated ¡over ¡5me. ¡ • Voltage ¡transi5ons ¡on ¡order ¡of ¡tens ¡of ¡nanoseconds ¡ with ¡on-‑chip ¡voltage ¡regulators. ¡ • A ¡DVFS ¡strategy ¡is ¡a ¡schedule ¡for ¡changing ¡voltage ¡ levels ¡during ¡applica5on ¡execu5on. ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 4 ¡

CPU ¡vs. ¡Memory ¡Bound ¡Applica5ons ¡ • For ¡a ¡totally ¡CPU-‑bound ¡computa5on ¡(no ¡main ¡ memory ¡accesses): ¡ Given ¡execu5on ¡5me ¡ t 0 ¡ at ¡CPU ¡frequency ¡ f 0 ¡and ¡a ¡target ¡CPU ¡ frequency ¡ f new , ¡the ¡execu5on ¡5me ¡ t new ¡is ¡given ¡by ¡ f 0 t new ¡ = ¡t 0 ¡* ¡ ¡ f new • A ¡totally ¡memory-‑bound ¡computa5on ¡(all ¡execu5on ¡ 5me ¡is ¡spent ¡accessing ¡memory) ¡experiences ¡no ¡ slowdown ¡at ¡a ¡slower ¡CPU ¡clock ¡frequency. ¡ • Most ¡applica5ons ¡are ¡somewhere ¡in ¡between: ¡ t new ¡ = ¡memory ¡access ¡5me ¡ ¡+ ¡ ¡ ¡ ¡ ¡* ¡(compute ¡5me ¡at ¡ f 0 ) ¡ f 0 f new 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 5 ¡

How ¡to ¡Measure ¡Memory ¡Access ¡and ¡ ¡ Compute ¡Times? ¡ • Cycle ¡breakdown ¡model: ¡ Total ¡cycles ¡= ¡Re2red ¡cycles ¡+ ¡Non-‑re2red ¡cycles ¡+ ¡Stall ¡cycles ¡ • Stall ¡cycles ¡can ¡be ¡approximately ¡decomposed ¡into ¡a ¡ sum ¡of ¡counts ¡of ¡events ¡causing ¡stalls ¡( N i ) ¡weighted ¡by ¡ their ¡penal5es ¡( P i ) : ¡ Counted_Stall_Cycles ¡= ¡ Σ ¡ P i ¡* ¡N i ¡ ¡ • Errors ¡range ¡from ¡5-‑30% ¡ – ¡Some ¡needed ¡counters ¡are ¡not ¡available ¡(varies ¡by ¡ pladorm). ¡ – ¡Penal5es ¡are ¡es5mates ¡and ¡aren’t ¡really ¡constants. ¡ ¡ ¡ ¡ ¡ ¡ ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 6 ¡

PerfExpert ¡LCPI ¡Metric ¡ • M. ¡Burtscher, ¡B. ¡Kim, ¡J. ¡Diamond, ¡J. ¡McCalpin, ¡ L. ¡Koesterke, ¡and ¡J. ¡Browne. ¡“PerfExpert: ¡An ¡ Easy-‑to-‑Use ¡Performance ¡Diagnosis ¡Tool ¡for ¡ HPC ¡Applica5ons”, ¡SC10, ¡November ¡2010 ¡(to ¡ appear) ¡ • Combines ¡performance ¡counter ¡ measurements ¡with ¡architectural ¡parameters ¡ to ¡compute ¡upper ¡bounds ¡on ¡local ¡cycle-‑per-‑ instruc5on ¡(LCPI) ¡contribu5ons ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 7 ¡

Six ¡LCPI ¡Categories ¡ • Data ¡memory ¡accesses: ¡ ¡ ¡ ¡ L1_DCA * L1_lat + L2_DCA * L2_lat + L2_DCM * Mem_lat )/ TOT_INS ¡ • Branching ¡contribu5on: ¡ ¡ ¡ ¡ ¡( BR_INS * BR_lat + BR_MSP * BR_miss_lat )/ TOT_INS ¡ • Other ¡categories: ¡ – Instruc5on ¡accesses ¡ – Floa5ng ¡point ¡instruc5ons ¡ – Instruc5on ¡TLB ¡accesses ¡ – Data ¡TLB ¡accesses ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 8 ¡

Hardware ¡Counters ¡ Hardware ¡performance ¡counters ¡available ¡on ¡most ¡ modern ¡ ¡microprocessors ¡can ¡provide ¡insight ¡into: ¡ ¡ 1. Whole ¡program ¡5ming ¡ 2. Cache ¡behaviors ¡ 3. Branch ¡behaviors ¡ 4. Memory ¡and ¡resource ¡access ¡pamerns ¡ 5. Pipeline ¡stalls ¡ 6. Floa5ng ¡point ¡efficiency ¡ 7. Instruc5ons ¡per ¡cycle ¡ Hardware ¡counter ¡informa5on ¡can ¡be ¡obtained ¡with: ¡ – Subrou5ne ¡or ¡basic ¡block ¡resolu5on ¡ – Process ¡or ¡thread ¡amribu5on ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 9 ¡

PAPI ¡Background ¡ • Performance ¡API ¡-‑ ¡hmp://icl.utk.edu/papi/ ¡ ¡ • Middleware ¡to ¡provide ¡a ¡consistent ¡programming ¡ interface ¡for ¡the ¡performance ¡counter ¡hardware ¡found ¡ in ¡most ¡major ¡micro-‑processors ¡ • De ¡facto ¡standard ¡ • Countable ¡events ¡are ¡defined ¡in ¡two ¡ways: ¡ – Pladorm-‑neutral ¡ preset ¡events ¡ ¡ – Pladorm-‑dependent ¡na5ve ¡events ¡ • Presets ¡can ¡be ¡derived ¡from ¡mul5ple ¡ na2ve ¡events. ¡ • Events ¡can ¡be ¡mul5plexed ¡if ¡physical ¡counters ¡are ¡ limited. ¡ • Sta5s5cal ¡sampling ¡implemented ¡by: ¡ – Hardware ¡overflow ¡if ¡supported ¡by ¡the ¡pladorm ¡ – Sorware ¡overflow ¡with ¡5mer ¡driven ¡sampling ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 10 ¡

PAPI ¡user-‑defined ¡event ¡mechanism ¡ • Allows ¡users ¡to ¡define ¡their ¡own ¡metrics ¡ – User ¡can ¡combine ¡events ¡and ¡constants ¡in ¡an ¡ expression ¡to ¡define ¡and ¡name ¡a ¡new ¡metric ¡ – Maps ¡the ¡new ¡metric ¡to ¡events ¡available ¡on ¡a ¡ pladorm ¡without ¡the ¡need ¡to ¡re-‑install ¡PAPI ¡ • User-‑defined ¡event ¡names ¡can ¡be ¡used ¡in ¡PAPI ¡ library ¡calls ¡the ¡same ¡way ¡as ¡preset ¡and ¡na5ve ¡ events. ¡ • User-‑defined ¡events ¡can ¡be ¡used ¡with ¡end-‑user ¡ performance ¡tools ¡such ¡as ¡TAU ¡and ¡Scalasca ¡ without ¡modifying ¡those ¡tools. ¡ ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 11 ¡

Specifica5on ¡of ¡user-‑defined ¡events ¡ • Event ¡specifica5on ¡file ¡ – parsed ¡at ¡ PAPI_library_init ¡5me, ¡or ¡ – any5me ¡arerwards ¡with ¡ PAPI_set_opt ¡call ¡ • Also ¡allow ¡sta5c ¡defini5on ¡at ¡PAPI ¡compile ¡5me ¡ • File ¡named ¡by ¡ – PAPI_USER_EVENTS_FILE ¡environment ¡variable, ¡or ¡ – op5on ¡to ¡ PAPI_set_opt • Event ¡defined ¡by ¡ ¡ ¡ ¡ ¡ ¡ ¡ Event, ¡OPERATION_STRING ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 12 ¡

System ¡constants ¡ • e.g., ¡Cache ¡and ¡memory ¡latencies ¡used ¡in ¡ event ¡defini5ons ¡ • Not ¡strictly ¡“constant” ¡  ¡upper ¡bounds ¡ • Some ¡obtained ¡from ¡architecture ¡manuals ¡ • We ¡provide ¡a ¡set ¡of ¡benchmarks ¡for ¡measuring ¡ constants ¡on ¡different ¡pladorms ¡and ¡maintain ¡ a ¡database ¡of ¡results. ¡ – LMBench ¡ – STREAM ¡ 10/27/12 ¡ 12th ¡UTEP/NMSU ¡Workshop ¡ 14 ¡

Modeling Performance and Energy Efficiency of Applica5on - PowerPoint PPT Presentation

Modeling Performance and Energy Efficiency of Applica5on Codes Shirley Moore University of Texas at El Paso svmoore@utep.edu 10/27/12 12th UTEP/NMSU

Energy Efficiency Modeling Discussion October 14th, 2016 2 Major Energy Efficiency Modeling

El Paso Electric El Paso Electric Energy Efficiency Energy Efficiency Standard Offer Programs -

NHEC Perspectives on Energy NHEC Perspectives on Energy Efficiency and Sustainable Energy

Overview of NDN Pla/orm, Applica5on Libraries, and API

India s Energy Efficiency India s Energy Efficiency Standards & Labeling Program

Farm Energy IQ Farms Today Securing Our Energy Future Dairy Farm Energy Efficiency Gary

Accelerating Energy Efficiency Delivering Global Energy Efficiency Goals and the offer of

2018 DTE Energy Incentive and Rebate Program 1 ENERGY EFFICIENCY PROGRAM FOR BUSINESS Jacob

UKRAINE : ENERGY EFFICIENCY and RENEWABLE ENERGY State Agency on Energy Efficiency and Energy

Nanomaterials for High Efficiency Energy for High Efficiency Energy Nanomaterials Conversion,

Overview of the Energy Efficiency Center & the NEPTUNE Program Benjamin Finkelor, Executive

Energy Efficiency in the States: 2013 Outlook Webinar for ACEEE Allies March 5, 2013 Ben

Overview Bureau of Energy Efficiency 1 Bureau of Energy Efficiency, Ministry of Power,

ENERGY EFFICIENCY OF TIMBER HOMES PRESENTED BY: WERNER SLABBERT WHAT IS A TIMBER HOME? WHAT

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

What is modeling? NEU 466M Instructor: Professor Ila R.

Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support

CPSC 340: Machine Learning and Data Mining More Regularization Summer 2020 Admin

CS4501: Introduction to Computer Vision Max-Margin Classifier, Regularization, Generalization,

Cumulant Signal Processing, Tensors and some Recurring Problems Phil Regalia Department of

Two- and Multi-particle Cumulant Measurements of v n and Isolation of Flow and Nonflow in

Correlations and order parameters in infinite matrix product states Ian McCulloch Jason Pillay

Optimal Bounds between f -Divergences and Integral Probability Metrics Rohit Agrawal (Harvard)

Modeling Performance and Energy Efficiency of Applica5on - PowerPoint PPT Presentation

Modeling Performance and Energy Efficiency of Applica5on Codes Shirley Moore University of Texas at El Paso svmoore@utep.edu 10/27/12 12th UTEP/NMSU

Energy Efficiency Modeling Discussion October 14th, 2016 2 Major Energy Efficiency Modeling

El Paso Electric El Paso Electric Energy Efficiency Energy Efficiency Standard Offer Programs -

NHEC Perspectives on Energy NHEC Perspectives on Energy Efficiency and Sustainable Energy

Overview of NDN Pla/orm, Applica5on Libraries, and API

India s Energy Efficiency India s Energy Efficiency Standards &amp; Labeling Program

Farm Energy IQ Farms Today Securing Our Energy Future Dairy Farm Energy Efficiency Gary

Accelerating Energy Efficiency Delivering Global Energy Efficiency Goals and the offer of

2018 DTE Energy Incentive and Rebate Program 1 ENERGY EFFICIENCY PROGRAM FOR BUSINESS Jacob

UKRAINE : ENERGY EFFICIENCY and RENEWABLE ENERGY State Agency on Energy Efficiency and Energy

Nanomaterials for High Efficiency Energy for High Efficiency Energy Nanomaterials Conversion,

Overview of the Energy Efficiency Center &amp; the NEPTUNE Program Benjamin Finkelor, Executive

Energy Efficiency in the States: 2013 Outlook Webinar for ACEEE Allies March 5, 2013 Ben

Overview Bureau of Energy Efficiency 1 Bureau of Energy Efficiency, Ministry of Power,

ENERGY EFFICIENCY OF TIMBER HOMES PRESENTED BY: WERNER SLABBERT WHAT IS A TIMBER HOME? WHAT

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

What is modeling? NEU 466M Instructor: Professor Ila R.

Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support

CPSC 340: Machine Learning and Data Mining More Regularization Summer 2020 Admin

CS4501: Introduction to Computer Vision Max-Margin Classifier, Regularization, Generalization,

Cumulant Signal Processing, Tensors and some Recurring Problems Phil Regalia Department of

Two- and Multi-particle Cumulant Measurements of v n and Isolation of Flow and Nonflow in

Correlations and order parameters in infinite matrix product states Ian McCulloch Jason Pillay

Optimal Bounds between f -Divergences and Integral Probability Metrics Rohit Agrawal (Harvard)

India s Energy Efficiency India s Energy Efficiency Standards & Labeling Program

Overview of the Energy Efficiency Center & the NEPTUNE Program Benjamin Finkelor, Executive