Accelerator Computing Volodymyr Kindratenko Innovative Systems - PowerPoint PPT Presentation

Hardware/Software Divergence in Accelerator Computing Volodymyr Kindratenko Innovative Systems Laboratory National Center for Supercomputing Applications University of Illinois at Urbana-Champaign

How do you envision the role of accelerators in parallel computing and high performance computing in the next decade including the role in the exascale systems? • As means to claim the highest peak performance, but not as means to achieve the highest efficiency at scale  • Tianhe-1A was #1 on Top-500 in 2010 • Its Rmax was about 54% of its Rpeak • Accelerators will play a minimal role in extreme-scale systems • Sure, systems such as Titan and BW will have a lot of them • But we have yet to see what performance the application scientists will achieve on these systems using GPUs at scale • Accelerators will play a substantial role for small-scale systems • A system with O(10) of GPUs can replace a cluster with O(1000) CPU cores for applications that can sustain strict scaling limitations imposed due to accelerators

How do you view the hardware/software divergence in accelerator Computing? • It is getting worse • We know how to build O(1M) CPU cores systems • But we do not know how to write software that can take advantage of such systems • Accelerators add another layer of complexity to already overly complex systems • Heterogeneity in hardware also means greater degree of divergence in software: host code, accelerator code, communication layer, etc.

Which accelerator (hardware) do you think will have advantages in the next 10 years and most likely win the battle in the next decade and why? • Intel Many Integrated Core (MIC) Architecture – like accelerators will eventually win the battle. • The architecture is sound (many cores, wide vector units, high memory bandwidth) • Programming model is very flexible, ranging from kernel offload co- processor to running entire application on the MIC • Programming tools are conventional: icc, idb, vtune • Programing languages are familiar: C/C++ with pragmas and libraries • Software development effort on MIC is comparable with performance tuning effort rather than with code reimplementation • Oh yes, when the “war” is over, what we consider today to be an accelerator, will be in our mainstream processor

• Why not NVIDIA GPUs? • Market forces are working against NVIDIA • With the introduction of APUs, Intel and AMD are taking away the low-end discrete GPU market from NVIDIA • Without this low-end mass-market, NVIDIA will have a harder time justifying the expense of developing high-end GPUs • Market for high-end (HPC) GPUs is too small to sustain NRC • Software development efforts necessary to efficiently utilize GPUs are substantial, despite all the efforts by NVIDIA and its partners developing tools and compilers • Programming model (kernel offload co-processor) is inherently limited • NVIDIA CUDA SDK is great, but it locks the developers into a particular ( incompatible with the rest of the world ) software- hardware environment • Other approaches, such as OpenCL, have yet to deliver performance levels achievable with CUDA

What programming model/library of accelerated computing do you think will have advantages in the next 10 years and most likely win the battle in the next decade and why? • Anything that is easy to use without sacrificing performance • Libraries for applications which heavily rely on standard libs (fft, linear algebra, …) • Kernel offload for codes with distinct, well-defined and dense computational kernels

What research challenges do you envision will be most critical and should be addressed in the coming years for the success of accelerator computing? • Ease of use • Programmer’s productivity • Automation (auto parallelization, auto-vectorization, auto- tuning, …) • Communication bottleneck

Accelerator Computing Volodymyr Kindratenko Innovative Systems - PowerPoint PPT Presentation

Hardware/Software Divergence in Accelerator Computing Volodymyr Kindratenko Innovative Systems Laboratory National Center for Supercomputing Applications University of Illinois at Urbana-Champaign How do you envision the role of accelerators

1 3 5 CONVENTIONAL DC MODEL Accelerator Output Accelerator Opening FB-CA SERIES Accelerator

CEBAF Accelerator Status Arne Freyberger Operations Department Accelerator Division Jefferson

SLAC Accelerator Science and R&D R. Hettel Accelerator Research Division Head (acting)

Fermilab Accelerator R&D Program Vladimir Shiltsev, Accelerator Physics Center Institutional

Challenges in Accelerator Applications Shukui Zhang Thomas Jefferson National Accelerator

FOA Landscape Manouchehr Farkhondeh DOE Office of Nuclear Physics EIC Accelerator Collaboration

KEK, High Energy Accelerator Research Organization KEK High Energy Accelerator Research

Eric Prebys FNAL Accelerator Physics Center 8/17/10 Im the head of the US LHC Accelerator

US LHC Accelerator Research Program HL-LHC BNL - FNAL- LBNL - SLAC LARP Accelerator Systems 17

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

FACET The Facility for Advanced aCcelerator Experimental Tests Mark Hogan SLAC National

ESRF Operation Status Report & Phase I and II Accelerator Upgrade Programme P. Raimondi On

Status of the Accelerator Complex Status of the Accelerator Complex Ioanis Kourbanis Main

NSF Convergence Accelerator 1 Convergence Accelerator WHY: Leverage the science across all fields

HEP INDEPENDENT SAFETY REVIEW ARGONNE WAKEFIELD ACCELERATOR (AWA) OVERVIEW SCOTT DORAN

NS NSF Convergence Accelerator Chaitan Baru Senior Science Advisor, Convergence Accelerator

Selfish Overlay Network Formation Georgios Smaragdakis 1 1 Deutsche Telekom Laboratories.

Detecting oestrus by monitoring sows visits to a boar T. Ostersen, C. Cornou, A.R. Kristensen

The EUROCS stratocumulus case: Observations and numerical simulations of the diurnal cycle of

Validation of AIRS Cloud-Clearing Algorithms C. Cho, C. Surussavadee, and D. Staelin Presented

The Dirichlet problem for second order parabolic operators in divergence form Kaj Nystrm

How to Divide Optimal Division into . . . Students into Groups Combined Optimality . . . A More

CS141: Intermediate Data Structures and Algorithms Divide and Conquer: Design and Analysis Amr

Divide and Conquer: A Mixture-Based Approach to RAPTOR Regional Adaptation for MCMC Theoretical

Accelerator Computing Volodymyr Kindratenko Innovative Systems - PowerPoint PPT Presentation

Hardware/Software Divergence in Accelerator Computing Volodymyr Kindratenko Innovative Systems Laboratory National Center for Supercomputing Applications University of Illinois at Urbana-Champaign How do you envision the role of accelerators

1 3 5 CONVENTIONAL DC MODEL Accelerator Output Accelerator Opening FB-CA SERIES Accelerator

CEBAF Accelerator Status Arne Freyberger Operations Department Accelerator Division Jefferson

SLAC Accelerator Science and R&amp;D R. Hettel Accelerator Research Division Head (acting)

Fermilab Accelerator R&amp;D Program Vladimir Shiltsev, Accelerator Physics Center Institutional

Challenges in Accelerator Applications Shukui Zhang Thomas Jefferson National Accelerator

FOA Landscape Manouchehr Farkhondeh DOE Office of Nuclear Physics EIC Accelerator Collaboration

KEK, High Energy Accelerator Research Organization KEK High Energy Accelerator Research

Eric Prebys FNAL Accelerator Physics Center 8/17/10 Im the head of the US LHC Accelerator

US LHC Accelerator Research Program HL-LHC BNL - FNAL- LBNL - SLAC LARP Accelerator Systems 17

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

FACET The Facility for Advanced aCcelerator Experimental Tests Mark Hogan SLAC National

ESRF Operation Status Report &amp; Phase I and II Accelerator Upgrade Programme P. Raimondi On

Status of the Accelerator Complex Status of the Accelerator Complex Ioanis Kourbanis Main

NSF Convergence Accelerator 1 Convergence Accelerator WHY: Leverage the science across all fields

HEP INDEPENDENT SAFETY REVIEW ARGONNE WAKEFIELD ACCELERATOR (AWA) OVERVIEW SCOTT DORAN

NS NSF Convergence Accelerator Chaitan Baru Senior Science Advisor, Convergence Accelerator

Selfish Overlay Network Formation Georgios Smaragdakis 1 1 Deutsche Telekom Laboratories.

Detecting oestrus by monitoring sows visits to a boar T. Ostersen, C. Cornou, A.R. Kristensen

The EUROCS stratocumulus case: Observations and numerical simulations of the diurnal cycle of

Validation of AIRS Cloud-Clearing Algorithms C. Cho, C. Surussavadee, and D. Staelin Presented

The Dirichlet problem for second order parabolic operators in divergence form Kaj Nystrm

How to Divide Optimal Division into . . . Students into Groups Combined Optimality . . . A More

CS141: Intermediate Data Structures and Algorithms Divide and Conquer: Design and Analysis Amr

Divide and Conquer: A Mixture-Based Approach to RAPTOR Regional Adaptation for MCMC Theoretical

SLAC Accelerator Science and R&D R. Hettel Accelerator Research Division Head (acting)

Fermilab Accelerator R&D Program Vladimir Shiltsev, Accelerator Physics Center Institutional

ESRF Operation Status Report & Phase I and II Accelerator Upgrade Programme P. Raimondi On