Towards Exascale Computing Yutaka Ishikawa University of Tokyo - PowerPoint PPT Presentation

Towards Exascale Computing Yutaka Ishikawa University of Tokyo RIKEN AICS

Outline of This Talk • Activities in U. of Tokyo and Riken AICS – Many-core based PC Cluster – System Software Stack – Prototype System • Rethinking of How to use MPI Library in state-of-the- art supercomputers – Are MPI_Isend/MPI_Recv really help for overlapping programming ? Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 2

Post T2K Todai 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2010 “K” Computer Exa-scale 10 PFlops Supercomputer 100+ PFlops SR11000 Fujitsu FX10 (1PFlops) Market Hitach SR16000/M1 (54.9 Tflops) T2K Todai 40 to 100 PFlops R&D 140TFlops (HA8000 Cluster) Hongo Campus Kashiwa Campus • PRIMEHPC FX10 • 4800 Node (16 core/node) • 1.13 PFlops • 150 Tbyte Memory • Hitachi SR16000/M1 • 56 Node (32 core/node) • 54.9 TFlops • 11200 Gbyte Memory FX10 HA8000 SR16000/M1 Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 3

Variations of Many-core based machines Many-core board connected to PCI-Express Many-core chip connected to system bus e.g., Intel Knights Ferry, Knights Corner Not existing so far Many Core Board Host Board memory memory Host CPU Host CPU IOH PCI-Express IOH memory memory memory memory memory memory memory Many-core inside CPU die Many-core only c.f., Intel Sandy Bridge with GPU Not existing fo far memory memory IOH memory memory http://pc.watch.impress.co.jp/docs/column/kaigai/20100412_ 360173.html Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 4

Post T2K System Image: Requirements • Both requirements of large data analysis and number crunching applications must Many Core Units • Number Crunching be satisfied. – Performance of I/O Host CPU Units • Controlling Many Core Units – Performance of floating point • Processing data analysis code • Handling File System operations – Parallel Performance Many Core Host Node Many Core Many Core CPU Unit Host Node Many Core Many Core Node Many Core Host CPU Unit Many Core Unit Node Many Core Many Core Node Many Core Host SSD CPU Unit Node Many Core Unit Node Many Core Unit CPU Unit Many Core SSD Node Many Core Interconnect Unit Many Core Unit SSD Unit Many Core Unit Interconnect SSD Unit Interconnect Interconnect Network for Nodes and Storage Area Network Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 5

Post T2K System Image: Execution Image • Both requirements of large data analysis and number crunching applications must Many Core Units • Number Crunching be satisfied. – Performance of I/O Host CPU Units • Controlling Many Core Units – Performance of floating point • Processing data analysis code • Handling File System operations – Parallel Performance Many Core Host Node Many Core Many Core CPU Unit Host Node Co-execution of 2 types of job within partition Many Core Many Core Node Many Core Host CPU Unit Many Core Unit Node Many Core Many Core Node Many Core Host SSD CPU Unit Node Many Core Unit Node Many Core ManyCores: Number crunching application Unit CPU Unit Many Core SSD Node Many Core Interconnect Unit Many Core Unit Host CPU is used for file I/O and memory swap SSD Unit Many Core Unit Interconnect SSD Unit Interconnect Host CPUs: I/O intensive application Interconnect Network for Nodes and Storage Area Network Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 6

Post T2K System Image: Execution Image • Both requirements of large data analysis and number crunching applications must Many Core Units • Number Crunching be satisfied. – Performance of I/O Host CPU Units • Controlling Many Core Units – Performance of floating point • Processing data analysis code • Handling File System operations – Parallel Performance Many Core Host Node Many Core Many Core CPU Unit Host One Job execution within partition Node Many Core Many Core Node Many Core Host CPU Unit Many Core Unit Node Many Core Many Core Node Many Core Host SSD CPU Unit Node Many Core Unit Node Many Core Unit CPU Unit Many Core SSD ManyCores: Computation and Communication Node Many Core Interconnect Unit Many Core Unit SSD Unit Many Core Unit Interconnect SSD Unit Host CPUs: Memory Share/Swap Interconnect Interconnect & Communication & I/O Network for Nodes and Storage Area Network Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 7

Post T2K System Image: Execution Image • Both requirements of large data analysis do { for (……) { and number crunching applications must Many Core Units for (…..) { • Number Crunching for (……) { be satisfied. /* Computation */ /* Due to the limited memory in many – Performance of I/O Host CPU Units core units, data is swapped to memory • Controlling Many Core Units in Host CPU */ – Performance of floating point • Processing data analysis code } • Handling File System operations } } /* Now many data is located in Host memory */ – Parallel Performance /* Data exchange among remote node */ Many Core } while (…); Host Node Many Core Many Core CPU Unit Host One Job execution within partition Node Many Core Many Core Node Many Core Host CPU Unit Many Core Computation Unit Node Many Core Many Core Node Many Core Host SSD CPU Unit Node Many Core Unit Node Many Core Unit CPU Unit Many Core Communication SSD ManyCores: Computation and Communication Node Many Core Interconnect Unit Many Core Unit SSD Unit Many Core Unit Interconnect SSD Unit Host CPUs: Memory Share/Swap Interconnect Interconnect & Communication & I/O Network for Nodes and Storage Area Network Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 8

Post T2K System Software Stack In case of Non-Bootable Many Core In case of Bootable Many Core Next –Generation NG Comm. MPI Comm. Library MPI MPI NG Comm. P2P MPI NG Comm. MPI NG Comm. Comm. Library Comm. Library Comm. Library P2P Library Parallel File System Comm. Comm. Library Library Basic Comm. Lib. Library Basic Comm. Lib. Parallel File System Library Basic Comm. Lib. Library Basic Comm. Lib. Basic Comm. Lib. Glibc for manycore Glibc for manycore Glibc Glibc for manycore Glibc for manycore Glibc Glibc for manycore Micro Kernel Linux Kernel Micro Kernel Micro Kernel Micro Kernel Micro Kernel Linux Kernel SMSL SMSL SMSL SMSL SMSL SMSL SMSL AAL ‐ Host IKCL IKCL IKCL IKCL IKCL AAL ‐ Host IKCL IKCL Device Driver AAL ‐ Manycore AAL ‐ Manycore AAL ‐ Manycore AAL ‐ Manycore AAL ‐ Manycore Device Driver PCI-Express Host Many Core Many Core Infiniband PCI-Express Network Card Infiniband Network Card • AAL (Accelerator Abstraction Layer) – Provides low-level accelerator interface Design Criteria – Enhances portability of the micro kernel • Cache-aware system software stack • IKCL (Inter-Kernel Communication Layer) • Scalability – Provides generic-purpose communication and data • Minimum overhead of communication facility transfer mechanisms • Portability • SMSL (System Service Layer) – Provides basic system services on top of the Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 9 communication layer

Post T2K System Software Stack In case of Non-Bootable Many Core In case of Bootable Many Core Next –Generation NG Comm. MPI Comm. Library MPI MPI NG Comm. P2P MPI NG Comm. MPI NG Comm. Comm. Library Comm. Library Comm. Library P2P Library Parallel File System Comm. Comm. Library Library Basic Comm. Lib. Library Basic Comm. Lib. Parallel File System Library Basic Comm. Lib. Library Basic Comm. Lib. Basic Comm. Lib. Glibc for manycore Glibc for manycore Glibc Glibc for manycore Glibc for manycore Glibc Glibc for manycore Micro Kernel Linux Kernel Micro Kernel Micro Kernel Micro Kernel Micro Kernel Linux Kernel SMSL SMSL SMSL SMSL SMSL SMSL SMSL AAL ‐ Host IKCL IKCL IKCL IKCL IKCL AAL ‐ Host IKCL IKCL Device Driver AAL ‐ Manycore AAL ‐ Manycore AAL ‐ Manycore AAL ‐ Manycore AAL ‐ Manycore Device Driver PCI-Express Host Many Core Many Core Infiniband PCI-Express Network Card Infiniband Network Card • AAL (Accelerator Abstraction Layer) Because manycores have small memory – Provides low-level accelerator interface Design Criteria caches and limited memory – Enhances portability of the micro kernel • Cache-aware system software stack bandwidth, the footprint in the cache • IKCL (Inter-Kernel Communication Layer) • Scalability during both user and system program – Provides generic-purpose communication and data • Minimum overhead of communication facility executions should be minimized. transfer mechanisms • Portability • SMSL (System Service Layer) – Provides basic system services on top of the Yutaka Ishikawa @ University of Tokyo/RIKEN AICS 10 communication layer

Towards Exascale Computing Yutaka Ishikawa University of Tokyo - PowerPoint PPT Presentation

Towards Exascale Computing Yutaka Ishikawa University of Tokyo RIKEN AICS Outline of This Talk Activities in U. of Tokyo and Riken AICS Many-core based PC Cluster System Software Stack Prototype System Rethinking of How

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

exascale road in China Ruibo WANG National University of Defense Technology Contents NUDT

Major Challenges to Achieve Exascale Performance Shekhar Borkar Intel Corp. April 29, 2009

HPC Future Look Exascale and Challenges Outline Future architectures Exascale initiatives

The Exascale Computing Project (ECP) Paul Messina, ECP Director Stephen Lee, ECP Deputy Director

Exascale Computing Project: Software Technology Perspective Rajeev Thakur, Argonne National Lab.

The U.S. D.O.E. Exascale Computing Project Goals and Challenges Paul Messina, ECP Director

Towards Modeling and Simulation of Exascale Computing Platforms Luka Stanisic Supervised by:

Time to Start over? Software for Exascale William Gropp www.cs.illinois.edu/~wgropp Why Is

THE ROAD TO EXASCALE: HARDWARE AND SOFTWARE CHALLENGES JACK DONGARRA UNIVERSITY OF TENNESSEE

On the Future of High Performance Computing: How to Think for Peta and Exascale Computing Jack

ON THE FUTURE OF HIGH PERFORMANCE COMPUTING: HOW TO THINK FOR PETA AND EXASCALE COMPUTING JACK

HPC Strategy & US Exascale Program James Amundson, Scientific Computing Division Head

Algorithmic and Software Challenges when Moving Towards Exascale Jack Dongarra University of

Exa-DM: Enabling Scientific Discovery in Exascale Simulations Jeremy Iverson 1 , 2 , Ya Ju Fan 1 ,

Containment Domains Resilience Mechanisms and Tools Toward Exascale Resilience Mattan Erez The

Agenda 1. Campus Police Safety Briefing 2. Call to Order Shawn A. Kachmar 3. Approval of

Coz iron Resources Ltd: ASX - CZR August 2014 Pilbara Iron Ore Presentation www.coziron.com 2

For personal use only Coz iron Resources Ltd: ASX - CZR AUGUST 2013 www.coziron.com Disclaimer

South Australian South Australian Iron Ore Projects Iron Ore Projects Bob Duffin, Executive

and Social Care 2 units 65 guided learning hours The care sector identified a need to

CIT2ADM (Citizen To Administration) An initiative launched by eVisionR Powered by

SUNIST United Laboratory and Improvement of Operation on SUNIST Spherical Tokamak HE Yexi, *FENG

Fuel Cell Reformer Control Karel Schnebele May 5, 2006 Presentation Outline Introduction

Towards Exascale Computing Yutaka Ishikawa University of Tokyo - PowerPoint PPT Presentation

Towards Exascale Computing Yutaka Ishikawa University of Tokyo RIKEN AICS Outline of This Talk Activities in U. of Tokyo and Riken AICS Many-core based PC Cluster System Software Stack Prototype System Rethinking of How

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

exascale road in China Ruibo WANG National University of Defense Technology Contents NUDT

Major Challenges to Achieve Exascale Performance Shekhar Borkar Intel Corp. April 29, 2009

HPC Future Look Exascale and Challenges Outline Future architectures Exascale initiatives

The Exascale Computing Project (ECP) Paul Messina, ECP Director Stephen Lee, ECP Deputy Director

Exascale Computing Project: Software Technology Perspective Rajeev Thakur, Argonne National Lab.

The U.S. D.O.E. Exascale Computing Project Goals and Challenges Paul Messina, ECP Director

Towards Modeling and Simulation of Exascale Computing Platforms Luka Stanisic Supervised by:

Time to Start over? Software for Exascale William Gropp www.cs.illinois.edu/~wgropp Why Is

THE ROAD TO EXASCALE: HARDWARE AND SOFTWARE CHALLENGES JACK DONGARRA UNIVERSITY OF TENNESSEE

On the Future of High Performance Computing: How to Think for Peta and Exascale Computing Jack

ON THE FUTURE OF HIGH PERFORMANCE COMPUTING: HOW TO THINK FOR PETA AND EXASCALE COMPUTING JACK

HPC Strategy &amp; US Exascale Program James Amundson, Scientific Computing Division Head

Algorithmic and Software Challenges when Moving Towards Exascale Jack Dongarra University of

Exa-DM: Enabling Scientific Discovery in Exascale Simulations Jeremy Iverson 1 , 2 , Ya Ju Fan 1 ,

Containment Domains Resilience Mechanisms and Tools Toward Exascale Resilience Mattan Erez The

Agenda 1. Campus Police Safety Briefing 2. Call to Order Shawn A. Kachmar 3. Approval of

Coz iron Resources Ltd: ASX - CZR August 2014 Pilbara Iron Ore Presentation www.coziron.com 2

For personal use only Coz iron Resources Ltd: ASX - CZR AUGUST 2013 www.coziron.com Disclaimer

South Australian South Australian Iron Ore Projects Iron Ore Projects Bob Duffin, Executive

and Social Care 2 units 65 guided learning hours The care sector identified a need to

CIT2ADM (Citizen To Administration) An initiative launched by eVisionR Powered by

SUNIST United Laboratory and Improvement of Operation on SUNIST Spherical Tokamak HE Yexi, *FENG

Fuel Cell Reformer Control Karel Schnebele May 5, 2006 Presentation Outline Introduction

HPC Strategy & US Exascale Program James Amundson, Scientific Computing Division Head