exascale road in China Ruibo WANG National University of Defense - PowerPoint PPT Presentation

Tianhe-3 and the exascale road in China Ruibo WANG National University of Defense Technology

Contents ❑ NUDT & TianHe ❑ the Exascale Road in China ❑ Tianhe-3

NUDT & Tianhe ❑ NUDT ❑ 1953 originally founded at Harbin ❑ 1970 move to Changsha ❑ 1978 renamed as National University of Defense Technology Harbin Changsha

NUDT & Tianhe ❑ Galaxy-I ❑ 1983, the 1st supercomputer in China ❑ peak performance 100 Mflops ❑ project started in 1978, widely used in oil exploration and weather forecast Galaxy-I supercomputer

NUDT & Tianhe ❑ Galaxy-I ❑ 1983, 100 Mflops ❑ Galaxy-II Galaxy-II ❑ 1994, Gflops ❑ Vector structure ❑ Galaxy-III ❑ 1997, 13 Gflops ❑ MPP Galaxy-III ❑ MIPS CPU

NUDT & Tianhe ❑ TianHe-1, deployed in 2009, 1.2Pflops ❑ Rank No.1 in China ❑ Rank No.5 in Top500 (Nov. 2009) ❑ TianHe-1A, deployed in 2010, 4.7Pflops ❑ Rank No.1 in Top500 (Nov. 2010) TianHe-1A TianHe-1

NUDT & Tianhe ❑ TianHe-1A, deployed in 2010, 4.7Pflops ❑ Rank No.1 in Top500 (Nov. 2010) ❑ the 1st time China got the No.1 ❑ deployed in the National Supercomputer Center in Tianjin

NUDT & Tianhe ❑ TianHe-2 made its pre-release @ IHPCF2013 ❑ International High Performance Computing Forum ❑ http://www.ihpcf.org/ ❑ Changsha, May. 2013

NUDT & Tianhe ❑ TianHe-2 ranked No.1 from Jun. 2013 to Nov. 2015 ❑ No.1 for 3 years ( 6 times ) ❑ Peak 55Pflops, Linpack 33.86Pflops Nov. 2013, Denver Jun. 2013, Leipzig Jun. 2014, Leipzig Nov. 2014, New Orleans Jun. 2015, Frankfurt Nov. 2015, Austin

NUDT & Tianhe ❑ TianHe-2 ❑ 16,000 compute nodes ❑ Frame: 32 compute Nodes ❑ Rack: 4 Compute Frames System ❑ Whole System: 125 Racks Compute Frame Compute Rack Compute Blade

NUDT & Tianhe ❑ TianHe-2 Background ❑ Sponsored by 863 High Tech. Program, Government of Guangdong province and Government of Guangzhou city ❑ deployed in National Supercomputer Center in Guangzhou (NSCC-GZ) ❑ Oct. 2013: Phase 1 system was moved to NSCC-GZ

NUDT & Tianhe ❑ Jan. 2014, Tianhe-2 began to provide service in NSCC-GZ

NUDT & Tianhe ❑ Originally planned to finish its upgrade to Phase 2 in 2015 ❑ Use the new generation KNL to replace the KNC ❑ The peak performance would reach 100Pflops ❑ In early 2015, due to some reasons, we try to use the homegrown accelerator to upgrade the system ❑ Phase 2 system is ready in the end of 2017

NUDT & Tianhe ❑ Comparison of Tianhe-2 & Tianhe-2A Tianhe-2 Tianhe-2A 16,000 nodes 17,792 nodes Nodes Intel CPU + KNC Intel CPU + Matrix-2000 & Performance 54.9 Pflops 100.68 Pflops Interconnect 10Gbps, 1.57us 14Gbps, 1us Memory 1.4PB 3PB Storage 12.4PB, 512GB/s 19PB, 1TB/s Energy Efficiency 17.8MW, 1.9Gflops/W 18.5MW, 5.4Gflops/W Programming OpenMP/OpenCL for MPSS for Intel KNC Environment Matrix-2000

NUDT & Tianhe ❑ Matrix-2000 SN0 SN1 SN2 SN3 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C Cluster 0 Cluster 1 Cluster 0 Cluster 1 Cluster 0 Cluster 1 Cluster 0 Cluster 1 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ❑ 4 super-nodes (SN) Cluster 2 Cluster 3 Cluster 2 Cluster 3 Cluster 2 Cluster 3 Cluster 2 Cluster 3 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C Cluster 4 Cluster 5 Cluster 4 Cluster 5 Cluster 4 Cluster 5 Cluster 4 Cluster 5 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ❑ 8 clusters per SN Cluster 6 Cluster 7 Cluster 6 Cluster 7 Cluster 6 Cluster 7 Cluster 6 Cluster 7 ❑ 4 cores per cluster On chip interconnection ❑ Core PCIE DDR4 DDR4 DDR4 DDR4 ❑ Self-defined 256-bit vector ISA ❑ 16 DP flops/cycle per core ❑ Peak performance: 2.4576Tflops@1.2GHz 4 SNs x 8 clusters x 4cores x 16 flops x 1.2 GHz = 2.4576 Tflops ❑ Power: ~240w ❑ 8 DDR4-2400 channels ❑ x16 PCIe Gen3

NUDT & Tianhe ❑ Heterogeneous Compute Nodes ❑ Intel Xeon CPU x2 ❑ Matrix-2000 x2 ❑ Memory:192GB ❑ Interconnect: 14G homegrown network ❑ Peak performance: 5.34Tflops

NUDT & Tianhe ❑ Heterogeneous Compute Blades ❑ Compute blade = Xeon part + Matrix-2000 part 4 Intel Xeon CPUs 4 Matrix-2000 2 Compute Nodes ❑ Use the Matrix-2000 part to replace the KNC part

NUDT & Tianhe ❑ Heterogeneous programming environment ❑ support OpenMP 4.x and OpenCL OpenMP 4.x OpenCL OpenCL X OpenMP runtime runtime compiler OpenMP runtime OpenCL runtime plugin API wrapper User heterogeneous computing library heterogeneous computing library Math Library symmetric communication library symmetric communication library driver driver (device) Kernel host OS device OS Xeon Matrix-2000

Next step: Exascale ❑ Governments target on Exascale computing ❑ US, Japan, EU, China ❑ China has currently achieved 100P level, but Exascale poses great more challenges ❑ Memory wall ❑ Communication wall ❑ Reliability wall ❑ Energy consumption wall ❑ etc.

More Walls for China ❑ Microelectronics & chip industry ❑ Still in an underdevelopment stage ❑ Calls for more Technology Accumulation ❑ Various & complex needs ❑ Huge & highly diverse market ❑ Calls for various design & development road ❑ Self-controllable road ❑ Processor ❑ Platform & OS ❑ APP ❑ Eco-system

China’s Development 23

National Projects & Plans in China ❑ since 1990, China release an HPC project in every 5-year plan, sponsored by the 863 High Tech. Program of the Ministry of Science & Technology ❑ the 10th 5-year plan (2001~2005) ❑ Project: High performance Computer and software system ❑ Targets: TFlops supercomputer and High Performance computing environment ❑ Successfully developed TF-scale computers and China National Grid (CNGrid) testbed ❑ the 11th 5-year plan (2006~2010) ❑ Project: High productive computer and network computing environment ❑ Targets: PFlops supercomputer and Grid computing environment ❑ Successfully developed Peta-scale computers, upgraded CNGrid into the national HPC service environment

National Projects & Plans in China ❑ the 12th 5-year plan (2011~2015) ❑ Project: High productive computer and computing environment ❑ Targets: 100PFlops supercomputer and cloud computing environment ❑ Developed world-class computer systems ❑ Tianhe-2 ❑ Sunway TaihuLight ❑ the 13th 5-year plan (2016~2020) ❑ Project: Exascale system ❑ Targets: key technology of EFlops supercomputer

The 13th 5-year plan (2016~2020) ❑ GOALS ❑ Develop self-dependent and controllable core technology of exascale computing, and keep China’s leading position ❑ Develop a series of critical HPC application and software center, building the HPC application eco-system ❑ Build national HPC environment with global top level resources and services ❑ Two Steps to Exascale ❑ Support vendors to develop prototypes (2016-2018) ❑ Choose and support vendors to achieve exascale

Exascale Goal in 2016 proposal ❑ System performance 1 Eflops ❑ Node performance > 10Tflops ❑ Network bandwidth > 400Gbps ❑ Network scale up to more than 100,000 nodes ❑ MPI latency < 1.2us ❑ Linpack efficiency > 60% ❑ Power efficiency > 30Gflops/W

Vendors in China ❑ University ❑ NUDT ❑ homegrown CPU, accelerator and interconnect ❑ Institute ❑ National Research Center of Parallel Computer Engineering and Technology (NRCPC) ❑ homegrown many-core CPU ❑ Company ❑ Dawning (Sugon) ❑ Various products lines besides HPC: server, PC, data center products, etc. ❑ High portion of market share

NUDT exascale prototype system deployed in the National Supercomputer Center in Tianjin, 2018

NUDT exascale prototype system ❑ 512 nodes ❑ 3 MT-2000+ processors ❑ 6Tflops peak performance ❑ Matrix-2000+ ❑ 128 cores ❑ 2 GHz ❑ 2 Tflops ❑ ~130W, ~15Gflops/W ❑ 400Gbps homegrown network

NUDT exascale prototype system ❑ Air and water hybrid cooling ❑ PUE < 1.15 ❑ High density

NRCPC exascale prototype system ❑ SW26010 CPU ❑ Used in Sunway TaihuLight system ❑ 512 nodes ❑ Each node has 2 CPUs ❑ Homegrown network

Sugon exascale prototype system

Sugon exascale prototype system ❑ Heterogenous architecture ❑ Hygon CPU + DCU ❑ 6D torus network

Sugon exascale prototype system ❑ Hierarchy ❑ 512 Nodes ❑ 32 Supernodes ❑ 6 Silicon Units ❑ 1 Silicon Cube ❑ Cooling ❑ Total immersion cooling ❑ No noise ❑ Better performance on heat exchange

Exascale prototype systems ❑ Compute ❑ Traditional multi-core CPU ❑ Many-core CPU ❑ CPU + DCU ❑ Network ❑ Homegrown interconnect network ❑ Commercial network ❑ Cooling ❑ Air & Water Hybrid cooling ❑ Traditional Water cooling ❑ Total immersion cooling

exascale road in China Ruibo WANG National University of Defense - PowerPoint PPT Presentation

Tianhe-3 and the exascale road in China Ruibo WANG National University of Defense Technology Contents NUDT & TianHe the Exascale Road in China Tianhe-3 Contents NUDT & TianHe the Exascale Road in China Tianhe-3

LCS LCS LCSs for China LCSs for China s for China s for China R Residential Residential

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

Major Challenges to Achieve Exascale Performance Shekhar Borkar Intel Corp. April 29, 2009

HPC Future Look Exascale and Challenges Outline Future architectures Exascale initiatives

Market and China Re Group China Re Group | September 2018 Contents Part 1 China P&C

Ancient China Geography China has large mountain ranges. China has large deserts. China has

THE ROAD TO EXASCALE: HARDWARE AND SOFTWARE CHALLENGES JACK DONGARRA UNIVERSITY OF TENNESSEE

What is a road vehicle? Road & Non-road vehicles in the RVSA The Meaning of Road

Previous Impact Models in Previous Impact Models in China China CISNAR Team, CAS, China CISNAR

This is us CHINA RAIL SERVICE 18.10.2017 Our rail services from and to China Logwin CHINA RAIL

Pouley Road Rustic Road Designation Pouley Road Rustic Road Designation Process From March 2006 to

of road traffic safety in China Highway Department, Ministry of Transport China September,

AIM/Country-China Model & AIM/Local-China Model Development in 2002 in China ERI AIM Project

PubPol 201 China Shock Module 3: International Chinas growth Trade Policy The China

Next Step Networks and Systems in China CCSA China Communications Standards Association

The Exascale Computing Project (ECP) Paul Messina, ECP Director Stephen Lee, ECP Deputy Director

Current Challenges to U.S. - China Educational Collaboration Additional Data International

Chinese Keyword Censorship of Instant Messaging Programs (and Work in Progress) Jeffrey Knockel

2020 Research and Practice on Semantic Mining and Semantic Organization of Multi-source

How the Great Firewall of China is Blocking Tor Philipp Winter and Stefan Lindskog Karlstad

What's Really Going On in China? Sheldon Ray AAII Washington D.C. Metro Chapter May 18, 2019

EV Purchase Preferences Annual Passenger Cars Sold 2.58 Annual BEVs Sold among Chinese Buyers

Assimilation and the Wage Growth of Rural-to-Urban Migrants in China Suqin Ge Virginia Tech GWU

Local Protectionism, Market Structure, and Social Welfare: Chinas Automobile Market Panle Jia

Sambuz

Useful Links

Newsletter

Mail Us

exascale road in China Ruibo WANG National University of Defense - PowerPoint PPT Presentation

Tianhe-3 and the exascale road in China Ruibo WANG National University of Defense Technology Contents NUDT & TianHe the Exascale Road in China Tianhe-3 Contents NUDT & TianHe the Exascale Road in China Tianhe-3

LCS LCS LCSs for China LCSs for China s for China s for China R Residential Residential

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

Major Challenges to Achieve Exascale Performance Shekhar Borkar Intel Corp. April 29, 2009

HPC Future Look Exascale and Challenges Outline Future architectures Exascale initiatives

Market and China Re Group China Re Group | September 2018 Contents Part 1 China P&amp;C

Ancient China Geography China has large mountain ranges. China has large deserts. China has

THE ROAD TO EXASCALE: HARDWARE AND SOFTWARE CHALLENGES JACK DONGARRA UNIVERSITY OF TENNESSEE

What is a road vehicle? Road &amp; Non-road vehicles in the RVSA The Meaning of Road

Previous Impact Models in Previous Impact Models in China China CISNAR Team, CAS, China CISNAR

This is us CHINA RAIL SERVICE 18.10.2017 Our rail services from and to China Logwin CHINA RAIL

Pouley Road Rustic Road Designation Pouley Road Rustic Road Designation Process From March 2006 to

of road traffic safety in China Highway Department, Ministry of Transport China September,

AIM/Country-China Model &amp; AIM/Local-China Model Development in 2002 in China ERI AIM Project

PubPol 201 China Shock Module 3: International Chinas growth Trade Policy The China

Next Step Networks and Systems in China CCSA China Communications Standards Association

The Exascale Computing Project (ECP) Paul Messina, ECP Director Stephen Lee, ECP Deputy Director

Current Challenges to U.S. - China Educational Collaboration Additional Data International

Chinese Keyword Censorship of Instant Messaging Programs (and Work in Progress) Jeffrey Knockel

2020 Research and Practice on Semantic Mining and Semantic Organization of Multi-source

How the Great Firewall of China is Blocking Tor Philipp Winter and Stefan Lindskog Karlstad

What's Really Going On in China? Sheldon Ray AAII Washington D.C. Metro Chapter May 18, 2019

EV Purchase Preferences Annual Passenger Cars Sold 2.58 Annual BEVs Sold among Chinese Buyers

Assimilation and the Wage Growth of Rural-to-Urban Migrants in China Suqin Ge Virginia Tech GWU

Local Protectionism, Market Structure, and Social Welfare: Chinas Automobile Market Panle Jia

Sambuz

Useful Links

Newsletter

Mail Us

Market and China Re Group China Re Group | September 2018 Contents Part 1 China P&C

What is a road vehicle? Road & Non-road vehicles in the RVSA The Meaning of Road

AIM/Country-China Model & AIM/Local-China Model Development in 2002 in China ERI AIM Project