Operating two InfiniBand grid clusters over 28 km distance Sabine - PowerPoint PPT Presentation

Operating two InfiniBand grid clusters over 28 km distance Sabine Richling, Steffen Hau, Heinz Kredel, Hans-G¨ unther Kruse IT-Center University of Heidelberg, Germany IT-Center University of Mannheim, Germany 3PGCIC-2010, Fukuoka, 4. November 2010 Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 1 / 41

Introduction Motivation Circumstances in Baden-W¨ urttemberg (BW) Increasing demand for high-performance computing capacities from scientific communities Demands are not high enough to qualify for the top German HPC centers in J¨ ulich, Munich and Stuttgart ⇒ Grid infrastructure concept for the Universities in Baden-W¨ urttemberg Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 2 / 41

Introduction Motivation Special Circumstances in Heidelberg/Mannheim Both IT-centers have a long record of cooperations Both IT-centers are connected by a 10 Gbit dark fibre connection of 28 km (two color lines already used for backup and other services) ⇒ Connection of the clusters in Heidelberg and Mannheim to ease operation and to enhance utilization Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 3 / 41

Introduction Outline Introduction 1 bwGRiD cooperation 2 Interconnection of two bwGRiD clusters 3 Cluster operation 4 Performance modeling 5 Summary and Conclusions 6 Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 4 / 41

bwGRiD cooperation bwGRiD cooperation Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 5 / 41

bwGRiD cooperation D-Grid German Grid Initiative (www.d-grid.de) Start: September 2005 Aim: Development and establishment of a reliable and sustainable Grid infrastructure for e-science in Germany Funded by the Federal Ministry of Education and Research (BMBF) with ∼ 50 Million Euro Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 6 / 41

bwGRiD cooperation bwGRiD Community project of the Universities of BW (www.bw-grid.de) Compute clusters at 8 locations: Stuttgart, Ulm (Konstanz), Karlsruhe, T¨ ubingen, Freiburg, Mannheim/Heidelberg, Esslingen Central storage unit in Karlsruhe Distributed system with local administration Access for all D-Grid virtual organizations via at least one middleware supported by D-Grid Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 7 / 41

bwGRiD cooperation bwGRiD – Objectives Verifying the functionality and the benefit of Grid concepts for the HPC community in BW Managing organisational and security problems Development of new cluster and Grid applications Solving license difficulties Enabling the computing centers to specialize Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 8 / 41

bwGRiD cooperation bwGRiD – Access Possibilities Access with local university accounts (via ssh ): → Access to a local bwGRiD cluster only Access with Grid Certificate and VO membership using a Grid middleware (e.g. Globus Toolkit: gsissh , GridFTP or Webservices): → Access to all bwGRiD resources Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 9 / 41

bwGRiD cooperation bwGRiD – Resources Compute cluster: Mannheim/Heidelberg: 280 nodes Direct Interconnection Frankfurt Karlsruhe: 140 nodes (interconnected Stuttgart: 420 nodes to a single cluster) Mannheim Heidelberg T¨ ubingen: 140 nodes Ulm (Konstanz): 280 nodes Karlsruhe Hardware in Ulm Stuttgart Esslingen Freiburg: 140 nodes Tübingen Esslingen: 180 nodes Ulm (joint cluster more recent Hardware München with Konstanz) Freiburg Central storage: Karlsruhe: 128 TB (with Backup) 256 TB (without Backup) Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 10 / 41

bwGRiD cooperation bwGRiD – Software Common Software: Scientific Linux, Torque/Moab batch system, GNU and Intel compiler suite Central repository for software modules (MPI versions, mathematical libraries, various free software, application software from each bwGRiD site) Application areas of bwGRiD sites: Freiburg: System Technology, Fluid Mechanics Karlsruhe: Engineering, Compiler & Tools Heidelberg: Mathematics, Neuroscience Mannheim: Business Administration, Economics, Computer Algebra Stuttgart: Automotive simulations, Particle simulations T¨ ubingen: Astrophysics, Bioinformatics Ulm: Chemistry, Molecular Dynamics Konstanz: Biochemistry, Theoretical Physics Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 11 / 41

Interconnection of two bwGRiD clusters Interconnection of two bwGRiD clusters Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 12 / 41

Interconnection of two bwGRiD clusters Hardware before Interconnection 10 Blade-Center in Heidelberg and 10 Blade-Center in Mannheim Each Blade-Center contains 14 IBM HS21 XM Blades Each Blade contains 2 Intel Xeon CPUs, 2.8 GHz (each CPU with 4 Cores) 16 GB Memory 140 GB Hard Drive (since January 2009) Gigabit-Ethernet (1 Gbit) Infiniband Network (20 Gbit) ⇒ 1120 Cores in Heidelberg and 1120 Cores in Mannheim Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 13 / 41

Interconnection of two bwGRiD clusters Hardware – Bladecenter Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 14 / 41

Interconnection of two bwGRiD clusters Hardware – Infiniband Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 15 / 41

Interconnection of two bwGRiD clusters Interconnection of the bwGRiD clusters Proposal in 2008 Acquisition and Assembly until May 2009 Running since July 2009 Infiniband over Ethernet over fibre optics: Longbow adaptor from Obsidian InfiniBand connector (black cable) fibre optic connector (yellow cable) Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 16 / 41

Interconnection of two bwGRiD clusters Interconnection of the bwGRiD clusters ADVA component: Transformation of white light from Longbow to one color light for the dark fibre connection between IT centers Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 17 / 41

Interconnection of two bwGRiD clusters MPI Performance – Prospects Measurements for different distances (HLRS, Stuttgart, Germany) Bandwidth 900-1000 MB/sec for up to 50-60 km Latency is not published Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 18 / 41

Interconnection of two bwGRiD clusters MPI Performance – Latency Local: ∼ 2 µ sec Interconnection: 145 µ sec IMB 3.2 PingPong 1e+06 local MA-HD 100000 10000 time [ µ sec] 1000 100 10 1 1 10 100 1e3 1e4 1e5 1e6 1e7 1e8 1e9 buffer size [bytes] Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 19 / 41

Interconnection of two bwGRiD clusters MPI Performance – Bandwidth Local: 1400 MB/sec Interconnection: 930 MB/sec IMB 3.2 PingPong 1800 local 1600 MA-HD bandwidth [Mbytes/sec] 1400 1200 1000 800 600 400 200 0 1 10 100 1e3 1e4 1e5 1e6 1e7 1e8 1e9 buffer size [bytes] Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 20 / 41

Interconnection of two bwGRiD clusters Experiences with Interconnection Network Cable distance MA-HD is 28 km (18 km linear distance in air) ⇒ Light needs 143 µ sec for this distance Latency is high: 145 µ sec = Light transit time + 2 µ sec local latency Bandwidth is as expected: about 930 MB/sec Local bandwidth 1200-1400 MB/sec Obsidian needs a license for 40 km Obsidian has buffers for larger distances Activation of buffers with license License for 10 km is not sufficient Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 21 / 41

Interconnection of two bwGRiD clusters MPI Bandwidth – Influence of the Obsidian License IMB 3.2 - PingPong - buffer size 1 GB 1000 bandwidth [Mbytes/sec] 800 600 400 200 0 16 Sep 23 Sep 30 Sep 07 Oct 00:00 00:00 00:00 00:00 start time [date hour] Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 22 / 41

Cluster operation Cluster operation Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 23 / 41

Cluster operation bwGRiD Cluster Mannheim/Heidelberg Belwue VORM VORM Benutzer Benutzer Benutzer Benutzer PBS AD LDAP Admin passwd Cluster Cluster Mannheim Heidelberg Obsidian InfiniBand + InfiniBand ADVA Lustre Lustre bwFS bwFS MA HD Richling, Hau, Kredel, Kruse (URZ/RUM) Operating two grid clusters over 28 km Fukuoka, November 2010 24 / 41

Operating two InfiniBand grid clusters over 28 km distance Sabine - PowerPoint PPT Presentation

Operating two InfiniBand grid clusters over 28 km distance Sabine Richling, Steffen Hau, Heinz Kredel, Hans-G unther Kruse IT-Center University of Heidelberg, Germany IT-Center University of Mannheim, Germany 3PGCIC-2010, Fukuoka, 4.

Performance of HPC Middleware over Infiniband WAN Designing Efficient FTP Mechanisms for High

IO Virtualization with InfiniBand [InfiniBand as a Hypervisor Accelerator] Michael Kagan Vice

InfiniBand Network Block Device Overview IBNBD: InfiniBand Network Block device Transfer

A Long-distance InfiniBand Interconnection between two Clusters in Production Use Sabine

Distance Education Distance education used to be about the distance. 1700s 1800s 1900s 2000s

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Dynamic Virtual Clusters in a Grid Dynamic Virtual Clusters in a Grid Site Manager Site Manager

Grid/Clo d Comp ting Grid/Clo d Comp ting Grid/Cloud Computing Grid/Cloud Computing over

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

I nternational research The evidence on clusters is clear Firms located in clusters are more

Internet Server Clusters Internet Server Clusters Jeff Chase Duke University, Department of

Supercomputers and Supercomputers and Clusters and Clusters and Grid, Grid, Oh My! Oh My!

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

Mark-recapture distance sampling (MRDS) in Distance 7.1 Setting up Distance for MRDS

Infiniband for Open MPI Andrew Friedley, Torsten Hoefler Matthew L. Leininger, Andrew Lumsdaine

SEE-GRID Deploying a Grid-enabled eInfrastructure in SE Europe www.see-grid.org Jorge Sanchez,

KM3NeT: Proposed Real-time Optoelectronic Readout System Peter Healey, Alistair Poustie, David W

Improving End2End CENIC `07 MAKING WAVES Performance for the March 12-14, 2007 La Jolla, CA

PEERING A very brief introduction Types of Peering Private Peering Bi-lateral Peering

OPTICAL FIBER CALIBRATION SYSTEM & ADAPTIVE POWER SUPPLY * J. CVACH*, INSTITUTE OF PHYSICS,

Status of the Mu3e Experiment https://www.psi.ch/mu3e/ Search for + e + e + e - CLFV 2019

N e w P h y s i c s w i t h Mu o n s F i r s t l o o k a t g - 2 ,

AMD & SSSD Two SiPM-based scintillation detectors for Auger Johannes Schumacher for the

4MOST 4m Multi-Object Spectroscopic Telescope 4MOST: a Wide-field, high-multiplex optical

Sambuz

Useful Links

Newsletter

Mail Us

Operating two InfiniBand grid clusters over 28 km distance Sabine - PowerPoint PPT Presentation

Operating two InfiniBand grid clusters over 28 km distance Sabine Richling, Steffen Hau, Heinz Kredel, Hans-G unther Kruse IT-Center University of Heidelberg, Germany IT-Center University of Mannheim, Germany 3PGCIC-2010, Fukuoka, 4.

Performance of HPC Middleware over Infiniband WAN Designing Efficient FTP Mechanisms for High

IO Virtualization with InfiniBand [InfiniBand as a Hypervisor Accelerator] Michael Kagan Vice

InfiniBand Network Block Device Overview IBNBD: InfiniBand Network Block device Transfer

A Long-distance InfiniBand Interconnection between two Clusters in Production Use Sabine

Distance Education Distance education used to be about the distance. 1700s 1800s 1900s 2000s

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Dynamic Virtual Clusters in a Grid Dynamic Virtual Clusters in a Grid Site Manager Site Manager

Grid/Clo d Comp ting Grid/Clo d Comp ting Grid/Cloud Computing Grid/Cloud Computing over

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

I nternational research The evidence on clusters is clear Firms located in clusters are more

Internet Server Clusters Internet Server Clusters Jeff Chase Duke University, Department of

Supercomputers and Supercomputers and Clusters and Clusters and Grid, Grid, Oh My! Oh My!

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

Mark-recapture distance sampling (MRDS) in Distance 7.1 Setting up Distance for MRDS

Infiniband for Open MPI Andrew Friedley, Torsten Hoefler Matthew L. Leininger, Andrew Lumsdaine

SEE-GRID Deploying a Grid-enabled eInfrastructure in SE Europe www.see-grid.org Jorge Sanchez,

KM3NeT: Proposed Real-time Optoelectronic Readout System Peter Healey, Alistair Poustie, David W

Improving End2End CENIC `07 MAKING WAVES Performance for the March 12-14, 2007 La Jolla, CA

PEERING A very brief introduction Types of Peering Private Peering Bi-lateral Peering

OPTICAL FIBER CALIBRATION SYSTEM &amp; ADAPTIVE POWER SUPPLY * J. CVACH*, INSTITUTE OF PHYSICS,

Status of the Mu3e Experiment https://www.psi.ch/mu3e/ Search for + e + e + e - CLFV 2019

N e w P h y s i c s w i t h Mu o n s F i r s t l o o k a t g - 2 ,

AMD &amp; SSSD Two SiPM-based scintillation detectors for Auger Johannes Schumacher for the

4MOST 4m Multi-Object Spectroscopic Telescope 4MOST: a Wide-field, high-multiplex optical

Sambuz

Useful Links

Newsletter

Mail Us

OPTICAL FIBER CALIBRATION SYSTEM & ADAPTIVE POWER SUPPLY * J. CVACH*, INSTITUTE OF PHYSICS,

AMD & SSSD Two SiPM-based scintillation detectors for Auger Johannes Schumacher for the