on th the e challeng allenges es of de deplo ploying ying
play

On th the e challeng allenges es of de deplo ploying ying an - PowerPoint PPT Presentation

On th the e challeng allenges es of de deplo ploying ying an an unusual sual hig igh pe perf rformance ormance hybri ybrid d obje bject/f ct/file ile pa parallel allel st stor orage e syst sy stem em in in JASMIN SMIN


  1. On th the e challeng allenges es of de deplo ploying ying an an unusual sual hig igh pe perf rformance ormance hybri ybrid d obje bject/f ct/file ile pa parallel allel st stor orage e syst sy stem em in in JASMIN SMIN Cristina del Cano Novales 1 , Jonathan Churchill 1 , Athanasios Kanaris 1 , Robert Döbbelin 2 , Felix Hupfeld 2 , Aleksander Trofimowicz 2 1 Scientific Computing Department, Science and Technology Facilities Council, RAL, Didcot OX11 0QX, UK 2 Quobyte GmbH, Berlin, AG Charlottenburg HRB 149012 B, Germany

  2. En Environmental vironmental Data ata An Analysis alysis ■ COMET-CPOM UoLeeds ■ ■ Centre for Environment and Near real time monitoring of all Hydrology active earthquake and volcanos. ■ Trends for 1000’s of species ■ Relies on full ESA Sentinel data, ■ Analysis unprecedented in Managed and unmanaged complexity and scope within tenancies, LOTUS batch the UK.

  3. : th the e mis issi sing ng pi piec ece MetOffice supercomputer ARCHER supercomputer (EPSRC/NERC) JASM SMIN IN (STFC/Stephen Kill)

  4. Blending PB’s of data, 1000's 0's of Clo loud ud VM VM's, s, Bat atch ch Com omputing uting & WA WAN Dat ata a tr trans ansfer fer  24.5 PB Panasas ~ 250GByte/s  44 PB Quobyte SDS ~ 220GBytes/s  5PB Caringo Object Store  80PB Tape  Batch HPC 6-10k cores  Optical Private WAN + Science DMZ  “Managed” VMware Cloud OpenStack “Community” Cloud   Pure FlashBlade scratch  Non-blocking ethernet 12-20Tbit/sec

  5. JA JASMIN4 SMIN4 Dis isc c St Storag rage JASMIN Disc Storage 60 50 JASMIN4 Caringo(S3/NFS) 40 Useable PB's QuoByte(SoF/S3/NFS) PURE(NVMe/NFS) 30 NetApp(Block/NFS) Equallogic(Block) 20 Panasas (Parallel File) 10 0 2012 2013 2014 2015 2016 2017 2018 2019 – No boundaries on data growth (or network topology) – S3 interface to file and object system. RW Both sides. – Performance similar to Panasas PFS – Online upgrades. Redundant networking. – No client “call back” port. • Previous root /network and UMC restrictions

  6. Quobyte obyte SD SDS – 45PB raw, ~30PB usable (EC 8+3) – Hardware split 50:50 Dell / Supermicro – 47x R730xd’s + MD3060 arrays (1 / server pair) - 40Gb NICs – 40x Supermicro 4U “Top loader” servers – 50Gb NICs – Target > 50MB/sec/HDD. Ideally 70-100MB/sec/HDD

  7. “5 Tier” CLOS Network – Traditional for BGP throughout – JASMIN2/3 all OSPF – OSPF Lower complexity cf BGP – Keep OSPF Leaf-Spine for JASMIN4 – Ease of use at the edges. – BGP only in Spine to SuperSpine – For the core network specialists – But stops EVPN leaf use for now

  8. Co Conn nnecting ecting JASM SMIN2 IN2 to o JASM SMIN4 IN4 Superspine: 16 Spines (32x 100Gb)  4 Cluster/groups of 4 routers 4x 32x100Gb 4x 32x100Gb 4x 32x100Gb 4x 32x100Gb J4 Network J2 Network – 8 Spines (32x 100Gb) – 12 Spines (36x 40Gb) • 4x 100Gb to Super-Spine • 4x 40Gb to Super-Spine – 17 Leaf pairs ( 2 of 16x 100Gb) – 30 Leafs ( 48x10Gb+12x40Gb) • 8x 100Gb uplinks. 1 per spine • 12x 40Gb uplinks. 1 per spine – Storage/Compute – Storage/Compute • 1x 25/40/50Gb to ‘A’ and ‘B’ leafs • 2x 10Gb to local leaf

  9. Congestion in a “non - blocking” network Storage can overwhelm a client 8 Threads , 8+3 EC = 88 servers Non-blocking fabric 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 40Gb 40Gb 50Gb 50Gb 25Gb 100Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb 25Gb But 180x25Gb > 4Tbits/s 3090 HDD’s x 70MB/s > 250GBytes/sec > 2Tbits/sec ~200GB/s for a few minutes

  10. Th Than ank k yo you! u!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend