hyperscale fpgas for hpc and cloud
play

Hyperscale FPGAs for HPC and Cloud Christoph Hagleitner, - PowerPoint PPT Presentation

SC 2016 Salt-Lake City, USA November 14, 2016 Hyperscale FPGAs for HPC and Cloud Christoph Hagleitner, hle@zurich.ibm.com IBM Research - Zurich Lab IBM Research Zurich Lab (ZRL) Established in 1956 Two Nobel Prizes (1986 and 1987)


  1. SC 2016 Salt-Lake City, USA November 14, 2016 Hyperscale FPGAs for HPC and Cloud Christoph Hagleitner, hle@zurich.ibm.com IBM Research - Zurich Lab

  2. IBM Research – Zurich Lab (ZRL)  Established in 1956  Two Nobel Prizes (1986 and 1987)  Today – ~300 employees (~3000 worldwide) – 40+ different nationalities – open innovation w/ 277 projects & 1900 partners in FP7, H2020, ... 11/14/2016 IBM Research - Zurich Lab 2

  3. Acknowledgements Accelerator Technologies @ ZRL  m Server team @ ZRL  Supervessel team @ CRL  openPOWER team  Peter Hofstee, ... L.Fiorin, F. Abel, E.Vermij, J.Weerasinghe, S.Dragone M.Purandare, R.Polig, J.vanLunteren, H.Giefers, C. Hagleitner 11/14/2016 IBM Research - Zurich Lab 3

  4. Cognitive Computing is the new (complex) workload Knowledge Intelligence Information C G R Context Decisions A Data O A E D N & Actions T A A N NoSQL H S P E E O SQL T C ESB R N T Accelerators for Accelerators for Accelerators for Accelerators for • Accelerators for Massively scalable and • Text analytics • (Deep) learning real-time, interactive • Key-value stores accelerated graph • • Diagram & image acceleration crowdsourcing • analytics In memory graph stores • • updated advanced inference understanding • • Knowledge graph Queries • crowdsourcing algorithms algorithms & accel. creation engines 11/14/2016 IBM Research - Zurich Lab 4

  5. Cognitive Computing Workflows 11/14/2016 IBM Research - Zurich Lab 5

  6. Inter-node vs. Intra-node Heterogeneous Computing Systems  hadoop-style workloads  complex HPC-like workloads  main metrics  main metrics – cost (capital, energy) – memory / accelerator / inter-node BW – compute density – data centric design – scalability – heterogeneous compute resources  specialiced, homogeneous nodes  versatile, heterogeneous nodes  datacenter disaggregation 11/14/2016 IBM Research - Zurich Lab 6

  7. FPGAs @ SuperVessel (work in progress) IaaS CloudFPGA CAPI FPGA Infrastructure Resource Management FPGA Resource Server Resource Management Management Disaggregated FPGA Server/Storage/Network Infrastructure Infrastructure FPGA FPGA FPGA FPGA FPGA Heterogeneous Storage Network Server Data Center Network 11/14/2016 IBM Research - Zurich Lab 7

  8. Heterogeneous Nodes: POWER8 Accelerator Interfaces 11/14/2016 IBM Research - Zurich Lab 8

  9. Inter-node vs. Intra-node Heterogeneous Computing Systems  hadoop-style workloads  complex HPC-like workloads  main metrics  main metrics – compute density – memory / accelerator / inter-node BW – cost (capital, energy) – data centric design – scalability – heterogeneous compute resources  specialized, homogeneous nodes  versatile, heterogeneous nodes  datacenter disaggregation 11/14/2016 IBM Research - Zurich Lab 9

  10. Hyperscale FPGAs @ m Server ……. Server Memory …. Fabric Server … Server FPGA Infrastructure DC Network Server Infrastructure IBM Research - Zurich Lab 10

  11. From a practical point of view ... FPGA Card Memory FPGA  KU060 FPGA w/ 16GB memory, 10GbE, PCIe Management Layer (ML) extension, board management controller User Logic (vFPGA)  The iNIC enables the FPGA to hook itself to the network and to communicate with other DC iNIC resources, such as servers, disks, I/O and other Network Service Layer (NSL) FPGA appliances Data Center Network 11 IBM Research - Zurich Lab

  12. Network-attached FPGAs @ Hyperscale  Disaggregation of compute resources – FPGAs can be deployed independent of: • the # CPUs (respectively servers) • the server form factor (which keep on shrinking) – FPGAs can be provisioned / rented similar to other cloud compute, storage and network resources  Scalability – Users can build SDN fabrics of FPGAs in the cloud – FPGAs are promoted to the rank of peer processor (end of slavery) – HW-based FPGA-to-FPGA communication provides low latency and high-Tput (RDMA NICs) 11/14/2016 IBM Research - Zurich Lab 12

  13. Dense, Disaggregated Nodes: ZRL “Dome” m Server  Cloud economics – density (>1000 nodes / rack) – integrated NICs – switch card (backplane, no cables) – medium to low-cost compute chips  Passive liquid cooling – ultimate density (cooling >70W / node) – energy re-use  Built to integrate heterogeneous resources – CPUs – Accelerators 11/14/2016 IBM Research - Zurich Lab 13

  14. SuperVessel: The OpenPOWER Cloud for Developers and Ecosystem www.ptopenlab.com SPARK, Symphony • • Accelerator service Cloud Data Service • IoT application • development platform POWER open source • migration service Machine learning & • deep learning Science computation • platform IBM Research - Zurich Lab

  15. Accelerator DevOps Service on OpenPOWER cloud Publish to Accelerator Online development Test in VM/Docker App. Store and Online Accelerator service with equipped with FPGA deployment for project management Cloud-based IDE (for POWER8 & CAPI) application on cloud (Collaboration with Xilinx) FPGA resource virtualization with Docker Accelerator scheduling for FPGA resource in Cloud Data synchronization in DevOps environment 11/14/2016 IBM Research - Zurich Lab 15

  16. SuperVessel Acceleration App Store Accelerators ... allow accelerator developers to create new accelerator and publish it. ... allow application developers to create VM/dockers with the selected accelerators Applications ... demos for new clients to try applications with accelerators. 16 11/14/2016 IBM Research - Zurich Lab 16

  17. Conclusions  Heterogeneous computing systems are the sustainable way to advance the two main cloud metrics: € to solution & Time to solution – reconfigurable computing is one of the few options available (... In the short term) – powerful heterogeneous compute nodes for complex workloads (strong, HPC-like nodes) openpower.org – specialized nodes to build rack-level heterogenous systems for hadoop-like applications (eg, cloudFPGA)  (Hyperscale) Cloud-deployment of disaggregated, heterogeneous computing systems (IaaS) ... ... is still at the research stage but advancing quickly – Supervessel @ www.ptopenlab.com – Zurich Heterogeneous Computing Cloud (ZHC2) @ zhc2.zurich.ihost.com  FPGAs are getting there but standardization & community effort required for – accelerator interfaces – FPGA compatibility and legacy code – cloud orchestration – libraries, usage models 11/14/2016 IBM Research - Zurich Lab 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend