hpc environment management new challenges in the petaflop
play

HPC Environment Management: New Challenges in the Petaflop Era - PowerPoint PPT Presentation

HPC Environment Management: New Challenges in the Petaflop Era Jonas Dias jonas@nacad.ufrj.br Albino Aveleda bino@nacad.ufrj.br VECPAR10 Agenda 1. Introduction 2. Available Tools 1. Deployment 2. Monitoring 3. Proprietary Solutions


  1. HPC Environment Management: New Challenges in the Petaflop Era Jonas Dias jonas@nacad.ufrj.br Albino Aveleda bino@nacad.ufrj.br VECPAR’10

  2. Agenda 1. Introduction 2. Available Tools 1. Deployment 2. Monitoring 3. Proprietary Solutions 3. LEMMing Project 4. Conclusion HPC Environment Management: New 6/24/2010 2 Challenges in the Petaflop Era

  3. Introduction • High Performance Computing Systems – Universities – Research centers – Experiments, simulations – Industry Sector • 62.4% (11/09 top500.org list) • Petaflop barrier HPC Environment Management: New 6/24/2010 3 Challenges in the Petaflop Era

  4. Growth of processors per system

  5. Management and Monitoring Tools • Systems with many processors • Organized information – List of nodes – Hierarchical approach • Usability – Expert and non ‐ expert managers

  6. Managing a Supercomputer • Grant secure access • Quick handling defects and problems • Offer a queue system • Use some monitoring tools • Support non uniform infrastructure • Integrate with local tools

  7. Administrate a HPC center • Proprietary Software – Integration – Usability • An open source proposal – LEMMing • Single point of management • RIA • Customization

  8. Available Tools • Deployment Tools – OSCAR – ROCKS – xCAT • Monitoring Tools – Cacti – Ganglia – Nagios HPC Environment Management: New 6/24/2010 8 Challenges in the Petaflop Era

  9. The Deployment • Should be easy – GUI – Out of the box installation • Integrate management features – Node adding and removal – Changes in properties • Basic HPC Tools – MPI – Queue system • Offer monitoring tools HPC Environment Management: New 6/24/2010 9 Challenges in the Petaflop Era

  10. Comparison Cluster Queuing Monitorin Node Adding MPI Installation System g Tool GUI + Network OSCAR GUI Yes Yes Yes listening UI + Network Rocks GUI Yes Yes Yes listening Command Command Line + xCAT No No No Line Manual Adding HPC Environment Management: New 6/24/2010 10 Challenges in the Petaflop Era

  11. Monitoring Tools • Web based – Easy access • Rich Internet Application • Alert sending • Customizable – Plug ‐ ins • The monitoring focus HPC Environment Management: New 6/24/2010 11 Challenges in the Petaflop Era

  12. Comparison Monitoring Web Based RIA Send Alert Plugins focus Cacti Yes No No Yes Network Ganglia Yes No No No Cluster/Grid Nagios Yes No Yes Yes Network HPC Environment Management: New 6/24/2010 12 Challenges in the Petaflop Era

  13. Proprietary solutions • Usually use some open source apps – OSCAR, Rocks, xCAT, Ganglia, Nagios, Cacti… • Tune the cluster configuration • Proprietary tools for administration – Hardware specific • Poor integration – Different vendors HPC Environment Management: New 6/24/2010 13 Challenges in the Petaflop Era

  14. Challenges to a HPC environment • Increasing number of processors • Heterogeneous environments – Resources from different machines • Particular/Local tools • Administrators with different level of knowledge • Present available resources as a whole – Organized and customized HPC Environment Management: New 6/24/2010 14 Challenges in the Petaflop Era

  15. • Node naming and organization – Simple form node 0 node 5 node 1 node 6 node 2 node 7 node 3 node 8 node 4 An example Expansion node 0 node 5 node 1 node 6 node 2 node 7 node 3 node 8 node 11 node 4 node 12 node 9 node 10

  16. • Node naming and organization – Hierarchical approach c0n0 c1n0 c0n1 c1n1 c0n2 c1n2 c0n3 c1n3 c0n4 An example Expansion c0n0 c1n0 c0n1 c1n1 c0n2 c1n2 c0n3 c1n3 c0n4 c1n4 c0n5 c1n5 c0n6

  17. LEMMing Project • Inspired on Zimbra Collaboration Suite • Use Open Source tools • Use AJAX technologies • LEMMing is not an extension – Less dependent – Great usability HPC Environment Management: New 6/24/2010 17 Challenges in the Petaflop Era

  18. What is LEMMing? • L inux E nterprise M anagement and M onitor ing • Cluster with thousands of nodes – Many failures • Flexibility • Easiness to add features • Great usability – Detect and solve problems faster HPC Environment Management: New 6/24/2010 18 Challenges in the Petaflop Era

  19. Features • Being freeware • Web Service based • AJAX interface design • Integration of other tools • Single point of management • Tested with Rocks clusters • Support for many cluster topologies organization • Integrated with workload management • Parallel shell tools • Customizable Dashboard HPC Environment Management: New 6/24/2010 19 Challenges in the Petaflop Era

  20. LEMMing Modules • LEMM ‐ WS – Web Services – Coupled to the supercomputer – API • LEMM ‐ GATE – Web application – Independent of the cluster HPC Environment Management: New 6/24/2010 20 Challenges in the Petaflop Era

  21. LEMMing Modules Relationship HPC Environment Management: New 6/24/2010 21 Challenges in the Petaflop Era

  22. LEMM ‐ GATE interface HPC Environment Management: New 6/24/2010 22 Challenges in the Petaflop Era

  23. LEMM ‐ GATE interface HPC Environment Management: New 6/24/2010 23 Challenges in the Petaflop Era

  24. Conclusion • Huge HPC centers – Heterogeneous machines – Many nodes per cluster • LEMMing – Integrate multiple clusters management and monitoring software stack – Rich internet application – Open Source model – Use of available tools HPC Environment Management: New 6/24/2010 24 Challenges in the Petaflop Era

  25. Future Work • Add support to different cluster systems • IPMI support • Queue management • Visist us: – http://lemm.sf.net – Check the video demonstration HPC Environment Management: New 6/24/2010 25 Challenges in the Petaflop Era

  26. Acknowledgments • The author thanks: – High Performance Computing Center – Professor Alvaro Coutinho – DELL Brazil HPC Environment Management: New 6/24/2010 26 Challenges in the Petaflop Era

  27. HPC Environment Management: New Challenges in the Petaflop Era Thanks! Jonas Dias jonas@nacad.ufrj.br Albino Aveleda bino@nacad.ufrj.br VECPAR’10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend