surfsara noc flash talk
play

SURFsara NOC Flash talk Erik Ruiter, Sr. Network Specialist, - PowerPoint PPT Presentation

SURFsara NOC Flash talk Erik Ruiter, Sr. Network Specialist, SURFsara TF-NOC Meeting Cambridge UK, 20-3-2014 Services National supercomputer National compute cluster Grid compute & storage Cartesius (capability Lisa (capacity computing)


  1. SURFsara NOC Flash talk Erik Ruiter, Sr. Network Specialist, SURFsara TF-NOC Meeting Cambridge UK, 20-3-2014

  2. Services National supercomputer National compute cluster Grid compute & storage Cartesius (capability Lisa (capacity computing) Gina (middleware services) computing) HPC Cloud IaaS Hadoop – Data processing GPU cluster (Do-it-yourself) (map-reduce algorithm) (Computing on a video card) Collaboratorium Remote Render cluster Beehub / SURFDrive collaboration (video wall) (Data visualization) (Dropbox unlimited) 2 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  3. Core Network: High level overview Fully redundant topology Core routers: 2x Juniper MX960 Internal firewall cluster: 2x Fortigate 311b + 5x Cisco 3750 External firewall cluster: 2x Fortigate 3040 + 2x Juniper EX4550 3 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  4. Core Network: E-infra compute and storage network SURFsara E-infra Network infrastructure • Connects the HPC environments in SURFsara • High capacity (160 Gbps Between QNodes) • Used for East - West traffic • Low latency • High scalability (upto 786 x 10Gbps) • Easy scaling • Single CLI management • Based on Juniper QFabric 4 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  5. NOC Tools: Monitoring Monitoring: • Icinga • Currently monitoring 149 hosts, with 379 services • Nconf is used for configuration (no manual text editing) • Cacti • Syslog daemon • Syslog-ng -> looking for better solution (logstash) 5 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  6. NOC Tools: NFSen Netflow Is enabled on our 2 core routers Nfsen is used for: - Intrusion detection - Using alert triggers when suspicious traffic patterns are detected. (only a few rules in place at the moment) - Traffic monitoring for our main uplinks: - SURFnet (10 Gbps) - LHCOPN (10 Gbps) 6 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  7. NOC Tools: Management Network management: • All elements reachable on SSH via management infrastructure • All elements reachable through console port (on centralized console servers) • All elements Authenticated by TACACS+ • All elements have SNMP v2 READ-ONLY access (limited to single IP address) Configuration management: • Rancid + SVNWEB • Small in-house developed web interface to easily find configs 7 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  8. NOC Ticketing system Ticketing system: • Trac • Combined issue tracking and wiki system • Used for software development projects, can interface with SVN, GIT ,etc. • All departments within SURFsara use Trac, Having their own Trac wiki and Ticket queue. • Network access requests also have a separate Trac queue: Request is first validated by Security team. Then assigned to NOC team. 8 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  9. NOC Documentation Documentation • MediaWiki / Trac • Stores all project and operational related information • Currently looking into moving all documentation from MediaWiki to Trac. • Exporting is difficult (different markup Language, Trac supports less functions) • Racktables • Rackspace • VLANs • IP space • Looking into storing cabling information • Racktables custom added features: • Daily script that does reverse DNS lookups to determine IP subnet occupation • Daily script that reads IP information from routers (SNMP) to document ‘routed by’ information. • Racktables API is used to create lightweight IP / VLAN overview 9 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  10. NOC Structure • Small team • 5 Network Engineers • 1 Team Leader • All engineers work on support, implementation and innovation projects • Rotated NOC duty days (once per week) • answering mail • Small operational requests • handling incidents • Only working hours support 10 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  11. NOC Frontend / Communication • Customers • Mainly internal -> system administrators of HPC systems • No real SLA’s, however providing redundant connectivity is getting more attention • Internal communications are mainly done through email and Trac tickets • For external communications we have a mailing list noc@surfsara.nl • There is no helpdesk phone number 11 TF-NOC meeting Cambridge 2014 – NOC Flash Talk

  12. Erik Ruiter Erik.Ruiter@surfsara.nl www.surfsara.nl

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend