publishing alice data cvmfs infrastructure monitoring
play

Publishing ALICE data & CVMFS infrastructure monitoring - PowerPoint PPT Presentation

Publishing ALICE data & CVMFS infrastructure monitoring Costin.Grigoras@cern.ch Publishing ALICE data VoBox services health AliEn services running status CE, ClusterMonitor, CMreport Proxy status and time left The


  1. Publishing ALICE data & CVMFS infrastructure monitoring Costin.Grigoras@cern.ch

  2. Publishing ALICE data ● VoBox services health – AliEn services running status ● CE, ClusterMonitor, CMreport – Proxy status and time left ● The certificate used to start AliEn services ● Delegated proxy, proxy server, proxy of the machine ● Storage Element test results – ADD and GET results 2 2

  3. Publishing details ● dashb-test-mb.cern.ch:6162 – Persistent SSL connection – client certificate authentication ● Using ActiveMQ Java library ver. 5.9.1 – activemq-client and activemq-stomp JARs ● Running as a thread in the central MonALISA repository for ALICE ● Currently pushing 640 values every 30 minutes 3 3

  4. Message structure ● Headers: nagios_host=alimonitor.cern.ch persistent=true destination=/topic/sam.alice.metric ● Body: {"mlServiceName": "CNAF", "hostName": "ui01-alice.cr.cnaf.infn.it", "serviceFlavour": "AliEn-VoBox-Test", "siteName": "CNAF", "metricStatus": "OK", "metricName": "Proxy of the machine", "summaryData": "Proxy is ok", "gatheredAt": "ui01-alice.cr.cnaf.infn.it", "timestamp": "2014-06-04T15:38:53Z", 4 4 "voName": "alice", "detailsData": "Time left: 20:51"}

  5. CVMFS infrastructure monitoring proposal ● Now a critical service, for not only ALICE ● Currently missing information about the performance of the Stratum 0/1 and the local site proxies – Some bits of information in various places, like availability of Stratum 0, awstats ... – Not enough to assess whether the services performance is OK ● Some sites are alerted for failures by the users (tasks failing) 5 5

  6. To address that ● Deploy a monitoring service on each server of the infrastructure – Full host monitoring (CPU, memory, network IO, disk IO performance, sockets and processes in each state) – CVMFS and Squid-specific probes (catalogue version, request counters, size) ● Real time access to the parameters plus – Alarms, history of all parameters, simple display options – Trivial now to integrate in dashboard 6 6

  7. Additionally ● MonALISA services also perform the network topology discovery out of the box – This would help with the automatic configuration of local site proxies – Similar algorithm as for the automatic SE selection for ALICE jobs 7 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend