Short introduction to monitoring systems for large Short - PowerPoint PPT Presentation

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Immediate Reaction / Configuration Change Immediate Reaction / Configuration Change Security Auditing Security Auditing Security Auditing Security Auditing Definition and setup of security polices for systems, users and network. Daily system checks and log analysis to find any incidents. Failure Monitoring Failure Monitoring Failure Monitoring Failure Monitoring Monitoring of hardware and software failures. Event based alerting gives the potential to react autonomous. Performance Monitoring Performance Monitoring Performance Monitoring Performance Monitoring Long term monitoring and statistical evaluation of system and network performance values. Baseline (Service Level Agreement) Baseline (Service Level Agreement) [ Pierre Zelnicek 2010 ]

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Performance Monitoring Performance Monitoring What has to be monitored? CPU & Disk & Memory (CDM) usage, Network bandwidth usage, CPU & Disk & Network I/O, Network latency Where it has to be monitored? On the systems itself per single user/process/instance. Network monitoring is done by accessing the switches via SNMP. What opensource solutions are existing? Ganglia, Lemon, Cacti, Smokeping and self build solutions based on RRDTool [ Pierre Zelnicek 2010 ]

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Failure Monitoring Failure Monitoring What has to be monitored? Errors and failures in hardware and software components. Critical thresholds for performance values. Where it has to be monitored? On the systems itself per single software instance or hardware component. Performance thresholds can be monitored via access to an performance monitoring system. What opensource solutions are existing? SysMES, Nagios What should the software additional provide? Autonomous execution of reactions to known or possible errors and failures. Is it possible to foresee hardware failures? Yes, for some hardware components which provide indicator values. [ Pierre Zelnicek 2010 ]

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Failure Monitoring Failure Monitoring Enclosure Device ID: 252 Slot Number: 7 Device Id: 11 Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SATA Raw Size: 931.512 GB [0x74706db0 Sectors] Non Coerced Size: 931.012 GB [0x74606db0 Sectors] Coerced Size: 930.390 GB [0x744c8000 Sectors] Firmware state: Online, Spun Up SAS Address(0): 0x96803721a299998b Connected Port Number: 7(path0) Inquiry Data: WD-WMATV1432482WDC WD1002FBYS-02A6B0 03.00C06 FDE Capable: Not Capable FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 3.0Gb/s Link Speed: 3.0Gb/s Media Type: Hard Disk Device [ Pierre Zelnicek 2010 ]

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Failure Monitoring Failure Monitoring [ Pierre Zelnicek 2010 ]

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Security Auditing Security Auditing What does security auditing cover? Check for policy enforcement setup and check for incidents. Where it has to be done? On the systems itself for policy enforcement setup. On a central logging facility for detecting incidents. What opensource solutions are existing? SELinux or AppArmor for policy enforcement. RSyslog and LogCheck, SNORT for incident detection. [ Pierre Zelnicek 2010 ]

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Reporting Service Level Agrement [ week / month / year ] Configuration/Setup Change Management Policy Change Security Auditing Security Auditing Check for incident Failure Monitoring Failure Monitoring Check thresholds Performance Monitoring Performance Monitoring Immediate autonomous reaction / response [ Pierre Zelnicek 2010 ]

Short introduction to monitoring systems for large Short - PowerPoint PPT Presentation

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Immediate Reaction / Configuration Change Immediate Reaction / Configuration Change Security Auditing Security

Continuous Distributed Monitoring Monitoring A Short Survey Graham Cormode AT&T Labs

2016 Coordinated Monitoring Schedule 1 Navigation of Coordinated Monitoring website

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

Monitoring and Workflow management Monitoring and Workflow management in large distributed

Surveillance Programs - GLNPO Cooperative Monitoring Coordinated Science and Monitoring

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Coastal Monitoring Update Clive Moon Engineering Manager - Environment Coastal Monitoring

Fuel Monitoring Presentation Fuel Monitoring We specialize in fuel monitoring also can customize

LYNAS MALAYSIA Key monitoring data As at October 2019 1 RADIOLOGICAL MONITORING PERFORMANCE

Revised Nonpublic School Monitoring Process 2015 2016 1 Past Nonpublic Monitoring Schedule

Network Monitoring On Large Networks Yao Chuan Han (TWCERT/CC) james@cert.org.tw 1 Overview

Types of Expert Systems Interpretation Systems Prediction Systems Diagnosis Systems

GLAST Large Area Telescope: GLAST Large Area Telescope: Gamma- -ray Large ray Large Gamma

2016 ANNUAL GENERAL MEETING Short Sea Shipping is OUR BUSINESS 2 Short Sea Shipping is OUR

GSM Short Message Service GSM Short Message Service GSM Short Message Service GSM Short Message

Terry Fox Drive Terry Fox Drive Monitoring Monitoring Results Results Nick Stow Senior

Architecture Needs a Time Series Platform Thom Crowe, Community Manager InfluxData As you

service Piotr Szwed and Kamil Pkala AGH University of Science and Technology Department of

Towards Omnia: a Monitoring Factory for Quality-Aware DevOps Apr 27 th , 2017 Marco MIGLIERINA

Path to Resilient and Observable Microservices Slides: https://slides.peterj.dev @pjausovec 1 /

GO BEYOND DATA Real-time Analytics for Application Performance Management Yury Oleynik Data

Chapter 16 Cryptography and Network Transport Level Security Security Chapter 16 Use your

WEB Security: Secure Socket Layer Cunsheng Ding HKUST, Hong Kong, CHINA C. Ding - COMP4631 -

A Framework for Effective Alert Visualization Uday Banerjee Jon Ramsey SecureWorks The

Short introduction to monitoring systems for large Short - PowerPoint PPT Presentation

Short introduction to monitoring systems for large Short introduction to monitoring systems for large computer farms computer farms Immediate Reaction / Configuration Change Immediate Reaction / Configuration Change Security Auditing Security

Continuous Distributed Monitoring Monitoring A Short Survey Graham Cormode AT&amp;T Labs

2016 Coordinated Monitoring Schedule 1 Navigation of Coordinated Monitoring website

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

Monitoring and Workflow management Monitoring and Workflow management in large distributed

Surveillance Programs - GLNPO Cooperative Monitoring Coordinated Science and Monitoring

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Coastal Monitoring Update Clive Moon Engineering Manager - Environment Coastal Monitoring

Fuel Monitoring Presentation Fuel Monitoring We specialize in fuel monitoring also can customize

LYNAS MALAYSIA Key monitoring data As at October 2019 1 RADIOLOGICAL MONITORING PERFORMANCE

Revised Nonpublic School Monitoring Process 2015 2016 1 Past Nonpublic Monitoring Schedule

Network Monitoring On Large Networks Yao Chuan Han (TWCERT/CC) james@cert.org.tw 1 Overview

Types of Expert Systems Interpretation Systems Prediction Systems Diagnosis Systems

GLAST Large Area Telescope: GLAST Large Area Telescope: Gamma- -ray Large ray Large Gamma

2016 ANNUAL GENERAL MEETING Short Sea Shipping is OUR BUSINESS 2 Short Sea Shipping is OUR

GSM Short Message Service GSM Short Message Service GSM Short Message Service GSM Short Message

Terry Fox Drive Terry Fox Drive Monitoring Monitoring Results Results Nick Stow Senior

Architecture Needs a Time Series Platform Thom Crowe, Community Manager InfluxData As you

service Piotr Szwed and Kamil Pkala AGH University of Science and Technology Department of

Towards Omnia: a Monitoring Factory for Quality-Aware DevOps Apr 27 th , 2017 Marco MIGLIERINA

Path to Resilient and Observable Microservices Slides: https://slides.peterj.dev @pjausovec 1 /

GO BEYOND DATA Real-time Analytics for Application Performance Management Yury Oleynik Data

Chapter 16 Cryptography and Network Transport Level Security Security Chapter 16 Use your

WEB Security: Secure Socket Layer Cunsheng Ding HKUST, Hong Kong, CHINA C. Ding - COMP4631 -

A Framework for Effective Alert Visualization Uday Banerjee Jon Ramsey SecureWorks The

Continuous Distributed Monitoring Monitoring A Short Survey Graham Cormode AT&T Labs