The Cluster Monitoring System
- f IHEP
Qingbao Hu
huqb@ihep.ac.cn
Computing Center, Institute of High Energy Physics, Chinese Academy of Sciences
International Symposium on Grids and Clouds (ISGC) 2016
The Cluster Monitoring System of IHEP Qingbao Hu huqb@ihep.ac.cn - - PowerPoint PPT Presentation
The Cluster Monitoring System of IHEP Qingbao Hu huqb@ihep.ac.cn Computing Center, Institute of High Energy Physics, Chinese Academy of Sciences International Symposium on Grids and Clouds (ISGC) 2016 Content Overview of IHEPs Monitor
Computing Center, Institute of High Energy Physics, Chinese Academy of Sciences
International Symposium on Grids and Clouds (ISGC) 2016
Qingbao Hu/CC/IHEP 2016/3/24 - 2
~ 13 13,500 0 CPU cores
tre, gLuste ter, r, openAF AFS, S, etc.
~ 5PB
3584 tape librari aries, es, LTO4 4 tape
fied d CERN CASTOR R 1.7
Qingbao Hu/CC/IHEP 2016/3/24 - 3
Cluster built with blades Tape libraries
Qingbao Hu/CC/IHEP 2016/3/24 - 4
Qingbao Hu/CC/IHEP 2016/3/24 - 5
Ganglia Recording the performance of different resource groups Icinga Monitoring the status of cluster devices and services Logger Analysis Collecting more comprehensive data & providing an overview of the whole cluster health status
Qingbao Hu/CC/IHEP 2016/3/24 - 6
Qingbao Hu/CC/IHEP 2016/3/24 - 7
Qingbao Hu/CC/IHEP 2016/3/24 - 8
Different clusters The status of bws1 farm
Qingbao Hu/CC/IHEP 2016/3/24 - 9
automatically
Qingbao Hu/CC/IHEP 2016/3/24 - 10
Server
results to Icinga server.
Qingbao Hu/CC/IHEP 2016/3/24 - 11
Qingbao Hu/CC/IHEP 2016/3/24 - 12
scale of Monitoring hosts scale
g service The average host delay The average service delay No DNX 1257 9796 251.588sec 256.930sec No DNX 1265 12222 789.429sec 789.000sec Use DNX 1343 13841 0.365sec 0.644sec
Qingbao Hu/CC/IHEP 2016/3/24 - 13
Qingbao Hu/CC/IHEP 2016/3/24 - 14
Qingbao Hu/CC/IHEP 2016/3/24 - 15
Qingbao Hu/CC/IHEP 2016/3/24 - 17
the file offset info to guarantee the continuity of the log data collection when collect service crash.
Qingbao Hu/CC/IHEP 2016/3/24 - 18
Collect configure file
Qingbao Hu/CC/IHEP 2016/3/24 - 19
log_all log_all log_all
Qingbao Hu/CC/IHEP 2016/3/24 - 20
collector
Qingbao Hu/CC/IHEP 2016/3/24 - 21
Qingbao Hu/CC/IHEP 2016/3/24 - 22
Qingbao Hu/CC/IHEP 2016/3/24 - 23
Qingbao Hu/CC/IHEP 2016/3/24 - 24
Qingbao Hu/CC/IHEP 2016/3/24 - 25