egee asia pacific regional operation center
play

EGEE Asia Pacific Regional Operation Center Min-Hong Tsai ASGC - PowerPoint PPT Presentation

Enabling Grids for E-sciencE EGEE Asia Pacific Regional Operation Center Min-Hong Tsai ASGC ISGC 2007 March 29, Taipei http://www.eu-egee.org/ http://www.twgrid.org/aproc/ www.eu-egee.org EGEE-II INFSO-RI-031688 Agenda Enabling Grids for


  1. Enabling Grids for E-sciencE EGEE Asia Pacific Regional Operation Center Min-Hong Tsai ASGC ISGC 2007 March 29, Taipei http://www.eu-egee.org/ http://www.twgrid.org/aproc/ www.eu-egee.org EGEE-II INFSO-RI-031688

  2. Agenda Enabling Grids for E-sciencE • APROC Introduction • Status • Joining EGEE EGEE-II INFSO-RI-031688 2

  3. APROC Introduction I Enabling Grids for E-sciencE • APROC Mission – Provide deployment support facilitating Grid expansion – Maximize the availability of Grid services • Supports EGEE sites in Asia Pacific since April 2005 – 20 production sites, 8 countries – 9 sites joined EGEE since last ISGC: recently HKU, KISTI – 3 sites in certification process  Philippines: Advanced Science and Technology Institute  Korea: KONKUK  Mongolia: (MAS IPT) Mongolian Academy of Sciences EGEE-II INFSO-RI-031688 3

  4. APROC Services Enabling Grids for E-sciencE • Site Deployment Support – Registration – Installation – Certification Operations Support • – Monitoring, troubleshooting – Problem tracking – Software updates and security coordination – Regional VO services - VOMS and LFC • ASGCCA CA Service – provide certificates for AP EGEE/LCG sites without domestic CA. • EGEE Operations – CIC-on-duty: EGEE global operations – Monitoring tool development: GStat and GGUS Search – TPM: Front line user support (Q4 2006) – OSCT: Incident Response duty (Dec 2006) EGEE-II INFSO-RI-031688 4

  5. APROC Usage Enabling Grids for E-sciencE • New Active VOs: Belle and TWGrid • This year: 200 KSI2K Years Last year: 41 KSI2K Years • EGEE-II INFSO-RI-031688 5

  6. APROC Availability Enabling Grids for E-sciencE JS from LHC OPN Remove SSH Hardware Slow BDII upgrade Failure • Daily snapshots of SAM results of 2.4 2.6 2.7 3.0 region 100% Availability increased to 70-80% range – 80% from 60-70% a half year ago SD 60% CT • CT mostly replica management JL 40% failure JS 20% ER – Sensitive to Information System OK access/performance 0% 2005-04 2005-07 2005-10 2006-01 2006-04 2006-07 2006-10 2007-01 – Request that data management clients can failover to secondary BDII • Network Issues – Often the root cause of CT, JL and JS 100 80 – Network congested site set up local top- level BDII 60  40 Increase default update timeout and avail breath time 20 reliab avail 0 2005-04 2005-06 2005-08 2005-10 2005-12 2006-02 2006-04 2006-06 2006-08 2006-10 2006-12 2007-02 EGEE-II INFSO-RI-031688 6

  7. Monitoring and Notification Enabling Grids for E-sciencE • Planned integration of Asset DB • Nagios plugins developed  CE  LFC  VOMS  Storage  IT services  OS Notification via Email • – SMS transmission device currently being tested EGEE-II INFSO-RI-031688 7

  8. Nagios Regional Monitoring Enabling Grids for E-sciencE • Tests run at faster frequency – 5-10 minutes – Faster response to faults Add customized plugins • – Run low level tests for faster isolation of problems – Tests may not be available in global monitoring tools yet – Ability to run tests on the target host via NRPE • Management Interface – Acknowledgement – On demand execution of tests – Historical availability – Test dependencies http://lists.grid.sinica.edu.tw/apwiki/Nagios_monitoring_-_APROC_sites http://lists.grid.sinica.edu.tw/apwiki/Nagios_Plugins_Description EGEE-II INFSO-RI-031688 8

  9. Plans Enabling Grids for E-sciencE • Increase monitoring coverage – Information System – Network performance monitoring  available/achievable bandwidth  Full mesh monitoring Improve troubleshooting tools • – http://lists.grid.sinica.edu.tw/apwiki/APROC/Troubleshooting_Guides – FAQ system – Service diagnostic scripts • Integration of ticketing system with GGUS • Training – EGEE Induction at GridAsia 2007. June 5, 2007 Singapore. EGEE-II INFSO-RI-031688 9

  10. Joining EGEE Infrastructure Enabling Grids for E-sciencE • Contact APROC • If domestic CA is not available – Register as a ASGCCA RA during ISGC • Dedicated an administrator with Unix experience • Allocate servers – 5: UI, CE, WN, DPM, MON – 3: CE/WN, MON, DPM  UI can be installed in user account  Consider Virtual Machine for MON Study user guide and installation manual • • Send configuration file to APROC for review before deployment Complete registration and certification process • EGEE-II INFSO-RI-031688 10

  11. Long Term Operations Enabling Grids for E-sciencE • Establish domestic CA if none exists • Increase availability and resource levels • Establish domestic operations structure – Operations procedures – Tools: monitoring and notification, ticketing system – User and administrator support • Training for administrators and users Collaborate with APROC in Regional operations • • Q: Need for regional experimental Grid? EGEE-II INFSO-RI-031688 11

  12. Issues in AsiaPacific Enabling Grids for E-sciencE • No regional projects to promote collaboration in EGEE • Network bandwidth – Low capacity: regional and last mile – Usage based billing Need for training • – Training for trainers – Application Training – E-Learning material • However EGEE already provides – M/W development and integration – Operations structure, coordination and support – Close to 200 user communities EGEE-II INFSO-RI-031688 12

  13. Summary Enabling Grids for E-sciencE • APROC Provides EGEE operations support services to AsiaPacific • EGEE sites in region has grown to 20 sites with utilization of 200 ksi2k years • We have also improved availability but still is significant room for improvement • We look forward to more site joining EGEE in the region and eht possibility for further collaboration – Applications – Operations • Feedback on what we can improve EGEE-II INFSO-RI-031688 13

  14. Thanks You for Your Attention! Enabling Grids for E-sciencE • Questions? – roc@lists.grid.sinica.edu.tw – http://www.twgrid.org/aproc/ • Thanks to efforts from: – T1/APROC Team  Jason Shih Dave Wei  Felix Lee Joanna Huang  Aries Hong Hung-Che Jen  Jinny Chien Shu-Ting Liao  Yi-Ping Wu Min Tsai EGEE-II INFSO-RI-031688 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend