GlideinWMS
Marco Mambelli Stakeholders Meeting May 11, 2018
GlideinWMS Marco Mambelli Stakeholders Meeting May 11, 2018 - - PowerPoint PPT Presentation
GlideinWMS Marco Mambelli Stakeholders Meeting May 11, 2018 Overview Releases since last stakeholders meeting Upcoming releases Current focus GlideinWMS roadmap Reference slides GlideinWMS Architecture Quick Facts
GlideinWMS
Marco Mambelli Stakeholders Meeting May 11, 2018
Overview
– GlideinWMS Architecture – Quick Facts – Releases since last stakeholders meeting
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 2
Releases Since Last Stakeholders Meeting
– Bug Fix: Incorrect behavior of Singularity – Bug Fix: proxy-renewal-script updates and bug fixes – Bug Fix: Protection against malformed Frontend messages and hardening of forked processes
and 17 to adapt to new Singularity 2.4.6 requirement and because I did only a partial fix in rushing 3.2.22.1
– Fixes to the proxy-renewal-script (OSG contributed) were also added
– Includes all features and bug fixes released in v3_2_22_2
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 3
Next Planned Release
– Merging of production and development branches (v3.2 and v3.3), will bring Google CE support and policy plugin to the production version – Code modernization to Python 2.7 (and 2.6) standards – Increase number and coverage of the unit tests
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 4 5 10 15 20 25 30 3.4 3.2.22.2 3.2.21
Tickets per release
Features Bug fix Other Total
Next Planned Release (cont)
– Glidein lifetime not based anymore on the length of the proxy – Internal support of condor_switchboard (discontinued by HTCondor) – New option to kill glideins when job requests decrease – Estimate in advance the cores provided to glideins discovering cores automatically – Add entry monitoring breakdown for metasites – Review Factory and Frontend tools, especially glidien_off and manual_glidein_submit.py
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 5
GlideinWMS: Current Focus
– More automated testing & CI (pylint, pythoscope, futurize, unittest …) is ongoing focus – Developer’s test infrastructure to connect to Factory ITB services for scale testing – External contributions should be production ready
– Consider site topology – AUTO estimate – Actively follow the requests and adapt as the request goes down – Solution addressed in phases
– Singularity support changes
– Adapt to sites with tighter security restrictions
– Impacts how we determine lifetime of a glidein
– Successful test w/ FIFE
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 6
GlideinWMS Roadmap
– Keep up with the scalability requirements
numpy, etc
– Outsource GlideinWMS functionality to the HTCondor
functionality natively through HTCondor
– Leaner & modular Frontend
– Dependent on the work that will be done in HTCondor in future
– Support for new HPC sites with stricter policies (e.g. no outbound connection except gateways, MFA)
team next week.
– Monitoring Modernization
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 7
GlideinWMS Roadmap
– Moving to Decision Engine (DE)
– Make Glidein as a service capable of talking to multiple WMS middleware/frameworks
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 8
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 9
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 10
GlideinWMS
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 11 condor submit VO Frontend HTCondor Central Manager HTCondor Schedulers HTCondor Schedulers VO Frontend
Clouds (AWS/OpenStack OpenNebula)
Virtual Machine Job
HTCondor CE
Virtual Machine Job GlideinWMS Factory HTCondor-G
Super Computers (via BOSCO)
Virtual Machine Job
Grid Site
Virtual Machine WN/VM Glidein HTCondor Startd Job
Pull Job
NOTE: Frontend can talk to multiple factories Factory can serve multiple frontends
2014 2014 2012 2006
GlideinWMS: Quick Facts
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 12
Role Resources Effort (FTE) Project Mgmt/Lead Parag Mhashilkar (0.15 USCMS) 0.15 Development & Support Parag Mhashilkar (0.20 SCD) Marco Mambelli (1 SCD) Dennis Box (0.75 SCD) Marco Mascheroni (0.5 CMS - Contractor) 2.45 TOTAL 2.60 Table: Current Resources & Roles
Quick Facts: Releases & Support Structure
– Issues tracked in redmine issue tracker
– Issues are now associated with respective stakeholders
workload
slides)
– SCM
– Release notes & history
– Entire development team is responsible for support
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 13
Quick Facts: Project Status & Communication Channels
Area of Interest Mailing Lists Support glideinwms-support@fnal.gov Stakeholders glideinwms-stakeholders@fnal.gov Release Announcements glideinwms-support@fnal.gov cms-dct-wms@fnal.gov glideinwms-stakeholders@fnal.gov Future Release plans See next slide Discussions glideinwms-discuss@fnal.gov Code commits glideinwms-commit@fnal.gov Twitter Tag: @glideinwms
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 14
– Technical discussions & status updates – Regular stakeholder participation – Contact Parag Mhashilkar if you need invite for this meeting
– Project Status reported monthly at CS Project status meetings
Tracking Releases in Redmine
05/11/2018 Marco Mambelli | GlideinWMS - Stakeholders Meeting 15
Default tabs not too useful