Open Science Grid
- ne grid among many
Ruth Pordes Fermilab May 3rd 2006
Open Science Grid one grid among many Ruth Pordes Fermilab May - - PowerPoint PPT Presentation
Open Science Grid one grid among many Ruth Pordes Fermilab May 3rd 2006 of course a special grid its the people (some of them at the consortium meeting in Jan 06) 5/3/06 2 With special partners 5/3/06 3 The Open Science Grid
Open Science Grid
Ruth Pordes Fermilab May 3rd 2006
5/3/06 2
it’s the people…
(some of them at the consortium meeting in Jan 06)
5/3/06 3
With special partners…
5/3/06 4
Grid providers serve multiple communities; Grid consumers use multiple grids.
The Open Science Grid Consortium brings
the grid service providers - middleware developers,
cluster, network and storage administrators, local-grid communities
the grid consumers - from global collaborations to the
single researcher, through campus communities to under-served science domains into a cooperative to share and sustain a common heterogeneous distributed facility in the US and beyond.
5/3/06 5
The Open Science Grid Consortium brings
I am the Executive Director Miron Livny is Manager of the OSG Distributed Facility: Head of the Condor Project and Virtual Data Toolkit, Coordinator of US federation in EGEE, member of EGEE/gLITE design team. Bill Kramer is the Chair of the Science Council; Head of Berkeley Lab NERSC supercomputing facility. Ian Foster is co-PI of the OSG Proposal: responsible for Globus and Computer Science research contributions and partnerships. Harvey Newman represents Advanced Network project contributions and collaborations. Alan Blateky is Engagement Coordinator for new communities.. Experiment software leadership: US ATLAS and US CMS Sw/C leaders, LIGO, CDF, D0, STAR etc.
5/3/06 6
The OSG Eco-System: Bio Interdependence
With international and national infrastructures - EGEE, TeraGrid; a growing number of campus grids - GLOW, GROW, GRASE, FermiGrid, Crimson Grid, TIGRE; the end-user integrated distributed systems - LIGO Data Grid, CMS and ATLAS distributed analysis systems, Tevatron SAMGrid, and STAR Data Grid.
5/3/06 7
What is Open Science Grid?
storage and networks.
hear in June.
(legacy) systems to Grids:
life use of technology, integration, operation.
5/3/06 8
What is Open Science Grid?
Blueprint Principles (june 2004)
Preserve Site autonomy and shared Grid use with local access. VO based Environment and Services. Recursive principles throughout - support “grid
5/3/06 9
First & foremost - delivery to the WLCG schedule for LHC science
And soon a third: Naregi
5/3/06 10
OSG: More than a US Grid
Taiwan - (CDF, LHC) Brazil - (D0, STAR, LHC) Korea
5/3/06 11
OSG 1 day last week:
50 Clusters : used locally as well as through the
grid
5 Large disk or tape stores 23 VOs >2000 jobs running through Grid;
Bioinformatics Routed from Local UWisconsin Campus Grid 2000 running jobs 500 waiting jobs LHC Run II
5/3/06 12
The Trend?
OSG 0.4.0 deployment
5/3/06 13
While LHC Physics drives the schedule and performance envelope
1 GigaByte/sec
5/3/06 14
OSG also Serves other stakeholders
Gravitational Wave and other legacy Physics exps.
E.g. From OSG Proposal: LIGO : With an annual science run of data collected at roughly a terabyte of raw data per day, this will be critical to the goal of transparently carrying out LIGO data analysis on the
Opportunity to share use of “standing army” of
resources
E.g. Genome Analysis and Database Update system,
Interfacing existing computing and storage facilities
and Campus Grids to a common infrastructure.
E.g. FermiGrid Strategy: To allow opportunistic use of otherwise dedicated resources. To save effort by implementing shared
services to run on the Grid.
5/3/06 15
OSG also Serves other stakeholders
Gravitational Wave and other legacy Physics exps.
E.g. From OSG Proposal: LIGO : With an annual science run of data collected at roughly a terabyte of raw data per day, this will be critical to the goal of transparently carrying out LIGO data analysis on the
Opportunity to share use of “standing army” of
resources
E.g. From OSG news: Genome Analysis and Database Update system,
Interface existing computing and storage facilities
and Campus Grids to a common infrastructure.
E.g. FermiGrid Strategy: To allow opportunistic use of otherwise dedicated resources. To save effort by implementing shared
services to run on the Grid.
3 Examples of Interoperation
5/3/06 16
Grid Laboratory of Wisconsin (GLOW):
5/3/06 17
GLOW to OSG and the Football Pool problem:
Routing jobs from “lan-grid” local security, job,
storage infrastructure and to “wan-grid”.
Middleware development from CMS DISUN
The goal of the application is to determine the
smallest "covering code" of ternary words of length six. (Or in the football pool, to determine how many
lottery tickets one would have to buy to guarantee that no more than one prediction is incorrect.) Even after decades of
study, only fairly weak bounds are known on this
applications in data compression, coding theory and statistical designs.
5/3/06 18
Opportunistic Routing from GLOW to OSG
5/3/06 19
TeraGrid
Through high-performance network connections, TeraGrid integrates high-performance computers, data resources and tools, and high- end experimental facilities around the (US) country.
CDF MonteCarlo jobs running on Purdue TeraGrid
resource; able to access OSG data areas and be accounted to both Grids.
http://www.nsf.gov/news/news_images.jsp?cntn_id=104248&org=OLPA
5/3/06 20
Genome Analysis and Database Update system
Runs across TeraGrid and OSG. Uses the Virtual Data System (VDS) workflow & provenance.
Pass through public DNA and protein databases for new and newly updated genomes of different organisms and runs BLAST, Blocks, Chisel. 1200 users of resulting DB.
Request: 1000 CPUs for 1-2 weeks. Once a month, every month. On OSG at the moment >600CPUs and 17,000 jobs a week.
5/3/06 21
Interoperation & Commonality with EGEE
OSG sites publish Information to WLCG BDII so
Resource Brokers can route jobs.
Operations Security Middleware
5/3/06 22
OSG Middleware Layers
NSF Middleware Initiative (NMI): Condor, Globus, Myproxy Virtual Data Toolkit (VDT) Common Services NMI + VOMS, MonaLisa, Clarens, AuthZ etc OSG Release Cache: VDT + Configuration, Validation, VO management, VO Specific Services & Interfaces LHC Services & Interfaces LIGO Data Grid Tevatron CDF, D0 Interfaces
Infrastructure Applications
5/3/06 23
OSG Middleware Layers
NSF Middleware Initiative (NMI): Condor, Globus, Myproxy Virtual Data Toolkit (VDT) Common Services NMI + VOMS, MonaLisa, Clarens, AuthZ etc OSG Release Cache: VDT + Configuration, Validation, VO management, VO Specific Services & Interfaces LHC Services & Interfaces LIGO Data Grid Tevatron CDF, D0 Interfaces
Infrastructure Applications
5/3/06 24
Virtual Data Toolkit V1.3.10b - a collection of components to integrate to a Distributed System Easy to download, install and use.
Apache HTTPD 2.2.0 jClarens Web Service Registry 0.6.1 Apache Tomcat 5.0.28 JobMon 0.2 Clarens 0.7.2 KX509 20031111 ClassAds 0.9.7 MonALISA 1.4.12 Condor/Condor-G 6.7.18 MyProxy 3.4 DOE and LCG CA Certificates v4 (includes LCG 0.25 CAs) MySQL 4.1.11 DRM 1.2.10 Nest 0.9.7-pre1 EDG CRL Update 1.2.5 Netlogger 3.2.4 EDG Make Gridmap 2.1.0 PPDG Cert Scripts 1.7 Fault Tolerant Shell (ftsh) 2.0.12 PRIMA Authorization Module 0.3 Generic Information Provider 1.0.15 (Iowa 15-Feb-2006) PRIMA Authorization Module For GT4 Web Services 0.1.0 gLite CE Monitor (INFN prerelease from 2005-11-15) 1.6.0 pyGlobus gt4.0.1-1.13 Globus Toolkit, pre web-services 4.0.1 pyGridWare gt4.0.1a Globus Toolkit, web-services 4.0.1 RLS 3.0.041021 GLUE Schema 1.2 draft 7 SRM Tester 1.1 GSI-Enabled OpenSSH 3.6 UberFTP 1.18 GUMS 1.1.0 Virtual Data System 1.4.4 Java SDK 1.4.2_10 VOMS 1.6.10.2 jClarens 0.6.1 VOMS Admin (client 1.2.10, interface 1.0.2, server 1.2.10) 1.2.10-r0 jClarens Discovery Services registration scripts 20060206
Common with EGEE/WLCG
5/3/06 25
Virtual Data Toolkit V1.3.10b - a collection of components to integrate to a Distributed System Easy to download, install and use.
Apache HTTPD 2.2.0 jClarens Web Service Registry 0.6.1 Apache Tomcat 5.0.28 JobMon 0.2 Clarens 0.7.2 KX509 20031111 ClassAds 0.9.7 MonALISA 1.4.12 Condor/Condor-G 6.7.18 MyProxy 3.4 DOE and LCG CA Certificates v4 (includes LCG 0.25 CAs) MySQL 4.1.11 DRM 1.2.10 Nest 0.9.7-pre1 EDG CRL Update 1.2.5 Netlogger 3.2.4 EDG Make Gridmap 2.1.0 PPDG Cert Scripts 1.7 Fault Tolerant Shell (ftsh) 2.0.12 PRIMA Authorization Module 0.3 Generic Information Provider 1.0.15 (Iowa 15-Feb-2006) PRIMA Authorization Module For GT4 Web Services 0.1.0 gLite CE Monitor (INFN prerelease from 2005-11-15) 1.6.0 pyGlobus gt4.0.1-1.13 Globus Toolkit, pre web-services 4.0.1 pyGridWare gt4.0.1a Globus Toolkit, web-services 4.0.1 RLS 3.0.041021 GLUE Schema 1.2 draft 7 SRM Tester 1.1 GSI-Enabled OpenSSH 3.6 UberFTP 1.18 GUMS 1.1.0 Virtual Data System 1.4.4 Java SDK 1.4.2_10 VOMS 1.6.10.2 jClarens 0.6.1 VOMS Admin (client 1.2.10, interface 1.0.2, server 1.2.10) 1.2.10-r0 jClarens Discovery Services registration scripts 20060206
EGEE/LCG at VDT 1.2.4? OSG prepared to help facilitate upgrade if needed.
5/3/06 26
OSG Program of Work:
Sustained, Robust Distributed Facility
Operations + Integration Security Software Releases Engagement
Education, Training and Outreach Science Driven Extensions
No developments in OSG - so dependent on external
projects for extended and new middleware.
Driven by schedule of stakeholders. Will be actively monitoring/inputting to Globus CDIGS
roadmap and campaigns. Participate in Grid Interoperability Now (GIN) when effort available.
Collaborate with gLITE whereever possible.
5/3/06 27
The Vision: the Grid
5/3/06 28
5/3/06 29
Secure:
Apply the NIST process: Management - Risk assessment, planning, Service auditing and checking, Operational - Incident response, Awareness and Training, Configuration management, Technical - Authentication and Revocation, Auditing and analysis. End to end trust in quality of code executed on remote CPU -signatures? Controls. http://csrc.nist.gov/index.html
5/3/06 30
(1) Lower Cost of Resource Owner Entry: Minimize software stack.
Facility: Sites, Services & Admins (3) Rich set
Organization Services. Virtual Organization: Services, Systems, Admins
Usable: “me, my friends, the grid” (Frank Wüerthwein)
(2)Lower Cost of User Entry: Thin User Grid Interface
User Interface (on my laptop)
5/3/06 31
New Services coming in OSG
“Pull Mode” & Pilot Jobs just in time binding of job to site: (Panda, GlideCAF, Condor-C) VO downloaded executables subject to site authorization and security callouts/services. Use of gLITE GLEXEC. Virtual Machine based Workspaces: VO/Globus workspaces encapsulate services. Worker Nodes need not have access to the WAN; use of Condor Grid Connection Broker (GCB) Resource Selection based on ClassAds & gLITE CEMON. Move to WS GT4: Tests of WS Gram with CMS CRAB jobs sent Globus back to development table.Next MDS4. Incremental upgrades where sensible. For HeadNodes (edge services) cleaner, we may make it a requirement, to replicate service and support both in parallel. Accounting: Condor meter; possibility to share probes/meters with gLITE. Agreement on GGF Usage Record - needs
5/3/06 32
Digression..Accounting What is an OSG Job? Resources can be on Multiple Grids: MyApplication Job Submission Condor-G
OSG EGEE
Job Counted on OSG & EGEE
5/3/06 33
SG EGEE
Job Counted on Campus Grid as well
MyApplication Job Submission Condor-G US CMS DISUN Accounting On the Campus Grid, On the VO grid, submitted to the local cluster by resource selector, do work across multiple grids, consume differing “value”…
5/3/06 34
Reliable: Central Operations Activities
Automated validation of
basic services and site configuration.
Robots of various kinds.
Grid Exerciser:
5/3/06 35
Fast:
Integrating network management into s/w stack
(LambaStation).
VO specific resource selection/brokering. Include support for persistent VO s/w on sites;
Posix(like) I/O to data at Worker Nodes.
Tune/Configure/Replicate Headnodes. Trying to stay ahead of the needed amount of
resources - while fully supporting opportunistic
5/3/06 36
OSG: Where to find information:
OSG Web site: www.opensciencegrid.org Work in progress:
http://osg.ivdgl.org/twiki/bin/view/Integration/Over viewGuide
Virtual Data Toolkit:
http://vdt.cs.wisc.edu//index.html
News about Grids in Science in “Science Grid This
Week”: www.interactions.org/sgtw
OSG Consortium meeting Seattle Aug 21st.
Thank you!