Open Science Grid one grid among many Ruth Pordes Fermilab May - - PowerPoint PPT Presentation

open science grid one grid among many
SMART_READER_LITE
LIVE PREVIEW

Open Science Grid one grid among many Ruth Pordes Fermilab May - - PowerPoint PPT Presentation

Open Science Grid one grid among many Ruth Pordes Fermilab May 3rd 2006 of course a special grid its the people (some of them at the consortium meeting in Jan 06) 5/3/06 2 With special partners 5/3/06 3 The Open Science Grid


slide-1
SLIDE 1

Open Science Grid

  • ne grid among many

Ruth Pordes Fermilab May 3rd 2006

slide-2
SLIDE 2

5/3/06 2

  • f course a special grid …

it’s the people…

(some of them at the consortium meeting in Jan 06)

slide-3
SLIDE 3

5/3/06 3

With special partners…

slide-4
SLIDE 4

5/3/06 4

Grid providers serve multiple communities; Grid consumers use multiple grids.

The Open Science Grid Consortium brings

 the grid service providers - middleware developers,

cluster, network and storage administrators, local-grid communities

 the grid consumers - from global collaborations to the

single researcher, through campus communities to under-served science domains into a cooperative to share and sustain a common heterogeneous distributed facility in the US and beyond.

slide-5
SLIDE 5

5/3/06 5

The Open Science Grid Consortium brings

I am the Executive Director Miron Livny is Manager of the OSG Distributed Facility: Head of the Condor Project and Virtual Data Toolkit, Coordinator of US federation in EGEE, member of EGEE/gLITE design team. Bill Kramer is the Chair of the Science Council; Head of Berkeley Lab NERSC supercomputing facility. Ian Foster is co-PI of the OSG Proposal: responsible for Globus and Computer Science research contributions and partnerships. Harvey Newman represents Advanced Network project contributions and collaborations. Alan Blateky is Engagement Coordinator for new communities.. Experiment software leadership: US ATLAS and US CMS Sw/C leaders, LIGO, CDF, D0, STAR etc.

slide-6
SLIDE 6

5/3/06 6

The OSG Eco-System: Bio Interdependence

With international and national infrastructures - EGEE, TeraGrid; a growing number of campus grids - GLOW, GROW, GRASE, FermiGrid, Crimson Grid, TIGRE; the end-user integrated distributed systems - LIGO Data Grid, CMS and ATLAS distributed analysis systems, Tevatron SAMGrid, and STAR Data Grid.

slide-7
SLIDE 7

5/3/06 7

What is Open Science Grid?

  • High Throughput Distributed Facility
  • Shared opportunistic access to existing clusters,

storage and networks.

  • Owner controlled resources and usage policies.
  • Supporting Science
  • 5 year Proposal submitted to NSF and DOE - should

hear in June.

  • Open and Heterogeneous
  • Research groups transitioning from & extending

(legacy) systems to Grids:

  • Experiments developing new systems.
  • Application Computer Scientists looking for Real

life use of technology, integration, operation.

  • University Researchers...
slide-8
SLIDE 8

5/3/06 8

What is Open Science Grid?

Blueprint Principles (june 2004)

Preserve Site autonomy and shared Grid use with local access. VO based Environment and Services. Recursive principles throughout - support “grid

  • f grids”
slide-9
SLIDE 9

5/3/06 9

First & foremost - delivery to the WLCG schedule for LHC science

And soon a third: Naregi

slide-10
SLIDE 10

5/3/06 10

OSG: More than a US Grid

Taiwan - (CDF, LHC) Brazil - (D0, STAR, LHC) Korea

slide-11
SLIDE 11

5/3/06 11

OSG 1 day last week:

 50 Clusters : used locally as well as through the

grid

 5 Large disk or tape stores  23 VOs  >2000 jobs running through Grid;

Bioinformatics Routed from Local UWisconsin Campus Grid 2000 running jobs 500 waiting jobs LHC Run II

slide-12
SLIDE 12

5/3/06 12

The Trend?

OSG 0.4.0 deployment

slide-13
SLIDE 13

5/3/06 13

While LHC Physics drives the schedule and performance envelope

1 GigaByte/sec

slide-14
SLIDE 14

5/3/06 14

OSG also Serves other stakeholders

 Gravitational Wave and other legacy Physics exps.

E.g. From OSG Proposal: LIGO : With an annual science run of data collected at roughly a terabyte of raw data per day, this will be critical to the goal of transparently carrying out LIGO data analysis on the

  • pportunistic cycles available on other VOs hardware

 Opportunity to share use of “standing army” of

resources

E.g. Genome Analysis and Database Update system,

 Interfacing existing computing and storage facilities

and Campus Grids to a common infrastructure.

E.g. FermiGrid Strategy: To allow opportunistic use of otherwise dedicated resources. To save effort by implementing shared

  • services. To work coherently to move all of our applications and

services to run on the Grid.

slide-15
SLIDE 15

5/3/06 15

OSG also Serves other stakeholders

 Gravitational Wave and other legacy Physics exps.

E.g. From OSG Proposal: LIGO : With an annual science run of data collected at roughly a terabyte of raw data per day, this will be critical to the goal of transparently carrying out LIGO data analysis on the

  • pportunistic cycles available on other VOs hardware

 Opportunity to share use of “standing army” of

resources

E.g. From OSG news: Genome Analysis and Database Update system,

 Interface existing computing and storage facilities

and Campus Grids to a common infrastructure.

E.g. FermiGrid Strategy: To allow opportunistic use of otherwise dedicated resources. To save effort by implementing shared

  • services. To work coherently to move all of our applications and

services to run on the Grid.

3 Examples of Interoperation

slide-16
SLIDE 16

5/3/06 16

Grid Laboratory of Wisconsin (GLOW):

slide-17
SLIDE 17

5/3/06 17

GLOW to OSG and the Football Pool problem:

 Routing jobs from “lan-grid” local security, job,

storage infrastructure and to “wan-grid”.

 Middleware development from CMS DISUN

  • utreach program.

 The goal of the application is to determine the

smallest "covering code" of ternary words of length six. (Or in the football pool, to determine how many

lottery tickets one would have to buy to guarantee that no more than one prediction is incorrect.) Even after decades of

study, only fairly weak bounds are known on this

  • value. Solutions to this problem have

applications in data compression, coding theory and statistical designs.

slide-18
SLIDE 18

5/3/06 18

Opportunistic Routing from GLOW to OSG

slide-19
SLIDE 19

5/3/06 19

TeraGrid

Through high-performance network connections, TeraGrid integrates high-performance computers, data resources and tools, and high- end experimental facilities around the (US) country.

 CDF MonteCarlo jobs running on Purdue TeraGrid

resource; able to access OSG data areas and be accounted to both Grids.

http://www.nsf.gov/news/news_images.jsp?cntn_id=104248&org=OLPA

slide-20
SLIDE 20

5/3/06 20

Genome Analysis and Database Update system

Runs across TeraGrid and OSG. Uses the Virtual Data System (VDS) workflow & provenance.

Pass through public DNA and protein databases for new and newly updated genomes of different organisms and runs BLAST, Blocks, Chisel. 1200 users of resulting DB.

Request: 1000 CPUs for 1-2 weeks. Once a month, every month. On OSG at the moment >600CPUs and 17,000 jobs a week.

slide-21
SLIDE 21

5/3/06 21

Interoperation & Commonality with EGEE

 OSG sites publish Information to WLCG BDII so

Resource Brokers can route jobs.

 Operations  Security  Middleware

slide-22
SLIDE 22

5/3/06 22

OSG Middleware Layers

NSF Middleware Initiative (NMI): Condor, Globus, Myproxy Virtual Data Toolkit (VDT) Common Services NMI + VOMS, MonaLisa, Clarens, AuthZ etc OSG Release Cache: VDT + Configuration, Validation, VO management, VO Specific Services & Interfaces LHC Services & Interfaces LIGO Data Grid Tevatron CDF, D0 Interfaces

Infrastructure Applications

slide-23
SLIDE 23

5/3/06 23

OSG Middleware Layers

NSF Middleware Initiative (NMI): Condor, Globus, Myproxy Virtual Data Toolkit (VDT) Common Services NMI + VOMS, MonaLisa, Clarens, AuthZ etc OSG Release Cache: VDT + Configuration, Validation, VO management, VO Specific Services & Interfaces LHC Services & Interfaces LIGO Data Grid Tevatron CDF, D0 Interfaces

Infrastructure Applications

slide-24
SLIDE 24

5/3/06 24

Virtual Data Toolkit V1.3.10b - a collection of components to integrate to a Distributed System Easy to download, install and use.

Apache HTTPD 2.2.0 jClarens Web Service Registry 0.6.1 Apache Tomcat 5.0.28 JobMon 0.2 Clarens 0.7.2 KX509 20031111 ClassAds 0.9.7 MonALISA 1.4.12 Condor/Condor-G 6.7.18 MyProxy 3.4 DOE and LCG CA Certificates v4 (includes LCG 0.25 CAs) MySQL 4.1.11 DRM 1.2.10 Nest 0.9.7-pre1 EDG CRL Update 1.2.5 Netlogger 3.2.4 EDG Make Gridmap 2.1.0 PPDG Cert Scripts 1.7 Fault Tolerant Shell (ftsh) 2.0.12 PRIMA Authorization Module 0.3 Generic Information Provider 1.0.15 (Iowa 15-Feb-2006) PRIMA Authorization Module For GT4 Web Services 0.1.0 gLite CE Monitor (INFN prerelease from 2005-11-15) 1.6.0 pyGlobus gt4.0.1-1.13 Globus Toolkit, pre web-services 4.0.1 pyGridWare gt4.0.1a Globus Toolkit, web-services 4.0.1 RLS 3.0.041021 GLUE Schema 1.2 draft 7 SRM Tester 1.1 GSI-Enabled OpenSSH 3.6 UberFTP 1.18 GUMS 1.1.0 Virtual Data System 1.4.4 Java SDK 1.4.2_10 VOMS 1.6.10.2 jClarens 0.6.1 VOMS Admin (client 1.2.10, interface 1.0.2, server 1.2.10) 1.2.10-r0 jClarens Discovery Services registration scripts 20060206

Common with EGEE/WLCG

slide-25
SLIDE 25

5/3/06 25

Virtual Data Toolkit V1.3.10b - a collection of components to integrate to a Distributed System Easy to download, install and use.

Apache HTTPD 2.2.0 jClarens Web Service Registry 0.6.1 Apache Tomcat 5.0.28 JobMon 0.2 Clarens 0.7.2 KX509 20031111 ClassAds 0.9.7 MonALISA 1.4.12 Condor/Condor-G 6.7.18 MyProxy 3.4 DOE and LCG CA Certificates v4 (includes LCG 0.25 CAs) MySQL 4.1.11 DRM 1.2.10 Nest 0.9.7-pre1 EDG CRL Update 1.2.5 Netlogger 3.2.4 EDG Make Gridmap 2.1.0 PPDG Cert Scripts 1.7 Fault Tolerant Shell (ftsh) 2.0.12 PRIMA Authorization Module 0.3 Generic Information Provider 1.0.15 (Iowa 15-Feb-2006) PRIMA Authorization Module For GT4 Web Services 0.1.0 gLite CE Monitor (INFN prerelease from 2005-11-15) 1.6.0 pyGlobus gt4.0.1-1.13 Globus Toolkit, pre web-services 4.0.1 pyGridWare gt4.0.1a Globus Toolkit, web-services 4.0.1 RLS 3.0.041021 GLUE Schema 1.2 draft 7 SRM Tester 1.1 GSI-Enabled OpenSSH 3.6 UberFTP 1.18 GUMS 1.1.0 Virtual Data System 1.4.4 Java SDK 1.4.2_10 VOMS 1.6.10.2 jClarens 0.6.1 VOMS Admin (client 1.2.10, interface 1.0.2, server 1.2.10) 1.2.10-r0 jClarens Discovery Services registration scripts 20060206

EGEE/LCG at VDT 1.2.4? OSG prepared to help facilitate upgrade if needed.

slide-26
SLIDE 26

5/3/06 26

OSG Program of Work:

 Sustained, Robust Distributed Facility

 Operations + Integration  Security  Software Releases  Engagement

 Education, Training and Outreach  Science Driven Extensions

 No developments in OSG - so dependent on external

projects for extended and new middleware.

 Driven by schedule of stakeholders.  Will be actively monitoring/inputting to Globus CDIGS

roadmap and campaigns. Participate in Grid Interoperability Now (GIN) when effort available.

 Collaborate with gLITE whereever possible.

slide-27
SLIDE 27

5/3/06 27

S U R F

The Vision: the Grid

slide-28
SLIDE 28

5/3/06 28

Usable Reliable Fast Secure

slide-29
SLIDE 29

5/3/06 29

Secure:

Apply the NIST process: Management - Risk assessment, planning, Service auditing and checking, Operational - Incident response, Awareness and Training, Configuration management, Technical - Authentication and Revocation, Auditing and analysis. End to end trust in quality of code executed on remote CPU -signatures? Controls. http://csrc.nist.gov/index.html

slide-30
SLIDE 30

5/3/06 30

(1) Lower Cost of Resource Owner Entry: Minimize software stack.

Facility: Sites, Services & Admins (3) Rich set

  • f Virtual

Organization Services. Virtual Organization: Services, Systems, Admins

Usable: “me, my friends, the grid” (Frank Wüerthwein)

(2)Lower Cost of User Entry: Thin User Grid Interface

User Interface (on my laptop)

slide-31
SLIDE 31

5/3/06 31

New Services coming in OSG

“Pull Mode” & Pilot Jobs just in time binding of job to site: (Panda, GlideCAF, Condor-C) VO downloaded executables subject to site authorization and security callouts/services. Use of gLITE GLEXEC. Virtual Machine based Workspaces: VO/Globus workspaces encapsulate services. Worker Nodes need not have access to the WAN; use of Condor Grid Connection Broker (GCB) Resource Selection based on ClassAds & gLITE CEMON. Move to WS GT4: Tests of WS Gram with CMS CRAB jobs sent Globus back to development table.Next MDS4. Incremental upgrades where sensible. For HeadNodes (edge services) cleaner, we may make it a requirement, to replicate service and support both in parallel. Accounting: Condor meter; possibility to share probes/meters with gLITE. Agreement on GGF Usage Record - needs

  • extending. Joint EGEE, OSG, TeraGrid monthly phone-calls.
slide-32
SLIDE 32

5/3/06 32

Digression..Accounting What is an OSG Job? Resources can be on Multiple Grids: MyApplication Job Submission Condor-G

OSG EGEE

Job Counted on OSG & EGEE

slide-33
SLIDE 33

5/3/06 33

SG EGEE

Job Counted on Campus Grid as well

MyApplication Job Submission Condor-G US CMS DISUN Accounting On the Campus Grid, On the VO grid, submitted to the local cluster by resource selector, do work across multiple grids, consume differing “value”…

slide-34
SLIDE 34

5/3/06 34

Reliable: Central Operations Activities

 Automated validation of

basic services and site configuration.

 Robots of various kinds.

Grid Exerciser:

slide-35
SLIDE 35

5/3/06 35

Fast:

 Integrating network management into s/w stack

(LambaStation).

 VO specific resource selection/brokering.  Include support for persistent VO s/w on sites;

Posix(like) I/O to data at Worker Nodes.

 Tune/Configure/Replicate Headnodes.  Trying to stay ahead of the needed amount of

resources - while fully supporting opportunistic

  • use. Policy, Priorities, Monitoring.
slide-36
SLIDE 36

5/3/06 36

OSG: Where to find information:

 OSG Web site: www.opensciencegrid.org  Work in progress:

http://osg.ivdgl.org/twiki/bin/view/Integration/Over viewGuide

 Virtual Data Toolkit:

http://vdt.cs.wisc.edu//index.html

 News about Grids in Science in “Science Grid This

Week”: www.interactions.org/sgtw

 OSG Consortium meeting Seattle Aug 21st.

Thank you!