EGEE and Interoperation Laurence Field CERN-IT-GD ISGC 2008 - - PowerPoint PPT Presentation

egee and interoperation
SMART_READER_LITE
LIVE PREVIEW

EGEE and Interoperation Laurence Field CERN-IT-GD ISGC 2008 - - PowerPoint PPT Presentation

Enabling Grids for E-sciencE EGEE and Interoperation Laurence Field CERN-IT-GD ISGC 2008 www.eu-egee.org EGEE and gLite are registered trademarks EGEE-II INFSO-RI-031688 Overview Enabling Grids for E-sciencE The grid problem definition


slide-1
SLIDE 1

EGEE-II INFSO-RI-031688

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

EGEE and Interoperation

Laurence Field CERN-IT-GD ISGC 2008

slide-2
SLIDE 2

2

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Overview

  • The grid problem definition
  • GLite and EGEE
  • The interoperability problem
  • The interoperation problem
  • Interoperation activities in EGEE
  • Grid Interoperability Now!
  • The need for standards
slide-3
SLIDE 3

3

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

What is a Grid?

Cross-organizational Grids Intra-organizational Grids Data Centers Virtualization Volunteer Computing Campus Grids Clusters Cloud Computing

Vaporware?

slide-4
SLIDE 4

4

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

What is the problem?

  • Organization A and B are administrative domains

– Independent policies, systems and authentication mechanisms

  • Users have local access to their local system using local methods
  • Users from A wish to collaborate with users from B

– Pool the resources – Split tasks by specialty – Share common frameworks

Organization B Organization A

slide-5
SLIDE 5

5

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

The Solution

  • The Users from A and B create a Virtual Organization

– Users have a unique identify but also the identity of the VO

  • Organizations A and B support the Virtual Organization

– Place “grid” interfaces at the organizational boundary – These map the generic “grid” functions/information/credentials

To the local security functions/information/credentials

  • Multi-institutional e-Science Infrastructures

Organization B Organization A Virtual Organization

slide-6
SLIDE 6

6

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

What is gLite

  • GLite is an integrated middleware distribution that provides the abstract

interfaces required for building a grid infrastructure which enables resource sharing across administrative domains.

  • The distribution consists of software repositories containing validated

components from multiple software providers, including components from gLite, with the documentation and tools required for deploying this as a production quality service.

  • The release procedure for the gLite distribution follows the same release

methodology used by many Linux distributions; A major baseline release to which updates are continually added.

  • The standard tools for the reference operating system are leveraged to

create the software repositories which are logical separated by service to allow them to evolve independently.

  • The latest major release is 3.1, which is available for the reference OS SL4

in both 32 and 64 bit flavors. Availability for other OS’ is a high priority and the order the priority is driven by demand. http://glite.web.cern.ch/glite/

slide-7
SLIDE 7

7

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

What is EGEE?

  • The Enabling Grids for E-SciencE (EGEE) Project

– 139 partners institutes from over 32 countries – Providing a service grid infrastructure of ~50000 CPUs and ~ 5 PB disk (5 million Gigabytes) of disk storage + tape MSS

Distributed across 260+ sites in 48 countries

– Which is available to more than 7500 users

Organized over 200 Virtual Organizations across 10 applications domains

– Who run are running more than 190K jobs per day

24 hours-a-day, 7 Days a week, 365 days a year

slide-8
SLIDE 8

8

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

The Solutions

PBS/Torque LSF Condor Load Leveler Sun Grid Engine GRAM v2 ARC CREAM NAREGI Unicore OSG GRAM v4 Nordugrid Naregi DEISA EGEE Teragrid

slide-9
SLIDE 9

9

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

The New Problem

  • Multiple grid infrastructures have evolved

– Using different interfaces at the organizational boundary

  • Users have grid access to their grid systems using grid methods
  • A grid itself can be seen as an organizational domain

– Independent policies, systems and authentication mechanisms

  • VOs from Grid A wish to use resources in grid B

– Pool the resources – Split task by specialty – Share common frameworks

Grid B Grid A Virtual Organization

slide-10
SLIDE 10

10

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Why?

  • Required common interfaces

– Now have multiple ”common” interfaces – Tried to solve one problem, but created another

  • Reasons:

– The infrastructures were developed independently

Funding based on regions and application domains

– Grid infrastructures are based on different middleware

Experimentation with different approaches Initially there were no standards

– Standards take time to mature

We need to build the infrastructures now!

  • The infrastructures outpaced standardization

Good standards require experience

slide-11
SLIDE 11

11

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

What can we do?

  • Interoperability:

“The ability to exchange information and to use what has been exchanged” (software)

  • Interoperation

“The use of interoperable systems“ (Infrastructures)

slide-12
SLIDE 12

12

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

How to Start

  • Understanding the differences

– Compatibility matrix

  • Domains that have to be linked for interoperability

– Security – Information Services – Job Management – Data Management

  • For interoperation you have to add

– Monitoring – Accounting – Operational links and joint policies – Trouble ticket systems – Operational security

slide-13
SLIDE 13

13

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Interoperability Matrix

  • 1. Understand both middleware stacks
  • 2. Identify the “common” interfaces
  • 3. Create an interoperability matrix

SRM SRM SRM Storage Control Protocol GSI/VOMS GridFTP GLUE v1 LDAP/GIIS GRAM OSG GSI/VOMS GSI/VOMS Security GridFTP GridFTP Storage Transfer Protocol GLUE v1.2 ARC Schema LDAP/BDII LDAP/GIIS Service Discovery GRAM GridFTP Job Submission EGEE ARC

slide-14
SLIDE 14

14

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Select Strategy

  • Long term solution

– Common interfaces – Standards

  • Medium term solutions

– Gateways – Adaptors and Translators

  • Short term solutions

– Parallel Infrastructures

  • User driven
  • Site driven
slide-15
SLIDE 15

15

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Parallel Infrastructures

  • User Driven

– The user joins both grids

Uses different clients

  • Depending on which interface

– More work for the User

Required for each infrastructure

– Keyhole approach

Restricts functionality

– Method initially used by ATLAS

Split workload between grids

slide-16
SLIDE 16

16

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Parallel Infrastructures

  • Site Driven

– The site joins both grids

Deploys both interfaces

– User only sees their grid interface – More work for the site

Can only be supported by large sites

  • Reduced resources

– Use By FZK

Participating in EGEE, Nordugrid and D-grid

slide-17
SLIDE 17

17

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Gateway

  • A gateway is a bridge between grid infrastructures

– Single point of failure – Gateway breaks, grid disappears – Scalability bottleneck – All the load through one service

  • Useful as a proof concept and to demonstrate the need
  • NAREGI approach using glite-CE

Gateway

slide-18
SLIDE 18

18

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Adaptors and Translators

  • Adaptors allow connection
  • Translators understand/modify information
  • They are built into the middleware

– The middleware can then work with both interfaces

Useful feature even when using standards!

  • Requires modification to the grid middleware

– Existing service interfaces can still be used

  • Using in the GIN information System

API Plugin Plugin

slide-19
SLIDE 19

19

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Bilateral Activities within EGEE

  • EGEE / OSG

– Already interoperating since Autumn 2005

  • EGEE /NDGF

– Working on interoperability since Summer 2005 – Anticipated completion May 2008

  • EGEE/Unicore

– Started Summer 2006 – Prototype components available

  • EGEE/Naregi

– Working on interoperability with EGEE since winter 2006 – Interoperable components available

  • EUCHINAGrid

– Separate project

  • EGEE/Garuda

– See talk in the next session!

slide-20
SLIDE 20

20

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Grid Interoperability Now

  • Building upon the many bi-lateral activities
  • Started at GGF-16 (now OGF) in Feb 2006
  • Demonstrate what we can for SC 2006

– Applications, Security, Job Management – Information Systems, Data Management

GIN

slide-21
SLIDE 21

21

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

GIN Information System

Generic Information Provider Provider EGEE Provider OSG Provider NDGF

GIN BDII ARC BDII

Provider Naregi Provider Teragrid Provider Pragma EGEE Site OSG Site NDGF Site Naregi Grid Teragrid Grid Pragma Grid Translators Glue

slide-22
SLIDE 22

22

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Google Earth Demo

EGEE OSG Naregi Teragrid Pragma Nordugrid

slide-23
SLIDE 23

23

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

The Need For Standards

  • Identified areas where standards are needed

– From the various interoperation activities

  • Common interfaces

– Critical interfaces at the organizational boundary

Security Information Computing Storage

  • Standards are less important for higher level services

– Problem constrained within the VO

Chose one solution and somewhere to host it.

slide-24
SLIDE 24

24

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Security

  • Security is the fundamental aspect

– Users belong to a VO and do work on behalf of the VO

Their identity is their experiment, not their institution

  • Require a common security mechanism

– All other standards will inherit from this one

  • Most grids use X509 credentials

– Already an existing standard ☺ – This has significantly reduced interoperability problems – Roots of trust, CAs, coordinated by the IGTF

  • Require common methods for VO policy management

– Groups and roles within a VO – Capabilities etc.

slide-25
SLIDE 25

25

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Information Service

  • Separate content and interface
  • Schema defines the content.

– Glue Schema created to facilitate interoperation

Currently v1.3

– Now and OGF working group

Draft of v2.0 ready now!

  • LDAP is the dominant interface

– 55% grids, 95% sites provide an LDAP interface

Grids and sites participating in GIN

– Various web service interfaces

These all have problems with large query results

slide-26
SLIDE 26

26

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Data Management

  • GridFTP

– Supported in most grid infrastructures

Reduced interoperability problems

  • Storage Resource Manager

– Is proposed interface to storage – Problems with different interpretations of the specification – Incompatible implementations – With a huge amount of effort it has taken 18 months to get right

  • The Storage Resource Broker (SRB)

– An alternative which is widely used.

slide-27
SLIDE 27

27

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Job Management

  • Job Description Language

– JSDL as defined by the OGF

  • Computing Interface

– As many interfaces as batch systems! – Need to agree on a common interface

OGSA-BES is the current candidate

  • OGSA-BES

– V1.0 draft document – A number of prototypes exist but unproven in production – Cream CE and KnowARC CE will implement BES

  • Need to think about accounting
slide-28
SLIDE 28

28

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Final Thoughts

  • The problem of grid interoperation

– A second attempt at the original problem

  • The solution is common interfaces

– Most crucially at the site boundary – The only way forward is real standards

  • The most important part is to agree

– Production feedback will ensure it works! – The initial choice only select the starting point

  • Interoperability can be overcome short term

– But only standards are sustainable in the long term

slide-29
SLIDE 29

29

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Summary

  • We need to put “Grids” into context

– What problem you are addressing?

Multi-institutional e-Science Infrastructures

  • Grid Interoperability is an avoidable problem

– Grid Interoperation is not!

  • More focus is needed on the interfaces

– Less focus required on specific implantations

  • Standards are critical for the future

– It doesn’t matter what they are as long a we agree – Existing use cases will ensure the standards work