An overview of the EGEE project and gLite middleware Brendan Hamill - - PowerPoint PPT Presentation

an overview of the egee project and glite middleware
SMART_READER_LITE
LIVE PREVIEW

An overview of the EGEE project and gLite middleware Brendan Hamill - - PowerPoint PPT Presentation

Enabling Grids for E-sciencE An overview of the EGEE project and gLite middleware Brendan Hamill NeSC Edinburgh Based in part on material originated by Mike Mineter and Nicolas Jacq www.eu-egee.org EGEE-II INFSO-RI-031688 Contents Enabling


slide-1
SLIDE 1

EGEE-II INFSO-RI-031688

Enabling Grids for E-sciencE

www.eu-egee.org

An overview of the EGEE project and gLite middleware

Brendan Hamill NeSC Edinburgh

Based in part on material originated by Mike Mineter and Nicolas Jacq

slide-2
SLIDE 2

Overview of EGEE GridKa-School 2006 2

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Contents

  • History of EGEE
  • Expanding into new areas of Science
  • Overview of the main grid services
slide-3
SLIDE 3

Overview of EGEE GridKa-School 2006 3

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE – international e-infrastructure

Network infrastructure & Resource centres

Operations, Support and training

Collaboration Pan-European Grid A four year programme:

  • Build, deploy and operate a

consistent, robust, large scale production grid service that links with and builds on national, regional and international initiatives

  • Improve and maintain the

middleware in order to deliver a reliable service to users

  • Attract new users from

research and industry and ensure training and support for them

slide-4
SLIDE 4

Overview of EGEE GridKa-School 2006 4

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

History of EGEE

slide-5
SLIDE 5

Overview of EGEE GridKa-School 2006 5

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

History of EGEE (2)

  • European DataGrid (EDG) Project

– Ended March 2004

  • EGEE phase 1:

– April 2004-March 2006

  • EGEE-II:

– April 2006-March 2008 – Part of the EU Sixth Framework Programme (FP6) – Budget > €50M – 970 individuals in 91 partner organisations

slide-6
SLIDE 6

Overview of EGEE GridKa-School 2006 6

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

CERN Large Hadron Collider

slide-7
SLIDE 7

Overview of EGEE GridKa-School 2006 7

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Large Hadron Collider (2)

slide-8
SLIDE 8

Overview of EGEE GridKa-School 2006 8

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Large Hadron Collider (3)

slide-9
SLIDE 9

Overview of EGEE GridKa-School 2006 9

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Large Hadron Collider (4)

slide-10
SLIDE 10

Overview of EGEE GridKa-School 2006 10

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

LHC pre-accelerators and detectors

slide-11
SLIDE 11

Overview of EGEE GridKa-School 2006 11

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

slide-12
SLIDE 12

Overview of EGEE GridKa-School 2006 12

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

CMS Detector

slide-13
SLIDE 13

Overview of EGEE GridKa-School 2006 13

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Other subject areas in EGEE

Bioinformatics Computational Chemistry Astrophysics

slide-14
SLIDE 14

Overview of EGEE GridKa-School 2006 14

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

How do we make Grid Computing sexy?

slide-15
SLIDE 15

Overview of EGEE GridKa-School 2006 15

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Malaria

  • ~300 million people

worldwide are affected

  • 1-1.5 million people

die every year

  • Widely spread
  • Caused by protozoan

parasites of the genus Plasmodium

Life cycle

slide-16
SLIDE 16

Overview of EGEE GridKa-School 2006 16

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Role of Plasmepsins

  • Plasmepsins are involved in

hemoglobin degradation during the parasite’s life cycle.

  • Present in the 4 species of

Plasmodium causing the disease in human

  • Sequence homology between

the plasmepsins is high (65- 70%)

  • X-ray-crystallography data

available

Plasmepsins (I, II, IV, and HAP) Amino acids HEMOGLOBIN Small Peptides Smaller Peptides Heme Hematin Hemozoin (malarial pigment) Falcipain and plasmepsin

  • xidation

polymerization Aminopeptidases

slide-17
SLIDE 17

Overview of EGEE GridKa-School 2006 17

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

“In-Silico” Drug Discovery

  • WISDOM Project

(Wide In-Silico Docking On Malaria)

– About 80 CPU years to produce TB of data

slide-18
SLIDE 18

Overview of EGEE GridKa-School 2006 18

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

“In-Silico” Drug Discovery

  • Profiling Inhibitors of

Influenza H5N1

– docking of 300,000 compounds studied – 8 different target structures of Influenza A neuraminidases – 2000 CPUs were used

  • ver 4 weeks (>100 CPU-

years) – >60,000 output files with a data volume of 600 Gigabytes

N1 H5 N1 H5

slide-19
SLIDE 19

Overview of EGEE GridKa-School 2006 19

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Molecular Dynamics

Molecular structure of β-lactamase

slide-20
SLIDE 20

Overview of EGEE GridKa-School 2006 20

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Earth Sciences

  • Climate Change
  • Energy Supply
  • Population Growth
  • Natural Hazards
slide-21
SLIDE 21

Overview of EGEE GridKa-School 2006 21

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Atmospheric Modelling

slide-22
SLIDE 22

Overview of EGEE GridKa-School 2006 22

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Asteroid impact simulations

slide-23
SLIDE 23

Overview of EGEE GridKa-School 2006 23

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Chicxulub crater

slide-24
SLIDE 24

Overview of EGEE GridKa-School 2006 24

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Impact Processes

slide-25
SLIDE 25

Overview of EGEE GridKa-School 2006 25

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Schiller Crater, Moon

slide-26
SLIDE 26

Overview of EGEE GridKa-School 2006 26

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

In the first 2 years EGEE

  • Established production quality sustained

Grid services

– 3000 users from at least 5 disciplines – Goal was to integrate 50 sites into a common infrastructure currently 180 – offer 5 Petabytes (1015) storage

  • Demonstrated a viable general process to

bring other scientific communities on board

  • Secured a second phase from April 2006

Pilot New

slide-27
SLIDE 27

Overview of EGEE GridKa-School 2006 27

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE-I Grid Operations

slide-28
SLIDE 28

Overview of EGEE GridKa-School 2006 28

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Natural continuation of EGEE

  • Expanded consortium
  • Emphasis on providing an

infrastructure

  • increased support for applications
  • interoperability with other

infrastructures

  • more involvement from Industry
slide-29
SLIDE 29

Overview of EGEE GridKa-School 2006 29

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

SA: service activities

  • establishing operations

NA: network activities

  • supporting VOs

JRA: “joint research activities”

  • e.g. hardening middleware

EGEE-II Activities

slide-30
SLIDE 30

Overview of EGEE GridKa-School 2006 30

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE-II: Expertise & Resources

  • More than 90 partners
  • 32 countries
  • 12 federations

Major and national Grid projects in Europe, USA, Asia + 27 countries through related projects:

– BalticGrid – SEE-GRID – EUMedGrid – EUChinaGrid – EELA

slide-31
SLIDE 31

Overview of EGEE GridKa-School 2006 31

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Related projects: infrastructure, education, application

Name Description BalticGrid EGEE extension to Estonia, Latvia, Lithuania EELA EGEE extension to Brazil, Chile, Cuba, Mexico, Argentina EUChinaGRID EGEE extension to China EUMedGRID EGEE extension to Malta, Algeria, Morocco, Egypt, Syria, Tunisia, Turkey ISSeG Site security eIRGSP Policies ETICS Repository, Testing BELIEF Digital Library of Grid documentation, organisation of workshops, conferences BIOINFOGRID Biomedical Health-e-Child Biomedical – Integration of heterogeneous biomedical information for improved healthcare ICEAGE International Collaboration to Extend and Advance Grid Education

slide-32
SLIDE 32

Overview of EGEE GridKa-School 2006 32

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Grid services

How can EGEE middleware support collaboration and resource sharing within and between many diverse VO’s ?

slide-33
SLIDE 33

Overview of EGEE GridKa-School 2006 33

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Grid Middleware

  • When using a PC or

workstation you

– Login with a username and password (“Authentication”) – Use rights given to you (“Authorisation”) – Run jobs – Manage files: create them, read/write, list directories

  • Components are

linked by a bus

  • Operating system
  • One admin domain
  • When using a Grid you

– Login with digital credentials (“Authentication”) – Use rights given you (“Authorisation”) – Run jobs – Manage files: create them, read/write, list directories

  • Services are linked by

the Internet

  • Middleware
  • Many admin domains
slide-34
SLIDE 34

Overview of EGEE GridKa-School 2006 34

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Typical current grid

  • Grid middleware

runs on each shared resource

– Data storage – (Usually) batch queues on pools of processors

  • Users join VO’s
  • Virtual organisation

negotiates with sites to agree access to resources

  • Distributed services

(both people and middleware) enable the grid, allow single sign-on

INTERNET

slide-35
SLIDE 35

Overview of EGEE GridKa-School 2006 35

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Overview of gLite Overview of gLite Middleware iddleware

slide-36
SLIDE 36

Overview of EGEE GridKa-School 2006 36

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Authorisation, Authentication (AA)

Users in many locations and organisations

Computing clusters,… Network resources Data storage Operating system Local scheduler File system

Hardware System software

Resources in many locations and organisations Grid Security Infrastructure

slide-37
SLIDE 37

Overview of EGEE GridKa-School 2006 37

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Basic job submission

Users

Compute elements Network resources Data storage

Resources How do I run a job

  • n a compute

element (CE) ? (CE =batch queue) Tools that:

  • copy files to and

between CE’s and data storage

  • Submit job to a CE
  • Monitor job
  • Get output
slide-38
SLIDE 38

Overview of EGEE GridKa-School 2006 38

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Information service (IS)

Users

Compute elements Network resources Data storage

Resources How do I know which CE could run my job? Which is free? Information service:

  • Resources send

updates to IS

  • Grid services

query IS before running jobs

slide-39
SLIDE 39

Overview of EGEE GridKa-School 2006 39

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

File management

Users

Compute elements Network resources Data storage

Resources My data are in files, and I’ve terabytes Storage Transfer Replica management Our data are in files, and I’ve terabytes We’ve terabytes

  • f data in files.
  • EGEE data: primarily file-based
  • services for databases used by

some VO’s, not in release yet

slide-40
SLIDE 40

Overview of EGEE GridKa-School 2006 40

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Parts of the Grid “ecosystem”

. . . LCG

2004 2001

EGEE

Used in

USA EU NextGrid DEISA GridCC Future grids EDG Globus MyProxy Condor ... VDT DataTAG CrossGrid ...

OSG, … SRM

… interactive

slide-41
SLIDE 41

Overview of EGEE GridKa-School 2006 41

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Security, Authentication and Authorisation

slide-42
SLIDE 42

Overview of EGEE GridKa-School 2006 42

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

How does EGEE…

  • How does EGEE build dynamic

distributed systems?

– For many international collaborations (“virtual

  • rganisations”)

– With n,000 processors in hundreds of independent sites (“administrative domains”) – With no prior direct relationship between users and resource providers – In a world where public networks are abused by hackers, etc.

slide-43
SLIDE 43

Overview of EGEE GridKa-School 2006 43

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Authentication and Authorisation

  • 1. Authentication - communication of identity

Basis for

  • Message integrity - so tampering is recognised
  • Message confidentiality, if needed - so only sender and receiver

can understand the message

  • Non-repudiation: knowing who did what, when – can’t deny it
  • 2. Authorisation - once identity is known, what can a

user do?

  • 3. Delegation- A allows service B to act on behalf of A

– Based on X.509 certificates

slide-44
SLIDE 44

Overview of EGEE GridKa-School 2006 44

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Current production middleware

Replica Replica Catalogue Catalogue Logging & Logging & Book Book-

  • keeping

keeping Resource Resource Broker Broker Storage Storage Element Element Computing Computing Element Element Information Information Service Service

Job Status DataSets info

Author. &Authen.

Job Submit Event Job Query Job Status Input “sandbox” Input “sandbox” + Broker Info Output “sandbox” Output “sandbox” Publish S E & C E i n f

“User User interface interface” ”

slide-45
SLIDE 45

Overview of EGEE GridKa-School 2006 45

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

User Interface node

  • The user’s interface to the Grid
  • Command-line interface to:

– Create/Manage proxy certificates – Job operations To submit a job Monitor its status Retrieve output – Data operations Upload file to SE Create replica Discover replicas – Other grid services

slide-46
SLIDE 46

Overview of EGEE GridKa-School 2006 46

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

User Interface node

  • Also C++ and Java APIs
  • To run a job user creates

a JDL (Job Description Language) file

UI JDL

slide-47
SLIDE 47

Overview of EGEE GridKa-School 2006 47

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Querying job status

Flag Meaning

SUBMITTED submission logged in the LB WAIT job match making for resources READY job being sent to executing CE SCHEDULED job scheduled in the CE queue manager RUNNING job executing on a WN of the selected CE queue DONE job terminated without grid errors CLEARED job output retrieved ABORT job aborted by middleware, check reason

Possible Job States:

slide-48
SLIDE 48

Overview of EGEE GridKa-School 2006 48

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

http://gridportal.hep.ph.ic.ac.uk/rtm

slide-49
SLIDE 49

Overview of EGEE GridKa-School 2006 49

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Further information

  • EGEE www.eu-egee.org
  • EGEE: 1st user Forum

http://egee-intranet.web.cern.ch/egee-intranet/User- Forum

  • gLite-3 User Guide

https://edms.cern.ch/file/722398//gLite-3-UserGuide.pdf

  • EGEE digital library: http://egee.lib.ed.ac.uk/

NEW!!!

slide-50
SLIDE 50

Overview of EGEE GridKa-School 2006 50

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

eLearning – Digital Library

http://egee.lib.ed.ac.uk/

slide-51
SLIDE 51

Overview of EGEE GridKa-School 2006 51

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

e-Learning statistics

  • 70 articles specific to EGEE – ICEAGE

repository has 535

  • 7 courses
  • 8 exercises
  • 37 modules
  • 2559 presentations
  • 35 tutorials
  • 82 videos
  • EGEE digital library: http://egee.lib.ed.ac.uk/
slide-52
SLIDE 52

Overview of EGEE GridKa-School 2006 52

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Other Learning Resources

gLite 3.0 Tutorial twiki:

https://grid.ct.infn.it/twiki/bin/view/GILDA/UserTutorials

User Information Group Pages

  • Regularly reviewed
  • Certified up-to-date

http://homepages.nesc.ac.uk/~scotland/UIG/index.html

Grid Acronym Soup

http://www.gridpp.ac.uk/gas/

slide-53
SLIDE 53

Overview of EGEE GridKa-School 2006 53

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Contacts

  • EGEE Public Portal

http://www.eu-egee.org

  • Technical Information

http://egee-technical.web.cern.ch/egee-technical/

  • How to try the grid running gLite

https://gilda.ct.infn.it

  • EGEE Project Office

project-eu-egee-po@cern.ch

slide-54
SLIDE 54

Overview of EGEE GridKa-School 2006 54

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Summary

  • From the rich grid ecosystem emerged the EGEE production

middleware

– Built on tools for

Authorisation and authentication Job submission (direct to a Computing Element) File transfer

– …with higher level services

Job submission to “a grid” (via resource broker) Data management Information Systems

– ..and upon these can be built toolkits and services for new application communities

Workflow Portals: e.g. P-GRADE Portal – www.lpds.sztaki.hu/pgportal

  • Authorisation and authentication underpin the middleware

– resource-sharing across organisations, without centralised control