World Wide Computing and the ATLAS World Wide Computing and the - - PowerPoint PPT Presentation

world wide computing and the atlas world wide computing
SMART_READER_LITE
LIVE PREVIEW

World Wide Computing and the ATLAS World Wide Computing and the - - PowerPoint PPT Presentation

World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment th July 2004 Taipei, 25 th July 2004 Taipei, 25 Roger Jones Roger Jones ATLAS International Computing Board Chair ATLAS International Computing


slide-1
SLIDE 1

World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment

Taipei, 25 Taipei, 25th

th July 2004

July 2004 Roger Jones Roger Jones ATLAS International Computing Board Chair ATLAS International Computing Board Chair

slide-2
SLIDE 2

RWL Jones, Lancaster University

Complexity of the Problem

Maj or challenges associat ed wit h:

Communicat ion and collaborat ion at a dist ance (collaborat ive t ools) Dist ribut ed comput ing resources Remot e sof t ware development and physics analysis

slide-3
SLIDE 3

RWL Jones, Lancaster University

Context Context

  • There is a real challenge presented by the large data

There is a real challenge presented by the large data volumes (~10Pb/year) expected by 2007 and beyond volumes (~10Pb/year) expected by 2007 and beyond

Despite Moore’s Law, we will need large scale distributed computing → Grids There are many Grid initiatives trying to make this possible

  • ATLAS is a worldwide collaboration, and so we span most

ATLAS is a worldwide collaboration, and so we span most Grid projects Grid projects

This has strengths and weaknesses!

We benefit from all developments We have problems maintaining coherence

We will ultimately be working with several Grids (with defined interfaces)

This may not be what funders like the EU want to hear!

slide-4
SLIDE 4

RWL Jones, Lancaster University

The ATLAS The ATLAS Data Data

  • ATLAS

ATLAS

  • Not one experiment!
  • A facility for many different

measurements and physics topics

  • Event selection

Event selection

  • 1 GHz pp collision rate

40 MHz bunch-

crossing rate

  • 200 Hz event-rate to mass-

storage

Real-time selection

Leptons Jets

slide-5
SLIDE 5

RWL Jones, Lancaster University

Auto.Tape (TB) Disk (TB) CPU (kSI2k)

From Trigger to Reconstruction From Trigger to Reconstruction

D A T A F L O W EB ROS

H L T LV L1

D E T RO LVL2

  • 2. 5 µs

SFI SFO E B N E F N ROI B L2P L2SV L 2 N

Event Filt er

DFM EFP EFP EFP EFP ~ sec ~4 GB/ s ROD ROD ROD ROB ROB ROB

300 MB/s

T0 reconstruction

Raw + Cal ESD etc 3216 1000 40 800 4058 15 kSI2k-s/event

1.3 kSI2k = 1 P4 with 3.2 GHz

CERN Tier-0&Tier 1/2

slide-6
SLIDE 6

RWL Jones, Lancaster University

The Global View The Global View

Distribution to ~6 T1s

Each of N T1s hold 2/N of the reconstructed data. The ability to do research requires therefore a sophisticated software infrastructure for complete and convenient data-access for the whole collaboration, and sufficient network bandwidth (2.5 Gb/s) for keeping up the data-transfer from T0 to T1s.

slide-7
SLIDE 7

RWL Jones, Lancaster University

Digital Divide Digital Divide

  • The is an opportunity to move closer to a single scientific

The is an opportunity to move closer to a single scientific community, wealthy/developed/emerging/poor community, wealthy/developed/emerging/poor

  • This requires investment on all sides

This requires investment on all sides For the For the ‘ ‘haves haves’ ’

International networks Grid infrastructure and toolkits An open approach

For the For the ‘ ‘have less have less’ ’

Investment in local networks and infrastructure Regional T2s Negotiated relationship with a Tier-1

slide-8
SLIDE 8

RWL Jones, Lancaster University

Digital Divide Digital Divide

  • Excellent networking is

Excellent networking is assumed in the Grid model assumed in the Grid model

  • Requires global high

Requires global high-

  • bandwidth

bandwidth connectivity connectivity

  • Projects like GLORIAD will

provide this

  • Allows countries to contribute

to ATLAS computing through in-kind local capacity

  • High bandwidth integration

with the ATLAS infrastructure

  • MC-data produced locally, may

stay and be accessed by ATLAS collaborators

  • Will give member countries

collaborative tools connection to the collaboration

  • All local institutions in ATLAS

All local institutions in ATLAS must have significant must have significant bandwidth to the local T2

GLORIAD: GLobal RIng network for Advanced applications Development. A 10 Gb/s network planned to start in 2004

bandwidth to the local T2

slide-9
SLIDE 9

RWL Jones, Lancaster University

Digital Divide Digital Divide

  • Needed ATLAS computing

Needed ATLAS computing capacity in capacity in countried countried with with small numbers of users e.g. small numbers of users e.g. China (10 China (10-

  • 15 end users, 88%

15 end users, 88% for local usage, 12% shared for local usage, 12% shared with the whole with the whole collaboration) collaboration)

  • 120 kSI2k CPU
  • 80 TB disk
  • 34 TB tape/slow media
  • Internal connection to local

Internal connection to local T2 T2

  • 0.16-0.65 Gb/s
  • Good PC as work-station

for each researcher

  • Variation

Variation-

  • 1

1

  • Copy of the 3 Tb/year TAG

at the local institution

  • Network to T2 on the low

end

  • ~0.5 TB locally per user
  • ~1 kSI2k locally per user
  • Variation

Variation-

  • 2

2

  • Access the 3 Tb TAG at the

local T2

  • Network to the T2 on the

high end, less disk-space locally

slide-10
SLIDE 10

RWL Jones, Lancaster University

ATLAS Components ATLAS Components

  • Grid Projects

Grid Projects

  • Develop the middleware
  • Provide hardware resources
  • Provide some manpower resource

But also drain resources from our core activities

  • Computing Model

Computing Model

  • Dedicated group to develop the computing model
  • Revised resources and planning paper evolving Sep 2004
  • Now examine from DAQ to end-user
  • Must include University/local resources
  • Devise various scenarios, different distributions of data
  • Data Challenges

Data Challenges

  • Test the computing model
  • Service other needs in ATLAS (but this must be secondary in DC2)
slide-11
SLIDE 11

RWL Jones, Lancaster University

Grid Projects Grid Projects

EGEE

Until these groups provide interoperability the experiments must provide it themselves

slide-12
SLIDE 12

RWL Jones, Lancaster University

Time Hype

Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Trigger

?

slide-13
SLIDE 13

RWL Jones, Lancaster University

Data Challenges Data Challenges Test Bench Test Bench – –Data Challenges Data Challenges

  • DC 1

DC 1 Jul 2002 Jul 2002-

  • May 2003

May 2003

Showed the many resources available (hardware, willing people)

Made clear the need for integrated system

Some tests of Grid software Mainly driven by HLT and Physics Workshop needs

One external driver is sustainable, two is not!

  • DC2

DC2 June June-

  • Sept 04

Sept 04

Real test of computing model for computing TDR Must use Grid systems Analysis and calibration + reconstruction and simulation Pre-production period (End June 04…) then 1-week intensive tests

  • DC3

DC3 05/06 05/06

Physics readiness TDR. Big increase in scale

slide-14
SLIDE 14

RWL Jones, Lancaster University

  • The goal includes:

The goal includes:

Use widely the GRID middleware and tools Large scale physics analysis in latter phase Computing model studies (document end 2004)

Slice test of the computing activities in 2007

  • Also

Also

  • Full use of Geant4; POOL; LCG applications
  • Pile-up and digitization in Athena
  • Deployment of the complete Event Data Model and the Detector Description
  • Simulation of full ATLAS and 2004 combined Testbeam
  • Test the calibration and alignment procedures
  • Run as much as possible the production on LCG-2
  • Combined Test beam operation foreseen as concurrent with DC2 and using same tools
  • Also need networking tests at full rate and above

Also need networking tests at full rate and above

Co-ordinate with LCG service challenges, ESLEA and other light-path network tests

DC2: June DC2: June – – Sept 2004 Sept 2004

slide-15
SLIDE 15

RWL Jones, Lancaster University

  • Preparation phase: worldwide exercise (June

Preparation phase: worldwide exercise (June-

  • July 04)

July 04)

Event generation; Simulation (>107); pile-up and digitization All “Byte-stream” sent to CERN

  • Reconstruction: at Tier0

Reconstruction: at Tier0

~400 processors, short term, sets scale Several streams

Express lines Calibration and alignment lines Different output streams

ESD and AOD replicated to Tier1 sites Out of Tier0

Re-calibration

new calibrations and alignment parameters

Re-processing Analysis using ATLAS Distributed Analysis in late phase

slide-16
SLIDE 16

RWL Jones, Lancaster University

Test Beds & Grid Projects Test Beds & Grid Projects

All pre All pre-

  • production for DC2 to be done using Grid tools from:

production for DC2 to be done using Grid tools from:

  • LCG2

LCG2

  • Common to all LHC experiments
  • LCG2 now rolling-out.
  • Much improved job success rate
  • Grid3/US ATLAS Test Bed

Grid3/US ATLAS Test Bed

  • Demonstrated success of grid computing model for HEP
  • Developing & deploying grid middleware and applications

wrap layers around apps,simplify deployment Very important tools for data management MAGDA and

software installation (pacman)

  • Evolve into fully functioning scalable distributed tiered grid
  • NorduGrid

NorduGrid

  • A very successful regional test bed
  • Light-weight Grid user interface, middleware, working prototypes etc
  • Now to be part of Northern European Grid in EGEE
slide-17
SLIDE 17

RWL Jones, Lancaster University

DC2 and Grid tools DC2 and Grid tools

  • Much work done:

Much work done:

Don Quijote for file replication; AMI (Grenoble) for metadata. CMT+Pacman have been combined to allow installation on demand Windmill production system, executors for each Grid and batch GANGA (ATLAS-LHCb joint project, Grid and batch interface for ATLAS production and analysis) + DIAL (BNL) + above = ADA A coherent view of tool use (bookkeeping; production) and integration is emerging (task force) – must span Grid efforts Analysis will later use the ARDA middleware, but we need a solution ‘yesterday’.

  • LCG

LCG-

  • 2

2

We are using and contributing to & validating LCG-2 components as they become available (R-GMA; RLS; …) ATLAS-EDG task force has become ATLAS-LCG task force

slide-18
SLIDE 18

RWL Jones, Lancaster University

New ATLAS Production System New ATLAS Production System

LCG NG Grid3 LSF

LCG exe LCG exe NG exe G3 exe LSF exe super super super super super

ProdDB Dat a Man. Syst em

RLS RLS RLS j abber j abber soap soap j abber

Don Quijote Windmill Lexor

AMI

Capone Dulcinea

Much of the problem is data management This must cope with >= 3 Grid catalogues The demands will be greater for analysis

slide-19
SLIDE 19

RWL Jones, Lancaster University

Computing Model Group Computing Model Group

  • Areas being addressed:

Areas being addressed:

  • 1. Computing Resources
  • 2. Networks from DAQ to primary storage
  • 3. Databases
  • 4. Grid Interfaces
  • 5. Computing Farms
  • 6. Distributed Analysis
  • 7. Distributed Production
  • 8. Alignment & Calibration Procedures
  • 9. Tests of Computing Model

10.Minimum permissible service 11.Simulation of model

  • Interim report Dec03, identified areas for further work and

Interim report Dec03, identified areas for further work and clarification clarification

  • Final Report by end 2004 ready for C

Final Report by end 2004 ready for C-

  • TDR

TDR

slide-20
SLIDE 20

RWL Jones, Lancaster University

A More Grid-like Model

CERN

Tier2 Lab a Lancs Lab c Uni n Lab m Lab b Uni b Uni y Uni x

Physics Department

α β γ

Desktop Germany

Tier 1

Taipei ASCC UK France I taly NL USA Brookhaven … … …

The LHC Computing Facility

NorthGrid SouthGrid LondonGrid ScotGrid

slide-21
SLIDE 21

RWL Jones, Lancaster University

The System The System

Tier2 Centre ~200kSI2k

Event Builder Event Filter ~7MSI2k T0 ~5MSI2k

UK Regional Centre (RAL) US Regional Centre French Regional Centre Asian Regional Centre Sheffield Manchester Liverpool Lancaster ~0.25TIPS

Workstations >10 GB/sec 450 Mb/sec 100 - 1000 MB/s

  • Some data for calibration and

monitoring to institutes

  • Calibrations flow back

Each Tier 2 has ~25 physicists working on one or more channels Each Tier 2 should have the full AOD, TAG & relevant Physics Group summary data Tier 2 do bulk of simulation Physics data cache ~Pb/sec ~ 300MB/s/T1 /expt

Tier2 Centre ~200kSI2k Tier2 Centre ~200kSI2k

≥622Mb/s

Tier 0 Tier 0 Tier 1 Tier 1

PC (2004) = ~1 kSpecInt2k

Northern Tier ~200kSI2k

Tier 2 Tier 2

  • ~9 Pb/year
  • No simulation

N Tier 1s each store 1/N of raw data, reprocess it & archive the ESD, hold 2/N of current ESD for scheduled analysis & all AOD+TAG ≥622Mb/s

slide-22
SLIDE 22

RWL Jones, Lancaster University

Operation of Tier Operation of Tier-

  • 1’s and Tier

1’s and Tier-

  • 2’s

2’s

Tiers defined by capacity, role and level of service Tiers defined by capacity, role and level of service

No assumption of single site (esp. T2), but must present as a single entity in human/response terms

Envisage at least 6 Tier Envisage at least 6 Tier-

  • 1’s (24x7 response) for ATLAS. Each will:

1’s (24x7 response) for ATLAS. Each will:

  • Keep on disk 1/3 of the ESD’s and a full AOD’s and TAG’s
  • Keep on tape 1/6 of Raw Data, reprocess and retain ESD produced
  • Keep on disk 1/3 of currently simulated ESD’s and on tape 1/6 of

previous versions

  • Provide facilities for physics group controlled ESD analysis
  • Calibration of real data
  • Support role for defined set of Tier-2s

Estimate ~4 Tier Estimate ~4 Tier-

  • 2’s (various sizes, slower response) for each Tier

2’s (various sizes, slower response) for each Tier-

  • 1. Each will:
  • 1. Each will:
  • Keep on disk a full copy of TAG and roughly one full AOD copy per

four T2s

  • Keep on disk a small selected sample of ESD’s
  • Provide facilities (CPU and disk space) for user analysis and user

simulation (~25 users/Tier-2)

  • Run central simulation

Note: we see the CERN ‘Tier Note: we see the CERN ‘Tier-

  • 1’ as a super Tier

1’ as a super Tier-

  • 2 in role

2 in role

slide-23
SLIDE 23

RWL Jones, Lancaster University

Analysis on Tier Analysis on Tier-

  • 2’s and Tier

2’s and Tier-

  • 3’s

3’s

  • This area is under the most active change

This area is under the most active change

We are trying to forecast resource usage and usage patterns from Physics Working Groups

  • Assume about ~10 selected large AOD datasets, one for

Assume about ~10 selected large AOD datasets, one for each physics analysis group each physics analysis group

  • Assume that each large local centre will have full TAG to

Assume that each large local centre will have full TAG to allow simple selections allow simple selections

Using these, jobs submitted to T1 cloud to select on full ESD New collection or ntuple-equivalent returned to local resource

  • Distributed analysis systems under development

Distributed analysis systems under development

Metadata integration, event navigation, database designs are all at top priority ARDA may help, but will be late in the day for DC2 (risk of interference with DC2 developments)

slide-24
SLIDE 24

RWL Jones, Lancaster University

External Tier 1, year 1 External Tier 1, year 1

External T1 : Storage requirement Fraction

Disk (TB) Tape (TB)

General ESD (curr.) 429 150 1/3 General ESD (prev..) 214 150 1/6 AOD 257 180 1/1 TAG 3 2 1/1 RAW Data (sample) 6 533 1/6 RAW sim 0.0 33.3 1/6 ESD Sim (curr.) 23.8 8.3 1/3 ESD Sim (prev.) 11.9 8.3 1/6 AOD Sim 14 10 1/1 Tag Sim 1/1 User Data (20 groups) 171 120 1/6 Total 1130 1195

Processing for Physics Groups 1760 kSI2k Reconstruction 340 kSI2k

slide-25
SLIDE 25

RWL Jones, Lancaster University

A Regular T2, year 1 A Regular T2, year 1

External T2 : Storage requirement Fraction Disk (TB) Tape (TB) General ESD (curr.) 26 1/50 General ESD (prev..) 18 1/50 AOD 64 1/4 TAG 3 1/1 ESD Sim (curr.) 1.4 1/50 ESD Sim (prev.) 1 1/50 AOD Sim 14 10 1/1 User Data (600/6/4=25) 37 26 Total 146 57

Simulation 21 kSI2k Reconstruction 2 kSI2k Users 176 kSI2k

Total: 199 kSI2k

slide-26
SLIDE 26

RWL Jones, Lancaster University

CERN T1/2, year 1 CERN T1/2, year 1

CERN T1/2 : Storage requirement

Fraction

Disk (TB) Tape (TB) General ESD (curr.) 26 1/50 General ESD (prev..) 18 1/50 AOD 257 1/1 TAG 3 1/1 ESD Sim (curr.) 5.7 2/25 ESD Sim (prev.) 5.7 2/25 AOD Sim 14 2/25 Tag Sim User Data (100 users) 149 104 Total 460 122

704 kSI2k required

slide-27
SLIDE 27

RWL Jones, Lancaster University

Resource Summary, Year 1 Resource Summary, Year 1

CERN All T1 All T2 Total

Auto tape (Pb) 4.4 7.2 1.4 12.9 Disk (Pb) 0.5 6.8 3.5 10.8 CPU (MSI2k) 4.8 12.7 4.8 22.2

slide-28
SLIDE 28

RWL Jones, Lancaster University

Networking Networking

  • Bob Dobinson was a great loss to the group

Bob Dobinson was a great loss to the group

  • EF

EF T0 maximum 300MB/s (450MB/s with headroom) T0 maximum 300MB/s (450MB/s with headroom)

  • If EF away from pit, require 7GB/s for SFI inputs (10x10Gbps wi

If EF away from pit, require 7GB/s for SFI inputs (10x10Gbps with th headroom) headroom)

  • Networking off

Networking off-

  • site now being calculated with David Foster

site now being calculated with David Foster

  • Recent exercise with (almost) current numbers

Recent exercise with (almost) current numbers

  • Full bandwidth estimated as

Full bandwidth estimated as requirement*1.5(headroom)*2(capacity) requirement*1.5(headroom)*2(capacity)

  • Propose dedicated networking test in DC2, allied with plans in H

Propose dedicated networking test in DC2, allied with plans in HLT LT RAL RAL (typical T1?) (typical T1?) T0 T0 Total Total ATLAS (MB/s) ATLAS (MB/s) 107 107 708 708 T1 Total T1 Total Gbps Gbps 2.3 2.3 12.6 12.6 T1 T1 Gbps Gbps full full 6.9 6.9 37.9 37.9 Assumed Assumed Gbps Gbps 10 10 70 70

slide-29
SLIDE 29

RWL Jones, Lancaster University

ATLAS Distributed Analysis & ATLAS Distributed Analysis & GANGA GANGA

  • The ADA (ATLAS Distributed Analysis) project started in late 200

The ADA (ATLAS Distributed Analysis) project started in late 2003 to bring 3 to bring together existing effort to develop a DA infrastructure: together existing effort to develop a DA infrastructure:

  • GANGA (GridPP in the UK) – front-end, splitting
  • DIAL (PPDG in the USA) – job model
  • It is based on a client/server model with an abstract interface

It is based on a client/server model with an abstract interface between services between services

  • thin client in the user’s computer, “analysis service” consisting itself of a collection
  • f services in the server
  • The vast majority of GANGA modules fit easily into this scheme (

The vast majority of GANGA modules fit easily into this scheme (or have been

  • r have been

integrated): integrated):

  • GUI, CLI, JobOptions editor, job splitter, output merger, ...
  • Job submission will go through (a clone of) the production syste

Job submission will go through (a clone of) the production system m

  • using the existing infrastructure to access resources on the 3 Grids and the legacy

systems

  • The forthcoming release of ADA (with GANGA 2.0) will have the fi

The forthcoming release of ADA (with GANGA 2.0) will have the first basic rst basic functionality to allow DC2 Phase III to proceed functionality to allow DC2 Phase III to proceed

slide-30
SLIDE 30

RWL Jones, Lancaster University

Analysis Analysis

  • This is just the first step

This is just the first step

  • Integrate with the ARDA back

Integrate with the ARDA back-

  • end

end

  • Much work needed on metadata

Much work needed on metadata for analysis (LCG and for analysis (LCG and GridPP GridPP metadata projects) metadata projects)

  • N.B. GANGA allows non

N.B. GANGA allows non-

  • production MC job submission

production MC job submission and data reconstruction end and data reconstruction end-

  • to

to-

  • end in LCG

end in LCG

Middleware service interfaces CE WMS File Catalogue etc. ... etc. Middleware services High level service interfaces (AJDL)

Analysis Service

ROOT cmd line Client GANGA cmd line Client GANGA Task Management Graphical Job Builder GANGA Job Management High-level services Client tools Catalogue services GANGA GUI Dataset Splitter Dataset Merger Job Management

slide-31
SLIDE 31

RWL Jones, Lancaster University

TAG Analysis TAG Analysis

CERN

Tier2 Lab a Lab p Lab c Uni n Lab m Lancs Uni b Uni y Uni x

Physics Department

Lancs

Germany

Tier 1

Taipei ASCC UK France I taly NL USA Brookhaven … … …

The LHC Computing Facility

Tag Tuple/ collection

slide-32
SLIDE 32

RWL Jones, Lancaster University

AOD Analysis AOD Analysis

CERN

Tier2 Lab a Lab p Lab c Uni n Lab m Lancs Uni b Uni y Uni x

Physics Department

Lancs

Germany

Tier 1

Taipei ASCC UK France I taly NL USA Brookhaven … … …

The LHC Computing Facility

Tuple/ collection L’pool Man Shef f NorthGrid AOD

slide-33
SLIDE 33

RWL Jones, Lancaster University

ESD Analysis ESD Analysis

Tier2 Lab a Lab p Lab c Uni n Lab m Lancs Uni b Uni y Uni x

Physics Department Germany

Tier 1

USA FermiLab UK France I taly NL USA Brookhaven … … … .

Tuple/ collection/ AOD Convener/ manager Tier 1 Cloud ESD

slide-34
SLIDE 34

RWL Jones, Lancaster University

Conclusions Conclusions

  • The Grid is the only practical way to function as a world

The Grid is the only practical way to function as a world-

  • wide

wide collaboration collaboration

  • DC1 showed we have many resources, especially people

DC1 showed we have many resources, especially people

  • Grid projects are starting to deliver

Grid projects are starting to deliver

  • Slower than desirable
  • Tensions over manpower
  • Problems of coherence
  • Real tests of the computing model due this year

Real tests of the computing model due this year

  • Serious and prompt input needed from the community
  • Revised costs are encouraging
  • Real sharing of resources is required

Real sharing of resources is required

  • The rich must shoulder a large part of the burden
  • Poorer members must also contribute
  • This technology allows them to do this more effectively