ATLAS Worldwide Distributed Computing ATLAS Worldwide Distributed - - PowerPoint PPT Presentation

atlas worldwide distributed computing atlas worldwide
SMART_READER_LITE
LIVE PREVIEW

ATLAS Worldwide Distributed Computing ATLAS Worldwide Distributed - - PowerPoint PPT Presentation

ATLAS Worldwide Distributed Computing ATLAS Worldwide Distributed Computing Zhongliang Ren 03 May 2006 ISGC 2006, Academia Sinica, Taipei, Taiwan Agenda of My Talk Agenda of My Talk Why the LHC and ATLAS ? The Standard Model The


slide-1
SLIDE 1

ATLAS Worldwide Distributed Computing ATLAS Worldwide Distributed Computing

Zhongliang Ren 03 May 2006 ISGC 2006, Academia Sinica, Taipei, Taiwan

slide-2
SLIDE 2

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 2

Agenda of My Talk Agenda of My Talk

Why the LHC and ATLAS ? –

The Standard Model

The basic questions remain to be answered

What are the LHC and ATLAS ? –

The Large Hadron Collider (P-P or Pb-Pb collisions)

The ATLAS experiment at the LHC

The data rate of ATLAS Computing Grid!

The ATLAS Data Rate and Computing Model The ATLAS Production Systems on Three Grids The Worldwide Computing Operations –

Tier-0 operation and data distribution to T1s

MC data production at T2s

Data reprocessing at T1s

Summary & Outlook

slide-3
SLIDE 3

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 3

Why LHC and ATLAS ? Why LHC and ATLAS ?

Our current knowledge is the Standard Model: –

The ElectroWeak(EW) unification theory

The Quantum ChromoDynamics(QCD)

The Standard Model has been successful with precision tests –

Results from LEP(Large Electron Positron Collider), etc.

However it introduces a fifth force, the Higgs field –

Which has never been observed so far in any experiment!

Many basic questions remain to be answered!

slide-4
SLIDE 4

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 4

Basic Questions Basic Questions

What is the origin of the mass of particles? Can the electroweak and the strong forces be unified? What are “dark matter” made of? Why are there three generations of particles? Where did the antimatter go? …

slide-5
SLIDE 5

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 5

What is the LHC? What is the LHC?

  • The LHC is the Large Hadron Collider:

Being built at CERN, Geneva across the French-Swiss border

27km circumference underground tunnel of ~100-150m depth

1232 dipole superconducting magnets(max.8.3 Tesla, 15m long)

Head-on proton-proton(14 TeV) or Pb-Pb collisions (~30 GeV/fm3)

Designed beam luminosity of 1034 protons/sec/cm2 (109 Hz interaction rate!) Beam luminosity in 2007: 1033 protons/sec/cm2

  • New particle up to 5 TeV/c2 mass can be produced & studied
slide-6
SLIDE 6

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 6

Magnet Installation Magnet Installation Interconnection of the dipoles and connection to the cryoline are the real challenges now in the installation process

Transport of dipoles in the tunnel with an optical guided vehicle

slide-7
SLIDE 7

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 7

Schedule of LHC Installation Schedule of LHC Installation

All key objectives have been reached for end of 2005. Magnet installation rate is now 20/week, with more than 200 installed.

This, together with the interconnect work, will remain the bottleneck until the end of installation.

Main objectives: –

Terminate installation in Feb. 2007

First beam collisions in summer 2007

slide-8
SLIDE 8

What Is the ATLAS Experiment ? What Is the ATLAS Experiment ?

slide-9
SLIDE 9

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 9

Basic HEP Det ect or Component s Basic HEP Det ect or Component s

Photons e± muons Π± , p neutrons Innermost Layer …Outermost Layer Tracking detector Electromagnetic calorimeter Hadronic calorimeter Muon chambers

slide-10
SLIDE 10

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 10

Diameter : 25 m Barrel toroid length: 26 m Length: 46 m Overall weight: 7000 Tons

ATLAS: A Toroidal LHC ApparatuS

General purpose particle detector (coverage up to |η|=5, L=1034 cm-2s-1)

Tracking (|η|<2.5, B=2T): Si pixels and strips TRT(e/π separation) Calorimetry (|η|<4.9): LAr EM Calo: (|η|<3.2) HAD Calo: (|η|<4.9) Scintillator-Tile (central), LAr (fwd) Muon Spectrometer (|η|<2.7): Air-core toroids with muon chambers p, 7TeV

p, 7TeV

slide-11
SLIDE 11

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 11

92 m

ATLAS superimposed to the 5 floors of building 40 ATLAS assembled 92 m under ground at CERN

Scale of ATLAS

slide-12
SLIDE 12

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 12

I nner Det ect or (I D I nner Det ect or (I D) )

The Inner Detector (ID) has three sub-systems: Pixels (0.8 108 channels) SemiConductor Tracker (SCT) (6 106 channels) Transition Radiation Tracker (TRT) (4 105 channels) Radiation tracker : TRT

6m long, 1.1 m radius

Si strips tracker : SCT Pixels

Precision Tracking: Pixel and SCT Cont inuous Tracking and e ident if icat ion: TRT I D inside 2 Tesla solenoid f ield

slide-13
SLIDE 13

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 13

All four completed SCT barrel cylinders have been integrated in their thermal enclosure Contribution from Taiwan: Optical links for Pixel and SCT detectors developed by the team Of Inst. of Physics, Academia Sinica

slide-14
SLIDE 14

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 14

November 4th: Barrel Toroid view after removal of the central support platform (ATLAS Cavern)

slide-15
SLIDE 15

The ATLAS Data Rate and Computing The ATLAS Data Rate and Computing Model Model

slide-16
SLIDE 16

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 16

Types and sizes of event data, processing times and operation parameters

Raw Data Size MB 1.6 ESD Size MB 0.5 AOD Size KB 100 TAG Size KB 1 Simulated Data Size MB 2.0 Simulated ESD Size MB 0.5 Time for Reconstruction kSI2k-sec/event 15 Time for Simulation kSI2k-sec/event 100 Time for Analysis kSI2k-sec/event 0.5 Raw event data rate from online DAQ Hz 200 Operation time seconds/day 50000 Operation time days/year 200 Operation time (2007) days/year 50 Event statistics events/day 107 Event statistics (from 2008 onwards) events/year 2·109

slide-17
SLIDE 17

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 17

ATLAS Computing Model ATLAS Computing Model

  • Worldwide distributed, computing grid based Tier structure
  • Two copies of raw and ESD data
  • Tier-0 at CERN: Archiving and distribution of raw data, calibration and first

pass processing, raw, ESD and AOD data exportation to T1s

  • 10 Ter-1s: Reprocessing of raw and MC data, data storage of raw(1/10), ESD

and AOD. Storage of MC data produced at T2s. AOD data replication to T2s

  • Tier-2s: MC production, analysis, calibration, etc.
  • Tier-3s: Local user data analysis and storage, etc.
  • Requires MC data equivalent to 20% of raw data events
slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 20

The ATLAS Production Systems on Three Grids The ATLAS Production Systems on Three Grids

  • Globablly distributed production using three computing grids:

LCG, OSG and NorduGrid

  • With 4 different production systems:

LCG-Lexor (3 running instances, 2 in Italy, 1 in Spain)

LCG-CG (2 running instances in Canada, Spain, 1 in France, 2 more planned at DESY and CERN)

OSG-PANDA (1 instance)

NorduGrid – Dulcinea (2 instances)

  • DQ2 Distributed Data Management (DDM) system

Integrated in OSG-PANDA, LCG-CG and CERN T0 operations

Being integrated in both LCG and NG now, ready to test soon

  • DDM operations & production software integration

ATLAS VO-boxes & DQ2 servers ready at CERN and 10 T1s (ASGC, BNL, CNAF, FZK, CC-IN2P3, NG, PIC, RAL, SARA and TRIUMF). FTS channel configurations done.

Task remains to configure FTS channels for all T2s!

Need to know T1-T2 associations

slide-21
SLIDE 21

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 21

LCG World View LCG World View

Currently 51 sites with ATLAS SW installed +9300 CPUs (shared with other VOs)

slide-22
SLIDE 22

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 22

OSG World View OSG World View

Currently ~50 sites, ~5000 CPU’s ATLAS dedicated ~1000 CPU’s

slide-23
SLIDE 23

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 23

NorduGrid World View NorduGrid World View

13 sites, 789 CPUs available for ATLAS now

slide-24
SLIDE 24

The Worldwide Computing Operations The Worldwide Computing Operations

slide-25
SLIDE 25

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 25

Requirements & Deliverables Requirements & Deliverables

  • ATLAS CTDR assumes:

200 Hz and 320 MB/sec

50,000 second effective data-taking, 10 million events per day

MC data production equivalent to 20% of raw data rate

2 million MC events per day

  • ATLAS data-taking at full efficiency:

17.28 million events per day

~3.5 million MC events per day

  • Computing operation deliverables in 2007-2008:

T0 operation

  • Detector calibrations finish and event reconstruction starts in 24 hours after start of data-taking
  • 15 kSI2K CPU seconds per event
  • 3,000 CPUs in total
  • Effects from RAW data streaming w.r.t. luminosity blocks

Worldwide distributed MC production (at T2s)

  • 50 events per job for physics events, 100 events/job for single particle events
  • Central MC productions ~100K jobs/day
  • User analysis & productions up to ~1 M jobs/day (job rate 12Hz)

Additional capacity for calibration, alignment and re-processing at T1s

Global operation with more than 100 sites

Needs 7x24 stable services

slide-26
SLIDE 26

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 26

Computing Operations: Work Groups Computing Operations: Work Groups

Since Sep. 2005, computing operation central teams have been

set up at CERN:

– CERN T0 operations – Service Challenges SC3/SC4 – Worldwide distributed production – Software distribution & installation – Site & software installation validation – Job definitions – Database replica deployment – DDM operation & Data management – Production monitoring – User support It is a running-in of continuous operations instead of standalone

challenges

Offline shift during data-taking is foreseen & planned

slide-27
SLIDE 27

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 27

ATLAS SC3 Tier0 Operation Test

  • Quasi-Raw data generated at CERN and transferred from Event Filter
  • utput to CERN CASTOR storage.
  • Real time event reconstruction jobs run at CERN T0 (~3000 kSI2K CPUs).
  • “Raw data” and the reconstructed ESD and AOD data are replicated to Tier

1 sites including ASGC

  • Test use of CERN infrastructure…

Castor2, LSF

  • … and the WLCG Grid middleware…

FTS, LFC, VO Boxes

  • … and ATLS specific

Production System: new Supervisor (Eowyn)

Tier0 Management System (TOM)

Raw Data generator (Jerry)

Distributed Data Management (DDM) software (DQ2)

slide-28
SLIDE 28

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 28

Tier-0 Operations

slide-29
SLIDE 29

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 29

SC4 Test Plans in 2006 SC4 Test Plans in 2006

  • Tier-0 operation test

SFO simulator at Point 1

Realistic CondDB and TagDB data flows

Automatic and continuous operation for weeks

Induced error scenarios

Etc.

  • DDM data movement test
  • Distributed simulation test (using only T2s as stated in CTDR)
  • Data re-processing test
  • Distributed analysis test

“Chaotic” user analysis Vs. Analysis train model

slide-30
SLIDE 30

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 30

Worldwide Distributed Productions Worldwide Distributed Productions

slide-31
SLIDE 31

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 31

Worldwide Distributed Productions Worldwide Distributed Productions

slide-32
SLIDE 32

Worldwide Distributed Productions

slide-33
SLIDE 33

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 33

Summary and Conclusions Summary and Conclusions

  • Great progress in every ATLAS computing operation area:

SC4 and Tier-0 test plans in 2006 established

Distributed production rate reached ~2 million events/week

Need to ramp-up the production rate by a factor of ~10 for ATLAS startup and data-taking period!

Achieve production capacity of 500K to 1 million jobs/day to also include distributed user analysis jobs

  • Near future plans:

CSC production of ~20 million MC events in the middle of 2006 (depend on ATLAS offline software readiness)

General rehearsal in Spring 2007 through trigger, data distribution, offline processing and analysis chain by feeding MC simulated “bytestream” data through HLT, SFO, Tier-0, Tier-1, etc.

Operation for ATLAS start-up and real data-taking period in 2007

  • We look forward to the successful application of WLCG grids in ATLAS

physics runs!

slide-34
SLIDE 34

Backup Slides Backup Slides

slide-35
SLIDE 35

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 35

Basic questions to be answered(1) Basic questions to be answered(1)

Do all the forces become one? –

A single grand unified force as Einstein once dreamed?

Are there undiscovered principles of nature: –

New symmetries, new physical laws?

How can we solve the mystery of dark energy? –

Is it related to the Higgs field?

Are there extra-dimensions of space? –

The string theory and manifest in particle physics

Why there are so many kinds of particles? –

Why three families? Do the patterns and variations among them suggest undiscovered physics principles?

slide-36
SLIDE 36

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 36

Basic questions to be answered(2) Basic questions to be answered(2)

What is the dark matter? Is it made of SUSY particles? –

How can we make it in the laboratory?

What are neutrinos telling us? –

Tiny mass? New physics at very high energies?

How did the Universe come to be? What happened to the antimatter produced in the big bang? –

How did the matter-antimatter asymmetry arise?

The LHC and ATLAS are being built to answer some (not all) of the

questions!

slide-37
SLIDE 37

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 37

World-wide Computing Grid To process the ~10 PBytes of data WLCG (World-wide LHC Computing Grid) Service Challenge 3 test

slide-38
SLIDE 38

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 38

NorduGrid World View NorduGrid World View

slide-39
SLIDE 39

03/05/06

  • Z. Ren ISGC 2006, Academia

Sinica, Taipei, Taiwan 39

I OP I OP-

  • AS Taiwan in ATLAS

AS Taiwan in ATLAS

Hardware: Developed optical links for Pixel, SCT, LAr and TilCal, and LMT for the SCT Software: Work for Atlas software release coordination Monitoring software for the TRT and global ID in Athena framework ATLAS Computing Operation Coordination Offline computing from EF to T0 operation, world-wide real and MC data productions, alignment and calibration and data re-processing, etc. Grid: Active in development, deployment and testing of the WLCG middleware Contributed actively to DC1, DC2 and Rome workshop MC productions

Analysis: Simulation of cosmic muon and measurement of the muon spectrum Simulation study for SM Higgs searches (di-boson channels)