CMS Plans and Strategy for Physics Analysis on the Grids Lothar A. - PowerPoint PPT Presentation

CMS CMS CMS Plans and Strategy for Physics Analysis on the Grids Lothar A. T. Bauerdick/Fermilab Invited Talk at the International Symposium for Grid Computing 2006 Academia Sinica, Taipei, May 2006 LATBauerdick Fermilab ISGC 2006 — CMS Computing and Analysis May 2, 2006 1

f Contents of Talk ✦ Introduction: ★ Worldwide LHC Computing Grid should be ready for Physics Soon! ✦ CMS Computing and Data Model ★ computing tiers, data tiers, data structures ✦ Data Analysis Strategy ★ Analysis Process and Model ✦ First experience ★ CMS data analysis on the grid using the CRAB system ✦ first attempts and many open questions ✦ Acknowledgment ★ many slides lifted from CMS talks at the recent CHEP06 in Mumbai! LATBauerdick/ Fermilab 2 ISGC 2006 — CMS Analysis May 2, 2006

f LHC startup plan Stage 1 Initial commissioning 43x43 to 156x156, N=3x10 10 Zero to partial squeeze L=3x10 28 - 2x10 31 Stage 2 75 ns operation 936x936, N=3-4x10 10 partial squeeze L=10 32 - 4x10 32 Stage 3 25 ns operation 2808x2808, N=3-5x10 10 partial to near full squeeze L=7x10 32 - 2x10 33 LATBauerdick/ Fermilab 3 ISGC 2006 — CMS Analysis May 2, 2006

f LHC First Physics Run in 2008 � Integrated luminosity with the current LHC plans Run 2008 Run 2008 Lumi (cm -2 s -1 ) 1.E+04 1.9 fb -1 1.9 fb -1 10 33 1.E+03 1 fb -1 (optimistic?) 10 32 1.E+02 1.E+01 10 31 1.E+00 1.E-01 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 � LHC = 30% weeks (optimistic!) luminosity (10**30 cm-2 sec-1) integrated luminosity (pb-1) events/crossing LATBauerdick/ Fermilab 4 ISGC 2006 — CMS Analysis May 2, 2006

f LHC First Physics Run in 2008 � Integrated luminosity with the current LHC plans Run 2008 Run 2008 Higgs (?) Lumi Susy - Susy (cm -2 s -1 ) Z’ � muons 1.E+04 Top re-discovery 1.9 fb -1 1.9 fb -1 10 33 1.E+03 1 fb -1 (optimistic?) 10 32 1.E+02 1.E+01 10 31 1.E+00 1.E-01 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 � LHC = 30% weeks (optimistic!) luminosity (10**30 cm-2 sec-1) integrated luminosity (pb-1) events/crossing LATBauerdick/ Fermilab 4 ISGC 2006 — CMS Analysis May 2, 2006

f Pilot Run PILOT RUN Int. Lumi Pile-up Lumi (pb -1 ) (cm -2 s -1 ) 1.E+02 10 31 10 1.E+01 10 30 1 1.E+00 0.1 10 29 1.E-01 10 28 1.E-02 11 13 15 17 19 21 23 25 27 29 1 3 5 7 9 1.E-03 1.E-04 � LHC = 20% DAYS (optimistic!) luminosity (10**30 cm-2 sec-1) integrated luminosity (pb-1)" events/crossing ✦ 30 days, maybe less (?); 43 x 43 bunches, then 156 x 156 bunches ✦ jets and IVB production — 15 pb -1 ==> 30K W’s and 4K Zs into leptons ✦ Measure cross sections, W and Z charge asymmetry (pdfs; IVB+jet production; top!) LATBauerdick/ Fermilab 5 ISGC 2006 — CMS Analysis May 2, 2006

Distributed Computing f Better Be Delivering on Time! ✦ Last year of preparation for Grid computing for LHC to work ★ computing resources are geographically distributed, interconnected via high throughput networks and operated by means of Grid software ★ WLCG systems is still very fragile, but it is functional and being used ★ Tier-0, Tier-1, Tier-2 and CAFs all are essential for success ✦ Large Aggregate Computing Resources Required: ★ in 2008 CMS requests total of 45 MSI2k CPU, 14 PB disk, 24 PB tape ➡ CMS computing model document (CERN-LHCC-2004-035) ➡ CMS computing Technical Design Report (CERN-LHCC-2005-023) LATBauerdick/ Fermilab 6 ISGC 2006 — CMS Analysis May 2, 2006

f Large Computing Resource Pledges ✦ Seven Tier-1 centers catering to CMS ★ ASGC amongst them! ★ Still, CMS did not get sufficient pledges for data storage at Tier-1 centers ➡ not enough tape library space to store all 2008 event and simulated data ✦ Most of the CMS Data Analysis will happen at Tier-2 centers ★ Tier-2 resource situation looks good! 50 MSI2k ~ 10,000 nodes! ★ some 30 Tier-2 sites are offering computing resources to CMS ➡ most of them listed in the WLCG-MoU ★ some eleven Tier-2s already working actively Summary Tier2s Split 2008 ALICE ATLAS CMS LHCb SUM 2008 Offered 5636 20114 18329 4436 48515 CPU (kSI2K) TDR Req. 14400 19940 19300 7650 61290 Balance -61% 1% -5% -42% -21% Offered 1464 6252 4760 840 13316 Disk (Tbytes) TDR Req. 3479 8748 4900 23 17150 Balance -58% -29% -3% -22% Offered 345 683 500 380 1908 Tape (Tbytes) LATBauerdick/ Fermilab 7 ISGC 2006 — CMS Analysis May 2, 2006

f Principles of CMS Computing Model ✦ Emphasizing the inherent structure of CMS data and data access ★ Structured Data and Structured Grids ✦ Data Granularity and Data Tiers ★ optimize sequential data access to well-defined Data Tiers ➡ eliminate object database philosophy from Event Data Model ★ Data always needs to be considered in its trigger context -> trigger paths ➡ O(2PB)/yr raw data split into O(50) (40TB) trigger-determined datasets ✦ Computing Tiers and hierarchical Data Grid ★ map data flow and data handling functions to a hierarchical structure ➡ event data flows Tier-0 —> Tier-1 —> Tier-2, data being analyzed at Tier-2 ★ facilitates well-defined roles and visible responsibilities for centers ✦ Building ability to prioritize is very important ★ In 2007/8, computing system efficiency may not be 100%.... LATBauerdick/ Fermilab 8 ISGC 2006 — CMS Analysis May 2, 2006

f Computing Tiers Tier-0: Tier-1’s: Tier-2’s: ✦ Accepts data from � making samples accessible � User data Analysis DAQ for selection and distribution � MC production ✦ Prompt reconstruction � data-intensive analysis � Import skimmed ✦ Data archive and � re-processing datasets from Tier-1 distribution to Tier-1’s and export MC data � calibration � Calibration/alignment � FEVT, MC data archiving LATBauerdick/ Fermilab 9 ISGC 2006 — CMS Analysis May 2, 2006

f Computing Tiers CMS-CAF Analysis Facility at CERN LPC-CAF and other User Analysis Facilities � Access to 10% express stream and � Typically associated to Tier-1/2 centers eventually the full raw dataset � Provide interactive and batch analysis � Focused on latency-critical detector environment to users outside CERN trigger calibration and analysis � Sizes from “Tier-3” over “Tier-2s” to activities significantly large analysis facilities, e.g. � Provide some CMS central services at Fermilab and BNL (e.g. store conditions and calibrations) � Backbone for analysis infrastructure LATBauerdick/ Fermilab 10 ISGC 2006 — CMS Analysis May 2, 2006

f CMS Data Analysis at CMS-CAF ✦ LHC running time is precious ★ Require short latency feedback and fast turnaround: hours, not days ✦ fast, efficient monitoring of data quality, trigger quality ★ With ad-hoc study of detector data (special data streams) ★ With a few critical analysis that verify physics (masses, cross sections) ✦ Calibration and Alignment ★ Require fast turn-around for Tier-0 and (potentially) the online filter farm ✦ Physics Assurance and Analysis ★ Are we seeing something unexpected (background or signal) that calls for trigger adjustment now ? Rapid analysis of ‘ express-line ’ physics without initially having to rely on a fully functional and perfectly operating Grid. ✦ As the experiment matures, in 2010 and beyond, some CAF responsibilities can be farmed out to T1s or T2s, but not during the startup phase. LATBauerdick/ Fermilab 11 ISGC 2006 — CMS Analysis May 2, 2006

Data Tiers and f Data Volume for 2008 ✦ RAW ➡ Detector data + L1, HLT results after online formatting ➡ Includes factors for poor understanding of detector, compression, etc ➡ 1.5MB/evt @ ~150 Hz; ~ 4.5 PB/year (two copies) ✦ RECO ➡ Reconstructed objects with their associated hits ➡ 250kB/evt; ~2.1 PB/year (incl. 3 reprocessing versions) ➡ Supports pattern recognition, recalibration, Root-browsable, for interactive analysis ✦ FEVT=RAW+RECO ➡ ~1.75MB/event, to keep RAW and RECO together for data handling ➡ 1 copy at Tier-0 and one spread over all Tier-1 ’ s ✦ AOD ➡ The main analysis format; fragment of RECO for analysis: objects + minimal hit info ➡ 50kB/evt; ~2.6PB/year - whole copy at each Tier-1 ➡ shipped out to all T1s and on demand to T2 and laptops ✦ Should be inclusive so that all groups can use it. ✦ may allow some re-calibration and re-alignment (refit) ✦ Plus MC in ~ 1:1 ratio with data LATBauerdick/ Fermilab 12 ISGC 2006 — CMS Analysis May 2, 2006

f Event Data Model ✦ “Event” holds all data taken during triggered physics event ✦ Provenance of each Event Data Product is being tracked ✦ Persistent objects designed such that they provide useful analysis information without needing external information ✦ Event Tiers: FEVT contains RAW and RECO which includes AOD RECO/contains AOD Tracks Electrons Photons KtJets ConeJets … TracksExtra BasicClusters SuperClusters TracksHits Cells CaloTowers TrackDigis EcalDigis HcalDigis RAW (includes Digi TrackRaw EcalRaw HcalRaw For Tracking for E/Gamma For 4 Jets LATBauerdick/ Fermilab 13 ISGC 2006 — CMS Analysis May 2, 2006

CMS Plans and Strategy for Physics Analysis on the Grids Lothar A. - PowerPoint PPT Presentation

CMS CMS CMS Plans and Strategy for Physics Analysis on the Grids Lothar A. T. Bauerdick/Fermilab Invited Talk at the International Symposium for Grid Computing 2006 Academia Sinica, Taipei, May 2006 LATBauerdick Fermilab ISGC 2006 CMS

Scientific Computing I Grids Strcutured Grids Unstrcutured Grids Module 7: Grid Generation

CMS physics overview CMS physics overview LISHEP-2013, March 18-22, Rio de Janeiro LISHEP-2013,

The CMS HL-LHC Upgrades and Proposed U.S. CMS Contributions Vivian ODell, U. S. CMS HL-LHC

Pixel trigger in CMS Peter Wittich CMS/Cornell University 12/2/2019 Trigger in CMS for Phase 2:

Flow measurements from CMS Julia Velkovska for the CMS Collaboration CMS flow measurements: LHC

CMS Computing Using the Worldwide LHC Computing Grids Lothar A. T. Bauerdick/Fermilab Talk at

CMS Programme India CERN LHC CMS India-CMS Kajari Mazumdar ( on behalf of

CMS Upgrades CMS Plans up to 2020/ 2030? Dan Green Fermilab 06/05/13 U of D0 CMS Upgrades

Physics plans and and ILDG ILDG usage usage Physics plans in Italy Italy in Francesco Di

PhEDEx and CMS Data Transfers Paul Rossman Fermilab Global CMS Data Network Paul Rossman

CMS Mortgage Strategies CMS TacOpps I - Trends & Opportunities in CRE Debt CMS TacOpps I

B-physics results in CMS in Run2 Alessio Boletti on behalf of the CMS collaboration Universit`

I ntroduction to the NRENs and Grids w orkshops Catalin Meirosu TERENA 4 th NRENs and Grids w

UPM DAY 1: SMART GRIDS TABLE 1: TECHNOLOGICAL CHALLENGES RELATED WITH SMART GRIDS DEVELOPMENT

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

Tuesday Wednesday Thursday Friday Keynotes Keynotes Keynotes parallel Photo coffee Grids

Waipaoa River Flood Control Scheme Review Review Technical Review Sept 2009 Steering

Investor Presentation Full Year FY 2013 Rory Macleod, Managing Director September 2013

WEL WELCOME! COME! Agenda Call to Order & Welcome Chairman I. II. Opening Comments

All Sector Advisory Group Meeting June 1, 2015 Scott MacDonald, Assistant Deputy Minister Jeremy

Climate to Prosper Climate to Prosper Bridging the Gap Organising and Communicating Climate

Implementation and Delivery Report Dorking Town Forum May 2018 X X X X X X Excerpt from:

How do I link Learning How do I link Learning Outcomes to Teaching Outcomes to Teaching and

Can self-directed learning resources increase individuals evaluation competency? Insights from

Sambuz

Useful Links

Newsletter

Mail Us

CMS Plans and Strategy for Physics Analysis on the Grids Lothar A. - PowerPoint PPT Presentation

CMS CMS CMS Plans and Strategy for Physics Analysis on the Grids Lothar A. T. Bauerdick/Fermilab Invited Talk at the International Symposium for Grid Computing 2006 Academia Sinica, Taipei, May 2006 LATBauerdick Fermilab ISGC 2006 CMS

Scientific Computing I Grids Strcutured Grids Unstrcutured Grids Module 7: Grid Generation

CMS physics overview CMS physics overview LISHEP-2013, March 18-22, Rio de Janeiro LISHEP-2013,

The CMS HL-LHC Upgrades and Proposed U.S. CMS Contributions Vivian ODell, U. S. CMS HL-LHC

Pixel trigger in CMS Peter Wittich CMS/Cornell University 12/2/2019 Trigger in CMS for Phase 2:

Flow measurements from CMS Julia Velkovska for the CMS Collaboration CMS flow measurements: LHC

CMS Computing Using the Worldwide LHC Computing Grids Lothar A. T. Bauerdick/Fermilab Talk at

CMS Programme India CERN LHC CMS India-CMS Kajari Mazumdar ( on behalf of

CMS Upgrades CMS Plans up to 2020/ 2030? Dan Green Fermilab 06/05/13 U of D0 CMS Upgrades

Physics plans and and ILDG ILDG usage usage Physics plans in Italy Italy in Francesco Di

PhEDEx and CMS Data Transfers Paul Rossman Fermilab Global CMS Data Network Paul Rossman

CMS Mortgage Strategies CMS TacOpps I - Trends &amp; Opportunities in CRE Debt CMS TacOpps I

B-physics results in CMS in Run2 Alessio Boletti on behalf of the CMS collaboration Universit`

I ntroduction to the NRENs and Grids w orkshops Catalin Meirosu TERENA 4 th NRENs and Grids w

UPM DAY 1: SMART GRIDS TABLE 1: TECHNOLOGICAL CHALLENGES RELATED WITH SMART GRIDS DEVELOPMENT

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

Tuesday Wednesday Thursday Friday Keynotes Keynotes Keynotes parallel Photo coffee Grids

Waipaoa River Flood Control Scheme Review Review Technical Review Sept 2009 Steering

Investor Presentation Full Year FY 2013 Rory Macleod, Managing Director September 2013

WEL WELCOME! COME! Agenda Call to Order &amp; Welcome Chairman I. II. Opening Comments

All Sector Advisory Group Meeting June 1, 2015 Scott MacDonald, Assistant Deputy Minister Jeremy

Climate to Prosper Climate to Prosper Bridging the Gap Organising and Communicating Climate

Implementation and Delivery Report Dorking Town Forum May 2018 X X X X X X Excerpt from:

How do I link Learning How do I link Learning Outcomes to Teaching Outcomes to Teaching and

Can self-directed learning resources increase individuals evaluation competency? Insights from

Sambuz

Useful Links

Newsletter

Mail Us

CMS Mortgage Strategies CMS TacOpps I - Trends & Opportunities in CRE Debt CMS TacOpps I

WEL WELCOME! COME! Agenda Call to Order & Welcome Chairman I. II. Opening Comments