CDF Data production model CDF Data production model
- S. Hou
- S. Hou
for the CDF data production team for the CDF data production team 02 May 2006 02 May 2006
CDF Data production model CDF Data production model S. Hou S. Hou - - PowerPoint PPT Presentation
CDF Data production model CDF Data production model S. Hou S. Hou for the CDF data production team for the CDF data production team 02 May 2006 02 May 2006 CDF data production model 2 Outline Outline Data streams - trigger, streaming,
for the CDF data production team for the CDF data production team 02 May 2006 02 May 2006
CDF data production model
2
CDF data production model
3
Collider Detector experiment at the Fermilab Tevatron collider
CDF data production model
4
2006 upgrade 2005 Achieved
40 MB/s 20 MB/s to Tape storage rate :
150 Hz 110 Hz Level-3 acceptance : 500 MB/s 850X0.2 MB/s Event Builder (EVB) : 1 kHz 850 Hz Level-2 acceptance : 40 kHz 27 kHz Level-1 acceptance : 3x1032 cm-2s-1 1.8x1032 cm-2s-1 Tevatron luminosity :
Event size : ~140 kByte ‘06 data taking rate ~ 5 M events/day Upgrade to improve DAQ efficiency
CDF data production model
5
Consumers 8 Streams:
A B C,D E,J G H
An event may have multiple triggers, Stream overlap ~ 5% increase with Tevatron luminosity
Data in 52 triggers
CDF data production model
6
Data logging rate up to Nov 2005 1.3 fb-1 of data written to tape Data logging rate increase w. Tevatron luminosity Good-run physics data
Feb 2002 - Dec 2004 1040 M events = 210 k files = 188 TByte Dec 2004 - Feb 2006 1270 M events = 172 k files = 159 TByte
1.6 fb-1 delivered by Tevatron 1.3 fb-1 in tape!
CDF data production model
7
CDF DAQ Production farm Enstore CDF Analysis Farm remote remote remote CAFs CAFs CAFs
User desk top
dCache raw raw datasets datasets production production datasets datasets
CDF data production model
8
dCache
file-servers
10Gbit
2Gbit
Remote sites Analysis Analysis farm farm
Enstore
tape library File-servers Servers
2Gbit Oracle DB
CDF data production model
9
Level-3 farm
Level-1,2 Trigger, DAQ
sub-detector
DataBase
Calibration
8 raw-datasets 52 physics datasets
Run splitter File catalog
Split data in production 8 raw data streams 52 physics datasets Final storage Enstore tape library STK 9940B drives 200 GB/tape 30 MByte/s read/write Steady R/W rate ~1TByte/drive/day
dCache CAF, fileservers
CDF data production model
10
buffer of input and output files
st model
dfarm
network MySQL,DB run-splitter
calibration Register concatenated
5 5 1 1 2 2 3 3
worker stager concatenator
4 4 6 6
Register
Register input
CDF data production model
11
CDF data production model
12
merged
4 4
fileserver network SAM,DB input-URL run-splitter calibration Declare/update metadata worker
2 2 1 1 3 3 5 5
dCache
an assigned SAM dataset
for bookkeeping
Merge output files sorted in run sequence
Declare metadata, update file parentage for bookkeeping
CDF data production model
13
Resource monitoring Automatic submission and monitoring
Network, Enstore tape I/O dCache, SAM Data handling, DB service CDF online, calibration DB, software
CDF data production model
14
immediately after data is available on Enstore Raw-data Histograms concatenation
quick detector feedback and good-run definition immediate after beam-line is available Raw-data production/Enstore Histograms
statistics required for chosen events Raw-data Histograms
Multiple outputs concatenation Enstore
CDF data production model
15
ProExe merge
query nextfile declare declare
raw merged reco-children
samStore Input dataset
Online DB good_runs
gphysr_… gphysr_runXX gphysr_runXX gX… gXjs00 gXcrs0
reco.gphysr_..
Control metadata Input datasets physics-datasets
CDF data production model
16
fetch files in input dataset
condor CAF
R/W dCache
reco merged
/pnfs /dCache /samcache /pnfs
CDF data production model
17
calibration
CAF headnode Unpack Unpack tarball tarball Worker Worker Scratch Scratch area area SAM DB dCache input Calibration DB Output to Concatenation
CDF data production model
18
Production Production 52 datasets 52 datasets 8 streams Concatenation Concatenation SAMstore SAMstore
CDF data production model
19
CDF DB, SAM DB, Data-Handling CAF condor batch system Fileserver storage Prohibited jobs missing required services
CDF data production model
20
Worker CPUs (Ganglia)
Traffic to fileserver (xfs)
Input: Enstore loading to dCache Output: multiple workers to fileservers 1Gbit network port to IDE: 40 MB/s 1output dataset to Enstore: 30 MB/s
CDF data production model
21
Tarball (archived execution binary file) distributed to worker CPUs Input files copied via SAM from dCache End of job, output files are copied to assigned fileserver
Commnads Commnads
executed now
CPU of a section
CPU’ ’s of a CAF job s of a CAF job
CDF data production model
22
List of projects List of projects
Cumulative file consumption file consumption
CDF data production model
23
status of file consumption file consumption
List of files/ List of files/ parantage parantage in a dataset in a dataset
CDF data production model
24
Each file created in production has a metadata Parent-daughter is updated after concatenation and SAMstore Metadata is used for bookkeeping SAM query on metadata tabulated and counted
CDF data production model
25
automatic for datasets with incomplete daughters are
to complete the production tasks of a input dataset
CDF data production model
26
SAM farm peak performance: Jobs distributed to two CAF’s (Analysis & Production farm) use 540 CPU for 6 physics streams 8 dCache input file servers, 6 output fileservers stable processing speed at 25 M events/day
(~5 time CDF DAQ ‘05)
3 TB input, 4 TB output /day
(output has 20% overlap in 52 datasets, 15% compressed H,J streams) Integrated Output event logging Daily file consumption
CDF data production model
27
CAF condor is very reliable worker hardware failure occasional RAID down-graded occasional Service 24x7 Oracle, Enstore service SAM, dCache shift support Produciton in parallel 6 streams, output to 6 Fileserver Rougher CPU usage at the end as streams were finishing up
CAF+Farm max=540 jobs Farm CPU
Traffic to/from Production farm
GREEN In bits/sec BLUE Out bits/sec DARK Peak In bits/sec PINK Peak Out bits/sec
CDF data production model
28
reco merged
dCache 10Gbit
2Gbit 1Gbit each
Server Port and IDE speed : 1 Gbit peak ~ 50 MB/s IDE peak ~ 40 MB/s matching to ~ 100 CPU max Enstore single dataset write :
single mover, 30 MB/s instant, ~ 1 TB/day
dual P3 server : 2 TB
network av. 50 MB/s I+O concatenation CPU : 1GB/3min/1CPU ~ 1 TB/day
dual Xeon server : 8 TB
network av. 100 MB/s I+O concatenation CPU : 1GB/1.5min/1CPU ~ 2 TB/day Unmatched port ratio
CDF data production model
29
farm switch (2Gbit capacity) average load is 800 Mbit/s limited by the fileserver Gbit links (40 MB/s each)
More CPU in a CAF
more streams in parallel production more fileservers (more Gbit links) Eventual limit is the tape drives
CDF data production model
30
Bookkeeping stay with SAM Production scripts are portable can be multiplied Binary Tarball self-sustained, grid compatible
Data-handling SAMGrid modify dCache copy CAF OSG-CAF modify batch-submit
CDF data production model
31