Raw Data Reconstruction with Raw-Data Reconstruction with PROOF
- C. Cheshkov, P. Hristov
Raw Data Reconstruction with Raw-Data Reconstruction with PROOF C. - - PowerPoint PPT Presentation
Raw Data Reconstruction with Raw-Data Reconstruction with PROOF C. Cheshkov, P. Hristov 24/10/2008 ALICE Offline Week C O e ee Many thanks to: Andrei,Federico,Fons,Jan,Gerri,Latchezar,Rene for the discussions and help and Marco &
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 2
Run AliRoot raw-data reco in parallel mode using PROOF Fast reconstruction feedback Tuning of reco-parameters and code Fast test before going to full-blast AliEn production Fast test before going to full blast AliEn production It may sound a bit abstract, but in fact that was one of the
Use case:
− raw-data files << #slaves − Higher (compared to ESD) event size and processing time
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 3
Overview
AliReconstruction implemented as Tselector
− Input List
− Code executed on slaves (also I/O) − Output files
Performance on CAF
Documentation Outlook 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 4
July 08, Federico's office Lets make a TSelector
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 5
I itR Init QA mgr Init Run: C t R R d InitRun: Create RawReader Init CDB Modified set run# from RAW Init Geometry Create RawReader Init run-loaders Init CDB Set run# from RAW Init Geometry I it GRP Init GRP Load all needed OCDB entries Unload used OCDB entries Init Reco-params Init GRP Init Reco-params Init vertexers and trackers Open ESD files Out-of-Loop QA SlaveBegin: Read selector input list Init run-loaders Init vertexers and trackers Open ESD files Reconstruct Event: ... In-loop QA Intermediate ESD files Init QA mgr Process: Recreate RawReader ... TkDiff 4.1.4 report ... Finish Run: Close files etc. Out-of-loop QA In-loop QA ... SlaveTerminate: Close files etc. number of diffs: 104 17 regions were deleted 20 regions were added 67 regions were changed ESD tags Finish QA Finish QA Terminate: ESD tags
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 6
collection of raw-data files SlaveBegin: Restore AliReconstruction Init run-loaders Init vertexers/trackers
InitRun: Create raw-reader Get run # gProof Init vertexers/trackers Open ESD files Init QA mgr Get u # Load OCDB R Input List TGeo geometry OCDB t i Raw data xrootd Run: Raw-data-chain->Process() OCDB entries Mag field map
AliReconstruction
Process: Recreate raw-reader from tree entry Terminate: Create tags Standard event reco Event QA Merging ESD files ESD SlaveTerminate: Close files etc. Finish QA 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 7
Completely transparent for the user:
− Prepare collection of raw-data files (one run)
( )
− Open PROOF session − Enable AliRoot (reco libs) on master and slaves
( )
− Run your favourite (or standard) rec.C by providing files collection as
input (“collection://xxx”)
− One can use any specific OCDB storage, custom reco options, reco-
params etc. If PROOF session is not opened > runs locally and allows quick
− If PROOF session is not opened -> runs locally and allows quick
check/debugging of the AliRoot code
Same code base used as if running sequentially Same code base used as if running sequentially Note: One has to patch v5-21-01-alice with the fixes in PROOF
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 8
Contains all 'parameter-like' objects common to slaves
Single access to OCDB from client machine Allows to customize reconstruction (as if running locally) Size:
− Dominated by OCDB entries − Depends on the active detectors − From a few MBs to ~50-60 MBs at most
It took quite some time to debug the code as input-list was
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 9
−
Restore AliReconstruction state from client session
−
Restore OCDB manager state <- OCDB entries
−
Set TGeo geometry and field map
−
AliRoot run-loaders (for managing intermediate reco files)
−
Open ESD/ESD-friend files & initialize QA mgr
−
Get AliRawEvent entry from raw-data tree
−
Recreate raw-reader out of it
−
Run standard single-event reco
−
Close all files & finalize QA
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 10
AliRoot reco output files:
− ESD − ESD-friend (switchable) − QA files (switchable)
− Intermediate (RecPoints, Digits) files (was not
− Log (switchable)
If we get rid of intermediate files, the I/O would
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 11
Idea: TFile -> TDirectory (no I/O, all event objects in
Was implemented and tester (details on performance
− Controlled via the url of galice.root file − Some methods not implemented at TDirectory level − In AliReconstruction - disable unloading/loading and writing of rec-
− At the moment code is unstable (problems with ROOT garbage
Will be committed as soon as we get more confidence
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 12
Using output-file merging functionality in
By default
− Only ESD files are merged
− Resulting file arrives locally
Necessitate running xrtood daemon on client machine Check the way to run and configure xrootd on CAF-reco
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 13
How to merge ESD-friend files?
− Opened transparently inside ROOT − How to create PROOF-output file?
p
Possibility to specify another output ESD file location (url)
Make resulting ESDs available in CAF as data-set (optionally)
− Make resulting ESDs available in CAF as data-set (optionally)
Possibility to retrieve other output files (expert mode)
AliReconstruction::SetOutput(url) ?
− If location is file url – store ESD (and ESD-friends) there − If location is folder – store all output files (one sub-folder per slave)
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 14
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 15
Linear dependance Slope depends mainly on the
Gerri is implementing the
−
Data is uploaded on slaves storage similarly to PAR files similarly to PAR files
−
Transparent to the user
−
Updated only when input data changes p y p g
−
There should be no dep on #slaves -> init time will be diminshed to a few s
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 16
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 17
No QA No QA,ESD-friend No QA,ESD-friend,run-loaders No QA,ESD friend,run loaders
2 4 6 8 10 12 14 16
No QA,ESD-friend,run-loader,log
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 18
6 8 6
Packets much shorter than Packets much shorter than
− Leads to an overhead input
data rate
− More slaves -> packets
become shorted -> effect is more pronounced
One has to play with both
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 19
Processing rate:
− No effect for the runs
− Performance of ITS-
Overall slow-down is
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 20
Input data rate:
− Now effect is
− I guess ROOT resets
Some crashes with
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 21
Not tested to full extent due to some technical
Recent tests show
− Mediocre results − System is unstable...
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 22
http://aliceinfo.cern.ch/Offline/Activities/Reconst
Will try to update with each important change or
Reminder: users can also use analysis-task-
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 23
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 24
Optimize performance Optimize performance
− Find optimal parameters (packet-size, tree cache depending
− Check the influence of reco code memory consumption
− Somehow put obtained optimizations in the PROOF code
Input data – under development - OK Handling of output files - straightforward Handling of output files straightforward Scalability of the system in case of more files
− Master is quite unstable - serialization?
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 25
The machinery works, so try it out and send us
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 26
24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 27
4 files x 4 slaves (eff ~24 slaves) 4 files x 2 slaves (eff ~22 slaves 4 files x 1 slave (only local slaves) 4 fil ll l 4 files x all slaves i td f t 5 10 15 20 25 via xrootd from castor 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 28