Raw Data Reconstruction with Raw-Data Reconstruction with PROOF C. - - PowerPoint PPT Presentation

raw data reconstruction with raw data reconstruction with
SMART_READER_LITE
LIVE PREVIEW

Raw Data Reconstruction with Raw-Data Reconstruction with PROOF C. - - PowerPoint PPT Presentation

Raw Data Reconstruction with Raw-Data Reconstruction with PROOF C. Cheshkov, P. Hristov 24/10/2008 ALICE Offline Week C O e ee Many thanks to: Andrei,Federico,Fons,Jan,Gerri,Latchezar,Rene for the discussions and help and Marco &


slide-1
SLIDE 1

Raw Data Reconstruction with Raw-Data Reconstruction with PROOF

  • C. Cheshkov, P. Hristov

24/10/2008 ALICE Offline Week C O e ee

slide-2
SLIDE 2

Many thanks to:

Andrei,Federico,Fons,Jan,Gerri,Latchezar,Rene for the discussions and help and Marco & Jan Fiete for their great support on CAF! for their great support on CAF!

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 2

slide-3
SLIDE 3

What is all about?

R AliR t d t i ll l d i PROOF

Run AliRoot raw-data reco in parallel mode using PROOF Fast reconstruction feedback Tuning of reco-parameters and code Fast test before going to full-blast AliEn production Fast test before going to full blast AliEn production It may sound a bit abstract, but in fact that was one of the

  • ptions we needed urgently during the LHC start-up
  • ptions we needed urgently during the LHC start-up

Use case:

− raw-data files << #slaves − Higher (compared to ESD) event size and processing time

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 3

slide-4
SLIDE 4

Contents

Overview

Overview

AliReconstruction implemented as Tselector

− Input List

Code executed on slaves (also I/O)

− Code executed on slaves (also I/O) − Output files

Performance on CAF

Documentation

Documentation Outlook 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 4

slide-5
SLIDE 5

July 08, Federico's office Lets make a TSelector

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 5

slide-6
SLIDE 6

It turned out to be a bit more It turned out to be a bit more difficult...

I itR Init QA mgr Init Run: C t R R d InitRun: Create RawReader Init CDB Modified set run# from RAW Init Geometry Create RawReader Init run-loaders Init CDB Set run# from RAW Init Geometry I it GRP Init GRP Load all needed OCDB entries Unload used OCDB entries Init Reco-params Init GRP Init Reco-params Init vertexers and trackers Open ESD files Out-of-Loop QA SlaveBegin: Read selector input list Init run-loaders Init vertexers and trackers Open ESD files Reconstruct Event: ... In-loop QA Intermediate ESD files Init QA mgr Process: Recreate RawReader ... TkDiff 4.1.4 report ... Finish Run: Close files etc. Out-of-loop QA In-loop QA ... SlaveTerminate: Close files etc. number of diffs: 104 17 regions were deleted 20 regions were added 67 regions were changed ESD tags Finish QA Finish QA Terminate: ESD tags

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 6

slide-7
SLIDE 7

AliReconstruction :: public TSelector AliReconstruction :: public TSelector

CLIENT MASTER SLAVES

collection of raw-data files SlaveBegin: Restore AliReconstruction Init run-loaders Init vertexers/trackers

CLIENT MASTER SLAVES

InitRun: Create raw-reader Get run # gProof Init vertexers/trackers Open ESD files Init QA mgr Get u # Load OCDB R Input List TGeo geometry OCDB t i Raw data xrootd Run: Raw-data-chain->Process() OCDB entries Mag field map

AliReconstruction

Process: Recreate raw-reader from tree entry Terminate: Create tags Standard event reco Event QA Merging ESD files ESD SlaveTerminate: Close files etc. Finish QA 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 7

slide-8
SLIDE 8

AliReconstruction :: public TSelector

Completely transparent for the user:

− Prepare collection of raw-data files (one run)

( )

− Open PROOF session − Enable AliRoot (reco libs) on master and slaves

( )

− Run your favourite (or standard) rec.C by providing files collection as

input (“collection://xxx”)

− One can use any specific OCDB storage, custom reco options, reco-

params etc. If PROOF session is not opened > runs locally and allows quick

− If PROOF session is not opened -> runs locally and allows quick

check/debugging of the AliRoot code

Same code base used as if running sequentially Same code base used as if running sequentially Note: One has to patch v5-21-01-alice with the fixes in PROOF

  • utput files merging

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 8

  • utput files merging
slide-9
SLIDE 9

Input List

Contains all 'parameter-like' objects common to slaves

Single access to OCDB from client machine

Single access to OCDB from client machine Allows to customize reconstruction (as if running locally) Size:

− Dominated by OCDB entries − Depends on the active detectors − From a few MBs to ~50-60 MBs at most

It took quite some time to debug the code as input-list was

copied on the master (default constructor TGeoManager deletes previous instance)

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 9

slide-10
SLIDE 10

O Sl On Slaves

  • First read input list:

Restore AliReconstruction state from client session

Restore OCDB manager state <- OCDB entries

Set TGeo geometry and field map

  • Initialize:

AliRoot run-loaders (for managing intermediate reco files)

Open ESD/ESD-friend files & initialize QA mgr

  • Process:

Get AliRawEvent entry from raw-data tree

Recreate raw-reader out of it

Run standard single-event reco

  • Finally:

Close all files & finalize QA

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 10

slide-11
SLIDE 11

On Slaves (Local I/O)

AliRoot reco output files:

AliRoot reco output files:

− ESD − ESD-friend (switchable) − QA files (switchable)

Q ( )

− Intermediate (RecPoints, Digits) files (was not

switchable) switchable)

− Log (switchable)

If we get rid of intermediate files, the I/O would

minimal

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 11

slide-12
SLIDE 12

Fil l R L d File-less Run-Loaders

Idea: TFile > TDirectory (no I/O all event objects in

Idea: TFile -> TDirectory (no I/O, all event objects in

memory) W i l t d d t t (d t il f

Was implemented and tester (details on performance

slides)

− Controlled via the url of galice.root file − Some methods not implemented at TDirectory level − In AliReconstruction - disable unloading/loading and writing of rec-

points, digits data

− At the moment code is unstable (problems with ROOT garbage

collection)

Will b itt d t fid

Will be committed as soon as we get more confidence

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 12

slide-13
SLIDE 13

Output Files

Using output-file merging functionality in

Using output file merging functionality in PROOF (based on TFileMerger) B d f lt

By default

− Only ESD files are merged

y g

− Resulting file arrives locally

N it t i t d d li t hi

Necessitate running xrtood daemon on client machine Check the way to run and configure xrootd on CAF-reco

web page (shown at the end of the talk) web-page (shown at the end of the talk)

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 13

slide-14
SLIDE 14

Output Files

How to merge ESD-friend files?

− Opened transparently inside ROOT − How to create PROOF-output file?

p

Possibility to specify another output ESD file location (url)

Make resulting ESDs available in CAF as data-set (optionally)

− Make resulting ESDs available in CAF as data-set (optionally)

Possibility to retrieve other output files (expert mode)

AliR i S O ( l) ?

AliReconstruction::SetOutput(url) ?

− If location is file url – store ESD (and ESD-friends) there − If location is folder – store all output files (one sub-folder per slave)

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 14

slide-15
SLIDE 15

Performance on CAF...

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 15

slide-16
SLIDE 16

Performance – init time

Linear dependance Slope depends mainly on the

size of OCDB objects sent to slaves

Gerri is implementing the

p g concept of 'input data'

Data is uploaded on slaves storage similarly to PAR files similarly to PAR files

Transparent to the user

Updated only when input data changes p y p g

There should be no dep on #slaves -> init time will be diminshed to a few s

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 16

slide-17
SLIDE 17

Performance – processing rate

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 17

slide-18
SLIDE 18

Perforamnce – local I/O

Processing rate in ev/s (4 files, 2.2 GB, 1200 evts)

No QA No QA,ESD-friend No QA,ESD-friend,run-loaders No QA,ESD friend,run loaders

2 4 6 8 10 12 14 16

No QA,ESD-friend,run-loader,log

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 18

6 8 6

slide-19
SLIDE 19

Performance – cache size

Packets much shorter than Packets much shorter than

tree cache (and xrootd read- ahead?) size

− Leads to an overhead input

data rate

− More slaves -> packets

become shorted -> effect is more pronounced

One has to play with both

sizes

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 19

slide-20
SLIDE 20

Test w/o read ahead in xrootd client Test w/o read-ahead in xrootd client and smaller tree cache

Processing rate:

Processing rate:

− No effect for the runs

with more detectors with more detectors

− Performance of ITS-

l tl i d

  • nly greatly improved

Overall slow-down is

most likely due to AliRoot update ;-) AliRoot update ; )

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 20

slide-21
SLIDE 21

Test w/o read ahead in xrootd client Test w/o read-ahead in xrootd client and smaller tree cache

Input data rate:

− Now effect is

pronounced not only f ITS b t l for ITS run, but also for all runs with small e ent si e event size

− I guess ROOT resets

tree cache depending

  • n the entry size

Some crashes with

smaller cache size

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 21

smaller cache size

slide-22
SLIDE 22

Tests with longer packets

Not tested to full extent due to some technical

Not tested to full extent due to some technical problem R t t t h

Recent tests show

− Mediocre results − System is unstable...

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 22

slide-23
SLIDE 23

Documentation

http://aliceinfo.cern.ch/Offline/Activities/Reconst

http://aliceinfo.cern.ch/Offline/Activities/Reconst ruction/RecoOnCAF/index.html Will t t d t ith h i t t h

Will try to update with each important change or

addition

Reminder: users can also use analysis-task-

force mailing list force mailing list

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 23

slide-24
SLIDE 24

Experience on GSI-AF

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 24

slide-25
SLIDE 25

O tl k Outlook

Optimize performance Optimize performance

− Find optimal parameters (packet-size, tree cache depending

  • n event size #files/slaves etc )
  • n event size, #files/slaves etc.)

− Check the influence of reco code memory consumption

S h b i d i i i i h PROOF d

− Somehow put obtained optimizations in the PROOF code

(packetizer,...)

I d d d l OK

Input data – under development - OK Handling of output files - straightforward Handling of output files straightforward Scalability of the system in case of more files

− Master is quite unstable - serialization?

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 25

slide-26
SLIDE 26

The machinery works so try it out and send us

The machinery works, so try it out and send us

your feedback!

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 26

slide-27
SLIDE 27

Spares

24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 27

slide-28
SLIDE 28

Exercise with local data-set

4 files x 4 slaves (eff ~24 slaves) 4 files x 2 slaves (eff ~22 slaves 4 files x 1 slave (only local slaves) 4 fil ll l 4 files x all slaves i td f t 5 10 15 20 25 via xrootd from castor 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 28