ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for - PowerPoint PPT Presentation

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 1

 High level overview of ATLAS Input/Output framework and data persistence.  Athena: The ATLAS event processing framework  The ATLAS event data model  Persistence:  Writing Event Data: OutputStream and OutputStreamTool Overview  Reading Event Data: EventSelector and AddressProvider  ConversionSvc and Converter  Timeline  Run 2: AthenaMP, xAOD  Run 3: AthenaMT  Run 4: Serialization, Streaming, MPI, ESP 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 2

 Simulation, reconstruction, and analysis/derivation are run as part of the Athena framework :  Using the most current ( transient ) version of the Event Data Model  Athena software architecture belongs to the blackboard family :  StoreGate is the Athena implementation of the blackboard: Athena: The  A proxy defines and hides the cache-fault ATLAS event mechanism: processing  Upon request, a missing data object instance can be created and added to the framework transient data store, retrieving it from persistent storage on demand.  Support for object identification via data type and key string:  Base-class and derived-class retrieval, key aliases, versioning, and inter-object references. 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 3

 Athena is used for different workflows in Reconstruction, Simulation and Analysis (mainly Derivation). Total CPU evt-loop Total Read (incl. Total Write (w/o Step ROOT compression time ROOT and P->T) compression) Workflows EVNTtoHITS 0.006 0.01% 0.017 0.02% 0.027 0.03% 91.986 HITtoRDO 1.978 5.30% 0.046 0.12% 0.288 0.77% 37.311 RDOtoRDO- 0.125 1.23% 0.153 1.51% 0.328 3.23% 10.149 Trigger RDOtoESD 0.166 1.88% 0.252 2.85% 0.444 5.02% 8.838 ESDtoAOD 0.072 23.15% 0.147 47.26% 0.049 15.79% 0.311 AODtoDAOD 0.052 5.35% 0.040 4.06% 0.071 7.24% 0.979 RAWtoALL N/A N/A 0.112 0.72% 0.043 0.28% 15.562 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 4

 The transient ATLAS event model is implemented in C++, and uses the full power of C++, including pointers , inheritance , polymorphism , templates , STL and Boost classes, and a variety of external packages . The ATLAS  At any processing stage, event data consist of a large and event data heterogeneous assortment of objects, with associations among objects. model  The final production outputs are xAOD and DxAOD , which were designed for Run II and after to simplify the data model, and make it more directly usable with ROOT .  More about this later… 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 5

Dynamic Attr Reader On-demand single attribute retrieval Conv. Store APR:Database ROOT Service POOL Gate Svc On-demand Opt. APR:Database ROOT single object T/P retrieval  ATLAS currently has almost 400 petabytes of event data  Including replicated datasets Persistence  ATLAS stores most of its event data using ROOT as its persistence technology  Raw readout data from the detector is in another format. 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 6

AthenaPool AthenaPool AthenaPool PoolSvc Output CnvSvc Converter Sequence Diagram StreamTool for writing Data connect Data new Output() Objects via Header setProcess AthenaPOOL: Tag(pTag) stream Objects() loop The AthenaPool- createRep(obj, addr) createRep(obj, addr) [object in DataObject item list] ToPool() OutputStreamTool T-P sep. transToPers( Writing Event is used for writing [trans.-pers. obj,pObj) conversion] data objects into registerForWrite ( Data place, pObj, desc) POOL/APR files registerForWrite (place, pObj, desc) and hides any token token persistency addr addr technology insert(addr) dependence from Register DataHeader in POOL, get token and insert to self the Athena commit commitOutput (outputName , true) software Output() alt commit() [full framework. commit] [else] commitAndHold () 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 7

 OutputStreams connect a job to a data sink, usually a file (or sequence of files).  Configured with ItemList for event and metadata to be written. OutputStream  Similar to Athena algorithms :  Executed once for each event and Output-  Can be vetoed to write filtered events  Can have multiple instances per job, writing to different data StreamTool sinks/files  OutputStreamTools are used to interface the OutputStream to a ConversionSvc and its Converter which depend on the persistent technology. 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 8

EventSelector PoolSvc Sequence Diagram AthenaPool for reading Data next() alt Objects via getCollectionCnv () [no more events in AthenaPOOL: collection] Pool new CollectionCnv An EventSelector initialize() createCollection (type, POOL:: create (type, des, is used to access connection, input, context) mode, session) ICollection selected events by executeQuery () Reading Event newQuery () iterating over the iterator iterator input Data [else] next () iterator DataHeaders. loadAddr retrieve(iterator, eventRef() esses() An Address- ref) token Provider preloads retrieve(token) proxies for the T-P sep. persToTrans(ptr, [pers.-trans. DataHeader dataHeader ) data objects in the conversion] Element setObjPtr(ptr, token, current input event context) dataHeader into StoreGate. loop getAddress () [element != end()] 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 9

 The EventSelector connect a job to a data sink, usually a file (or sequence of files).  For event processing it implements the next() function that provides the persistent reference to the DataHeader .  The DataHeader stores persistent references and StoreGate state EventSelector for all data objects in the event. and Address-  It also has other functionality, such as handling file boundaries for e.g. metadata processing. Provider  An AddressProvider is called automatically, if an object retrieved from StoreGate has not been read.  AddressProvider interact with ConversionSvc and Converter 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 10

 The role of conversion services and their converters is to provide a means to write C++ data objects to storage and read them back.  Each storage technology is implemented via a ConversionSvc and Converter.  ATLAS uses ROOT via POOL/APR that is implemented via ConversionSvc Athena/Pool Conversion and Converter  APR implements ROOT TKey and TTree technologies.  Converter dispatching done by type .  Converters can do (optional) Transient/Persistent mappings and handle schema evolution.  When writing, Converter return an externalizable reference . Input File Compressed Baskets Persistent Transient baskets (b) (B) State (P) State (T) 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 11 read decompress stream t/p conv.

 Since Run II, ATLAS has deployed AthenaMP , the multi-process version of Athena.  Starts up and initializes as single (mother) process.  Optionally processes events  Forks of (worker) processes that do the event processing in parallel.  Utilizes Copy On Write , thereby saving large amounts of memory. Run 2  Each worker has its own address space , no sharing of event data.  In default mode, workers are independent of each others for I/O: Read their own data directly from file and write their own output to Multi Process: a (temporary) file.  Input may be non-optimal as worker have to de-compress the same AthenaMP buffers to process different subsections of events -> cluster dispatching  output from different workers needs to be merged, which can create a bottleneck -> deployment of SharedWriter 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 12

SharedReader SharedWriter  The Shared Writer collects  The Shared Data Reader output data objects from all reads, de-compresses and AthenaMP workers via de-serializes the data for all AthenaMP: shared memory and writes workers and therefore them to a single output file. provides a single location to Shared I/O  This helps to avoid a store the decompressed data separate merge step in and serve as caching layer. components AthenaMP processing. 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 13

 Each xAOD container has an associated data store object (called Auxiliary Store ).  Both are recorded in StoreGate.  The key for the aux store should be the same as the data object with ‘Aux.’ appended. Also Run 2  The xAOD aux store object contains the ‘ static ’ aux variables .  It also holds a SG::AuxStoreInternal object which manages any additional ‘ dynamic ’ variables. xAOD Data Model 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 14

 Most xAOD object data are not stored in the xAOD objects themselves, but in a separate auxiliary store.  Object data stored as vectors of values. xAOD:  (“Structure of arrays” versus “array of structures.”) Auxiliary data  Allows for better interaction with root, partial reading of objects, and user extension of objects.  Opens up opportunities for more vectorization and better use of compute accelerators. 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 15

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for - PowerPoint PPT Presentation

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 1 High level overview of ATLAS Input/Output framework and data persistence. Athena: The ATLAS event

Measuring DNSSEC using RIPE Atlas Kaveh Ranjbar RIPE NCC RIPE Atlas Coverage RIPE Atlas 2

ATLAS Searches for SUSY Chris Young, CERN ATLAS Group What have we not looked for? 1 / 37 ATLAS

ATLAS ROOT I/O pt 2 Atlas Hot Topics (with reference to CHEP presentations) Big data

Top Properties from ATLAS Chris Young (CERN), on behalf of ATLAS 27th May 2020 1 / 19 Top

Atlas Summit 2016 C ALL FOR P RESENTA TION P ROPOSALS The Atlas Society is currently planning the

Atlas Arteria Investor Presentation July 2018 Important notice and disclaimer Disclaimer Atlas

ATLAS Shrugged ATLAS Shrugged Pat O Toole Toole Pat O (with apologies to Ayn Rand and

Macquarie Atlas Roads Limited Macquarie Atlas Roads International Limited 2016 Annual General

World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment th

Highlights and Searches in ATLAS Dave Charlton University of Birmingham on behalf of the ATLAS

Data Management in ATLAS Angelos Molfetas on behalf of the ATLAS DQ2 team 1 ATLAS DDM

H result from ATLAS Lydia Brenner Introduction ATLAS I will try to compare some

ATLAS/CMS Upgrades Yasuyuki Horii Nagoya University on Behalf of the ATLAS and CMS

Project ATLAS Michelle Warf NCDOT EAU Caitlyn Meyer ATLAS GIS Consultant February 25

Atlas Arteria 2018 Full Year Results Presentation 28 February 2019 Important notice and

Atlas Analysis Infrastructure in Atlas Analysis Infrastructure in Japan Japan Hiroshi Sakamoto

CYBERSECURITY STRATEGIES TO MANAGE BUSINESS RISKS A C O N V E R S A T I O N W I T H H O R N E

Learning to Hash with its Application to Big Data Retrieval and Mining o Department of

NAMED DATA NETWORKING IN SCIENTIFIC APPLICATIONS Susmit Shannigrahi, Chengyu Fan and Christos

Care Transitions Network Data Jam October 28, 2016 National Council for Behavioral Health

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall

Logical Foundations of Continuous Query Languages for Data Streams Carlo Zaniolo Carlo Zaniolo

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for - PowerPoint PPT Presentation

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter van Gemmeren (ANL): ATLAS I/O Overview 1 High level overview of ATLAS Input/Output framework and data persistence. Athena: The ATLAS event

Measuring DNSSEC using RIPE Atlas Kaveh Ranjbar RIPE NCC RIPE Atlas Coverage RIPE Atlas 2

ATLAS Searches for SUSY Chris Young, CERN ATLAS Group What have we not looked for? 1 / 37 ATLAS

ATLAS ROOT I/O pt 2 Atlas Hot Topics (with reference to CHEP presentations) Big data

Top Properties from ATLAS Chris Young (CERN), on behalf of ATLAS 27th May 2020 1 / 19 Top

Atlas Summit 2016 C ALL FOR P RESENTA TION P ROPOSALS The Atlas Society is currently planning the

Atlas Arteria Investor Presentation July 2018 Important notice and disclaimer Disclaimer Atlas

ATLAS Shrugged ATLAS Shrugged Pat O Toole Toole Pat O (with apologies to Ayn Rand and

Macquarie Atlas Roads Limited Macquarie Atlas Roads International Limited 2016 Annual General

World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment th

Highlights and Searches in ATLAS Dave Charlton University of Birmingham on behalf of the ATLAS

Data Management in ATLAS Angelos Molfetas on behalf of the ATLAS DQ2 team 1 ATLAS DDM

H result from ATLAS Lydia Brenner Introduction ATLAS I will try to compare some

ATLAS/CMS Upgrades Yasuyuki Horii Nagoya University on Behalf of the ATLAS and CMS

Project ATLAS Michelle Warf NCDOT EAU Caitlyn Meyer ATLAS GIS Consultant February 25

Atlas Arteria 2018 Full Year Results Presentation 28 February 2019 Important notice and

Atlas Analysis Infrastructure in Atlas Analysis Infrastructure in Japan Japan Hiroshi Sakamoto

CYBERSECURITY STRATEGIES TO MANAGE BUSINESS RISKS A C O N V E R S A T I O N W I T H H O R N E

Learning to Hash with its Application to Big Data Retrieval and Mining o Department of

NAMED DATA NETWORKING IN SCIENTIFIC APPLICATIONS Susmit Shannigrahi, Chengyu Fan and Christos

Care Transitions Network Data Jam October 28, 2016 National Council for Behavioral Health

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data &amp; Real Time Data Streams

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall

Logical Foundations of Continuous Query Languages for Data Streams Carlo Zaniolo Carlo Zaniolo

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams