How to make a petabyte ROOT file: proposal for managing data with - PowerPoint PPT Presentation

How to make a petabyte ROOT file: proposal for managing data with columnar granularity Jim Pivarski Princeton University – DIANA October 11, 2017 1 / 12

Motivation: start by stating the obvious ROOT’s selective reading is very important for analysis. Datasets have about a thousand branches 1 , so if you want to plot a quantity from a terabyte dataset with TTree::Draw , you only have to read a few gigabytes from disk. 1 3116 ATLAS MC, 1717 ATLAS data, 2151 CMS MiniAOD, 675+ CMS NanoAOD, 560 LHCb 2 / 12

Motivation: start by stating the obvious ROOT’s selective reading is very important for analysis. Datasets have about a thousand branches 1 , so if you want to plot a quantity from a terabyte dataset with TTree::Draw , you only have to read a few gigabytes from disk. Same for reading over a network (XRootD). auto file = TFile::Open("root://very.far.away/mydata.root"); 1 3116 ATLAS MC, 1717 ATLAS data, 2151 CMS MiniAOD, 675+ CMS NanoAOD, 560 LHCb 2 / 12

Motivation: start by stating the obvious ROOT’s selective reading is very important for analysis. Datasets have about a thousand branches 1 , so if you want to plot a quantity from a terabyte dataset with TTree::Draw , you only have to read a few gigabytes from disk. Same for reading over a network (XRootD). auto file = TFile::Open("root://very.far.away/mydata.root"); This is GREAT. 1 3116 ATLAS MC, 1717 ATLAS data, 2151 CMS MiniAOD, 675+ CMS NanoAOD, 560 LHCb 2 / 12

Conversation with computer scientist So it sounds like you already have a columnar database. 3 / 12

Conversation with computer scientist So it sounds like you already have a columnar database. Not exactly— we still have to manage data as files, rather than columns. 3 / 12

Conversation with computer scientist So it sounds like you already have a columnar database. Not exactly— we still have to manage data as files, rather than columns. What? Why? Couldn’t you just use XRootD to manage (move, backup, cache) columns directly? Why does it matter that they’re inside of files? 3 / 12

Conversation with computer scientist So it sounds like you already have a columnar database. Not exactly— we still have to manage data as files, rather than columns. What? Why? Couldn’t you just use XRootD to manage (move, backup, cache) columns directly? Why does it matter that they’re inside of files? Because. . . because. . . 3 / 12

Evidence that it matters: the CMS NanoAOD project Stated goal: to serve 30–50% of CMS analyses with a single selection of columns. Need to make hard decisions about which columns to keep: reducing more makes data access easier for 50% of analyses while completely excluding the rest. 4 / 12

Evidence that it matters: the CMS NanoAOD project Stated goal: to serve 30–50% of CMS analyses with a single selection of columns. Need to make hard decisions about which columns to keep: reducing more makes data access easier for 50% of analyses while completely excluding the rest. If we really had columnar data management, the problem would be moot: we’d just let the most frequently used 1–2 kB of each event migrate to warm storage while the rest cools. 4 / 12

Evidence that it matters: the CMS NanoAOD project Stated goal: to serve 30–50% of CMS analyses with a single selection of columns. Need to make hard decisions about which columns to keep: reducing more makes data access easier for 50% of analyses while completely excluding the rest. If we really had columnar data management, Instead, we’ll probably put the the problem would be moot: we’d just let the whole small copy (NanoAOD) in most frequently used 1–2 kB of each event warm storage and the whole large migrate to warm storage while the rest cools. copy (MiniAOD) in colder storage. 4 / 12

Evidence that it matters: the CMS NanoAOD project Stated goal: to serve 30–50% of CMS analyses with a single selection of columns. Need to make hard decisions about which columns to keep: reducing more makes data access easier for 50% of analyses while completely excluding the rest. If we really had columnar data management, Instead, we’ll probably put the the problem would be moot: we’d just let the whole small copy (NanoAOD) in most frequently used 1–2 kB of each event warm storage and the whole large migrate to warm storage while the rest cools. copy (MiniAOD) in colder storage. This is artificial. 4 / 12

Evidence that it matters: the CMS NanoAOD project Stated goal: to serve 30–50% of CMS analyses with a single selection of columns. Need to make hard decisions about which columns to keep: reducing more makes data access easier for 50% of analyses while completely excluding the rest. If we really had columnar data management, Instead, we’ll probably put the the problem would be moot: we’d just let the whole small copy (NanoAOD) in most frequently used 1–2 kB of each event warm storage and the whole large migrate to warm storage while the rest cools. copy (MiniAOD) in colder storage. This is artificial. There’s a steep popularity distribution across columns, but we cut it abruptly with file schemas (data tiers). 4 / 12

Except for the simplest TTree structures, we can’t pull individual branches out of a file and manage them on their own. 5 / 12

Except for the simplest TTree structures, we can’t pull individual branches out of a file and manage them on their own. But you have XRootD! 5 / 12

Except for the simplest TTree structures, we can’t pull individual branches out of a file and manage them on their own. But you have XRootD! Yes, but only ROOT knows how to interpret a branch’s relationship with other branches. 5 / 12

What would it look like if we could? CREATE TABLE derived_data AS SELECT pt, eta, phi, deltaphi**2 + deltaeta**2 AS deltaR FROM original_data WHERE deltaR < 0.2; creates a new derived data table from original data , but links , rather than copying , pt , eta , and phi . 2 2 Implementation dependent, but common. “ WHERE ” selection may be implemented with a stencil. 6 / 12

What would it look like if we could? CREATE TABLE derived_data AS SELECT pt, eta, phi, deltaphi**2 + deltaeta**2 AS deltaR FROM original_data WHERE deltaR < 0.2; creates a new derived data table from original data , but links , rather than copying , pt , eta , and phi . 2 If original data is deleted, the database would not delete pt , eta , and phi , as they’re in use by derived data . 2 Implementation dependent, but common. “ WHERE ” selection may be implemented with a stencil. 6 / 12

What would it look like if we could? CREATE TABLE derived_data AS SELECT pt, eta, phi, deltaphi**2 + deltaeta**2 AS deltaR FROM original_data WHERE deltaR < 0.2; creates a new derived data table from original data , but links , rather than copying , pt , eta , and phi . 2 If original data is deleted, the database would not delete pt , eta , and phi , as they’re in use by derived data . For data management, this is a very flexible system, as columns are a more granular unit for caching and replication. 2 Implementation dependent, but common. “ WHERE ” selection may be implemented with a stencil. 6 / 12

What would it look like if we could? CREATE TABLE derived_data AS SELECT pt, eta, phi, deltaphi**2 + deltaeta**2 AS deltaR FROM original_data WHERE deltaR < 0.2; creates a new derived data table from original data , but links , rather than copying , pt , eta , and phi . 2 If original data is deleted, the database would not delete pt , eta , and phi , as they’re in use by derived data . For data management, this is a very flexible system, as columns are a more granular unit for caching and replication. For users, there is much less cost to creating derived datasets— many versions of corrections and cuts. 2 Implementation dependent, but common. “ WHERE ” selection may be implemented with a stencil. 6 / 12

Idea #1. Cast data from ROOT files into a well-known standard for columnar, hierarchical data; manage those columns individually in an object store like Ceph. 7 / 12

Idea #1. Cast data from ROOT files into a well-known standard for columnar, hierarchical data; manage those columns individually in an object store like Ceph. 1. Apache Arrow is one such standard. It’s similar to ROOT’s splitting format but permits O (1) random access and splits down to all levels of depth. 7 / 12

Idea #1. Cast data from ROOT files into a well-known standard for columnar, hierarchical data; manage those columns individually in an object store like Ceph. 1. Apache Arrow is one such standard. It’s similar to ROOT’s splitting format but permits O (1) random access and splits down to all levels of depth. 2. PLUR or PLURP is my subset of the above with looser rules about how data may be referenced. Acronym for the minimum data model needed for physics: Primitives , Lists , Unions , Records , and maybe Pointers (beyond Arrow). 7 / 12

Idea #1. Cast data from ROOT files into a well-known standard for columnar, hierarchical data; manage those columns individually in an object store like Ceph. 1. Apache Arrow is one such standard. It’s similar to ROOT’s splitting format but permits O (1) random access and splits down to all levels of depth. 2. PLUR or PLURP is my subset of the above with looser rules about how data may be referenced. Acronym for the minimum data model needed for physics: Primitives , Lists , Unions , Records , and maybe Pointers (beyond Arrow). (the “standard database” approach) 7 / 12

How to make a petabyte ROOT file: proposal for managing data with - PowerPoint PPT Presentation

How to make a petabyte ROOT file: proposal for managing data with columnar granularity Jim Pivarski Princeton University DIANA October 11, 2017 1 / 12 Motivation: start by stating the obvious ROOTs selective reading is very important

PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

Root River Fisheries Root River Fisheries Craig Helker Craig Helker WDNR WDNR Root River

File Management What is a file? Elements of file management File organization

Certicate Transparency Root Explorer Nikita Korzhitskii Niklas Carlsson Web Public Key

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

Thoughts on F-Root Futures Jeff Osborn President, Internet Systems Consortium Whats the

Root Cause Analysis 1 Root Cause Analysis Root Cause Analysis is a method that is used to

Square Root of Not: Square Root of Not: . . . A Major Difference Between Square Root of

F root anycast: What, why and how Joo Damas ISC Overview What is a root server? What is

Rapid Replication of Multi- Petabyte File Systems Justin Sybrandt Jason Hick (NSF award number

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

File Systems: Semantics & Structure What is a File a file is a named collection of

What if... There is no file with the name given to the File constructor: new File

Flexible Rendering for Multiple Platforms tobias.persson@bitsquid.se Breakdown Introduction

Line Segments and Triangles A line drawing = set of line segments + set of faces. We need to

Object Modeling Chapter 5, Analysis: Exercise 2.6 Draw a sequence diagram for the warehouseOnFire

Essence of Linear Algebra Linear combinations and Span Some cool intuitions The box game:

Apache Arrow & TDataFrame Giulio Eulisse (CERN) 22 Mar 2018 1 Apache Arrow: the project

CS224N NLP Bill MacCartney Gerald Penn Winter 2011 Borrows slides from Chris Manning, Bob

Soc Society for Nutrition Education a and Behavior Annual Con onference Opening Session July

OpenCms Days 2011 Workshop Track: The OpenCms 8 Demo Template Modules in Detail Polina Smagina,

How to make a petabyte ROOT file: proposal for managing data with - PowerPoint PPT Presentation

How to make a petabyte ROOT file: proposal for managing data with columnar granularity Jim Pivarski Princeton University DIANA October 11, 2017 1 / 12 Motivation: start by stating the obvious ROOTs selective reading is very important

PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO PRESS ROOT TO CONTINUE: PRESS ROOT TO

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

Root River Fisheries Root River Fisheries Craig Helker Craig Helker WDNR WDNR Root River

File Management What is a file? Elements of file management File organization

Certicate Transparency Root Explorer Nikita Korzhitskii Niklas Carlsson Web Public Key

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

Thoughts on F-Root Futures Jeff Osborn President, Internet Systems Consortium Whats the

Root Cause Analysis 1 Root Cause Analysis Root Cause Analysis is a method that is used to

Square Root of Not: Square Root of Not: . . . A Major Difference Between Square Root of

F root anycast: What, why and how Joo Damas ISC Overview What is a root server? What is

Rapid Replication of Multi- Petabyte File Systems Justin Sybrandt Jason Hick (NSF award number

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

What if... There is no file with the name given to the File constructor: new File

Flexible Rendering for Multiple Platforms tobias.persson@bitsquid.se Breakdown Introduction

Line Segments and Triangles A line drawing = set of line segments + set of faces. We need to

Object Modeling Chapter 5, Analysis: Exercise 2.6 Draw a sequence diagram for the warehouseOnFire

Essence of Linear Algebra Linear combinations and Span Some cool intuitions The box game:

Apache Arrow &amp; TDataFrame Giulio Eulisse (CERN) 22 Mar 2018 1 Apache Arrow: the project

CS224N NLP Bill MacCartney Gerald Penn Winter 2011 Borrows slides from Chris Manning, Bob

Soc Society for Nutrition Education a and Behavior Annual Con onference Opening Session July

OpenCms Days 2011 Workshop Track: The OpenCms 8 Demo Template Modules in Detail Polina Smagina,

File Systems: Semantics & Structure What is a File a file is a named collection of

Apache Arrow & TDataFrame Giulio Eulisse (CERN) 22 Mar 2018 1 Apache Arrow: the project