A Foundation for Automated Placement of Data Douglass Otstott, Sean - PowerPoint PPT Presentation

A Foundation for Automated Placement of Data Douglass Otstott, Sean Williams, Latchesar Ionkov, Michael Lang, Ming Zhao LA-UR-17-22686 Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA

Memory and Storage are Converging • Persistent storage on the memory bus (NVDIMMs) • Remote memory (GenZ) • Which memory bus? (DRAM, HBM, GPU memory, … ) Los Alamos National Laboratory 10/22/2019 2

Data Layouts are Different Dataset Memory N N ... ... ... ... pressure=5.1 M M ... ... ... N temp=33.1 ... ... ... ... density=0.4 ... temp pressure M ... N ... ... ... ... M ... density ... ... Storage row 1 row 2 row 3 row M data ... ... ... ... pressure=5.1 density=0.4 row 1 row 2 row 3 row M ... temperature ... ... ... ... 3

Data Sharing • With less distinction — more confusion • With more complicated workloads there are a lot of options • In situ, in transit, … • No generic way for sharing data in memory between applications • ad-hoc • in-memory file system • What data format? • data producer • data consumer 4

Need for Data Management Service • Handles all data that application shares • Moves data between the many memory and storage layers • Allows data layout transformations • This work • describes the foundations for building such service • allows data movement and transformation • doesn’t include the support for global data optimizations 5

Components • Name server • handles metadata • global • Runtime • runs on every node • handles local data • talks to runtimes on other nodes • Global/Local placement services (not included) • optimize data locality and format • Application (not included) 6

Data Model • Dataset • types • primitive types (integer, floating point, string) • structs • (multidimensional) arrays • variables • Fragments • subsets of a dataset • types - based on dataset types • variables - based on dataset variables • Versions • provide consistent view of distributed dataset 7

Declarative Data Language & Transformations • For the computers: transformation rules • For the user: define the abstract that convert data between dataset and dataset and subsets subsets fragment dataset { var p struct { S 0000 field a pa T 0000 a, b, c float64 } dest dest } S 0008 field a pba T 0004 dest field b dest S 0004 fragment default { dest viz var p = p } dest dest S 0000 field a fragment viz { p T 0000 fi e l d b S 0004 var pa { a } = p field c var pba { b, a } = p S 0008 } default 8

• API • Operations • create object • object registered in the name server • name • dataset description • runtime • attach fragment • finds the locations of necessary • dataset name fragments that contain the • fragment description relevant data and version • version • brings the data and transforms it to the required format • publish fragment • runtime • data pointer • registers the fragment version in • version the name server • keeps copy of the data in memory or local storage 9

F11 F12 F13 • Can be used for A communication between ranks F21 • Fragment can have read- only and read-write parts of complex geometry F22 F31 10

Results 140000 create_object attach • Synthetic benchmark 120000 publish 100000 Operations/sec • Evaluates the overhead 80000 of the operations 60000 • Single name server 40000 20000 • 16 ranks per node 0 16 32 64 128 256 512 1024 2048 Ranks 11

Results: SNAP checkpoint 100 • Original SNAP (no checkpoints) vs. adding 80 the checkpoint code 60 Time(s) • Evaluate the overhead 40 20 RT/NS SNAP SNAP 0 16 32 64 128 256 512 1024 2048 Ranks 12

RT/NS SNAP 3500 MPI-IO SNAP 3000 2500 Time(s) 2000 1500 1000 500 0 16 32 64 128 256 512 1024 2048 Ranks 13

120 100 80 Time(s) 60 40 20 N to N restart N to N over 2 restart N to N over 4 restart 0 4 8 16 32 64 128 256 512 1024 2048 Ranks 14

Results: VPIC 90 VPIC I/O RT/NS I/O 80 RT/NS No I/O 70 Percent Overhead 60 50 40 30 20 10 0 16 32 64 128 256 512 1024 Ranks 15

Conclusions • Scalable data service • Easy to use API • Future • Integration with data placement services • Additional applications (E3SM) • Scalable name server 16

A Foundation for Automated Placement of Data Douglass Otstott, Sean - PowerPoint PPT Presentation

A Foundation for Automated Placement of Data Douglass Otstott, Sean Williams, Latchesar Ionkov, Michael Lang, Ming Zhao LA-UR-17-22686 Managed by Triad National Security, LLC for the U.S. Department of Energys NNSA Memory and Storage are

Automated Design of Digital Automated Design of Digital Automated Design of Digital Automated

VLSI Placement Sadiq M. Sait & Habib Youssef December 1995 Placement Placement is the

TimberWolf 7.0 Placement Perform TimberWolf placement Based on the given standard cell

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

Automated Placement for Custom Digital Designs Tung-Chieh Chen Physical Design Group, SpringSoft

Overview of Automated Bus Consortium Program Accelerating automated technology for transit

Automated Reasoning: Some Successes and New Challenges Predrag Jani ci c

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Student Placement Task Force Student placement option presentation Maize Board of Education |

College Placement Presentation October 30, 2019 Dave Bucciero Director of College Placement

ADVANCED PLACEMENT The purpose of the Advanced Placement program is to provide the students with

Advanced Placement Physics 1 Advanced Placement Physics 2 Dr. Matt Frederickson Dr. Kevin

College Placement Presentation October 24, 2018 Dave Bucciero Director of College Placement

INCREASING CIRCULATION BOOK DISPLAYS THROUGH 2 Placement PLACEMENT LIBRARY GEOGRAPHY

BonnPlace : A Self-Stabilizing Placement Framework Ulrich Brenner, Anna Hermann, Nils Hoppmann,

The ISPD 2006 Placement Contest and Benchmark Suite Gi-Joon Nam, Charles J. Alpert, Paul G.

TensorFlow Huge machine learning community Programming APIs for many languages Abstraction layer

Extending In-Memory Relational Database Engines with Native Graph Support EDBT18 Mohamed S.

Exploring Neural Mechanisms for Prediction Keith L. Downing The Norwegian University of Science

Hacking PostgreSQL Stephen Frost Crunchy Data stephen@crunchydata.com FOSDEM 2019 February 3,

Database System Implementation Joy Arulraj Slides are derived from courses developed by Thomas

Overview for today Natural Language Processing with NNs [~15m] Supervised

Designing Computer Systems for Software 2.0 Kunle Olukotun Stanford University SambaNova

HOW TO USE JAVA STREAMS TO ACCESS EXISTING DATA WITH ULTRA-LOW LATENCY PER MINBORG, CTO,

A Foundation for Automated Placement of Data Douglass Otstott, Sean - PowerPoint PPT Presentation

A Foundation for Automated Placement of Data Douglass Otstott, Sean Williams, Latchesar Ionkov, Michael Lang, Ming Zhao LA-UR-17-22686 Managed by Triad National Security, LLC for the U.S. Department of Energys NNSA Memory and Storage are

Automated Design of Digital Automated Design of Digital Automated Design of Digital Automated

VLSI Placement Sadiq M. Sait &amp; Habib Youssef December 1995 Placement Placement is the

TimberWolf 7.0 Placement Perform TimberWolf placement Based on the given standard cell

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

Automated Placement for Custom Digital Designs Tung-Chieh Chen Physical Design Group, SpringSoft

Overview of Automated Bus Consortium Program Accelerating automated technology for transit

Automated Reasoning: Some Successes and New Challenges Predrag Jani ci c

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Student Placement Task Force Student placement option presentation Maize Board of Education |

College Placement Presentation October 30, 2019 Dave Bucciero Director of College Placement

ADVANCED PLACEMENT The purpose of the Advanced Placement program is to provide the students with

Advanced Placement Physics 1 Advanced Placement Physics 2 Dr. Matt Frederickson Dr. Kevin

College Placement Presentation October 24, 2018 Dave Bucciero Director of College Placement

INCREASING CIRCULATION BOOK DISPLAYS THROUGH 2 Placement PLACEMENT LIBRARY GEOGRAPHY

BonnPlace : A Self-Stabilizing Placement Framework Ulrich Brenner, Anna Hermann, Nils Hoppmann,

The ISPD 2006 Placement Contest and Benchmark Suite Gi-Joon Nam, Charles J. Alpert, Paul G.

TensorFlow Huge machine learning community Programming APIs for many languages Abstraction layer

Extending In-Memory Relational Database Engines with Native Graph Support EDBT18 Mohamed S.

Exploring Neural Mechanisms for Prediction Keith L. Downing The Norwegian University of Science

Hacking PostgreSQL Stephen Frost Crunchy Data stephen@crunchydata.com FOSDEM 2019 February 3,

Database System Implementation Joy Arulraj Slides are derived from courses developed by Thomas

Overview for today Natural Language Processing with NNs [~15m] Supervised

Designing Computer Systems for Software 2.0 Kunle Olukotun Stanford University SambaNova

HOW TO USE JAVA STREAMS TO ACCESS EXISTING DATA WITH ULTRA-LOW LATENCY PER MINBORG, CTO,

VLSI Placement Sadiq M. Sait & Habib Youssef December 1995 Placement Placement is the