Modular Data Storage with Anvil Mike Mammarella Shant Hovsepian - PowerPoint PPT Presentation

Modular Data Storage with Anvil Mike Mammarella Shant Hovsepian Eddie Kohler

Motivation • Data storage and databases drive modern applications • Facebook, Twitter, Google Mail, system logs, even Firefox • Yet hand-built data stores can outperform by 100x! [Boncz] • Changing the layout of stored data can substantially improve performance • Recent systems implement custom storage engines • Custom storage engines are hard to write • Reason: Must be consistent, fast for both reads and writes • What if you want to experiment with a new layout? 2

The Question Can we give applications a simple and efficient modular framework, supporting a wide variety of different data layouts, enabling better performance? 3

The Question Can we give applications a simple and efficient modular framework, supporting a wide variety of different data layouts, enabling better performance? Yes we can! 3

Anvil • Fine-grained modules called dTables • Composable to build complex data stores from simple parts • Easy to implement new dTables to store specialized data • Isolates all writing to dedicated writable dTables • Many data storage layouts only add or change read-only dTables, which are significantly easier to implement • Good disk access characteristics come as well • Unifying dTables combine write- and read-optimized dTables 4

Contributions • Fine-grained, modular dTable design • Core dTables • Overlay dTable, Managed dTable, Exception dTable • Anvil implementation • Shows that such a system can be fast 5

dTables • Key/value store • Keys are integers, floats, strings, or blobs • Values are byte arrays • Iterators support in-order traversal • Most are read-only 6

dTables • Key/value store • Keys are integers, floats, strings, or blobs • Values are byte arrays • Iterators support in-order traversal • Most are read-only dTable iterator blob lookup(key k) key key() bool insert(key k, blob v) blob value() bool remove(key k) bool valid() iter iterator() bool next() Slightly simplified, but not much! 6

dTable Layering • Applications (and frontends) use the dTable interface • But so do other dTables! • Transform data • Add indices • Construct complex functionality from simple pieces 7

dTable Layering • Applications (and frontends) use the dTable interface • But so do other dTables! • Transform data • Add indices • Construct complex functionality from simple pieces lookup() dTable lookup() lookup() 7

dTable Layering • Applications (and frontends) use the dTable interface • But so do other dTables! • Transform data • Add indices • Construct complex functionality from simple pieces lookup() iterator() dTable dTable lookup() lookup() iterator() 7

dTable Layering • Applications (and frontends) use the dTable interface • But so do other dTables! • Transform data • Add indices • Construct complex functionality from simple pieces lookup() iterator() iter wrap dTable dTable iter lookup() lookup() iterator() 7

An Application-Specific Backend Managed dTable Journal dTable Overlay dTable Bloom dTable Exception dTable State Dict. dTable B-tree dTable Array dTable Linear dTable 8

Application-Specific Data Example • Want to store the state of residence of customers • Identified by mostly-contiguous IDs • Most live in the US, but a few don’t • Move between states occasionally • Common case could be stored efficiently as an array of state IDs • But don’t want to penalize the uncommon case • Want transactional semantics 9

Application-Specific Data Example • Want to store the state of residence of customers • Identified by mostly-contiguous IDs • Most live in the US, but a few don’t • Move between states occasionally • Common case could be stored efficiently as an array of state IDs • But don’t want to penalize the uncommon case - Mostly-contiguous IDs • Want transactional semantics - Most live in the US - Some live elsewhere - Don’t penalize them - Occasionally relocate 9

Array dTable • Stores an array of fixed-size values • Keys must be contiguous integers • Locating data items becomes constant time • Can’t store some types of data • Read-only Array 10

Storing Common Case Data Efficiently Managed dTable Journal dTable Overlay dTable Bloom dTable - Mostly-contiguous IDs - Most live in the US Exception dTable - Some live elsewhere - Don’t penalize them - Occasionally relocate State Dict. dTable B-tree dTable Array dTable Linear dTable 11

Storing Common Case Data Efficiently Managed dTable Journal dTable Overlay dTable Bloom dTable - Mostly-contiguous IDs - Most live in the US Exception dTable - Some live elsewhere - Don’t penalize them “California” - Occasionally relocate State Dict. dTable B-tree dTable 31 Array dTable Linear dTable 11

Storing Common Case Data Efficiently Managed dTable Journal dTable Overlay dTable Bloom dTable Mostly-contiguous IDs ✔ Most live in the US ✔ Exception dTable - Some live elsewhere - Don’t penalize them “California” - Occasionally relocate State Dict. dTable B-tree dTable 31 Array dTable Linear dTable 11

Exception dTable 12

Exception dTable • Many data sets mostly but not entirely conform to some pattern that would allow more efficient storage 12

Exception dTable • Many data sets mostly but not entirely conform to some pattern that would allow more efficient storage • Exception dTable combines a “restricted” dTable with an “unrestricted” dTable • Sentinel value in restricted dTable indicates that the unrestricted dTable should be checked 12

Exception dTable • Many data sets mostly but not entirely conform to some pattern that would allow more efficient storage • Exception dTable combines a “restricted” dTable with an “unrestricted” dTable • Sentinel value in restricted dTable indicates that the unrestricted dTable should be checked • Simple unrestricted dTable: Linear dTable 12

Storing All Data Managed dTable Journal dTable Overlay dTable Bloom dTable Mostly-contiguous IDs ✔ Most live in the US ✔ Exception dTable - Some live elsewhere - Don’t penalize them - Occasionally relocate State Dict. dTable B-tree dTable Array dTable Linear dTable 13

Storing All Data Managed dTable Journal dTable Overlay dTable Bloom dTable Mostly-contiguous IDs ✔ Most live in the US ✔ Exception dTable Some live elsewhere ✔ - Don’t penalize them - Occasionally relocate State Dict. dTable B-tree dTable Array dTable Linear dTable 13

Storing All Data Managed dTable Journal dTable Overlay dTable Bloom dTable Mostly-contiguous IDs ✔ Most live in the US ✔ Exception dTable Some live elsewhere ✔ Don’t penalize them ✔ - Occasionally relocate State Dict. dTable B-tree dTable Array dTable Linear dTable 13

General dTables • We’ve seen how to build a read-only data store specialized for an application-specific layout • The pieces can be recombined for other layouts • Next section shows how to build a writable store • Writable store dTables are common to many layouts • Split data write functionality and management policies 14

Writable dTables • Array dTable is hard to update transactionally • Idea: use separate writable dTables • Can be optimized for writing (e.g. a log) • Several design questions • Implementation of write-optimized dTable • Building an efficient store from write-optimized and read-only pieces 15

Fundamental Writable dTable 16

Fundamental Writable dTable • Appends new/updated data to a shared journal Journal 16

Modular Data Storage with Anvil Mike Mammarella Shant Hovsepian - PowerPoint PPT Presentation

Modular Data Storage with Anvil Mike Mammarella Shant Hovsepian Eddie Kohler Motivation Data storage and databases drive modern applications Facebook, Twitter, Google Mail, system logs, even Firefox Yet hand-built data stores can

Anvil Attachments Hydraulic Buckets, Electro-Hydraulic Buckets & Grapples, Grapples, Digging

Modular Budgets Modular Budgets Modular Budgets Modular Budgets OSPA NANO Session 10/25/06

Modular Data Storage with Anvil Mike Mamarella, Shant Hovsepian, Eddie Kohler Presented by

1 TEMPORARY MODULAR HOUSING Meeting Purpose Learn how Temporary Modular Housing will allow

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Managing Modular Software for your NuGet, C++ and Java Development Agenda Modular software

Laser heating in the diamond anvil cell: The basics Denis ANDRAULT Laboratoire Magmas et

GRABS ONBOARD MERCHANT SHIPS FOR ANVIL ATTACHMENTS By OCEANUS MARITIME SERVICES LLC GRABS ARE

COUNCIL OF COUNCILS December 6, 2014 Anvil Centre New Westminster COUNCIL OF COUNCILS METRO

PTFE Pipe Slide Assemblies Overview Application Anvil PTFE pipe slide assemblies are designed to

South Florida deep South Florida deep convection: Convective convection: Convective

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

resources T M Modular Gold Plant MGP Environmentally Friendly True Modular

Modular Robots Modular Robots by D. Dibbern and A. Werdermann by D. Dibbern and A. Werdermann

WELCOME Temporary Modular Housing Community Information Session Thank you for joining us!

Modular Home Virginia Building Solutions For Home Owners: Variety Modular homes look like

Question 1 Lecture Outline A 42 yo woman is brought to the ED pulseless. Resuscitation is

Early Twentieth-Century Fiction e20fic19.blogs.rutgers.edu Prof. Andrew Goldstone

building software with ease FOSS for Scientists devroom @ FOSDEM13 February 2nd 2013

Experimental design and data analysis Arsenii, Feyza & Samarth Content Overview

Genotyping structural variants in TOPMed using pangenome graphs Jean Monlong February 12-13,

Automated Curriculum Learning for Reinforcement Learning Feryal Behbahani Jeju Deep Learning

An exchange format for multimodal annotations Thomas Schmidt, Susan Duncan, Oliver Ehmer,

E vil men have tried to destroy the Word of God since God inspired prophets and apostles to write

Modular Data Storage with Anvil Mike Mammarella Shant Hovsepian - PowerPoint PPT Presentation

Modular Data Storage with Anvil Mike Mammarella Shant Hovsepian Eddie Kohler Motivation Data storage and databases drive modern applications Facebook, Twitter, Google Mail, system logs, even Firefox Yet hand-built data stores can

Anvil Attachments Hydraulic Buckets, Electro-Hydraulic Buckets &amp; Grapples, Grapples, Digging

Modular Budgets Modular Budgets Modular Budgets Modular Budgets OSPA NANO Session 10/25/06

Modular Data Storage with Anvil Mike Mamarella, Shant Hovsepian, Eddie Kohler Presented by

1 TEMPORARY MODULAR HOUSING Meeting Purpose Learn how Temporary Modular Housing will allow

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Managing Modular Software for your NuGet, C++ and Java Development Agenda Modular software

Laser heating in the diamond anvil cell: The basics Denis ANDRAULT Laboratoire Magmas et

GRABS ONBOARD MERCHANT SHIPS FOR ANVIL ATTACHMENTS By OCEANUS MARITIME SERVICES LLC GRABS ARE

COUNCIL OF COUNCILS December 6, 2014 Anvil Centre New Westminster COUNCIL OF COUNCILS METRO

PTFE Pipe Slide Assemblies Overview Application Anvil PTFE pipe slide assemblies are designed to

South Florida deep South Florida deep convection: Convective convection: Convective

&gt; SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

resources T M Modular Gold Plant MGP Environmentally Friendly True Modular

Modular Robots Modular Robots by D. Dibbern and A. Werdermann by D. Dibbern and A. Werdermann

WELCOME Temporary Modular Housing Community Information Session Thank you for joining us!

Modular Home Virginia Building Solutions For Home Owners: Variety Modular homes look like

Question 1 Lecture Outline A 42 yo woman is brought to the ED pulseless. Resuscitation is

Early Twentieth-Century Fiction e20fic19.blogs.rutgers.edu Prof. Andrew Goldstone

building software with ease FOSS for Scientists devroom @ FOSDEM13 February 2nd 2013

Experimental design and data analysis Arsenii, Feyza &amp; Samarth Content Overview

Genotyping structural variants in TOPMed using pangenome graphs Jean Monlong February 12-13,

Automated Curriculum Learning for Reinforcement Learning Feryal Behbahani Jeju Deep Learning

An exchange format for multimodal annotations Thomas Schmidt, Susan Duncan, Oliver Ehmer,

E vil men have tried to destroy the Word of God since God inspired prophets and apostles to write

Anvil Attachments Hydraulic Buckets, Electro-Hydraulic Buckets & Grapples, Grapples, Digging

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

Experimental design and data analysis Arsenii, Feyza & Samarth Content Overview