Solving Everyday Data Problems with FoundationDB Ryan Worl - PowerPoint PPT Presentation

Solving Everyday Data Problems with FoundationDB Ryan Worl (ryantworl@gmail.com) Consultant

About Me ● Independent software engineer ● Today’s real example is from ClickFunnels ● > 70,000 customers, > 1.8B of payments processed ● Billions of rows of OLTP data (Amazon Aurora MySQL) ● Ryan Worl ● @ryanworl on Twitter ● ryantworl@gmail.com

Agenda ● How FoundationDB Works ● “Everyday” data problems ● Why FoundationDB can be the solution ● ClickFunnels’ recent data problem ● FoundationDB for YOUR data problems

Coordinators elect & heartbeat Cluster Controller (Paxos) Coordinators store core cluster state, used like ZooKeeper All processes register themselves with the Cluster Controller Cluster Controller Coordinators 1

Cluster Controller (CC) assigns Master Role Master 2

CC assigns TLog, Proxy, Resolver, and Storage Roles Proxy 3

CC assigns TLog, Proxy, Resolver, and Storage Roles Resolver 3

CC assigns TLog, Proxy, Resolver, and Storage Roles TLog 3

CC assigns TLog, Proxy, Resolver, and Storage Roles Storage 3

On Start: Your App Connects and Asks CC For Topology YOUR APP 4

Client Library Asks a Proxy for Key Range to Storage Mapping YOUR APP 4

Data Distribution Runs On Master, Key Map Stored in Database YOUR APP FF 4

Start a Transaction: Ask Master for Latest Version YOUR APP 5

Start a Transaction: Ask Master for Latest Version (Batched) YOUR APP 100 5

Perform Reads at Read Version Directly to Storage YOUR APP 00 55 AF FF 6

Consequences ● All replicas participate in reads ● Client load balances among different replicas ● Failures of all but one replica for each range keep the system alive @ryanworl

Buffer Writes Locally Until Commit YOUR APP 00 00 55 55 AF AF FF FF 7

Commit Part 1: Send R/W Conflict Ranges + Mutations to Proxy YOUR APP 00 00 55 55 AF AF FF FF 8

Part 2: Proxy Batches Txns to Master To Get Commit Version YOUR APP A H T Z 00 00 55 55 AF AF FF FF 8

Consequences ● Master is not a throughput bottleneck ● Intelligent batching makes Master workload small ● Conflict ranges and mutations are not sent to Master at all @ryanworl

Part 3: Send Conflict Ranges to Resolvers for Conflict Detection YOUR APP A H T Z 00 00 55 55 AF AF FF FF 8

Part 4: If Isolation Passes, Send Mutations to Relevant TLogs YOUR APP A H T Z 00 00 55 55 AF AF FF FF 8

Part 5: (Async) Storages Pull Mutations from Their Buddy TLogs YOUR APP A H T Z 00 00 55 55 AF AF FF FF 8

Failure Detection: Cluster Controller Heartbeats YOUR APP A H T Z 00 00 55 55 AF AF FF FF 9

Initiate Recovery on Any Transaction Role Failure YOUR APP A H T Z 00 00 55 55 AF AF FF FF ฀฀ 10

Cluster Controller Failure: Coordinators Elect New One YOUR APP A H T Z 00 00 55 55 AF AF FF FF ฀฀ 11

Storage Server Failure: No Recovery, Repair in Background YOUR APP ฀฀ A H T Z 00 00 55 55 AF AF FF FF 12

Status Quo ● Most apps start uncomplicated ● One database, one queue ● … five years later, a dozen data systems @ryanworl

“Everyday” Data Problems? https://twitter.com/coda @ryanworl

“Microservices” ● Can make this worse Service C Service A Service B @ryanworl

Why is this a problem? ● Operational costs ○ Administrative costs ○ Duplicated data ● Development costs ○ Atomicity mostly ignored in the real world ○ Corrupted data extremely common @ryanworl

Why is this a problem? ● Security costs ○ More systems = More risk ● Error handling never exercised ○ “De-coupled”, “redundant”, “fault tolerant” services mostly a myth @ryanworl

Why is this a problem? ● “Managed cloud services” ○ They will never pick up the pieces ○ They will reboot the machine for you… ○ A weak system run by someone else is still weak ■ e.g. data loss from async replication @ryanworl

Why is FoundationDB a solution? ● Build anything you want or need ● Multiple systems in one cluster ● Eventually consistent models easier to build too ○ OLTP Change Log OLAP @ryanworl

ClickFunnels’ Recent Data Problem

The Everyday Data Problem ● “Smart Lists” based on user-defined rules ● Running against billions of rows in an OLTP database ● Both user-facing and automated (100s of QPS)

The Everyday Data Problem SELECT contacts.id from contacts LEFT JOIN emails on emails.contact_id = contacts.id LEFT JOIN templates on templates.id = emails.template_id WHERE ...

Breaking it down ● Data volume = 100s of GB ● Complex joins and row-oriented storage ● Indexes can’t satisfy every query efficiently ● Aurora = single threaded queries ● Really just set operations on integers at the core... @ryanworl

Bitmap indexes!

Bitmap Indexes 101 ● Roaring Bitmap Library (roaringbitmap.org) ● Space usage proportional to number of set bits ● Billions of operations per second (SIMD) ● Easily parallelizable (multi-core and distributed) @ryanworl

Bitmap Indexes 101 ● Multi-minute evaluation times of rules in Aurora ● Under 100ms with bitmaps @ryanworl

New Possibilities ● Evaluating rules on every customer website page view ● “How many people will this rule match?” in real time ● Stats and analytics can adapt to this format ● E.g. unique email opens per hour with rules applied for fancy charts ● Pages load instantly even for the largest customers @ryanworl

How to get there - Step One ● Replicate Aurora binlog into FoundationDB ● Write volume not high enough to worry about sharding ● Example of a log structure [“binlog”, VersionStamp] => MySQL Binlog as JSON @ryanworl

How to get there - Step Two ● Chunk bitmaps into small segments (2^18 is fine) ● Evaluate rules, set bits where rules match ● One writer at a time for low contention ● Example of storing a large object among many keys [“bitmaps”, rule_id, chunk_id] => Bitmap Chunk

How to get there - Step Three ● Do a range read for every chunk for each rule ● Parallelize by evaluating different ranges ● Classic fork-join pattern [“bitmaps”, rule_1, chunk_1] => Chunk CORE 1 [“bitmaps”, rule_2, chunk_1] => Chunk [“bitmaps”, rule_1, chunk_N] => Chunk CORE 2 [“bitmaps”, rule_2, chunk_N] => Chunk

Experimental Results ● Real-world queries take 100ms ● One large box today ● Distributed later with little extra work ● HA from auto-scaling group and load balancer ● < 3000 lines of JavaScript + RoaringBitmap @ryanworl

YOUR Everyday Data Problems ● FoundationDB’s performance ○ Concurrency Potential ○ Coordination Avoidance ● Break down the transaction critical path @ryanworl

https://en.wikipedia.org/wiki/Amdahl%27s_law

~ 275 allocations / second 160ms latency and growing https://www.activesphere.com/blog/2018/08/05/high-contention-allocator

> 3500 allocations / second ~ 13ms latency @ high concurrency https://www.activesphere.com/blog/2018/08/05/high-contention-allocator

YOUR Everyday Data Problems ● Tables, logs, queues, secondary indexes ● Simple to implement with little code ● Freedom to build your exact solution ● … without the explosion of data systems ● One cluster to manage @ryanworl

Questions ● Email or tweet me if you have questions or want to talk about specific use cases for FoundationDB @ryanworl ryantworl@gmail.com

Solving Everyday Data Problems with FoundationDB Ryan Worl - PowerPoint PPT Presentation

Solving Everyday Data Problems with FoundationDB Ryan Worl (ryantworl@gmail.com) Consultant About Me Independent software engineer Todays real example is from ClickFunnels > 70,000 customers, > 1.8B of payments processed

Evan Tschannen 1 Evan Tschannen Worked on FoundationDB for 8 years Touched every

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Solving Word Problems The strategy for solving word problems, presented in written form, may be

Solving Percent Problems Word Problems Find a Pattern Estimation Problems Fraction Problems

Foundations of AI 3. Solving Problems by Searching Problem-Solving Agents, Formulating

Contents Foundations of Artificial Intelligence Problem-Solving Agents 1 3. Solving Problems by

Introduction to the Engineering Design Cycle Solving Everyday Problems Using the Engineering

Continuous Improvement Solving Problems That Change Lives CI Skills Development Problem Solving

Solving Problems by Searching Chapter 3 Ch. 03 p.1/49 Outline Problem-solving agents

Introduction to Data Science: Common observation to be religion, income, frequency where sex and

Solving Evacuation Problems in Polynomial Space Miriam Schlter & Martin Skutella

Solving Very Hard Problems: Cube-and-Conquer, a Hybrid SAT Solving Method Marijn J.H. Heule

Wheelchair use in everyday life Stephen Sprigle Why understand wheelchair use in everyday

Everyday Efficiencies Todd L. Montgomery @toddlmontgomery Everyday Efficiencies Why should we

Everyday People in Everyday Places: Moving Communities The Potential for Change through Sport

Everyday products without everyday waste OUR PURPOSE Create a waste free future. OUR MISSION

Storage: File System Implementation Prof. Patrick G. Bridges 1 University of New Mexico The Way

15-721 ADVANCED DATABASE SYSTEMS Lecture #13 Checkpoint Protocols Andy Pavlo / / Carnegie

Shooting Stars in the Sky An Online Algorithm for Skyline Queries Donald Kossmann Frank Ramsak

Git database with bitmap index Kuba Podgrski source{d} All the crazy mental gymnastics with

Incremental Backups ( Good things come in small packages !) John Snow ( yes, I know ) Vladimir

Detecting argument selection defects Andrew Rice *, Eddie Aftandilian, Ciera Jaspan, Emily

OpenPrinting Vector Printer Driver API Printing Summit Lexington 2006 Osamu MIHARA

Motivation Many applications of databases manipulate geographical (2-d) data. Others involve

Solving Everyday Data Problems with FoundationDB Ryan Worl - PowerPoint PPT Presentation

Solving Everyday Data Problems with FoundationDB Ryan Worl (ryantworl@gmail.com) Consultant About Me Independent software engineer Todays real example is from ClickFunnels > 70,000 customers, > 1.8B of payments processed

Evan Tschannen 1 Evan Tschannen Worked on FoundationDB for 8 years Touched every

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Solving Word Problems The strategy for solving word problems, presented in written form, may be

Solving Percent Problems Word Problems Find a Pattern Estimation Problems Fraction Problems

Foundations of AI 3. Solving Problems by Searching Problem-Solving Agents, Formulating

Contents Foundations of Artificial Intelligence Problem-Solving Agents 1 3. Solving Problems by

Introduction to the Engineering Design Cycle Solving Everyday Problems Using the Engineering

Continuous Improvement Solving Problems That Change Lives CI Skills Development Problem Solving

Solving Problems by Searching Chapter 3 Ch. 03 p.1/49 Outline Problem-solving agents

Introduction to Data Science: Common observation to be religion, income, frequency where sex and

Solving Evacuation Problems in Polynomial Space Miriam Schlter &amp; Martin Skutella

Solving Very Hard Problems: Cube-and-Conquer, a Hybrid SAT Solving Method Marijn J.H. Heule

Wheelchair use in everyday life Stephen Sprigle Why understand wheelchair use in everyday

Everyday Efficiencies Todd L. Montgomery @toddlmontgomery Everyday Efficiencies Why should we

Everyday People in Everyday Places: Moving Communities The Potential for Change through Sport

Everyday products without everyday waste OUR PURPOSE Create a waste free future. OUR MISSION

Storage: File System Implementation Prof. Patrick G. Bridges 1 University of New Mexico The Way

15-721 ADVANCED DATABASE SYSTEMS Lecture #13 Checkpoint Protocols Andy Pavlo / / Carnegie

Shooting Stars in the Sky An Online Algorithm for Skyline Queries Donald Kossmann Frank Ramsak

Git database with bitmap index Kuba Podgrski source{d} All the crazy mental gymnastics with

Incremental Backups ( Good things come in small packages !) John Snow ( yes, I know ) Vladimir

Detecting argument selection defects Andrew Rice *, Eddie Aftandilian, Ciera Jaspan, Emily

OpenPrinting Vector Printer Driver API Printing Summit Lexington 2006 Osamu MIHARA

Motivation Many applications of databases manipulate geographical (2-d) data. Others involve

Solving Evacuation Problems in Polynomial Space Miriam Schlter & Martin Skutella