Design Principles for Scaling Multi-core OLTP Under High Contention - PowerPoint PPT Presentation

Design Principles for Scaling Multi-core OLTP Under High Contention Kun Ren, Jose Faleiro , Daniel Abadi Yale University

Conflicts: The scourge of database systems • Logical conflicts • Due to data conflicts between transactions T 1 : Read(x); 0 T 2 : Write(x); • Physical conflicts • Due to contention on internal data-structures

Conflicts: The scourge of database systems • Logical conflicts • Due to data conflicts between transactions T 1 : Addressed via new correctness Read(x 0 ); T 2 : criteria, exploiting semantics Write(x 2 ); • Physical conflicts • Due to contention on internal data-structures Addressed via new protocols, DB architectures

… but conflicts are inevitable • Logical conflicts are application dependent • Logical conflicts directly result in physical conflicts

… but conflicts are inevitable • Logical conflicts are application dependent • Logical conflicts directly result in physical conflicts We address these physical conflicts in multi-core main-memory DBs

The life of a transaction Thread/process pool

The life of a transaction Thread/process T pool

The life of a transaction • Assign a transaction to an Thread/process “execution context” T pool • Assigned context performs all actions required to execute the transaction • Concurrency control • Transaction logic • Logging • Deal with conflicts via shared concurrency control meta-data

Example: Logical lock acquisition A T 1 B C T 2

Example: Logical lock acquisition A T 1 • Latch bucket B C T 2

Example: Logical lock acquisition A T 2 T 1 • Latch bucket B C • Add lock request T 2

Example: Logical lock acquisition A T 2 T 1 • Latch bucket B C • Add lock request • Unlatch bucket T 2

Example: Logical lock acquisition A T 1 • Latch bucket B C • Add lock request • Unlatch bucket T 2 T 3 T 4 T 5

Example: Logical lock acquisition A T 1 • Latch bucket B Several threads must acquire a single latch C • Add lock request Synchronization overhead Overhead increases with contention • Unlatch bucket T 2 T 3 T 4 T 5

Example: Logical lock acquisition A T 1 • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket T 2 T 3 T 4 T 5

Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 3 T 4 T 5

Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 2 T 3 T 4 T 5

Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 3 T 2 T 3 T 4 T 5

Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 3 T 4 T 2 T 3 T 4 T 5

Example: Logical lock acquisition A T 1 • Latch bucket B C • Add lock request More synchronization overhead • Unlatch bucket T 2 T 3 T 4 T 5

The result? Throughput Number of Threads

Dealing with contention on few cores

Dealing with contention on lots of cores

Observations • Contention for lock list depends on workload, not implementation • Latches can be made as fine-grained as possible • E.g., bucket-level latches • But if records are popular, fine-grained latching will not help

Every protocol has the same overheads • Concurrency control protocols use object meta-data • Lock lists in locking • Timestamps in timestamp ordering, MVCC, OCC • Object meta-data is accessible by any thread • E.g., threads update read and write timestamps in timestamp ordering • E.g., threads manipulate lock lists in 2PL • Globally updatable shared meta-data is the problem • Synchronization, coherence overheads • No bound on threads contending for the same meta-data

Every protocol has the same overheads • Concurrency control protocols use object meta-data • Lock lists in locking • Timestamps in timestamp ordering, MVCC, OCC Scalability anti-pattern • Object meta-data is accessible by any thread • E.g., threads update read and write timestamps in timestamp ordering • E.g., threads manipulate lock lists in 2PL • Globally updatable shared meta-data is the problem • Synchronization, coherence overheads • No bound on threads contending for the same meta-data

Need a mechanism to bound contention on shared meta-data

Decouple concurrency control and execution • Delegate concurrency control to a specific set of threads • These threads are responsible for performing only concurrency control logic • Access to concurrency control meta-data is mediated via concurrency control threads

Communication via message-passing • No data sharing between concurrency control and execution threads • Concurrency control and execution threads interact via explicit message-passing • Like RPC in distributed systems

Example: Logical lock acquisition B C A T 1 CC B CC C CC A T 2

Example: Logical lock acquisition B C A T 1 CC B CC C CC A T 2 T 2 Enqueue lock request

Example: Logical lock acquisition B C A T 1 Add to lock list T 2 CC B CC C CC A T 2

Example: Logical lock acquisition A T 1 • Enqueue lock request CC A • Acquire lock T 2 T 3 T 4 T 5

Example: Logical lock acquisition One consumer & producer A T 1 per queue • Enqueue lock request CC A Bounded contention per • Acquire lock queue T 2 T 3 T 4 T 5

Example: Logical lock acquisition A T 1 • Enqueue lock request CC A • Acquire lock T 2 T 3 T 4 T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

Example: Logical lock acquisition A T 1 T 2 • Enqueue lock request CC A • Acquire lock T 3 T 4 T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

Example: Logical lock acquisition A T 1 T 2 T 3 • Enqueue lock request CC A • Acquire lock T 4 T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

Example: Logical lock acquisition A T 1 T 2 T 3 T 4 • Enqueue lock request CC A • Acquire lock T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

Example: Logical lock acquisition A T 1 T 2 T 3 T 4 T 5 • Enqueue lock request CC A • Acquire lock One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

TPC-C NewOrder and Payment • 16 Warehouses • 80 core machine

TPC-C NewOrder and Payment 3.0 M Throughput (txns/sec) Delegated 2.5 M Conventional 2.0 M 1.5 M 1.0 M 0.5 M 0.0 M 10 20 40 60 80 Number of CPU cores

Observations • Could be adapted to any concurrency control protocol • Indeed, to any multi-core DB sub-system • Key idea: Delegate functionality to threads • E.g., concurrency control v.s. execution • Message-passing for communication • Message-passing may be inevitable on heterogeneous hardware

Examples of delegating functionality • Delegating functionality has been successfully used in a variety of domains • Multi-core indexing -- Physiological partitioning (PLP), PALM • Distributed OCC validation – Hyder, Centiman • Multi-core MVCC – Bohm, Lazy transactions

Conclusions • DB implementations cannot circumvent workload conflicts • Workload conflicts result in data-structure contention • Transaction to thread assignment causes unbounded data- structure contention • Delegate functionality to threads to bound contention

If your DB is in this position…

Design Principles for Scaling Multi-core OLTP Under High Contention - PowerPoint PPT Presentation

Design Principles for Scaling Multi-core OLTP Under High Contention Kun Ren, Jose Faleiro , Daniel Abadi Yale University Conflicts: The scourge of database systems Logical conflicts Due to data conflicts between transactions T 1 :

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

OLAP and Data Mining Chapter 17 OLTP Compared With OLAP On Line Transaction Processing

YMMV The The Las Last Si t Six Mon x Months ths Prison Life GOOD EVIL NVM OLTP DRAM

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

Benchmarking Hybrid OLTP&OLAP Database Systems Florian Funke Alfons Kemper Thomas Neumann

OldSQL vs. NoSQL vs. NewSQL on New OLTP Michael Stonebraker,

YMMV 2013 2013 2013 2013 Prison Life GOOD EVIL NVM OLTP DRAM SSD/HDD Pr Projec oject

Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing

Effectively Scaling Effectively Scaling up/universalizing exclusive up/universalizing exclusive

Scaling From simple models to rich strategies PPPLab Day, November 30th Scaling: recent

Project discussion, 22 May: Mandatory but ungraded. Thanks for doing this June 4, 6pm deadline for

Thinking & Working Politically An in intr troduction to o key id ideas, example les an

Probabilistic Inference MATTHIAS NICKLES SCHOOL OF ENGINEERING & INFORMATICS NATIONAL

Towards Agent-Based Rational Service Composition RACING Approach Vadim Ermolayev

Linear-Programming Decoding of Tanner Codes with Local-Optimality Certificates Nissim Halabi Guy

Parallel Execution for Conflicting Transactions Neha Narula Thesis Advisors: Robert Morris and

Disclosures New Drugs for Osteoporosis Edward Hsiao receives research grant support and Bone

X.25 Slow, Safe and Reliable 2005/03/11 (C) Herbert Haas What is X.25 ? Connection-oriented

Design Principles for Scaling Multi-core OLTP Under High Contention - PowerPoint PPT Presentation

Design Principles for Scaling Multi-core OLTP Under High Contention Kun Ren, Jose Faleiro , Daniel Abadi Yale University Conflicts: The scourge of database systems Logical conflicts Due to data conflicts between transactions T 1 :

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

OLAP and Data Mining Chapter 17 OLTP Compared With OLAP On Line Transaction Processing

YMMV The The Las Last Si t Six Mon x Months ths Prison Life GOOD EVIL NVM OLTP DRAM

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Analysis of Scaling Algorithms for Matrix &amp; Operator Scaling Contents Scaling Algorithms

Benchmarking Hybrid OLTP&amp;OLAP Database Systems Florian Funke Alfons Kemper Thomas Neumann

OldSQL vs. NoSQL vs. NewSQL on New OLTP Michael Stonebraker,

YMMV 2013 2013 2013 2013 Prison Life GOOD EVIL NVM OLTP DRAM SSD/HDD Pr Projec oject

Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing

Effectively Scaling Effectively Scaling up/universalizing exclusive up/universalizing exclusive

Scaling From simple models to rich strategies PPPLab Day, November 30th Scaling: recent

Project discussion, 22 May: Mandatory but ungraded. Thanks for doing this June 4, 6pm deadline for

Thinking &amp; Working Politically An in intr troduction to o key id ideas, example les an

Probabilistic Inference MATTHIAS NICKLES SCHOOL OF ENGINEERING &amp; INFORMATICS NATIONAL

Towards Agent-Based Rational Service Composition RACING Approach Vadim Ermolayev

Linear-Programming Decoding of Tanner Codes with Local-Optimality Certificates Nissim Halabi Guy

Parallel Execution for Conflicting Transactions Neha Narula Thesis Advisors: Robert Morris and

Disclosures New Drugs for Osteoporosis Edward Hsiao receives research grant support and Bone

X.25 Slow, Safe and Reliable 2005/03/11 (C) Herbert Haas What is X.25 ? Connection-oriented

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

Benchmarking Hybrid OLTP&OLAP Database Systems Florian Funke Alfons Kemper Thomas Neumann

Thinking & Working Politically An in intr troduction to o key id ideas, example les an

Probabilistic Inference MATTHIAS NICKLES SCHOOL OF ENGINEERING & INFORMATICS NATIONAL