design principles for scaling multi core oltp under high
play

Design Principles for Scaling Multi-core OLTP Under High Contention - PowerPoint PPT Presentation

Design Principles for Scaling Multi-core OLTP Under High Contention Kun Ren, Jose Faleiro , Daniel Abadi Yale University Conflicts: The scourge of database systems Logical conflicts Due to data conflicts between transactions T 1 :


  1. Design Principles for Scaling Multi-core OLTP Under High Contention Kun Ren, Jose Faleiro , Daniel Abadi Yale University

  2. Conflicts: The scourge of database systems • Logical conflicts • Due to data conflicts between transactions T 1 : Read(x); 0 T 2 : Write(x); • Physical conflicts • Due to contention on internal data-structures

  3. Conflicts: The scourge of database systems • Logical conflicts • Due to data conflicts between transactions T 1 : Addressed via new correctness Read(x 0 ); T 2 : criteria, exploiting semantics Write(x 2 ); • Physical conflicts • Due to contention on internal data-structures Addressed via new protocols, DB architectures

  4. … but conflicts are inevitable • Logical conflicts are application dependent • Logical conflicts directly result in physical conflicts

  5. … but conflicts are inevitable • Logical conflicts are application dependent • Logical conflicts directly result in physical conflicts We address these physical conflicts in multi-core main-memory DBs

  6. The life of a transaction Thread/process pool

  7. The life of a transaction Thread/process T pool

  8. The life of a transaction • Assign a transaction to an Thread/process “execution context” T pool • Assigned context performs all actions required to execute the transaction • Concurrency control • Transaction logic • Logging • Deal with conflicts via shared concurrency control meta-data

  9. The life of a transaction • Assign a transaction to an Thread/process “execution context” T pool • Assigned context performs all actions required to execute the transaction • Concurrency control • Transaction logic • Logging • Deal with conflicts via shared concurrency control meta-data

  10. The life of a transaction • Assign a transaction to an Thread/process “execution context” T pool • Assigned context performs all actions required to execute the transaction • Concurrency control • Transaction logic • Logging • Deal with conflicts via shared concurrency control meta-data

  11. Example: Logical lock acquisition A T 1 B C T 2

  12. Example: Logical lock acquisition A T 1 • Latch bucket B C T 2

  13. Example: Logical lock acquisition A T 2 T 1 • Latch bucket B C • Add lock request T 2

  14. Example: Logical lock acquisition A T 2 T 1 • Latch bucket B C • Add lock request • Unlatch bucket T 2

  15. Example: Logical lock acquisition A T 1 • Latch bucket B C • Add lock request • Unlatch bucket T 2 T 3 T 4 T 5

  16. Example: Logical lock acquisition A T 1 • Latch bucket B Several threads must acquire a single latch C • Add lock request Synchronization overhead Overhead increases with contention • Unlatch bucket T 2 T 3 T 4 T 5

  17. Example: Logical lock acquisition A T 1 • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket T 2 T 3 T 4 T 5

  18. Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 3 T 4 T 5

  19. Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 2 T 3 T 4 T 5

  20. Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 3 T 2 T 3 T 4 T 5

  21. Example: Logical lock acquisition • Latch bucket B Lock list moves across cores C Coherence overhead • Add lock request • Unlatch bucket A T 1 T 2 T 3 T 4 T 2 T 3 T 4 T 5

  22. Example: Logical lock acquisition A T 1 • Latch bucket B C • Add lock request More synchronization overhead • Unlatch bucket T 2 T 3 T 4 T 5

  23. The result? Throughput Number of Threads

  24. Dealing with contention on few cores

  25. Dealing with contention on lots of cores

  26. Observations • Contention for lock list depends on workload, not implementation • Latches can be made as fine-grained as possible • E.g., bucket-level latches • But if records are popular, fine-grained latching will not help

  27. Every protocol has the same overheads • Concurrency control protocols use object meta-data • Lock lists in locking • Timestamps in timestamp ordering, MVCC, OCC • Object meta-data is accessible by any thread • E.g., threads update read and write timestamps in timestamp ordering • E.g., threads manipulate lock lists in 2PL • Globally updatable shared meta-data is the problem • Synchronization, coherence overheads • No bound on threads contending for the same meta-data

  28. Every protocol has the same overheads • Concurrency control protocols use object meta-data • Lock lists in locking • Timestamps in timestamp ordering, MVCC, OCC Scalability anti-pattern • Object meta-data is accessible by any thread • E.g., threads update read and write timestamps in timestamp ordering • E.g., threads manipulate lock lists in 2PL • Globally updatable shared meta-data is the problem • Synchronization, coherence overheads • No bound on threads contending for the same meta-data

  29. Need a mechanism to bound contention on shared meta-data

  30. Decouple concurrency control and execution • Delegate concurrency control to a specific set of threads • These threads are responsible for performing only concurrency control logic • Access to concurrency control meta-data is mediated via concurrency control threads

  31. Communication via message-passing • No data sharing between concurrency control and execution threads • Concurrency control and execution threads interact via explicit message-passing • Like RPC in distributed systems

  32. Example: Logical lock acquisition B C A T 1 CC B CC C CC A T 2

  33. Example: Logical lock acquisition B C A T 1 CC B CC C CC A T 2 T 2 Enqueue lock request

  34. Example: Logical lock acquisition B C A T 1 Add to lock list T 2 CC B CC C CC A T 2

  35. Example: Logical lock acquisition A T 1 • Enqueue lock request CC A • Acquire lock T 2 T 3 T 4 T 5

  36. Example: Logical lock acquisition One consumer & producer A T 1 per queue • Enqueue lock request CC A Bounded contention per • Acquire lock queue T 2 T 3 T 4 T 5

  37. Example: Logical lock acquisition One consumer & producer A T 1 per queue • Enqueue lock request CC A Bounded contention per • Acquire lock queue T 2 T 3 T 4 T 5

  38. Example: Logical lock acquisition A T 1 • Enqueue lock request CC A • Acquire lock T 2 T 3 T 4 T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

  39. Example: Logical lock acquisition A T 1 T 2 • Enqueue lock request CC A • Acquire lock T 3 T 4 T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

  40. Example: Logical lock acquisition A T 1 T 2 T 3 • Enqueue lock request CC A • Acquire lock T 4 T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

  41. Example: Logical lock acquisition A T 1 T 2 T 3 T 4 • Enqueue lock request CC A • Acquire lock T 5 One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

  42. Example: Logical lock acquisition A T 1 T 2 T 3 T 4 T 5 • Enqueue lock request CC A • Acquire lock One core manipulates lock list List cannot “bounce” around cores T 2 T 3 T 4 T 5 List likely remains cached under high contention

  43. TPC-C NewOrder and Payment • 16 Warehouses • 80 core machine

  44. TPC-C NewOrder and Payment 3.0 M Throughput (txns/sec) Delegated 2.5 M Conventional 2.0 M 1.5 M 1.0 M 0.5 M 0.0 M 10 20 40 60 80 Number of CPU cores

  45. Observations • Could be adapted to any concurrency control protocol • Indeed, to any multi-core DB sub-system • Key idea: Delegate functionality to threads • E.g., concurrency control v.s. execution • Message-passing for communication • Message-passing may be inevitable on heterogeneous hardware

  46. Examples of delegating functionality • Delegating functionality has been successfully used in a variety of domains • Multi-core indexing -- Physiological partitioning (PLP), PALM • Distributed OCC validation – Hyder, Centiman • Multi-core MVCC – Bohm, Lazy transactions

  47. Conclusions • DB implementations cannot circumvent workload conflicts • Workload conflicts result in data-structure contention • Transaction to thread assignment causes unbounded data- structure contention • Delegate functionality to threads to bound contention

  48. If your DB is in this position…

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend