unist unist hanyang univ unist skku
play

UNIST UNIST Hanyang Univ. UNIST/SKKU Fast but Asymmetric - PowerPoint PPT Presentation

Deukyeon Hwang Wook-Hee Kim Youjip Won Beomseok Nam UNIST UNIST Hanyang Univ. UNIST/SKKU Fast but Asymmetric Non-Volatility Byte-Addressability Large Capacity Access Latency CPU Caches Persistent Memory (Non-Volatile) (Volatile)


  1. Deukyeon Hwang Wook-Hee Kim Youjip Won Beomseok Nam UNIST UNIST Hanyang Univ. UNIST/SKKU

  2. Fast but Asymmetric Non-Volatility Byte-Addressability Large Capacity Access Latency

  3. CPU Caches Persistent Memory (Non-Volatile) (Volatile) LOST 40! 30 30 40 10 20 30 40 30 40 cache line FLUSH

  4. Inserting 25 into a node 10 20 30 40 (0 ) Partially updated tree node is inconsistent 10 20 30 40 40 (1 ) → 10 20 30 30 40 Append-Only Update (2 ) 10 20 25 30 40 (3 )

  5. Node Split Node A Node A Node B 10 20 30 10 20 30 40 40 60 60 ʌ ʌ ʌ P1 P2 P3 P1 P2 P3 P4 P6 P4 P6 Logging → Selective Persistence (Internal node in DRAM)

  6. ▪ Append-Only • Unsorted keys ▪ Selective Persistence • Internal node → DRAM • Internal nodes have to be reconstructed from leaf nodes after failures • Logging for leaf nodes ▪ Previous solutions NV- Tree [FAST’15] Append-Only leaf update + Selective Persistence wB+- Tree [VLDB’15] Append-Only node update + bitmap/slot array metadata FP- Tree [SIGMOD’16] Append-Only leaf update + fingerprints + Selective Persistence

  7. F ailure- A tomic S hif T Append-Only (FAST) (Unsorted keys) Lock-Free Search F ailure- A tomic Selective Persistence I n-place R ebalancing (DRAM + PM) (FAIR)

  8. ▪ Modern processors reorder instructions to utilize the memory bandwidth ▪ Memory ordering in x86 and ARM x86 ARM stores-after-stores Y N stores-after-loads N N loads-after-stores N N loads-after-loads N N Inst. w/ dependency Y Y ▪ x86 guarantees Total Store Ordering (TSO) ▪ Dependent instructions are not reordered

  9. ▪ Pointers in B+-Tree store unique memory addresses ▪ 8-byte pointer can be atomically updated Read transactions detect transient inconsistency between duplicate pointers ▪ transient inconsistency • In-flight state partially updated by a write transaction 10 20 30 40 40 P1 P2 P3 P4 P5 P5

  10. 10 20 30 40 P1 P2 P3 P4 P5 P5 mfence(); mfence(); TSO 10 20 30 40 40 P1 P2 P3 P4 P5 P5

  11. Insert (25, P6) into a node using FAST g: Garbage 10 20 30 40 g g ʌ : Null ʌ ʌ P1 P2 P3 P4 P5 Read transactions can succeed in finding a key even if a system crashes in any step

  12. Insert (25, P6) into a node using FAST 10 20 30 40 g g ʌ P1 P2 P3 P4 P5 P5

  13. Insert (25, P6) into a node using FAST 10 20 30 40 40 g ʌ P1 P2 P3 P4 P5 P5

  14. Insert (25, P6) into a node using FAST 10 20 30 40 40 g ʌ P1 P2 P3 P4 P5 P5

  15. Insert (25, P6) into a node using FAST read transaction 10 20 30 40 40 g ʌ P1 P2 P3 P4 P5 P5 Key 40 between duplicate pointers is ignored!

  16. Insert (25, P6) into a node using FAST 10 20 30 40 40 g ʌ P1 P2 P3 P4 P4 P5 Shifting P4 invalidates the left 40

  17. Insert (25, P6) into a node using FAST 10 20 30 30 40 g ʌ P1 P2 P3 P4 P4 P5

  18. Insert (25, P6) into a node using FAST 10 20 30 30 40 g ʌ P1 P2 P3 P3 P4 P5

  19. Insert (25, P6) into a node using FAST 10 20 25 30 40 g ʌ P1 P2 P3 P3 P4 P5

  20. Insert (25, P6) into a node using FAST 10 20 25 30 40 g ʌ P1 P2 P3 P6 P4 P5 Storing P6 validates 25

  21. ▪ It is necessary to call clflush at the boundary of cache line 10 20 30 40 g g ʌ ʌ P1 P2 P3 P4 P5 Cache Line Cache Line 2 1 10 20 30 30 40 g ʌ P1 P2 P3 P3 P4 P5 mfence() clflush( ) Cache Line 2 mfence() Cache Line Cache Line 1 2

  22. ▪ Let’s avoid expensive logging by making read transactions be aware of rebalancing operations ▪ B link -Tree 10 20 30 40 70 80 90

  23. FAIR split a node Node A Node B 10 20 30 40 40 60 60 ʌ ʌ P1 P2 P3 P4 P6 P4 P6 A read transaction can detect transient inconsistency if keys are out of order

  24. FAIR split a node Node A Node B 10 20 30 40 60 ʌ ʌ P1 P2 P3 P4 P6 Setting NULL pointer validates Node B. Node A and Node B are virtually a single node

  25. FAIR split a node Node A Node B 10 20 30 40 60 ʌ ʌ P1 P2 P3 P4 P6 Migrated keys can be accessed via sibling pointer

  26. FAIR split a node Node A Node B 10 20 30 40 50 60 ʌ ʌ P1 P2 P3 P4 P6 P5

  27. Insert a key into the parent node using FAST after FAIR split Node R root 10 70 70 C2 C3 C3 Node A Node B Node C 10 20 30 40 50 60 70 80 90

  28. Insert a key into the parent node using FAST after FAIR split Node R root 10 70 70 C3 C2 C2 Node A Node B Node C 10 20 30 40 50 60 70 80 90 Node B can be accessed from Node A

  29. Insert a key into the parent node using FAST after FAIR split ➢ Searching the key 50 from the root after a system crash Node R root 10 70 70 key accessed by read transaction C3 C2 C2 Node A Node B Node C 10 20 30 40 50 60 70 80 90 Node B can be accessed from Node A

  30. Insert a key into the parent node using FAST after FAIR split Node R root 10 40 70 C3 C2 C4 Node A Node B Node C 10 20 30 40 50 60 70 80 90 FAST inserting makes Node B visible atomically

  31. Read transactions can tolerate any inconsistency caused by write transactions → Read transactions can access the transient inconsistent tree node being modified by a write transaction → Lock-Free Search

  32. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 30 40 g g ʌ ʌ P1 P2 P3 P4 P5 shift →

  33. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 30 40 g g ʌ P1 P2 P3 P4 P5 P5 shift →

  34. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 30 40 40 g ʌ P1 P2 P3 P4 P5 P5 shift →

  35. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 30 40 40 g ʌ P1 P2 P3 P4 P4 P5 shift →

  36. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 30 30 40 g ʌ P1 P2 P3 P4 P4 P5 shift →

  37. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 30 30 40 g ʌ P1 P2 P3 P3 P4 P5 shift →

  38. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 20 30 40 g ʌ P1 P2 P3 P3 P4 P5 shift →

  39. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction read → 10 20 20 30 40 g ʌ P1 P2 P2 P3 P4 P5 shift →

  40. Read transaction [Example 1] Searching 30 while inserting (15, P6) Write transaction FOUND! read → 10 20 20 30 40 g ʌ P1 P2 P2 P3 P4 P5 shift →

  41. Read transaction [Example 2] Searching 30 while deleting (20, P2) Write transaction read → 10 20 30 40 g g ʌ ʌ P1 P2 P3 P4 P5  shift

  42. Read transaction [Example 2] Searching 30 while deleting (20, P2) Write transaction read → 10 20 30 40 g g ʌ ʌ P1 P3 P3 P4 P5  shift

  43. Read transaction [Example 2] Searching 30 while deleting (20, P2) Write transaction read → 10 30 30 40 g g ʌ ʌ P1 P3 P3 P4 P5  shift

  44. Read transaction [Example 2] Searching 30 while deleting (20, P2) Write transaction read → 10 30 30 40 g g ʌ ʌ P1 P3 P4 P4 P5  shift

  45. Read transaction [Example 2] Searching 30 while deleting (20, P2) Write transaction read → 10 30 40 40 g g ʌ ʌ P1 P3 P4 P4 P5  shift

  46. Read transaction [Example 2] Searching 30 while deleting (20, P2) Write transaction read → 10 30 40 40 g g ʌ ʌ P1 P3 P4 P5 P5  shift

  47. Read transaction [Example 2] Searching 30 while deleting (20, P2) Write transaction 30 NOT FOUND read → 10 30 40 40 g g ʌ ʌ P1 P3 P4 P5 P5  shift The read transaction cannot find the key 30 due to shift operation

  48. ▪ Direction flag: • Odd Number • Even Number – Deletion shifts to the left. – Insertion shifts to the right. – Search must scan from Right to Left – Search must scan from Left to Right read → Search 40 10 20 30 40 g g counter 2 ʌ ʌ P1 P2 P3 P4 P5 Insert 25 shift →

  49. ▪ Direction flag: • Odd Number • Even Number – Deletion shifts to the left. – Insertion shifts to the right. – Search must scan from Right to Left – Search must scan from Left to Right  read Search 40 10 20 30 40 g g counter 3 ʌ ʌ P1 P2 P3 P4 P5 Delete 25  shift

  50. ▪ Direction flag: • Odd Number • Even Number – Deletion shifts to the left. – Insertion shifts to the right. – Search must scan from Right to Left – Search must scan from Left to Right read → Search 40 10 20 30 40 g g counter 3 2 ʌ ʌ P1 P2 P3 P4 P5 Delete 25  shift The read transaction has to check the counter once again to make sure the counter has not changed. Otherwise, search the node again.

  51. Transaction A Transaction B BEGIN INSERT 10 SUSPENDED BEGIN SEARCH 10(FOUND) COMMIT WAKE UP ABORT Dirty reads problem The ordering of Transaction A and Transaction B cannot be determined

  52. Isolation Level Highest Serializable Repeatable reads Read committed Read uncommitted Lowest Our Lock-Free Search supports low isolation level

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend