Hashing Index Scheme for Persistent Memory Pengfei Zuo , Yu Hua, Jie - PowerPoint PPT Presentation

Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory Pengfei Zuo , Yu Hua, Jie Wu Huazhong University of Science and Technology, China 13th USENIX Symposium on Operating Systems Design and Implementation ( OSDI ), 2018

Persistent Memory (PM) ➢ Non-volatile memory as PM is expected to replace or complement DRAM as main memory – Non-volatility, low power, large capacity PCM ReRAM DRAM Read ( ns ) 20-70 20-50 10 PCM Write ( ns ) 150-220 70-140 10 √ √ × Non-volatility Standby Power ~0 ~0 High Density ( Gb/cm 2 ) 13.5 24.5 9.1 ReRAM C. Xu et al. “Overcoming the Challenges of Crossbar Resistive Memory Architectures”, HPCA, 2015. 2 K. Suzuki and S. Swanson. “A Survey of Trends in Non -Volatile Memory Technologies: 2000- 2014”, IMW 2015.

Index Structures in DRAM vs PM ➢ Index structures are critical for memory&storage systems ➢ Traditional indexing techniques originally designed for DRAM become inefficient in PM – Hardware limitations of NVM CPU • Limited cell endurance Persist • Asymmetric read/write latency and energy • Write optimization matters – The requirement of data consistency • Data are persistently stored in PM • Crash consistency on system failures 3

Tree-based vs Hashing Index Structures ➢ Tree-based index structures – Pros: good for range query – Cons: O(log(n)) time complexity for point query – Ones for PM have been widely studied • CDDS B-tree [FAST’11] • NV-Tree [FAST’15] • wB+-Tree [VLDB’15] • FP-Tree [SIGMOD’16] • WORT [FAST’17] • FAST&FAIR [FAST’18] 4

Tree-based vs Hashing Index Structures ➢ Tree-based index structures ➢ Hashing index structures – Pros: constant time complexity for – Pros: good for range query point query – Cons: O(log(n)) time complexity – Cons: do not support range query for point query – Ones for PM have been widely – Widely used in main memory studied • Main memory databases • • CDDS B-tree [FAST’11] In-memory key-value stores, e.g., Memcached and Redis • NV-Tree [FAST’15] – When maintained in PM, multiple • wB+-Tree [VLDB’15] • non-trivial challenges exist FP-Tree [SIGMOD’16] • • WORT [FAST’17] Rarely touched by existing work • FAST&FAIR [FAST’18] 5

Challenges of Hashing Indexes for PM ① High overhead for consistency guarantee – Ordering memory writes • Cache line flush and memory fence instructions – Avoiding partial updates for non-atomic writes • Logging or copy-on-write (CoW) mechanisms CPU Memory Bus 8-byte width Volatile caches Non-volatile memory 6

Challenges of Hashing Indexes for PM ① High overhead for consistency guarantee ② Performance degradation for reducing writes – Hashing schemes for DRAM usually cause many extra writes for dealing with hash collisions [INFLOW’15, MSST’17] – Write-friendly hashing schemes reduce writes but at the cost of decreasing access performance • PCM-friendly hash table (PFHT) [INFLOW’15] • Path hashing [MSST’17] 7

Challenges of Hashing Indexes for PM ① High overhead for consistency guarantee ② Performance degradation for reducing writes ③ Cost inefficiency for resizing hash table − Double the table size and iteratively rehash all items − Take O(N) time to complete − N insertions with cache line flushes & memory fences Rehash all items Old Hash Table New Hash Table 8

Existing Hashing Index Schemes for PM (“ × ”: bad, “ √ ”: good , “ -- ”: moderate) PFHT 1 Bucketized Path Hashing 2 Cuckoo (BCH) √ √ √ Memory efficiency √ -- -- Search √ -- -- Deletion × -- -- Insertion × √ √ NVM writes × × × Resizing × × × Consistency [1] B. Debnath et al. “Revisiting hash table design for phase change memory”, INFLOW, 2015. 9 [2] P. Zuo and Y. Hua. “A write -friendly hashing scheme for non- volatile memory systems”, MSST, 2017.

Existing Hashing Index Schemes for PM (“ × ”: bad, “ √ ”: good , “ -- ”: moderate) PFHT 1 Bucketized Path Level Hashing 2 Cuckoo (BCH) Hashing √ √ √ √ Memory efficiency √ -- -- √ Search √ -- -- √ Deletion × -- -- √ Insertion × √ √ √ NVM writes × × × √ Resizing × × × √ Consistency [1] B. Debnath et al. “Revisiting hash table design for phase change memory”, INFLOW, 2015. 10 [2] P. Zuo and Y. Hua. “A write -friendly hashing scheme for non- volatile memory systems”, MSST, 2017.

Level Hashing Write-optimized & High-performance Hash Table Structure x One movement 0 1 2 3 4 5 N-4 N-3 N-2 N-1 TL: BL: One movement Consistency Resizing support support Cost-efficient Low-overhead Consistency In-place Resizing Scheme Guarantee Scheme 11

Write-optimized Hash Table Structure ① Multiple slots per bucket ② Two hash locations for each key ③ Sharing-based two-level structure ④ At most one movement for each successful insertion 12

Write-optimized Hash Table Structure 100% ① Maximum Load Multiple slots per bucket 80% Factor ② Two hash locations for each key 60% 40% ③ Sharing-based two-level structure 20% ④ 2.2% At most one movement for each 0% successful insertion D1 D1+D2 D1+D2+D3 All x 0 1 2 3 4 5 N-4 N-3 N-2 N-1 TL: 13

Write-optimized Hash Table Structure 100% ① Maximum Load Multiple slots per bucket 80% Factor ② Two hash locations for each key 60% 47.6% 40% ③ Sharing-based two-level structure 20% ④ 2.2% At most one movement for each 0% successful insertion D1 D1+D2 D1+D2+D3 All x 0 1 2 3 4 5 N-4 N-3 N-2 N-1 TL: 14

Write-optimized Hash Table Structure 100% 82.5% ① Maximum Load Multiple slots per bucket 80% Factor ② Two hash locations for each key 60% 47.6% 40% ③ Sharing-based two-level structure 20% ④ 2.2% At most one movement for each 0% successful insertion D1 D1+D2 D1+D2+D3 All x 0 1 2 3 4 5 N-4 N-3 N-2 N-1 TL: BL: 15

Write-optimized Hash Table Structure 100% 91.1% 82.5% ① Maximum Load Multiple slots per bucket 80% Factor ② Two hash locations for each key 60% 47.6% 40% ③ Sharing-based two-level structure 20% ④ 2.2% At most one movement for each 0% successful insertion D1 D1+D2 D1+D2+D3 All x One movement 0 1 2 3 4 5 N-4 N-3 N-2 N-1 TL: BL: One movement 16

Write-optimized Hash Table Structure ➢ Write-optimized: only 1.2% of insertions incur one movement ➢ High-performance: constant-scale time complexity for all operations ➢ Memory-efficient: achieve high load factor by evenly distributing items x One movement 0 1 2 3 4 5 N-4 N-3 N-2 N-1 TL: BL: One movement 17

Cost-efficient In-place Resizing ➢ Put a new level on top of the old hash table and only rehash items in the old bottom level 0 1 2 3 N-2 N-1 TL: BL: 18

Cost-efficient In-place Resizing ➢ Put a new level on top of the old hash table and only rehash items in the old bottom level 0 1 2 3 4 5 6 7 2N-4 2N-3 2N-2 2N-1 TL: TL: BL: 19

Cost-efficient In-place Resizing ➢ Put a new level on top of the old hash table and only rehash items in the old bottom level 0 1 2 3 4 5 6 7 2N-4 2N-3 2N-2 2N-1 TL: BL: IL: ( the interim level ) 20

Cost-efficient In-place Resizing ➢ Put a new level on top of the old hash table and only rehash items in the old bottom level 0 1 2 3 4 5 6 7 2N-4 2N-3 2N-2 2N-1 TL: BL: Rehashing IL: ( the interim level ) 21

Cost-efficient In-place Resizing ➢ Put a new level on top of the old hash table and only rehash items in the old bottom level 0 1 2 3 4 5 6 7 2N-4 2N-3 2N-2 2N-1 TL: BL: 22

Cost-efficient In-place Resizing ➢ Put a new level on top of the old hash table and only rehash items in the old bottom level – The new hash table is exactly double size of the old one – Only 1/3 buckets (i.e., the old bottom level) are rehashed 0 1 2 3 4 5 6 7 2N-4 2N-3 2N-2 2N-1 TL: BL: 23

Low-overhead Consistency Guarantee ➢ A token associated with each slot in the open- addressing hash tables – Indicate whether the slot is empty – A token is 1 bit, e.g., “1” for non - empty, “0” for empty A bucket: 1 1 0 0 KV 0 KV 1 Tokens Slots 24

Low-overhead Consistency Guarantee ➢ A token associated with each slot in the open- addressing hash tables – Indicate whether the slot is empty – A token is 1 bit, e.g., “1” for non - empty, “0” for empty ➢ Modifying the token area only needs an atomic write – Leveraging the token to perform log-free operations A bucket: 1 1 0 0 KV 0 KV 1 Tokens Slots 25

Log-free Deletion ➢ Delete an existing item Delete 1 1 0 0 KV 0 KV 1 26

Log-free Deletion ➢ Delete an existing item Delete 1 1 0 0 KV 0 KV 1 Modify the token in an atomic write 1 0 0 0 KV 0 KV 1 27

Log-free Deletion ➢ Delete an existing item Delete 1 1 0 0 KV 0 KV 1 Modify the token in an atomic write 1 0 0 0 KV 0 KV 1 ➢ Log-free insertion and log-free resizing – Please find them in our paper 28

Consistency Guarantee for Update ➢ If directly update an existing key-value item in place Update – Inconsistency on system failures 1 1 0 0 KV 0 KV 1 29

Hashing Index Scheme for Persistent Memory Pengfei Zuo , Yu Hua, Jie - PowerPoint PPT Presentation

Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory Pengfei Zuo , Yu Hua, Jie Wu Huazhong University of Science and Technology, China 13th USENIX Symposium on Operating Systems Design and Implementation ( OSDI ), 2018

Today. Cuckoo hashing. Today. Cuckoo hashing. Johnson-Lindenstrass. Cuckoo hashing. Hashing

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Overview Intro to Hashing Intro to Hashing Hashing with Chaining Whats hashing?

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Hardware Support for ACID Transactions in Persistent Memory Arpit Joshi , Vijay Nagarajan, Marcelo

Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files

Database Systems Index: Hashing Based on slides by Feifei Li, University of Utah Hashing n

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

CS143: Index 1 Topics to Learn Important concepts Dense index vs. sparse index Primary

Distributed Shared Persistent Memory (SoCC 17) Yizhou Shan, Yiying Zhang Persistent Memory

Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari , Jishen Zhao

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis

Hashing (Application of Probability) Ashwinee Panda Final CS 70 Lecture! 9 Aug 2018 Overview

Hashing Connections 2-Universal Hash Function Perfect Hashing Anil Maheshwari Proofs

Union-Find [10] In the last class Hashing Collision Handling for Hashing Closed

Fern Green Open House (CCE) CCE Structure Total Defence Leadership for Day All

An Overview of Human Computation Dr. Ling-Jyh Chen (cclljj@iis.sinica.edu.tw) Institute of

The IAEAs technical cooperation programme: contributing for peace and development Ms Ana

Upscaling Mindsets for a High-Performing Civil Service in the Tech-Powered New Normal Naomi AOKI

Robust Multilingual Statistical Morphology Generation Models Ondej Duek and Filip Jurek

A Closer Look at Adaptive Regret Dmitry Adamskiy Joint work with Wouter Koolen, Volodya Vovk and

Theorie der Informatik 6. Formale Sprachen und Grammatiken Malte Helmert Gabriele R oger

Whole Person Care Los Angeles Clemens Hong MD MPH Director, Whole Person Care Medical