AC AC-Ke Key: Adaptive Caching for
LSM-based Key-Value Stores
Fenggang Wu, Ming-Hong Yang, Baoquan Zhang, David H.C. Du University of Minnesota, Twin Cities July 2020. USENIX ATC’20
Key-Value Stores Key-value stores are popular. web searching, - - PowerPoint PPT Presentation
AC AC-Ke Key : Adaptive Caching for LSM-based Key-Value Stores Fenggang Wu , Ming-Hong Yang, Baoquan Zhang, David H.C. Du University of Minnesota, Twin Cities July 2020. USENIX ATC20 Key-Value Stores Key-value stores are popular. web
Fenggang Wu, Ming-Hong Yang, Baoquan Zhang, David H.C. Du University of Minnesota, Twin Cities July 2020. USENIX ATC’20
Storage DRAM
KV batch write merge merge
Ø write-optimized Ø batch and write sequentially Ø never perform scattered in-place update
10X Larger
Level 1 Level 2 Level 3
Storage DRAM
K
Point Lookup
Block:
Ø sorted range of Key/Value pairs Ø basic I/O Unit (16KB)
Level 1 Level 2 Level 3
Suffer from read performance issue
Ø Potentially every level needs to be checked Ø Fetch whole block when only one key is queried
Storage DRAM
K1, K2
Range Query
Result
Level 1 Level 2 Level 3
merge
Read performance issue: Both point lookup and range query will fetch many blocks for one query.
Storage DRAM
Cache Space
“One key contributes 20% of a server’s requests” Memcache [Atikoglu 2012] “1% of the keys takes up 50% of total point lookup” ZippyDB@Facebook [Cao, 2020] Range queries have hot ranges too [Cooper 2012, Gilad 2020]
Can we use cache? Yes, workloads have hotspots!
Level 1 Level 2 Level 3
Storage DRAM
Ø Data have different sizes/level -> different cache cost/benefit. Ø Distinct types of read: point lookup and range query.
But… popular caching schemes do NOT fit!!
Cache Space
Unique caching challenge in LSM-KVS Existing Solutions Ø General caching schemes:
Ø No special consideration about the cache cost/benefit in LSM-KVS.
Ø Existing LSM-KVS caching:
Ø Favors only particular workload. Ø Not efficient for a different/dynamic workload.
Yes, workloads have hotspots!
Level 1 Level 2 Level 3
Can we use cache?
Block KV KP Point Range Adaptive LevelDB Yes No No Inefficient Supported Fix-sized RocksDB Yes Yes No Large Value inefficient Supported Fix-sized Cassandra No Yes Yes Efficient Not Supported Fix-sized AC-Key Yes Yes Yes Efficient Supported Adaptive-sized Key/Value Key/Value … Key/Value Key/Value Key/P P
Block KV KP
Block: Range query KV: Point lookup
(small/hot value)
KP: Point lookup
(large/warm value)
Key challenge: adjust the sizes of different types of caches according to dynamic workloads
Storage DRAM
Block Cache KV Cache KP Cache Point Cache
Different items can be cached Favorite workload
Block Point
boundary
Real Cache Ghost Cache Block Point
boundary
Block Ghost Cache Hit Point Ghost Cache Hit Ø Ghost Cache Ø Store only metadata of evicted entries from the real cache Ø On ghost hit: Push boundary away to grow the real cache
Finally reach to a dynamic equilibrium for a given workload. block block num
Point Cache Block Cache
Block
KV Cache KP Cache
KP KV
Upper Level Lower Level
Real Cache Ghost Cache
dynamic boundary
ØUpper level Point Cache vs Block Cache ØLower level: KV Cache vs KP Cache
ØMeasure caching efficiency to consider different entry cost/benefit ØSpecial cached entry handling due to compaction and flush
Ø Implement AC-Key based on RocksDB Ø Evaluate with various workloads and system settings
For complete evaluation result: check out our paper
1000 2000 3000 4000 0% 20% 40% 60% 80% 100% QPS Range Query Ratio ac-key rocksdb pure-kv pure-kp
pure-block-cache / industry
all point-lookup workload all range-query workload
0.2 0.4 0.6 0.8 1 500 1000 1500 0.4 0.8 Block Cache Ratio QPS KP Cache Ratio 1000-1500 500-1000 0-500 winner
2000 4000 6000 0.5 1 1.5 2 2.5 3 QPS Operations (million) ac-key
400 800 1200 0.5 1 1.5 2 2.5 3 Size (KB) block kp kv
pure range-query pure point-lookup competing scheme: offline
ØLSM-based key-value store is widely used
ØWrite-optimized; but has read performance issue.
ØAC-Key: Adaptive caching for LSM-based key-value stores.
ØIntegrating all the KV, KP, Block cache components. ØHierarchical size-adaptive design. ØOutperform industry solutions.
Fenggang Wu wuxx0835@umn.edu