Key-Value Stores Key-value stores are popular. web searching, - - PowerPoint PPT Presentation

key value stores
SMART_READER_LITE
LIVE PREVIEW

Key-Value Stores Key-value stores are popular. web searching, - - PowerPoint PPT Presentation

AC AC-Ke Key : Adaptive Caching for LSM-based Key-Value Stores Fenggang Wu , Ming-Hong Yang, Baoquan Zhang, David H.C. Du University of Minnesota, Twin Cities July 2020. USENIX ATC20 Key-Value Stores Key-value stores are popular. web


slide-1
SLIDE 1

AC AC-Ke Key: Adaptive Caching for

LSM-based Key-Value Stores

Fenggang Wu, Ming-Hong Yang, Baoquan Zhang, David H.C. Du University of Minnesota, Twin Cities July 2020. USENIX ATC’20

slide-2
SLIDE 2

Key-Value Stores

  • Key-value stores are popular.
  • web searching, social networks, e-commerce, etc.
  • LSM-tree based Key-value stores (LSM-KVS) are widely used.
slide-3
SLIDE 3

Storage DRAM

KV batch write merge merge

Ø write-optimized Ø batch and write sequentially Ø never perform scattered in-place update

10X Larger

LSM Tree -- Write

Level 1 Level 2 Level 3

slide-4
SLIDE 4

Storage DRAM

K

LSM Tree -- Read

Point Lookup

Block:

Ø sorted range of Key/Value pairs Ø basic I/O Unit (16KB)

Level 1 Level 2 Level 3

Suffer from read performance issue

Ø Potentially every level needs to be checked Ø Fetch whole block when only one key is queried

slide-5
SLIDE 5

Storage DRAM

LSM Tree -- Read

K1, K2

Range Query

Result

Level 1 Level 2 Level 3

merge

Read performance issue: Both point lookup and range query will fetch many blocks for one query.

slide-6
SLIDE 6

Storage DRAM

Addressing Read Issue

Cache Space

“One key contributes 20% of a server’s requests” Memcache [Atikoglu 2012] “1% of the keys takes up 50% of total point lookup” ZippyDB@Facebook [Cao, 2020] Range queries have hot ranges too [Cooper 2012, Gilad 2020]

Can we use cache? Yes, workloads have hotspots!

Level 1 Level 2 Level 3

slide-7
SLIDE 7

Storage DRAM

Ø Data have different sizes/level -> different cache cost/benefit. Ø Distinct types of read: point lookup and range query.

But… popular caching schemes do NOT fit!!

Cache Space

Unique caching challenge in LSM-KVS Existing Solutions Ø General caching schemes:

Ø No special consideration about the cache cost/benefit in LSM-KVS.

Ø Existing LSM-KVS caching:

Ø Favors only particular workload. Ø Not efficient for a different/dynamic workload.

Yes, workloads have hotspots!

Level 1 Level 2 Level 3

Addressing Read Issue

Can we use cache?

slide-8
SLIDE 8

Block KV KP Point Range Adaptive LevelDB Yes No No Inefficient Supported Fix-sized RocksDB Yes Yes No Large Value inefficient Supported Fix-sized Cassandra No Yes Yes Efficient Not Supported Fix-sized AC-Key Yes Yes Yes Efficient Supported Adaptive-sized Key/Value Key/Value … Key/Value Key/Value Key/P P

Block KV KP

Block: Range query KV: Point lookup

(small/hot value)

KP: Point lookup

(large/warm value)

Key challenge: adjust the sizes of different types of caches according to dynamic workloads

Storage DRAM

Block Cache KV Cache KP Cache Point Cache

Different items can be cached Favorite workload

slide-9
SLIDE 9

Cache Size Adjustment using Ghost Cache

Block Point

boundary

Real Cache Ghost Cache Block Point

boundary

Block Ghost Cache Hit Point Ghost Cache Hit Ø Ghost Cache Ø Store only metadata of evicted entries from the real cache Ø On ghost hit: Push boundary away to grow the real cache

Finally reach to a dynamic equilibrium for a given workload. block block num

slide-10
SLIDE 10

Point Cache Block Cache

Block

KV Cache KP Cache

KP KV

Upper Level Lower Level

Real Cache Ghost Cache

dynamic boundary

AC-Key – Hierarchical Adaptive Caching

ØUpper level Point Cache vs Block Cache ØLower level: KV Cache vs KP Cache

slide-11
SLIDE 11

Other Solved Challenges

ØMeasure caching efficiency to consider different entry cost/benefit ØSpecial cached entry handling due to compaction and flush

slide-12
SLIDE 12

Evaluation

Ø Implement AC-Key based on RocksDB Ø Evaluate with various workloads and system settings

For complete evaluation result: check out our paper

slide-13
SLIDE 13

1000 2000 3000 4000 0% 20% 40% 60% 80% 100% QPS Range Query Ratio ac-key rocksdb pure-kv pure-kp

  • ffline

pure-block-cache / industry

all point-lookup workload all range-query workload

Evaluation

slide-14
SLIDE 14

0.2 0.4 0.6 0.8 1 500 1000 1500 0.4 0.8 Block Cache Ratio QPS KP Cache Ratio 1000-1500 500-1000 0-500 winner

2000 4000 6000 0.5 1 1.5 2 2.5 3 QPS Operations (million) ac-key

  • ffline

400 800 1200 0.5 1 1.5 2 2.5 3 Size (KB) block kp kv

Evaluating Adaptive Size

pure range-query pure point-lookup competing scheme: offline

  • try different combinations
  • 1/10 cache granularity
  • pick the winner
  • fixed-configuration
slide-15
SLIDE 15

Summary

ØLSM-based key-value store is widely used

ØWrite-optimized; but has read performance issue.

ØAC-Key: Adaptive caching for LSM-based key-value stores.

ØIntegrating all the KV, KP, Block cache components. ØHierarchical size-adaptive design. ØOutperform industry solutions.

slide-16
SLIDE 16

Th Thank y you

  • u!

Fenggang Wu wuxx0835@umn.edu