lsm trie an lsm tree based ultra large key value store
play

LSM-trie: An LSM-tree-based Ultra- Large Key-Value Store for Small - PowerPoint PPT Presentation

LSM-trie: An LSM-tree-based Ultra- Large Key-Value Store for Small Data Xingbo Wu, Yuehai Xu, Zili Shao, Song Jiang Wayne State University The Hong Kong Polytechnic University ATC 2015 1 Motivation Very small KV items are widespread. For


  1. LSM-trie: An LSM-tree-based Ultra- Large Key-Value Store for Small Data Xingbo Wu, Yuehai Xu, Zili Shao, Song Jiang Wayne State University The Hong Kong Polytechnic University ATC 2015 1

  2. Motivation • Very small KV items are widespread. For a store of given capacity, smaller KV items demand more metadata to • locate them. Demand on a KV store’s capacity at individual KV servers keep • increasing. • Many KV stores require high performance for both reads and writes. However, for SILT, its major effort spends on optimizing reads by minimizing • metadata size, write performance is compromised. 2

  3. Question 1: In the meantime, for some KV stores, such as SILT (Small • Index Large Table), major efforts are made to optimize reads by minimizing metadata size, while write performance can be compromised without conducting multi-level incremental compactions. Explain how high write amplifications are produced in SILT? 3

  4. Question 3: Use Figure 1 in this paper to explain the difference between • linear and exponential growth pattern? 4

  5. Question 2: Note that LSM-trie uses hash functions to organize its data • and accordingly does not support range search. Does LevelDB support range search? LevelDB supports range search. This figure from paper: <WiscKey: Sepera tj ng Keys fs om Values in SSD-Conscious S tp rage> 5

  6. Question 4: Among all compactions moving data from Lk to Lk+1, we must • make sure their key range are not overlapped to keep any two SSTables at level Lk+1 from having overlapped key ranges. However, this cannot be achieved with the LevelDB data organization. Explain why levelDB can not achieve it? The key range of an SSTable is variable in levelDB and the range’s distribution can be different in different levels. 6

  7. Question 5: Use Figure 2 and 3 to describe the LSM-trie’s structure and • how compaction is performed in the trie? 7

  8. Question 5: Use Figure 2 and 3 to describe the LSM-trie’s structure and • how compaction is performed in the trie? 8

  9. Question 6: The indices and Bloom filters in a KV store can grow very • large. Use an example to show that these metadata in LevelDB may have to be out of core. Suppose we have 10TB disk, the block size is 4 KB, and size of index for each block is 12B. Then the total size of index will be 30 GB. (10TB/4KB)*12B = 30GB Suppose we have 10 TB disk, the size of each KV item is 100 B, and 10-bit-per-key in Bloom filter. Then the total size of Bloom filter will be 125 GB. (10TB/100B)*1.25B = 125GB The total size of metadata will be 155GB! 9

  10. Question 7: Therefore, the Bloom filter must be beefed up by using more • bits. Use an example to show why the Bloom filter have to be longer. Suppose we have a SSTable-trie which has 7 levels, each level has 8 sub- levels. The false positive rate for Bloom filter employed 10-bit per item will be 0.82%. At the worst case, the probability of searching the levels without the KV item we wanted will increase from 5.74% (7 * 0.82%) to 45.92% (7*8*0.82%). 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend