a low latency multi version key value store using b tree
play

A Low-Latency Multi-Version Key-Value Store Using B-tree on an - PowerPoint PPT Presentation

A Low-Latency Multi-Version Key-Value Store Using B-tree on an FPGA-CPU Platform Yuchen Ren , Jinyu Xie, Yunhui Qiu, Hankun Lv, Wenbo Yin, Lingli Wang State Key Laboratory of ASIC and System, Fudan University Bowei Yu, Hua Chen, Xianjun He,


  1. A Low-Latency Multi-Version Key-Value Store Using B-tree on an FPGA-CPU Platform Yuchen Ren , Jinyu Xie, Yunhui Qiu, Hankun Lv, Wenbo Yin, Lingli Wang State Key Laboratory of ASIC and System, Fudan University Bowei Yu, Hua Chen, Xianjun He, Zhijian Liao, Xiaozhong Shi IT R&D Dept., Chengdu Research Institute, Huawei Technologies Co., Ltd. FPL’19, Barcelona, September 11th, 2019

  2. Introduction - Background performance & power consumption CPU-based RDMA-based FPGA-based low (power) efficiency limited flexibility of CPU-centric and efficiency of memory hierarchy RDMA *RDMA: Remote Direct Memory Access Version 1 Value 1 Multi-Version KVS Key Version 2 Value 2 (Key-Value Store) Version ... Value ... 2

  3. Introduction - Contribution Design • a low-latency multi-version in-memory KVS • FPGA-CPU heterogeneous architecture Storage • – hash table – keys FPGA board (Cuckoo hashing) • version-value pairs (VVPs) – – B-trees host memory Operation • get , put , delete , CAS , getPredecessor – bypassing the CPU • range query – with the help of the CPU *CAS: Compare and Swap 3

  4. Architecture 4

  5. Architecture - Network Offload Engine 5

  6. Architecture - Key-Value Store Engine 6

  7. Architecture - First-level indexing by key 7

  8. Architecture - Second-level indexing by version 8

  9. Implementation 2GB DDR4  FPGA platform  Intel i5-2400 • Xilinx KCU105 quad-core CPU  Frequency  12GB DDR3 • KVSE: 120MHz  256GB SSD • DMA: 250MHz  CentOS 7 • DDR4: 300MHz • NOE: 156.25MHz PC two 10GbE Xilinx KCU105 PCIe gen3 x8 9

  10. Evaluation - Key-Value Store Message Generator in FPGA hardware 10

  11. Evaluation - Results  Latency increases almost linearly  KVSE is the bottleneck * Kops: Thousand operations per second 11

  12. Conclusion Comparison (latency, get operation) • Our KVS: < 8μs (within a B -tree of 5 levels ) • Hybrid FPGA approach: ≈ 75μs ( within a B + -tree of 5 levels ) • Many software-based KVS systems: > 1ms (on the support of versioning) Future work • Optimize the system architecture of our multi-version KVS. • Expand to a distributed KVS by setting up multiple storage hosts. * Hybrid FPGA approach: D. Heinrich, S. Werner, M. Stelzner, C. Blochwitz, T. Pionteck and S. Groppe , “Hybrid FPGA approach for a B+ tree in a semantic Web database system,” 2015 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), Bremen, 2015, pp. 1-8. 12

  13. Thanks! Contact ycren18@fudan.edu.cn wbyin@fudan.edu.cn llwang@fudan.edu.cn

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend