with fp fpgas cas ase stu tudy on on a a
play

with FP FPGAs: Cas ase Stu tudy on on a a Key-Value Store - PowerPoint PPT Presentation

Zsolt Istvn * , Gustavo Alonso, Ankit Singla Systems Group, Computer Science Dept., ETH Zrich * Now at IMDEA Software Institute, Madrid Providing Multi-tenant Services with FP FPGAs: Cas ase Stu tudy on on a a Key-Value Store FPGAs in


  1. Zsolt István * , Gustavo Alonso, Ankit Singla Systems Group, Computer Science Dept., ETH Zürich * Now at IMDEA Software Institute, Madrid Providing Multi-tenant Services with FP FPGAs: Cas ase Stu tudy on on a a Key-Value Store

  2. FPGAs in the Cloud • Wider adoption of FPGAs (e.g., Amazon F1, Microsoft Catapult, …) • Many promising use-cases but often singe-tenant designs • Clouds built on sharing and multi-tenancy ❑ High utilization ❑ Flexible provisioning ❑ Load isolation and QoS guarantees 2

  3. Providing multi-tenancy with FPGAs FPGA FPGA Virtualization Multi-tenant applications • General purpose (PR) • Domain-specific • Few tenants • Many tenants • Trades off functionality • Trades off performance (?) • Course grained resource alloc. • Fine grained resource alloc. • Tenants “bring” applications • Provider “brings” application 3

  4. Multi-tenant application as a service Key-value store • Widely deployed in the cloud and datacenters • Different tradeoffs but similar interfaces, e.g.: • Memcached – caching, no replication, latency- optimized, main-memory • Amazon S3 – BLOB store, replicated, BW-optimized, needs large capacity 4

  5. Building a multi-tenant KVS (Multes) • Area well studied in related work • Several pipelined designs, all saturate network link • Caribou: Interfaces and functionality similar to SW [VLDB17] • FPGA can provide replication for fault-tolerance [NSDI16] • Requirements for multi-tenancy: • Performance isolation • Data isolation • Flexibility in resource allocation (focus on network bandwidth) • Efficient use of resources regardless of number of tenants [VLDB17] Z. István, D. Sidler, G. Alonso Caribou: Intelligent Distributed Storage. 5 [NSDI16] Z. István, D. Sidler, G. Alonso, M. Vukolic: Consensus in a Box: Inexpensive Coordination in Hardware.

  6. Designing for multi-tenancy messages Replication • Caribou is composed of four modules Multes (single pipeline) Caribou • Requests can take various routes Traffic Shaper • Some traffic is inter-node Network Network Replication + Replication Stack (TCP) Stack (TCP) Log Manager • Hard to reason about load interactions! messages Client Traffic Shaper Value Multivers. Value Access Hash Table + • Multes: Reorganized pipeline to ensure Access + Hash Table + Processing Allocator Processing + Allocator all requests take same path (1) • Hash table implements parts of the replication log features (multi-version) Memory Memory • More coupling between modules (op- codes) 6

  7. Per-tenant limits (D,C,T) Token buckets Round Traffic Shaper Configuration -robin Token Output packets/commands Input packets/commands • Commonly used in networking scenarios Bucket • Max. number of tokens ( D ), adding C tokens tenant ID Extract every T cycles Token Bucket • Limits data rate, burst size Token • Buffer space on the FPGA? Bucket • Queue commands before data movement • Token buckets can be configured with no Encodes the Meta- “real cost” of Body data overhead at runtime (2) the request • Per-tenant allocations controlled by software Request/command 7

  8. Replicated KVS • Caribou implements inter-FPGA replication (leader based algorithm) Tenant 1 replicated group Tenant 2 Leader Replica Replica FPGA FPGA FPGA FPGA node node node node Replica Replica Leader Tenant 3 replicated group 8

  9. List of peers, Role in protocol, Outstanding Having multiple roles proposals, etc. Tenant Tenant Tenant • Control state machine at heart of 1 State 2 State 3 State replication protocol • Data and control handled separately • Multiple copies not an option Out. command Input message • Complex logic + plumbing • SM extended to store state for each tenant – can context switch per each packet (3) Replication controller • Not all states need tenant context (atomic broadcast • Latency inside SM not on critical path Encodes protocol) key, data • Now in registers, but could use BRAMs to op., socket store state numbers, etc. 9

  10. Replication protocol Evaluation Client Multes of Multes Network Client Client Memory/ Client Storage Client Tenant 1 Client Tenant 2 • Multiple Xilinx VC709s connected to a 10Gbps switch • 9 load generating machines, Go-based benchmarking tool • Tenants connect to different TCP port numbers (e.g. 2880, 2881, …) ✓ Multes offers flexible multi-tenancy while efficiently using the network link 10

  11. No performance loss due to multi-tenancy • Read-only throughput on a single node 11

  12. Load isolation • Replicated write latency of Tenant 0 (group = 3) • Additional tenants using their full read bandwidth (1/8 of 10Gbps) Replicated write latency [us] (without client overhead) 12

  13. Resource Usage: Small cost for sharing 100 Logic 2x Caribou 90 % of VC709 resources 80 70 BRAM 60 50 Logic 40 BRAM 30 20 Multes T=2 10 Caribou 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 No. of max. supported tenants in Multes The FPGA part on the VC709 is XC7VX690T-2FFG1761C 13

  14. Thoughts on the future Platform-as-a-service Multes • Customize KVS with tenant-defined Traffic Shaper Network Replication processing for different “flavors” Stack (TCP) Traffic Shaper Value Multivers. • Combining multi-tenant application with Access + Hash Table small PR regions Processing + Allocator • Simple streaming interfaces – can use HLS, OpenCL, etc. Memory • Misbehaving PR region does not impact others 14

  15. Conclusion Multes: multi- tenant KVS service that doesn’t sacrifice performance Project on Github: https://github.com/fpgasystems/caribou Relied on three techniques: 1) Single-pipeline architecture and traffic shapers → no load interaction 2) Runtime-parameterization of control modules → flexible allocations 3) “Contexts” in controlling state machines → no overhead when switching between tenants → Applicable to many network-facing applications on FPGAs 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend