mul multitena nancy ncy for r fast and nd programmabl ble
play

Mul Multitena nancy ncy for r Fast and nd Programmabl ble - PowerPoint PPT Presentation

Mul Multitena nancy ncy for r Fast and nd Programmabl ble Network rks in n the he Cl Cloud ud Tao Wang * , Hang Zhu * , Fabian Ruffy, Xin Jin, Anirudh Sivaraman, Dan Ports, and Aurojit Panda ( * Equal contribution) Wha What do does


  1. Mul Multitena nancy ncy for r Fast and nd Programmabl ble Network rks in n the he Cl Cloud ud Tao Wang * , Hang Zhu * , Fabian Ruffy, Xin Jin, Anirudh Sivaraman, Dan Ports, and Aurojit Panda ( * Equal contribution)

  2. Wha What do does s toda day’s s cloud ud offer as s a se service? Ø Generic compute and storage resources Ø Specialized accelerators 2

  3. Em Emergenc nce of f pr progr grammabl ble ne network k de devices Ø Pipeline-based programmable devices Ø In-network switches Ø At-host SmartNICs Ø Enable wide-range innovations for classical networked systems Ø Consensus: NOPaxos, NetPaxos Ø Concurrency control: Eris Ø Caching: NetCache, IncBricks Ø Storage: NetChain, SwitchKV Ø Applications: SwitchML, NetAccel Ø … 3

  4. Wh Why y no not offer suc such h system as s a cloud ud se service? Ø Need of multitenancy support Ø Provider’s aspect Ø Improve resource utilization Ø One application can hardly consume all the hardware resources Ø Heterogenous resource requirement Ø Tenant’s aspect Ø Enable innovations Ø New programs can be easily tested w/o impacting basic network functionality 4

  5. How to enable multitenancy y for programmable devices? Requirements: Ø Resource efficiency Ø Little overhead Ø Isolation Ø Performance Ø Allocated resource Our vision: a hybrid compile-time and run-time solution 5

  6. Backgrou Ba ound on on prog ogramma mmable network ork devices Parser Ingress Pipeline Exact match Xbar Queues Egress Ternary match Xbar Pipeline Stage 1 SRAMs/TCAMs …… Match Action … … Match Action PHV Action Ethernet header Stateful Mem Circuit Packet … container … units Headers Queue length … Per-packet Metadata Hardware e.g., register enqueue port 6

  7. Pr Programmable devices’ characte teristics Performance Ø Various types of hardware resources Ø Most of them are decided during compile time Ø Limited run-time support Ø Hardware wirings are decided during compile time Ø Line-rate performance achieved after successful compilation Ø No temporal scheduling (e.g., CPU or NPU scheduling) Ø No spatial reconfiguration (e.g., FPGA [AmorphOS, OSDI’18]) Ø Resource efficiency Ø Isolation Ø Little overhead Ø Performance Ø Allocated resource Programmability 7

  8. A A hybrid compile-tim time e an and run-tim time e solu lutio tion Ø Compile-time program linker Ø Target generic resources (e.g., SRAMs/TCAMs, action units, etc.) Ø But static Ø Run-time memory allocator Ø Target stateful memory Ø But limited 8

  9. Sy System overview S u Run-time b m Tenants i t r e q u Control Plane e s t 2 Reallocation Memory Utility Problem Table Entry … S T 1 T n Allocator Calculator Solver Handler Compile-time Linker Translation Layer 3 1 Resource Sharing Policy Data Plane … Stage 2 Stage 3 Stage m Stage 1 Resource Usage Checker Header & Metadata Program Linker Sys & Sys & Config Counter Tenant … Tenant Params Record Tables Tables Merged Jumbo Program One Big Array One Big Array One Big Array One Big Array

  10. Go Goals als of compile ile-tim time e lin linker er Ø Restrict resource usage Ø Provide isolation Ø Ensure tenant program does not inference with others’ Ø Ensure no infinite packet resubmitting Ø Ensure no loop forwarding configuration Ø … 10

  11. Pa Parser Ø Fixed packet format Parser Ø Eth, VLAN, IP, TCP or UDP header apply S’s parser to if (tag==T 1 ’s VID) followed by custom headers extract common apply T 1 ’s parser headers … Ø System program Header { Ø Extract common headers Ethernet hdr Ø Tenant Programs IP hdr VLAN hdr Ø Extract tenant-defined headers System TCP or UDP hdr Program T 1 hdr Tenant … Programs T n hdr } 11

  12. Control Con ol (ingress and egress) pipeline Ø Feed-forward packet flow Packet Flow Ø “Sandwich” architecture Control Pipeline Ø write-then-read half Ø read-then-write half Convert to if (tag==T 1 ’s VID) Pass system system apply T 1 ’s ctrl states to states … tenants Ø System program Ø Interact with tenant programs System states { Ø E.g., pass system states System states { … Ø Convert virtual addresses to physical egress_port link utilization … ones packet count } … } 12

  13. Run-tim Ru time e mem emory allo allocator Ø Page-table-like indirection Register Array Match Action Tenant 1 Config metadata.offset=0 VID==1 Params Control metadata.amount=2 6 Plane One Big Array metadata.offset=512 VID==2 Tenant 2 metadata.amount=2 4 Memory allocator … … One Big Array pkt.physical_address = Counter metadata.offset + (pkt.virtual_address % metadata.amount) Record One Big Array 13

  14. Im Implem plemen entatio tion Ø Prototype on Barefoot Tofino switch Ø Compile-time linker Ø Extend open-source P4 compiler [1] Ø Run-time memory allocator Ø Base on auto-generated APIs to pull records and modify table entries [1] https://github.com/p4lang/p4c 14

  15. Comp Compile-tim time e program am lin linker er correc ectn tnes ess Ø Resource usage on Tofino 150 Resource Usage (% of total) Ø Packet-level validation on PTF 100 Ø Sys program Ø Basic parsing and forwarding logics 50 Ø [SOSP’17] NetCache Ø [NSDI’18] NetChain 0 r t s V M s s a i e t e n H i b A g l n U b P X a R U a t S s h T S n t c # Ø Overhead i o y B t a a i h t w M c s A e a t t H c a Ø Additional gateway tables to check a G x E which program to be executed Merged program Sys program NetCache NetChain Ø Additional tag-along PHV containers 15

  16. Ru Run-tim time e mem emory allo allocator effic icien iency Ø Experimental Setting Ø 64 tenants submit 1-min heavy hitter detection task against source IP address within its /6 subnets Ø 10-min CAIDA trace replay Ø Evaluation metric Ø Utility: memory hit ratio Ø Satisfaction: time fraction w/ utility > 0.9 Ø We show the mean and 5 th percentile 16

  17. Con Conclusion on Ø Takeaways Ø A hybrid solution for multi-tenancy support Ø Compile-time linker: general but static Ø Run-time memory allocator: dynamic but limited Ø Future work Ø Seek new hardware design Ø Both general and dynamic 17

  18. Thanks! Happy to take questions tw1921@nyu.edu

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend