networking for containerized clouds
play

Networking for Containerized Clouds Daehyeok Kim Tianlong Yu 1 , - PowerPoint PPT Presentation

FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds Daehyeok Kim Tianlong Yu 1 , Hongqiang Liu 3 , Yibo Zhu 4 , Jitu Padhye 2 , Shachar Raindel 2 Chuanxiong Guo 4 , Vyas Sekar 1 , Srinivasan Seshan 1 Carnegie Mellon


  1. FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds Daehyeok Kim Tianlong Yu 1 , Hongqiang Liu 3 , Yibo Zhu 4 , Jitu Padhye 2 , Shachar Raindel 2 Chuanxiong Guo 4 , Vyas Sekar 1 , Srinivasan Seshan 1 Carnegie Mellon University 1 , Microsoft 2 , Alibaba group 3 , Bytedance 4

  2. Two Trends in Cloud Applications Containerization RDMA networking • Lightweight isolation • Higher networking performance • Portability 1

  3. Benefits of Containerization Host 1 Host 2 Container 1 Container 2 Container 2 IP: 20.0.0.1 IP: 10.0.0.1 IP: 20.0.0.1 Migration Network Network Network App App App Namespace Isolation Portability Software Software Switch Switch IP: 30.0.0.1 IP: 40.0.0.1 NIC NIC 2

  4. Containerization and RDMA are in Conflict! Host 1 Host 2 Container 1 Container 2 Container 2 IP: 10.0.0.1 IP: 10.0.0.1 IP: 20.0.0.1 Migration RDMA RDMA RDMA App App App Namespace Isolation Portability IP: 10.0.0.1 IP: 20.0.0.1 RDMA NIC RDMA NIC 3

  5. Existing H/W based Virtualization Isn’t Working Using Single Root I/O Virtualization (SR-IOV) Host 1 Host 2 Container 1 Container 2 Container 2 IP: 10.0.0.2 IP: 10.0.0.1 IP: 20.0.0.1 Migration RDMA RDMA RDMA App App App Namespace Isolation Portability IP: 10.0.0.1 IP: 10.0.0.2 IP: 20.0.0.1 VF 1 VF 2 VF RDMA NIC NIC Switch NIC Switch Virtual Function VF 4

  6. Sub-optimal Performance of Containerized Apps RDMA networking can improve the training speed of NN model by ~ 10x ! 3000 Native RDMA Native RDMA Container+TCP Container+TCP 1 Training Speed (Images/sec) 2000 14.4x 9.2x CDF 0.5 1000 0 0 0 10 20 30 40 Resnet-50 Inception-v3 Alexnet Time per step (sec) Model Speech recognition RNN training Image classification CNN training 5

  7. Our Work: FreeFlow • Enable high speed RDMA networking capabilities for containerized applications • Compatible with existing RDMA applications • Close to native RDMA performance • Evaluation with real-world data-intensive applications 6

  8. Outline • Motivation • FreeFlow Design • Implementation and Evaluation 7

  9. FreeFlow Design Overview FreeFlow Native RDMA Host Host Container 1 Container 2 RDMA App IP: 20.0.0.1 IP: 10.0.0.1 RDMA App RDMA App Verbs API Verbs API Verbs API Verbs library FreeFlow NIC command Verbs library IP: 30.0.0.1 RDMA NIC RDMA NIC 8

  10. Background on RDMA “Host 1 wants to write contents in MEM -1 to MEM- 2 on Host 2” Host 1 Host 2 1. Control path RDMA App RDMA App - Setup RDMA Context RDMA CTX MEM-1 MEM-2 RDMA CTX - Post work requests (e.g., write) 2. Data path - NIC processes work requests Verbs library Verbs library - NIC directly accesses memory RDMA NIC RDMA NIC 9

  11. FreeFlow in the Scene “Container 1 wants to write contents in MEM -1 to MEM- 2 on Container 2” Container 2 Container 1 RDMA App RDMA App RDMA CTX MEM-1 MEM-2 RDMA CTX C1: How to forward verbs calls? FreeFlow FreeFlow S-RDMA CTX S-MEM-1 S-MEM-2 S-RDMA CTX C2: How to synchronize memory? Verbs library Verbs library RDMA NIC RDMA NIC 10

  12. Challenge 1: Verbs forwarding in Control Path struct ibv_qp { Container struct ibv_context *context; RDMA App RDMA App …. }; ? ibv_post_send (struct ibv_qp* qp , …) FreeFlow Shim Verbs API Attempt 1: Forward “as it is” ➔ Incorrect Verbs library Attempt 2: “Serialize” and forward NIC command ➔ Inefficient RDMA NIC 11

  13. Internal Structure of Verbs Library struct ibv_qp { Container struct ibv_context *context; RDMA App RDMA App …. }; ? ibv_post_send (struct ibv_qp* qp , …) FreeFlow Shim Verbs API Verbs library NIC command Parameters are serialized by Verbs library! RDMA NIC 12

  14. FreeFlow Control Path Channel Idea: Leveraging the serialized output of verbs library Container RDMA App RDMA App ibv_post_send (struct ibv_qp* qp , ….) VNIC Verbs library FreeFlow library FreeFlow Router Shim Verbs API Write (VNIC_fd, serialized parameters) Verbs library Parameters are forwarded correctly VNIC NIC command without manual serialization! FreeFlow Router RDMA NIC 13

  15. Challenge 2: Synchronizing Memory for Data Path Container RDMA App • Shadow memory in FreeFlow router RDMA CTX MEM • A copy of application’s memory region • Directly accessed by NICs VNIC FreeFlow Router • S-MEM and MEM must be synchronized. S-RDMA CTX S-MEM • How to synchronize S-MEM and MEM? Verbs library RDMA NIC 14

  16. Strawman Approach for Synchronization “Container 1 wants to write contents in MEM -1 to MEM- 2 on Container 2” Container Container RDMA App RDMA App DATA RDMA CTX MEM-1 MEM-2 RDMA CTX Explicit synchronization ? VNIC VNIC High freq. ➔ High overhead Low freq. ➔ Wrong data for app FreeFlow Router FreeFlow Router S-RDMA CTX S-MEM-1 S-MEM-2 S-RDMA CTX Verbs library Verbs library RDMA NIC RDMA NIC 15

  17. Containers can Share Memory Regions Host Container RDMA App RDMA CTX MEM-1 Shared memory VNIC • FreeFlow router is running in a container MEM FreeFlow Router S-RDMA CTX S-MEM-1 MEM and S-MEM can be located on the same physical memory region Verbs library RDMA NIC 16

  18. Zero-copy Synchronization in Data Path Host Container How to allocated MEM-1 to shadow memory space? RDMA App RDMA CTX MEM-1 Shared memory VNIC MEM FreeFlow Router S-RDMA CTX S-MEM-1 Synchronization without explicit memory copy: Method1: Allocate shared buffers with FreeFlow APIs Method2: Re-map app’s memory space to shadow Verbs library memory space FreeFlow supports both! RDMA NIC 17

  19. FreeFlow Design Summary Container 1 Container 2 IP: 10.0.0.1 IP: 20.0.0.1 RDMA App RDMA App FreeFlow control path channel VNIC VNIC FreeFlow Router Zero-copy memory synchronization Verbs library IP: 30.0.0.1 RDMA NIC FreeFlow provides near native RDMA performance for containers! 18

  20. Outline • Motivation • FreeFlow Design • Implementation and Evaluation 19

  21. Implementation and Experimental Setup • FreeFlow Library • Add 4000 lines in C to libibverbs and libmlx4. • FreeFlow Router • 2000 lines in C++ • Testbed setup • Two Intel Xeon E5-2620 8-core CPUs, 64 GB RAM • 56 Gbps Mellanox ConnectX-3 NICs • Docker containers 20

  22. Does FreeFlow Support Low Latency? 4 Native RDMA FreeFlow Latency (us) 3 0.38 μ s 2 1 0 64 256 1K 4K Message size (B) 21

  23. Does FreeFlow Support High Throughput? 60 Throughput (Gbps) 40 Bounded by control path channel performance 20 Native RDMA FreeFlow 0 2K 8K 32K 128K 512K 1M Message size (B) 22

  24. Do Applications Benefit from FreeFlow? Container+TCP Native RDMA FreeFlow 1 8.7x CDF 0.5 0 0 10 20 30 40 Time per step (sec) 23

  25. Summary • Containerization today can’t benefit from speed of RDMA. • Existing solutions for NIC virtualization don’t work (e.g., SR -IOV). • FreeFlow enables containerized apps to use RDMA. • Challenges and Key Ideas • Control path: Leveraging Verbs library structure for efficient Verbs forwarding • Data path: Zero-copy memory synchronization • Performance close to native RDMA github.com/microsoft/freeflow 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend