SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet
Mao Miao, Fengyuan Ren, Xiaohui Luo, Jing Xie, Qingkai Meng, Wenxue Cheng
- Dept. of Computer Science and Technology, Tsinghua University,
SoftRDMA: Rekindling High Performance Software RDMA over Commodity - - PowerPoint PPT Presentation
SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet Mao Miao, Fengyuan Ren, Xiaohui Luo , Jing Xie, Qingkai Meng, Wenxue Cheng Dept. of Computer Science and Technology, Tsinghua University, Background R emote D irect
Mao Miao, Fengyuan Ren, Xiaohui Luo, Jing Xie, Qingkai Meng, Wenxue Cheng
control to prevent packet drops
separate networks
actually provide that capability
congestion problems, etc
control mechanisms to ensure scalability, routability and reliability
Applications Verbs API RDMAP DDP MPA TCP IP NIC driver Data Link DPDK Usersp ace Applications Verbs API RDMAP DDP MPA TCP IP NIC driver Data Link User Space Kernel Space Applications Verbs API RDMAP DDP MPA TCP IP NIC driver Data Link User Space Kernel Space
wo-Copy
1 2 3 4 5 6 7 8 … Device Driver
RX Ring Buffer
NIC1 DMA Engine Interrupt Generator NIC Interrupt Handler
Softirq Net_rx_action NIC1
IP TCP
App1 read()
Hardware Interrupt Netif_rx_schedule() Raised softirq check
1 2 3
Poll_queue() dev->poll
App2 read() …
User Space Kernel 5
Stack Processing
6 7 App recv Control flow Data flow 1st COPY 4 2nd COPY
1 2 3 4 5 6 7
1 2 3 4 5 6 7 8 … Device Driver
RX Ring Buffer
NIC1 DMA Engine Interrupt Generator NIC Interrupt Handler
Softirq Net_rx_action NIC1
IP TCP
App1 read()
Hardware Interrupt Netif_rx_schedule() Raised softirq check
1 2 3
Poll_queue() dev->poll
App2 read() …
User Space Kernel 5
Stack Processing
6 7 App recv Control flow Data flow 1st COPY 4 2nd COPY
space to remove the copy in Step 7
Step 6
7 6
how to manage them?
the input Pkts before stack processing
reused fast to hold input packets?
up to thousands of input packets
TCP/IP TCP/IP App Event Conditions Batched Systcalls
(c1)
Thread 1 Thread 2
TCP/IP TCP/IP App
NIC Driver User Space
(c2)
Thread 1 Thread 2
processing and Pkts’ TX
efficiency and reduce the latency
TCP/IP TCP/IP App
(c3)
Thread 1 Thread 2
API (Sequential API)
(SoftRDMA)
3.59us/64B 5.29us/1KB 16.27us/10KB
93.45us/100KB 432.50us/500KB
7893.31Mbps/100KB
8854.16Mbps/10KB 8917.44Mbps/100KB
high-performance network I/O