Implementation and Analysis of Large Receive Offload in a - - PowerPoint PPT Presentation

implementation and analysis of large receive offload in a
SMART_READER_LITE
LIVE PREVIEW

Implementation and Analysis of Large Receive Offload in a - - PowerPoint PPT Presentation

Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi* Computer Architecture and Operating Systems Group, The University of Aizu (*presenter) VPACT08, Austin, TX Apr 20, 2008


slide-1
SLIDE 1

Implementation and Analysis of Large Receive Offload in a Virtualized System

Takayuki Hatori and Hitoshi Oi* Computer Architecture and Operating Systems Group, The University of Aizu

(*presenter)

VPACT08, Austin, TX Apr 20, 2008

slide-2
SLIDE 2
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Outline

System level virtualization Large Receive Offload Xen internal network architecture Large Receive Offload Implementation Experimental Results & Analysis Conclusions

slide-3
SLIDE 3
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

System Level Virtualization

Multiple independent “machines” on top of single hardware platform Utilization and consolidation of hardware resources Isolation and protection against software malfunction and attack to VMs. These advantages come with overhead: especially in I/O components

slide-4
SLIDE 4
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Objectives

Porting LRO to a Xen virtualized system and see how it improves network performance. Modifications to internal network architecture (interface between domains). Experiments and evaluations

Sender-receiver programs transfer 10GB data Measured throughput, CPU utilization

slide-5
SLIDE 5
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Large Receive Offload (LRO)

Receive multiple packets in a single receive

  • peration

Aggregate packets into groups and pass them to the upper layers Reduce overhead in packet-related data structures (e.g. skb).

slide-6
SLIDE 6
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Xen Internal Network Architecture

slide-7
SLIDE 7
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

LRO Implementation

LRO LRO Data Copy

slide-8
SLIDE 8
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Experimental Setup

Operating System: CentOS (Linux 2.6.18) Xen: 3.1.0 Receiver: Xeon 1.86GHz, 2GB Mem Sender: AMD Athlon 64x2 2GHz, 2GB Mem NIC: On-board 1Gbps Workload: Simple sender-receiver programs that transfer 10GB data with MTU=1500B.

slide-9
SLIDE 9
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Throughput & LRO Rate Comparison

slide-10
SLIDE 10
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Clock Cycles & Instruction Count

slide-11
SLIDE 11
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Network Traffic at Receiver

slide-12
SLIDE 12
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Bandwidth Estimation at Sender

Ack received at Sender increases estimated bandwidth Till Sack received

slide-13
SLIDE 13
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Breakdown of Acknowledgment

slide-14
SLIDE 14
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Analysis of Performance Improvement

LRO reduced acknowledgment

  • Ack. increased

By unsuccessful delayed acknowledgment

Selective acknowledgment rate dropped

Sacks were not aggregated by delayed acknowledgment

High estimated bandwidth in TCP layer

Uses selective acknowledgment rate for estimation

High throughput achieved

slide-15
SLIDE 15
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Conclusions

LRO implementation in Xen and experimental results presented. LRO in the physical interface improved the CPU utilization across the system LRO in the virtual interface achieved high throughput, which is the result of LRO and delayed acknowledgment combination. Further Optimization

LRO aware Network Bridge Option to disable delayed acknowledgment

slide-16
SLIDE 16
  • T. Hatori & Hitoshi Oi VPACT08, Austin, TX、Apr 20, 2008

Implementation and Analysis of LRO in a Virtualized System

Thank you for your attention

Any questions ?