Keeping up with the hardware Challenges in scaling I/O performance - PowerPoint PPT Presentation

Keeping up with the hardware Challenges in scaling I/O performance Jonathan Davies XenServer System Performance Lead XenServer Engineering, Citrix Cambridge, UK 18 Aug 2015 Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 1 / 50

Outline The virtualisation performance challenge 1 Networking performance 2 Storage performance 3 Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 2 / 50

The virtualisation performance challenge Outline The virtualisation performance challenge 1 Networking performance 2 Storage performance 3 Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 3 / 50

The virtualisation performance challenge Recent hardware trends 100 Gb/s 40 Gb/s NICs 10 Gb/s speed (log scale) CPUs 1 Gb/s disks NVMe SSD HDD 2000 2005 2010 2015 Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 4 / 50

The virtualisation performance challenge Virtualisation overhead is increasing As I/O devices get faster but CPU speeds remain constant, this means the relative virtualisation overhead increases: Old I/O devices time spent on physical device overhead Modern I/O devices overhead Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 5 / 50

Networking performance Outline The virtualisation performance challenge 1 Networking performance 2 Storage performance 3 Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 6 / 50

Networking performance Areas of weak networking performance Metric Xen’s performance Intrahost VM -to- VM throughput weak Intrahost aggregate throughput weak Interhost from-VM transmit throughput strong Interhost into-VM receive throughput weak Interhost aggregate throughput strong Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 7 / 50

Networking performance Improving intrahost single-stream throughput Outline The virtualisation performance challenge 1 Networking performance 2 Improving intrahost single-stream throughput Improving intrahost aggregate throughput Summary Storage performance 3 Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 8 / 50

Networking performance Improving intrahost single-stream throughput Where do we stand? Intrahost VM -to- VM single-stream throughput measurements (using CentOS 7): XenServer 6.5 15 Gb/s Target 30 Gb/s more is better Dell R720 (2 × Xeon E5-2643 v2) Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 9 / 50

Networking performance Improving intrahost single-stream throughput It’s even worse with an upstream guest kernel! Intrahost VM -to- VM single-stream throughput measurements (using CentOS 7): XenServer 6.5 15 Gb/s (guests with 4.0 kernel) 9 Gb/s Target 30 Gb/s more is better Dell R720 (2 × Xeon E5-2643 v2) Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 10 / 50

✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ 11 / 50 Two CentOS 7.0 VM s (4.0.9 kernel) on Dell R720 (2 × Xeon E5-2643 v2) rx netfront passing skb to kernel rx netfront put on rxq rx netfront lling frags 18 Aug 2015 rx netfront dequeuing skb from tmpq rx netfront enqueuing skb on tmpq rx netfront reading from rx slot tx netfront freeing skb tx netfront received tx response Datapath analysis with 4.0 kernel in guests Improving intrahost single-stream throughput dealloc thread sent tx response dealloc thread releasing dealloc thread got from dealloc ring rx netback put in dealloc ring rx netback freeing skb rx netback dequeued from rxq rx netback gntcpy nished rx netback enqueued in rxq Keeping up with the hardware rx netback kicking receive thread rx netback device received skb bridge delivered skb bridge received skb Networking performance tx netback passing skb to kernel tx netback lling frags tx netback dequeued skb from tx_queue tx netback nished gntmap tx netback nished gntcpy tx netback nished build_gops tx netback enqueued on tx_queue tx netback allocated skb tx netback reading from rst tx slot tx netfront written to last tx slot Jonathan Davies (Citrix) tx netfront written to rst tx slot tx kernel passes skb to netfront tx kernel passes skb to ip layer tx kernel clones skb tx kernel in tcp_transmit_skb tx kernel calling tcp_transmit_skb � � � � � � � � � � � � 0 � � � � � � � ☎ ✥ ✄ ✂ ✁ ✥ ☎ tsc/1000

12 / 50 Two CentOS 7.0 VM s (4.0.9 kernel) on Dell R720 (2 × Xeon E5-2643 v2) rx netfront passing skb to kernel rx netfront put on rxq rx netfront fi lling frags 18 Aug 2015 rx netfront dequeuing skb from tmpq rx netfront enqueuing skb on tmpq rx netfront reading from rx slot tx netfront freeing skb tx netfront received tx response Datapath analysis with 4.0 kernel in guests Improving intrahost single-stream throughput dealloc thread sent tx response dealloc thread releasing dealloc thread got from dealloc ring rx netback put in dealloc ring rx netback freeing skb rx netback dequeued from rxq rx netback gntcpy fi nished rx netback enqueued in rxq Keeping up with the hardware rx netback kicking receive thread rx netback device received skb bridge delivered skb bridge received skb Networking performance tx netback passing skb to kernel tx netback fi lling frags tx netback dequeued skb from tx_queue tx netback fi nished gntmap tx netback fi nished gntcpy tx netback fi nished build_gops tx netback enqueued on tx_queue tx netback allocated skb tx netback reading from fi rst tx slot tx netfront written to last tx slot Jonathan Davies (Citrix) tx netfront written to fi rst tx slot tx kernel passes skb to netfront tx kernel passes skb to ip layer tx kernel clones skb tx kernel in tcp_transmit_skb tx kernel calling tcp_transmit_skb 12000 10000 8000 6000 4000 2000 0 tsc/1000

13 / 50 Two CentOS 7.0 VM s (4.0.9 kernel) on Dell R720 (2 × Xeon E5-2643 v2) rx netfront passing skb to kernel Transmitter often stalls; only ever two packets in flight rx netfront put on rxq rx netfront fi lling frags 18 Aug 2015 rx netfront dequeuing skb from tmpq rx netfront enqueuing skb on tmpq rx netfront reading from rx slot tx netfront freeing skb tx netfront received tx response Improving intrahost single-stream throughput dealloc thread sent tx response dealloc thread releasing dealloc thread got from dealloc ring rx netback put in dealloc ring rx netback freeing skb rx netback dequeued from rxq rx netback gntcpy fi nished rx netback enqueued in rxq Keeping up with the hardware rx netback kicking receive thread rx netback device received skb bridge delivered skb bridge received skb Networking performance tx netback passing skb to kernel tx netback fi lling frags Red boxes: periods when netfront is not running tx netback dequeued skb from tx_queue tx netback fi nished gntmap tx netback fi nished gntcpy tx netback fi nished build_gops tx netback enqueued on tx_queue tx netback allocated skb tx netback reading from fi rst tx slot tx netfront written to last tx slot Jonathan Davies (Citrix) tx netfront written to fi rst tx slot tx kernel passes skb to netfront tx kernel passes skb to ip layer tx kernel clones skb tx kernel in tcp_transmit_skb tx kernel calling tcp_transmit_skb 12000 10000 8000 6000 4000 2000 0 tsc/1000

Networking performance Improving intrahost single-stream throughput Principal bottleneck: high TX completion latency High TX completion latency is a serious problem with guests using 4.x kernels, which aggressively limit the amount of uncompleted data. Definition of TX completion latency TX completion latency time skb generated by guest request put in TX ring request consumed by dom0 response received in TX ring Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015 14 / 50

15 / 50 Two CentOS 7.0 VM s (4.0.9 kernel) on Dell R720 (2 × Xeon E5-2643 v2) rx netfront passing skb to kernel rx netfront put on rxq rx netfront fi lling frags 18 Aug 2015 rx netfront dequeuing skb from tmpq rx netfront enqueuing skb on tmpq rx netfront reading from rx slot tx netfront freeing skb tx netfront received tx response Improving intrahost single-stream throughput dealloc thread sent tx response dealloc thread releasing dealloc thread got from dealloc ring The transmitter waits for TX completion rx netback put in dealloc ring rx netback freeing skb rx netback dequeued from rxq rx netback gntcpy fi nished rx netback enqueued in rxq Keeping up with the hardware rx netback kicking receive thread rx netback device received skb bridge delivered skb bridge received skb Networking performance tx netback passing skb to kernel tx netback fi lling frags tx netback dequeued skb from tx_queue tx netback fi nished gntmap tx netback fi nished gntcpy tx netback fi nished build_gops Yellow slice: point of TX completion tx netback enqueued on tx_queue tx netback allocated skb tx netback reading from fi rst tx slot tx netfront written to last tx slot Jonathan Davies (Citrix) tx netfront written to fi rst tx slot tx kernel passes skb to netfront tx kernel passes skb to ip layer tx kernel clones skb tx kernel in tcp_transmit_skb tx kernel calling tcp_transmit_skb 12000 10000 8000 6000 4000 2000 0 tsc/1000

Keeping up with the hardware Challenges in scaling I/O performance - PowerPoint PPT Presentation

Keeping up with the hardware Challenges in scaling I/O performance Jonathan Davies XenServer System Performance Lead XenServer Engineering, Citrix Cambridge, UK 18 Aug 2015 Jonathan Davies (Citrix) Keeping up with the hardware 18 Aug 2015

Hardware Observability Framework Hardware Observability Framework Hardware Observability

VC. VC. Hardware Startup The Hardware Revolu/on The Hardware Revolution Removing Barriers to

Sec Secure ure Hardware Hardware and Hardware and Hardware- En Enabled abled Security

Ameren Keeping Current and Keeping Cooling Evaluation Presentation 2016 Evaluation Activities

The Jo Job Keeping Pla lan A TRAINING FOR RESIDENTIAL PROVIDERS Job Keeping Plan Training

software and hardware for the Internet of Things. Choose hardware Design hardware Design

Hardware evaluation and procurement Hardware: competition, evolution, Evaluation of CPU nodes

BIOINSPIRED HARDWARE Erki Suurjaak Overview bioinspired hardware NASAs exploration

Secure Hardware HOW CAN WE PROTECT OUR HARDWARE ??? HOW CAN OUR HARDWARE PROTECT ITSELF ??? 1

HOST Hardware Trojans I ECE 525 Hardware Trojans (HT) What is a hardware Trojan? A deliberate

Contents Contents Mobile Phones Generations Hardware Requirements for Hardware Requirements

CSE 120 Hardware How hardware works Operating Systems Layer What the kernel does API

Exit Hardware 376 Series - Push Bar Exit Hardware 376 Series - Push Bar Exit Hardware In distinct

Computer Hardware Hardware components are the physical parts of the computer Main Hardware

KEEPING YOUR CHILD SAFE ONLINE Thursday 26 September 2019 GENERATION INTERNET KEEPING YOUR

Keeping Lamb Meat Red Dr Honor Calnan Keeping Lamb Meat Red Problem of lamb browning

Developing new commissioning arrangements in north east London The seven clinical

Recent developments in R packages for graphical models REVISED February 2013 Sren

Ring A Vaginal Ring Containing Dapivirine for HIV-1 PrEP Ring Study: Background Study Design:

Gli an'corpi monoclonali nel Mieloma Mul'plo Dr. Vi3orio

SENSEI project S ub- E lectron- N oise S kipperCCD E xperimental I nstrument Ultra low-energy

Max of a List Implement the function max-item which returns the biggest number in a list of

CONSOB EMITTENTI TITOLI S.p.A. C OMMISSIONE N AZIONALE PER LE S OCIETA E LA B ORSA La

FOSS for Scientists devroom Sylwester Arabas Juan A. A nel Christos Siopis