experience with dpdk troubleshooting
play

Experience with DPDK Troubleshooting Juha Kosonen - - PowerPoint PPT Presentation

Experience with DPDK Troubleshooting Juha Kosonen - juha.kosonen@nokia.com Ajay Simha - asimha@redhat.com Abstract Getting near real-time performance for VNF applications requires designing the application carefully and profiling its performance


  1. Experience with DPDK Troubleshooting Juha Kosonen - juha.kosonen@nokia.com Ajay Simha - asimha@redhat.com

  2. Abstract Getting near real-time performance for VNF applications requires designing the application carefully and profiling its performance under realistic conditions. OpenStack has developed many functionalities that helps in this, such as CPU and PCI topology awareness and pinning the virtual machines to dedicated cores. The host networking stack, in turn, has been optimized with features such as accelerated virtual switches and DPDK. This presentation gives an update of using DPDK in real VNF tests and tells some real life experiences from DPDK troubleshooting. What can I expect to learn? • Overview of DPDK • What can go wrong with installation of DPDK and how to go about troubleshooting it • Tuning of OVS-DPDK and how to make sure you are getting the expected performance 2 INSERT DESIGNATOR, IF NEEDED

  3. Agenda • Introduction • Telco Requirements • HA and Performance • Journey from virto to DPDK • SR-IOV vs. DPDK (The two viable options today) • DPDK Data Plane • Hardware Tuning • Throughput • CPU core allocation • HA • DPDK Troubleshooting • Install time • HA • Performance • Summary 3 INSERT DESIGNATOR, IF NEEDED

  4. Telco Requirements: HA and Performance Two pillars of NFV • Maximum subscribers per core • Highly available - as close to Five 9s as possible • If you have no HA • Failure = Zero PPS!! • Telco applications/services serve millions of subscribers • Hardware based solutions provided required performance • We need to get maximum throughput from our virtualized solution! • Need maximum PPS (Packets per second) without drops!! • NEPs as well as Telcos (Operators) don’t like NIC driver dependency on VNFs • Closer to Cloud Ready the better 4 INSERT DESIGNATOR, IF NEEDED

  5. Journey from VIRTIO to DPDK: virtio

  6. Journey from VIRTIO to DPDK: PCI Passthrough

  7. Journey from VIRTIO to DPDK: SR-IOV

  8. Journey from VIRTIO to DPDK: PCI Passthrough

  9. DPDK Data Plane Things to consider • Hardware tuning • Throughput • CPU core allocation • HA 9 INSERT DESIGNATOR, IF NEEDED

  10. DPDK Data Plane: Hardware tuning ● Zero Loss requires latest hardware generation ● Intel NICs, either Niantic series (x520/x540) or the latest Fortville series (x710). 10 INSERT DESIGNATOR, IF NEEDED

  11. DPDK Data Plane: Throughput VM: testpmd virtio virtio VM L2 forwarding, intra -NUMA node, single queue ● 2 x virtio-net interfaces (node0) ● 2 x 10Gb interfaces (node0) virtio- virtio- ● Testpmd DPDK application in VM ring ring ● Bidirectional traffic ● Maximum rate while within specified loss pci vhost vhost pci Frame Mpps Gbps Mpps/core Mpps Gbps Mpps/core size @0.002% @0.002% @0.002% @0% @0% @0% loss loss loss loss [1] loss loss Host: openvswitch 64 12.14[1] 8.15 6.07 7.34 3.67 PCI NIC PCI NIC 4.85 2.42 256 7.65[2] 16.88 3.82 PCI NIC PCI NIC 1024 2.38 19.94 1.19 2.36 1.18 10GSFP+Cu packet generator de-queueing 1500 1.63 19.90 0.81 1.61 en-queueing polling thread 11 11 INSERT DESIGNATOR, IF NEEDED [1] 59% higher than v2.5.1, [2] 34% higher than v2.5.1

  12. DPDK Data Plane: CPU Core Allocation Resources partitioning/allocation 1 Red Hat OpenStack Platform 10 - December, 2016 2

  13. DPDK Data Plane: DPDK HA (DPDK Bonding) Red Hat OpenStack Platform 10 - December, 2016

  14. DPDK Troubleshooting Three Things to consider • Installation • HA • Performance 14 INSERT DESIGNATOR, IF NEEDED

  15. DPDK Troubleshooting: Installation Heat Template Parameters Setting (network-enviroment.yaml) • ComputeKernelArgs: "iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=12" • NeutronDpdkCoreList: "'4,6,20,22'" • NeutronDpdkMemoryChannels: "4" • NeutronDpdkSocketMemory: "'1024,1024'" • NeutronDpdkDriverType: "vfio-pci" • NovaVcpuPinSet: "8,10,12,14,18,24,26,28,30" • NeutronTunnelTypes: "" • NeutronNetworkType: 'vlan' • NeutronNetworkVLANRanges: 'datacentre:4000:4070, dpdk:4071:4071' • HostCpusList: "'2,4,6,8,10,12,14,18,20,22,24,26,28,30'" • NeutronDatapathType: "netdev" • NeutronVhostuserSocketDir: "/var/run/openvswitch" • NeutronBridgeMappings: 'datacentre:br-isolated, dpdk:br-link ' 15 INSERT DESIGNATOR, IF NEEDED

  16. DPDK Troubleshooting: Installation What OVS looks like in a working DPDK setup Port br-link [root@overcloud-compute-0 ~]# ovs-vsctl show Interface br-link 9ef8aab3-0afa-4fca-9a49-7ca97d3e2ffa type: internal Manager "ptcp:6640:127.0.0.1" Port phy-br-link is_connected: true Interface phy-br-link Bridge br-link type: patch Controller "tcp:127.0.0.1:6633" options: {peer=int-br-link} is_connected: true Port " dpdkbond0 " fail_mode: secure Interface "dpdk1" type: dpdk Interface "dpdk0" type: dpdk 16 INSERT DESIGNATOR, IF NEEDED

  17. DPDK Troubleshooting: Installation What options are being set for OVS-DPDK? [root@overcloud-compute-0 ~]# cat /etc/sysconfig/openvswitch OPTIONS="" DPDK_OPTIONS = "-l 2,3,14,15 -n 4 --socket-mem 1024,1024 -w 0000:83:00.0 -w 0000:83:00.1" 17 INSERT DESIGNATOR, IF NEEDED

  18. DPDK Troubleshooting: Installation What OVS looks like in a broken DPDK setup [root@overcloud-compute-0 heat-admin]# ovs-vsctl show 857105ff-db67-41c6-812b-eaa65d224ca0 Bridge br-link fail_mode: standalone Port "dpdk0" Interface "dpdk0" type: dpdk error: "could not open network device dpdk0 (Address family not supported by protocol)" Port br-link Interface br-link type: internal ovs_version: "2.5.0" [root@overcloud-compute-0 heat-admin]# 18 INSERT DESIGNATOR, IF NEEDED

  19. DPDK Troubleshooting: Installation Checking if physical NICs are bound to DPDK [root@overcloud-novacompute-0 ~]# dpdk-devbind --status Network devices using DPDK-compatible driver ============================================ 0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv=uio_pci_generic unused=ixgbe,vfio-pci Network devices using kernel driver =================================== 0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' if=ens255f0 drv=ixgbe unused=vfio-pci,uio_pci_generic 0000:01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' if=ens255f1 drv=ixgbe unused=vfio-pci,uio_pci_generic 0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=ens1f0 drv=ixgbe unused=vfio-pci,uio_pci_generic 0000:05:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=ens4f0 drv=ixgbe unused=vfio-pci,uio_pci_generic 0000:05:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=ens4f1 drv=ixgbe unused=vfio-pci,uio_pci_generic 19 INSERT DESIGNATOR, IF NEEDED

  20. DPDK Troubleshooting: HA What DPDK bond looks like [root@overcloud-compute-0 ~]# ovs-appctl bond/show dpdkbond0 ---- dpdkbond0 ---- bond_mode: balance-tcp bond may use recirculation: yes, Recirc-ID : 1 bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms next rebalance: 94 ms lacp_status: negotiated active slave mac: a0:36:9f:47:e0:62(dpdk1) slave dpdk0: enabled may_enable: true slave dpdk1: enabled active slave may_enable: true 20 INSERT DESIGNATOR, IF NEEDED

  21. DPDK Troubleshooting: Performance Things to look for • Inspecting the node • Measurements 21 INSERT DESIGNATOR, IF NEEDED

  22. DPDK Troubleshooting: Inspecting the node Running tuned-adm manually - tuned HostCpuList [root@overcloud-compute-1 ~]# grep TUNED /etc/tuned/bootcmdline TUNED_BOOT_CMDLINE="nohz=on nohz_full='2,4,6,8,10,12,14,18,20,22,24,26,28,30' rcu_nocbs='2,4,6,8,10,12,14,18,20,22,24,26,28,30' intel_pstate=disable nosoftlockup" [root@overcloud-compute-0 ~]# cat /etc/tuned/cpu-partitioning-variables.conf # Examples: # isolated_cores=2,4-7 # isolated_cores=2-23 # isolated_cores='2,4,6,8,10,12,14,18,20,22,24,26,28,30' [root@overcloud-compute-1 ~]# grep vcpu /etc/nova/nova.conf vcpu_pin_set=8,10,12,14,18,24,26,28,30 22

  23. DPDK Troubleshooting: Inspecting the node Verifying tuned is working for you (pbench): Non-tuned CPU - CPU17 Average number of Local Timer Interrupts: 1000 :-( HostCpusList: "'1,2,4,6,8,10,12,14,18,20,22,24,26,28,30'" 23 https://github.com/distributed-system-analysis/pbench

  24. DPDK Troubleshooting: Inspecting the node Verifying tuned is working for you (pbench): tuned CPU - CPU1 Average number of Local Timer Interrupts: 2 :-) HostCpusList: "'1,2,4,6,8,10,12,14,18,20,22,24,26,28,30'" https://github.com/distributed-system-analysis/pbench 24

  25. DPDK Troubleshooting: Measurements The following command displays the packet counters as well as the packet drops [root@overcloud-compute-0 ~]# ovs-ofctl dump-ports br-link OFPST_PORT reply (xid=0x2): 4 ports port LOCAL: rx pkts=542285, bytes=67242760, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=1, bytes=70, drop=0, errs=0, coll=0 port 1: rx pkts=153700 , bytes=11011978, drop=0 , errs=0, frame=?, over=?, crc=? tx pkts=271137 , bytes=33620934, drop=0, errs=0, coll=? port 2: rx pkts=18084, bytes=2875028, drop=0, errs=0, frame=?, over=?, crc=? tx pkts=271135, bytes=33620740, drop=0, errs=0, coll=? port 3: rx pkts=0, bytes=0, drop=?, errs=?, frame=?, over=?, crc=? tx pkts=0, bytes=0, drop=?, errs=?, coll=? 25 INSERT DESIGNATOR, IF NEEDED

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend