Analyzing DPDK applications with eBPF Sharpening the toolset - - PowerPoint PPT Presentation

analyzing dpdk applications with ebpf
SMART_READER_LITE
LIVE PREVIEW

Analyzing DPDK applications with eBPF Sharpening the toolset - - PowerPoint PPT Presentation

Analyzing DPDK applications with eBPF Sharpening the toolset Stephen Hemminger Fosdem, February 1, 2020 Microsoft 1 Table of Contents Introduction Packet Capture Tracing Lttng Bpftrace Performance Conclusion 2 Introduction Ancient


slide-1
SLIDE 1

Analyzing DPDK applications with eBPF

Sharpening the toolset

Stephen Hemminger Fosdem, February 1, 2020

Microsoft 1

slide-2
SLIDE 2

Table of Contents

Introduction Packet Capture Tracing Lttng Bpftrace Performance Conclusion

2

slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

Ancient wisdom

French proverb Mauvés ovriers ne trovera ja bon hostill Bad workers will never fjnd a good tool Chinese proverb To do a good job, a craftsman must sharpen his tools.

3

slide-5
SLIDE 5

Ancient wisdom

French proverb Mauvés ovriers ne trovera ja bon hostill Bad workers will never fjnd a good tool Chinese proverb To do a good job, a craftsman must sharpen his tools.

3

slide-6
SLIDE 6

Methodology

  • Don’t focus on a tool set
  • Problem statement
  • Workload Characterization
  • USE
  • Utilization
  • Saturation
  • Errors

See Linux tracing talks (Brendan Gregg et al)

4

slide-7
SLIDE 7

Methodology

  • Don’t focus on a tool set
  • Problem statement
  • Workload Characterization
  • USE
  • Utilization
  • Saturation
  • Errors

See Linux tracing talks (Brendan Gregg et al)

4

slide-8
SLIDE 8

Methodology

  • Don’t focus on a tool set
  • Problem statement
  • Workload Characterization
  • USE
  • Utilization
  • Saturation
  • Errors

See Linux tracing talks (Brendan Gregg et al)

4

slide-9
SLIDE 9

Methodology

  • Don’t focus on a tool set
  • Problem statement
  • Workload Characterization
  • USE
  • Utilization
  • Saturation
  • Errors

See Linux tracing talks (Brendan Gregg et al)

4

slide-10
SLIDE 10

Methodology

  • Don’t focus on a tool set
  • Problem statement
  • Workload Characterization
  • USE
  • Utilization
  • Saturation
  • Errors

See Linux tracing talks (Brendan Gregg et al)

4

slide-11
SLIDE 11

Capture vs Tracing

DPDK Application pdump tcpdump Tracer send receive ring Capture Tracing

5

slide-12
SLIDE 12

Packet Capture

slide-13
SLIDE 13

DPDK pdump

DPDK Primary Application dpdk-pdump tool PCAP PMD dpdk_port0 librte_pdump capture.pcap Traffic Generator

  • Packet copied and queued

to ring

  • Secondary process sends to

libpcap

  • Packets recorded in pcap

format

6

slide-14
SLIDE 14

Pdump limitations

  • No metadata (vlan, offmoad, ...)
  • Inaccurate timestamp
  • No direction information
  • Single port only
  • No fjltering
  • Poor performance

7

slide-15
SLIDE 15

Pdump limitations

  • No metadata (vlan, offmoad, ...)
  • Inaccurate timestamp
  • No direction information
  • Single port only
  • No fjltering
  • Poor performance

7

slide-16
SLIDE 16

Pdump limitations

  • No metadata (vlan, offmoad, ...)
  • Inaccurate timestamp
  • No direction information
  • Single port only
  • No fjltering
  • Poor performance

7

slide-17
SLIDE 17

Pdump limitations

  • No metadata (vlan, offmoad, ...)
  • Inaccurate timestamp
  • No direction information
  • Single port only
  • No fjltering
  • Poor performance

7

slide-18
SLIDE 18

Pdump limitations

  • No metadata (vlan, offmoad, ...)
  • Inaccurate timestamp
  • No direction information
  • Single port only
  • No fjltering
  • Poor performance

7

slide-19
SLIDE 19

Pdump limitations

  • No metadata (vlan, offmoad, ...)
  • Inaccurate timestamp
  • No direction information
  • Single port only
  • No fjltering
  • Poor performance

7

slide-20
SLIDE 20

PCAP Next Generation

  • Nanosecond resolution timestamp
  • System and Interface metadata
  • Multiple interfaces
  • Flags (direction, hash, ...)
  • Comments

8

slide-21
SLIDE 21

PCAP Next Generation

  • Nanosecond resolution timestamp
  • System and Interface metadata
  • Multiple interfaces
  • Flags (direction, hash, ...)
  • Comments

8

slide-22
SLIDE 22

PCAP Next Generation

  • Nanosecond resolution timestamp
  • System and Interface metadata
  • Multiple interfaces
  • Flags (direction, hash, ...)
  • Comments

8

slide-23
SLIDE 23

PCAP Next Generation

  • Nanosecond resolution timestamp
  • System and Interface metadata
  • Multiple interfaces
  • Flags (direction, hash, ...)
  • Comments

8

slide-24
SLIDE 24

PCAP Next Generation

  • Nanosecond resolution timestamp
  • System and Interface metadata
  • Multiple interfaces
  • Flags (direction, hash, ...)
  • Comments

8

slide-25
SLIDE 25

Packet fjltering with libpcap

PCAP fjlter string: ip dst fosdem.org

cBPF program (6 insns): (000) ldh [12] (001) jeq #0x800 jt 2 jf 5 (002) ld [30] (003) jeq #0x1f16168c jt 4 jf 5 (004) ret #65535 (005) ret #0

9

slide-26
SLIDE 26

Packet fjltering cBPF

Translated to eBPF

eBPF program (11 insns): L0: xor r0, r0 L1: xor r7, r7 L2: mov r6, r1 L3: ldh r0, [12] L4: jne r0, #0x800, L9 L5: ldw r0, [30] L6: jne r0, #0x1f16168c, L9 L7: mov32 r0, #0xffff L8: exit L9: mov32 r0, #0x1 L10: exit

10

slide-27
SLIDE 27

Tracing

slide-28
SLIDE 28

Linux Trace toolkit

  • Easy to use
  • User Defjned Trace Points
  • Filtering
  • Common Trace Format
  • High performance

11

slide-29
SLIDE 29

Linux Trace toolkit

  • Easy to use
  • User Defjned Trace Points
  • Filtering
  • Common Trace Format
  • High performance

11

slide-30
SLIDE 30

Linux Trace toolkit

  • Easy to use
  • User Defjned Trace Points
  • Filtering
  • Common Trace Format
  • High performance

11

slide-31
SLIDE 31

Linux Trace toolkit

  • Easy to use
  • User Defjned Trace Points
  • Filtering
  • Common Trace Format
  • High performance

11

slide-32
SLIDE 32

Linux Trace toolkit

  • Easy to use
  • User Defjned Trace Points
  • Filtering
  • Common Trace Format
  • High performance

11

slide-33
SLIDE 33

Adding lttng tracepoint

/∗ Send burst

  • f

packets on an output i n t e r f a c e ∗/ s t a t i c i n l i n e int send_burst ( struct lcore_conf ∗qconf , uint16_t n , uint16_t port ) { struct rte_mbuf ∗∗m_table = qconf− >tx_mbufs [ port ] . m_table ; uint16_t queueid = qconf− >tx_queue_id [ port ] ; int ret ; ret = rte_eth_tx_burst ( port , queueid , m_table , n ) ; tracepoint ( l3fwd , tx_burst , port , queueid , n , ret ) ; i f ( u n l i k e l y ( ret < n )) ^^Irte_pktmbuf_free_bulk(&m_table [ ret ] , n − ret ) ; return 0; }

12

slide-34
SLIDE 34

Using eBPF from userspace

  • Origin: dtrace
  • Adds NOP locations and ELF section
  • Run code at tracepoint
  • Prerequisites

uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev

13

slide-35
SLIDE 35

Using eBPF from userspace

  • Origin: dtrace
  • Adds NOP locations and ELF section
  • Run code at tracepoint
  • Prerequisites

uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev

13

slide-36
SLIDE 36

Using eBPF from userspace

  • Origin: dtrace
  • Adds NOP locations and ELF section
  • Run code at tracepoint
  • Prerequisites

uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev

13

slide-37
SLIDE 37

Using eBPF from userspace

  • Origin: dtrace
  • Adds NOP locations and ELF section
  • Run code at tracepoint
  • Prerequisites

uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev

13

slide-38
SLIDE 38

Using eBPF from userspace

  • Origin: dtrace
  • Adds NOP locations and ELF section
  • Run code at tracepoint
  • Prerequisites

uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev

13

slide-39
SLIDE 39

Using eBPF from userspace

  • Origin: dtrace
  • Adds NOP locations and ELF section
  • Run code at tracepoint
  • Prerequisites

uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev

13

slide-40
SLIDE 40

Adding DTRACE probes

s t a t i c void pkt_burst_receive ( struct fwd_stream ∗ f s ) { struct rte_mbuf ∗pkts_burst [MAX_PKT_BURST] ; uint16_t i , nb_rx ; /∗ Receive a burst

  • f

packets . ∗/ nb_rx = rte_eth_rx_burst ( fs− >rx_port , fs− >rx_queue , ^^ I ^^ I ^^ I pkts_burst , nb_pkt_per_burst ) ; DTRACE_PROBE1( testpmd , rx_burst , nb_rx ) ; i f ( u n l i k e l y (nb_rx == 0)) ^^ I ^^ I r e t u r n ;

14

slide-41
SLIDE 41

Looking for USDT

Use bpftrace to look for tracepoints in application

$ sudo bpftrace -l "usdt:./build/app/testpmd" usdt:./build/app/testpmd:testpmd:rx_burst

15

slide-42
SLIDE 42

Running bpftrace

Build a histogram of the number of packets per loop

$ sudo bpftrace -e 'usdt:./build/app/testpmd:rx_burst { @ = hist(arg0); }' Attaching 1 probe... ^C @: [0] 16001930 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 0 | | [2, 4) 0 | | [4, 8) 0 | | [8, 16) 0 | | [16, 32) 0 | | [32, 64) 5333977 |@@@@@@@@@@@@@@@@@ |

16

slide-43
SLIDE 43

Performance

slide-44
SLIDE 44

Caveats

  • Limited hardware - x85 with 25G NIC
  • One ofg test
  • Untuned
  • Limited scope
  • Testpmd - 64 byte packets
  • Immediate drop
  • Current DPDK 19.11
  • Single queue active

17

slide-45
SLIDE 45

Caveats

  • Limited hardware - x85 with 25G NIC
  • One ofg test
  • Untuned
  • Limited scope
  • Testpmd - 64 byte packets
  • Immediate drop
  • Current DPDK 19.11
  • Single queue active

17

slide-46
SLIDE 46

Caveats

  • Limited hardware - x85 with 25G NIC
  • One ofg test
  • Untuned
  • Limited scope
  • Testpmd - 64 byte packets
  • Immediate drop
  • Current DPDK 19.11
  • Single queue active

17

slide-47
SLIDE 47

Caveats

  • Limited hardware - x85 with 25G NIC
  • One ofg test
  • Untuned
  • Limited scope
  • Testpmd - 64 byte packets
  • Immediate drop
  • Current DPDK 19.11
  • Single queue active

17

slide-48
SLIDE 48

Caveats

  • Limited hardware - x85 with 25G NIC
  • One ofg test
  • Untuned
  • Limited scope
  • Testpmd - 64 byte packets
  • Immediate drop
  • Current DPDK 19.11
  • Single queue active

17

slide-49
SLIDE 49

Caveats

  • Limited hardware - x85 with 25G NIC
  • One ofg test
  • Untuned
  • Limited scope
  • Testpmd - 64 byte packets
  • Immediate drop
  • Current DPDK 19.11
  • Single queue active

17

slide-50
SLIDE 50

Caveats

  • Limited hardware - x85 with 25G NIC
  • One ofg test
  • Untuned
  • Limited scope
  • Testpmd - 64 byte packets
  • Immediate drop
  • Current DPDK 19.11
  • Single queue active

17

slide-51
SLIDE 51

Test confjguration

Testpmd tx_only Testpmd rx_only

  • Sender tx-only
  • Receiver rx-only
  • Observe Receive packets/sec

18

slide-52
SLIDE 52

Capture and Tracing Performance

Test Disabled Enabled Pdump

  • 36.85

Pdump + eBPF Lttng

  • 0.42
  • 0.02

bpftrace

  • 0.01
  • 56.72

19

slide-53
SLIDE 53

Conclusion

slide-54
SLIDE 54

Ongoing work

  • DPDK packet capture
  • Pcapng support
  • capture fjlter
  • dumpcap (tshark) syntax
  • DPDK trace points

20

slide-55
SLIDE 55

Ongoing work

  • DPDK packet capture
  • Pcapng support
  • capture fjlter
  • dumpcap (tshark) syntax
  • DPDK trace points

20

slide-56
SLIDE 56

Ongoing work

  • DPDK packet capture
  • Pcapng support
  • capture fjlter
  • dumpcap (tshark) syntax
  • DPDK trace points

20

slide-57
SLIDE 57

Ongoing work

  • DPDK packet capture
  • Pcapng support
  • capture fjlter
  • dumpcap (tshark) syntax
  • DPDK trace points

20

slide-58
SLIDE 58

Ongoing work

  • DPDK packet capture
  • Pcapng support
  • capture fjlter
  • dumpcap (tshark) syntax
  • DPDK trace points

20

slide-59
SLIDE 59

Ongoing work

  • DPDK packet capture
  • Pcapng support
  • capture fjlter
  • dumpcap (tshark) syntax
  • DPDK trace points

20

slide-60
SLIDE 60

Thank you

  • Questions
  • Thanks
  • DPDK community
  • LTTng
  • eBPF developers
  • Contact

stephen@networkplumber.org @networkplumber

21