Rx Listener Performance or: How to Saturate a 10GbE Link with an - PowerPoint PPT Presentation

Rx Listener Performance or: How to Saturate a 10GbE Link with an OpenAFS Rx File- server Andrew Deason June 2019 OpenAFS Workshop 2019 1

Overview • Problem and background • Baseline: ~1.7 gbps • foreach(why_are_we_so_slow) : • Discuss issue • Show solution • Performance impact • End result: 10gbps+ • Other considerations, future 2

The Problem • Customer has 1G volume, files are 1M+ • Hundreds of clients, all fetch at once • Fileserver saturated at 1-2gbps • 1GiB * 100clients @ 1.5gbps ≈ 9.5 minutes • 1GiB * 100clients @ 10gbps ≈ 1.5 minutes • We do not care about: • Single-client performance, latency • Uncached files • Complex solutions (DPF, TCP) • Other servers 3

Test Environment • Fileserver • Solaris 11.4 • HP ProLiant DL360 Gen9 • Xeon E5-2667v3, 8/16 cores • Clients • Fake afscp clients on Debian 9.5 • HP ProLiant DL360 Gen10 • Xeon Gold 6136, 12/24 cores • 2x Broadcom Limited NetXtreme II BCM57810 10gbps NIC • Harness: afscp_bench Python script 4

Step 0: Baseline (master fc7e1700) 11 1200 10 9 1000 8 00base 7 800 6 MiB/s gbps iperfUDP 600 5 4 400 3 2 200 1 0 0 5 10 15 20 # of rx listeners 5

Step 0: Baseline (master fc7e1700) “Is this server even busy?” 6

Step 0: Baseline (master fc7e1700) “Is this server even busy?” 7

Step 0: Baseline (master fc7e1700) One thread is doing all the work! 8

rx_Listener • aka rxi_ListenerProc , “the listener”, etc. • TCP: read(fd) / recv(fd) per stream • UDP: recvmsg(fd) for everyone 9

rx_Listener • Listener calls recvmsg() , parses, hands out data • Other processing, too (later) • . . . for all 128/256/etc threads ( -p ) • We’re sending data, but receive ACKs 10

Step 1: Multiple Listeners • Create multiple threads to run rxi_ListenerProc() • recvmsg() itself internally serialized • Everything after recvmsg() runs in parallel (per-conn) • conn_recv_lock • How many threads? 11

Step 1: Multiple Listeners 11 1200 10 9 1000 8 01mlx 7 800 00base 6 MiB/s gbps iperfUDP 600 5 4 400 3 2 200 1 0 0 5 10 15 20 # of rx listeners 12

Step 1: Multiple Listeners 13

Step 1: Multiple Listeners 14

Syscall Overhead • Each packet received is 1 syscall, plus locking • in rx_Listener • Each packet sent is 1 syscall, plus locking • sometimes in rx_Listener • We must send packets separately, but: 15

Step 2: recvmmsg/sendmmsg • recvmmsg()/sendmmsg() (note the extra m ) • Solaris 11.4+, RHEL 7+ • Receive same-call packets in bulk, qsort() • Also benefits platforms without *mmsg 16

Step 2: recvmmsg/sendmmsg 11 1200 10 9 1000 8 02mmsg 7 800 01mlx 00base 6 MiB/s gbps iperfUDP 600 5 4 400 3 2 200 1 0 0 5 10 15 20 # of rx listeners 17

Step 2: recvmmsg/sendmmsg 18

Step 2: recvmmsg/sendmmsg 19

rx_Listener (again) Where is the listener spending all its time? Lots of time in sendmmsg() 20

rx_Write / rx_Writev buffering • Normally: buffer, then sendmsg() • If the tx window is full: • Wait? • Overfill tx window • The listener calls sendmsg() • Why? • Reduces context switching for LWP • Allows RPC threads to move on 21

Step 3: rxi_Start Defer • Skip calling rxi_Start() in the listener • Flag call instead • Wakeup rx_Write , which calls rxi_Start() • Only when rx_Write is waiting for the tx window • Alternate approach: process packets in rx_Write 22

Step 3: rxi_Start Defer 11 1200 10 9 1000 8 03defer 7 800 02mmsg 01mlx 6 MiB/s 00base gbps iperfUDP 600 5 4 400 3 2 200 1 0 0 5 10 15 20 # of rx listeners 23

Step 3: rxi_Start Defer 24

Step 3: rxi_Start Defer 25

recvmsg() parallelization • Remember: recvmsg() itself internally serialized • per socket • SO_REUSEPORT allows for sockets on same addr • Solaris 11+, RHEL6.5+ • Packets assigned to sockets based on configurable hash • Default: IP and port for source and destination 26

Step 4: SO_REUSEPORT 11 1200 10 9 1000 8 04reuse 7 800 03defer 02mmsg 6 MiB/s 01mlx gbps iperfUDP 00base 600 5 4 400 3 2 200 1 0 0 5 10 15 20 # of rx listeners 27

Step 4: SO_REUSEPORT 28

Step 4: SO_REUSEPORT 29

Small RPCs Step 4: SO_REUSEPORT (small) 50 04reuse 03defer 40 02mmsg 01mlx 00base 30 MiB/s 20 10 0 0 5 10 15 20 # of rx listeners 30

Options Impact • So far, default options besides -p • What options matter? • -auditlog 31

Auditlog sysvmq auditing 50 00base 01mlx 40 02mmsg 03defer 04reuse 30 MiB/s 20 10 0 0 5 10 15 20 # of rx listeners 32

Auditlog sysvmq auditing (zoom) 9 8 7 MiB/s 6 5 00base 01mlx 02mmsg 03defer 4 04reuse 3 0 5 10 15 20 # of rx listeners 33

Auditlog • Audit subsystem uses one big global lock • Addressed for new pipe audit interface • See “OpenAFS Audit Interfaces enhancements” tomorrow 34

Lessons Learned • Recording per-function runtimes is way too heavyweight • DTrace profile probes vs pid • Must verify profiling performance impact • Test, don’t assume • VMs • localhost • auditlog 35

Future Possibilities • More efficient ACK processing? • Revisit jumbograms • AF_RXRPC • Kernel client improvements • TCP (DPF) 36

Code Top commit https://gerrit.openafs.org/13621 Public https://gerrit.openafs.org/#/q/topic:recvmmsg https://gerrit.openafs.org/#/q/topic:sendmmsg Drafts https://gerrit.openafs.org/#/q/topic:multi-listener https://gerrit.openafs.org/#/q/topic:rxi_startdefer https://gerrit.openafs.org/#/q/topic:reuseport Slides http://dson.org/talks 37

Rx Listener Performance or: How to Saturate a 10GbE Link with an - PowerPoint PPT Presentation

Rx Listener Performance or: How to Saturate a 10GbE Link with an OpenAFS Rx File- server Andrew Deason June 2019 OpenAFS Workshop 2019 1 Overview Problem and background Baseline: ~1.7 gbps foreach(why_are_we_so_slow) :

Revamping the CallMonitor Listener EECS 4315 Franck van Breugel March 29, 2020 1/13

The Listener as Speaker: Implications for Teaching Listening Henry D. Schlinger., Jr. California

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Bi-directional talker-listener Source Environmental / Receiver limitations: adaptation across a

The meanings of indexical words What does a listener understand sustainable meanings of the

The Role of Joint Control Training in the Acquisition of Complex Listener Responses: A Literature

ENG 1 Taking the listener on a sonic journey, either during their concerts or while listening to

Rings between Z and Q Luke Harmon March 2018 Hooks to interest the listener Subgroups of Q .

Registration Form of ACMME202 1 ( Presentation Only & Listener ) June 1 7-20 , 202 1 National

Examples of BAD Slides Company presentation in particular but presentations in general

Validating a technique for post-hoc estimation of a listener's focus in music structure analysis

Socket Programming Instruction Sheet In this lab, our goal is to make two machines talk to each

Persuasion of the Undecided: Language vs. the Listener Liane Longpr, Esin Durmus, Claire Cardie

Good Listener! Excerpts from Jeremiah 6, 1 John 4, Isaiah 43, Matthew 13 To whom shall I speak

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

March 2019 CONTENTS Page Combined Partner Performance 1 Breckland Performance Reports 2-6

Future State-of-the-Art Electrical Interconnect Byungsub Kim* and Vladimir Stojanovi

Generation Rx HUCOP APhA-ASP TARYN EUBANK VICE PRESIDENT OF PATIENT PROJECTS Generation Rx:

M ASTER : Long-Term Stable Routing and Scheduling in Low-Power Wireless Networks Oliver Harms 1,2

Q4 2017 Supplementary Slides February 22, 2018 1 Forward-looking Statements This presentation

Cost Transparency Rx California Office of Statewide Health Planning and Development Starla

Strokes After TAVR Multi-factorial Origin, Incidence (past and present), and Management

Example: Map-Coloring Northern Constraint Satisfaction Problems Territory Western Queensland

Uninformed Search CS271P, Fall Quarter, 2018 Introduction to Artificial Intelligence Prof.

Sambuz

Useful Links

Newsletter

Mail Us

Rx Listener Performance or: How to Saturate a 10GbE Link with an - PowerPoint PPT Presentation

Rx Listener Performance or: How to Saturate a 10GbE Link with an OpenAFS Rx File- server Andrew Deason June 2019 OpenAFS Workshop 2019 1 Overview Problem and background Baseline: ~1.7 gbps foreach(why_are_we_so_slow) :

Revamping the CallMonitor Listener EECS 4315 Franck van Breugel March 29, 2020 1/13

The Listener as Speaker: Implications for Teaching Listening Henry D. Schlinger., Jr. California

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Bi-directional talker-listener Source Environmental / Receiver limitations: adaptation across a

The meanings of indexical words What does a listener understand sustainable meanings of the

The Role of Joint Control Training in the Acquisition of Complex Listener Responses: A Literature

ENG 1 Taking the listener on a sonic journey, either during their concerts or while listening to

Rings between Z and Q Luke Harmon March 2018 Hooks to interest the listener Subgroups of Q .

Registration Form of ACMME202 1 ( Presentation Only &amp; Listener ) June 1 7-20 , 202 1 National

Examples of BAD Slides Company presentation in particular but presentations in general

Validating a technique for post-hoc estimation of a listener's focus in music structure analysis

Socket Programming Instruction Sheet In this lab, our goal is to make two machines talk to each

Persuasion of the Undecided: Language vs. the Listener Liane Longpr, Esin Durmus, Claire Cardie

Good Listener! Excerpts from Jeremiah 6, 1 John 4, Isaiah 43, Matthew 13 To whom shall I speak

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

March 2019 CONTENTS Page Combined Partner Performance 1 Breckland Performance Reports 2-6

Future State-of-the-Art Electrical Interconnect Byungsub Kim* and Vladimir Stojanovi

Generation Rx HUCOP APhA-ASP TARYN EUBANK VICE PRESIDENT OF PATIENT PROJECTS Generation Rx:

M ASTER : Long-Term Stable Routing and Scheduling in Low-Power Wireless Networks Oliver Harms 1,2

Q4 2017 Supplementary Slides February 22, 2018 1 Forward-looking Statements This presentation

Cost Transparency Rx California Office of Statewide Health Planning and Development Starla

Strokes After TAVR Multi-factorial Origin, Incidence (past and present), and Management

Example: Map-Coloring Northern Constraint Satisfaction Problems Territory Western Queensland

Uninformed Search CS271P, Fall Quarter, 2018 Introduction to Artificial Intelligence Prof.

Sambuz

Useful Links

Newsletter

Mail Us

Registration Form of ACMME202 1 ( Presentation Only & Listener ) June 1 7-20 , 202 1 National