Multipath TCP Upstreaming Mat Martineau (Intel) and Matthieu Baerts - - PowerPoint PPT Presentation

multipath tcp upstreaming
SMART_READER_LITE
LIVE PREVIEW

Multipath TCP Upstreaming Mat Martineau (Intel) and Matthieu Baerts - - PowerPoint PPT Presentation

Multipath TCP Upstreaming Mat Martineau (Intel) and Matthieu Baerts (Tessares) Plan Multipath TCP Overview First Patch Set Upstreaming Roadmap Advanced Features Roadmap Conclusion and links 2 What is MPTCP? 3 Multipath


slide-1
SLIDE 1

Multipath TCP Upstreaming

Mat Martineau (Intel) and Matthieu Baerts (Tessares)

slide-2
SLIDE 2

Plan

  • Multipath TCP Overview
  • First Patch Set Upstreaming Roadmap
  • Advanced Features Roadmap
  • Conclusion and links

2

slide-3
SLIDE 3

What is MPTCP?

3

slide-4
SLIDE 4

Multipath TCP (MPTCP)

  • Exchange data for a single connection over different paths, simultaneously
  • RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group

4

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-5
SLIDE 5

Multipath TCP (MPTCP)

  • Exchange data for a single connection over different paths, simultaneously
  • RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group
  • More bandwidth:

5

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-6
SLIDE 6

Multipath TCP (MPTCP)

  • Exchange data for a single connection over different paths, simultaneously
  • RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group
  • More mobility (walk-out):

6

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-7
SLIDE 7

Multipath TCP (MPTCP)

  • Exchange data for a single connection over different paths, simultaneously
  • RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group
  • More mobility (walk-out):

7

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-8
SLIDE 8

Multipath TCP (MPTCP)

  • Exchange data for a single connection over different paths, simultaneously
  • RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group
  • More mobility (walk-out):

8

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-9
SLIDE 9

Multipath TCP (MPTCP)

  • Exchange data for a single connection over different paths, simultaneously
  • RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group
  • More mobility (walk-out):

9

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-10
SLIDE 10

Multipath TCP Use Cases

  • Smartphones (Apple, Samsung, LG, others)

○ Support failover / “walk-out” scenario. ○ More Bandwidth

  • Residential Gateways (LTE + DSL, for example)

○ More Bandwidth

  • Multipath TCP is part of 5G standardisation:

○ Access Traffic Steering, Switching and Splitting: ATSSS

10

slide-11
SLIDE 11

Multipath TCP Use Cases: ATSSS

11

Steering best network selection OR Switching seamless handover FROM TO

and vice versa

Splitting network aggregation AND improved end-user experience

Defined in 3GPP Release 16, ATSSS is a core network function in 5G networks, playing a key role in managing data traffic between 3GPP (5G, 4G) networks and non-3GPP (Wi-Fi) networks

5G 5G 5G WiFi WiFi WiFi

slide-12
SLIDE 12

Existing Linux implementation

  • First implementation for Linux kernel in March 2009

○ Latest MPTCP out-of-tree Linux kernel version is v0.95 ○ Generally used as a client / server in current deployments, for millions of users

  • But not upstreamable

○ Built to support experiments and rapid changes but not generic enough ○ Special purpose implementation of MPTCP

12

slide-13
SLIDE 13

Guidelines for upstream

  • New implementation cannot affect existing TCP stack:

○ Without performance regressions. No code size change if CONFIG_MPTCP=n ○ Maintainable and configurable ○ Can be used in a variety of deployments

  • Multipath TCP will be "opt-in"
  • Proceed in steps:

○ Minimal features set ○ Optimisations and advanced features for later

13

slide-14
SLIDE 14

Protocol Overview: RFC 6824

  • Looks like TCP on the wire, similar usage for apps

14 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer

slide-15
SLIDE 15

Protocol Overview: RFC 6824

  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number

15 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer

A B C D E F Dseq=0, seq=123,“A” Dseq=1, seq=124,“B” Dseq=2, seq=456,“C” Dseq=4, seq=125,“E” Dseq=5, seq=126,“F” Dseq=3, seq=456,“D”

Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain

slide-16
SLIDE 16
  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number

Protocol Overview: RFC 6824

16 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer

slide-17
SLIDE 17

Protocol Overview: RFC 6824

17 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer

  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number
slide-18
SLIDE 18

Protocol Overview: RFC 6824

18 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer

  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number
slide-19
SLIDE 19

Protocol Overview: RFC 6824

19 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer

  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number + ACK
slide-20
SLIDE 20

Protocol Overview: RFC 6824

  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number + ACK
  • MP_CAPABLE, MP_JOIN, DATA_FIN

20 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer In TCP

  • ptions
slide-21
SLIDE 21

Protocol Overview: RFC 6824

  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number + ACK
  • MP_CAPABLE, MP_JOIN, DATA_FIN
  • Signaling: Add/Remove Addresses, Fast Close

21 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer Via TCP ACK

slide-22
SLIDE 22

Protocol Overview: RFC 6824

  • Looks like TCP on the wire, similar usage for apps
  • Subflow mapping: Data Sequence Number + ACK
  • MP_CAPABLE, MP_JOIN, DATA_FIN
  • Signaling: Add/Remove Addresses, Fast Close
  • Coupled receive windows across TCP subflows

22 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer

slide-23
SLIDE 23

Multiple versions of MPTCP

  • RFC 6824: Experimental

○ All known implementations support it, only this version

  • RFC 6824 bis: Standard

○ Submitted to IESG for publication ○ Behavioral changes: MPTCP v0 → MPTCP v1 ○ Some parts easier to implement ○ Selected by 3GPP for 5G

23

slide-24
SLIDE 24

First Patch Set Roadmap

24

slide-25
SLIDE 25

MPTCP Socket architecture

25 MPTCP TCP Subflow TCP Subflow Socket Layer

IP Proto: struct proto

SKB 1 SKB 2 ... SKB 1 SKB 2 ...

TCP ULP: struct tcp_ulp_ops We start from: tcp_request_sock_ops SKB extension: struct mptcp_ext To store Data Sequence Signal (25 bytes)

slide-26
SLIDE 26

Userspace API

  • MPTCP selected when creating the socket:

socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP);

26

slide-27
SLIDE 27

Userspace API

  • MPTCP selected when creating the socket:

socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP);

○ IPPROTO_MPTCP = IPPROTO_TCP | 0x100; /* = 262 */

27

slide-28
SLIDE 28

Userspace API

  • MPTCP selected when creating the socket:

socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP);

○ IPPROTO_MPTCP = IPPROTO_TCP | 0x100; /* = 262 */

  • getsockopt() / setsockopt() with MPTCP socket or its TCP subflows?

28

slide-29
SLIDE 29

Userspace API

  • MPTCP selected when creating the socket:

socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP);

○ IPPROTO_MPTCP = IPPROTO_TCP | 0x100; /* = 262 */

  • getsockopt() / setsockopt() with MPTCP socket or its TCP subflows?
  • Security: who can create MPTCP sockets?

○ Initial implementation will not be hardened by broad use yet (syzkaller, etc.) ○

sysctl per network namespace, MPTCP disabled by default: is it enough?

29

slide-30
SLIDE 30

Diagnostics

  • MPTCP will have a collection of counters for diagnostic and debug purposes
  • Per-socket data will be shared with userspace via sock_diag(7)

○ TCP ULP framework has been extended to enable diag

  • Some TCP counters are also found in /proc

○ Should MPTCP add to these as well?

30

slide-31
SLIDE 31

Tests

  • Kernel Self Tests

○ Between multiple namespaces (veth) ○ MPTCP ⇔ MPTCP, MPTCP ⇔ TCP, TCP ⇔ MPTCP ○ Various conditions including packet loss, reordering, and variations in routing

  • Packetdrill

○ Background project ongoing to add MPTCP support ○ Out-of-tree Packetdrill with MPTCP support but old and limited

31

slide-32
SLIDE 32

Initial use case

  • Server role is a good place to start
  • Simpler path management

○ Client side handles multiple interfaces (like cellular + Wi-Fi)

  • Common server configuration uses one public

interface for clients

○ Advertising additional interfaces not required

  • Client features all build on what’s needed for servers

32

Server

IP1

Client

IP2 IP3 NAT NAT Internet

slide-33
SLIDE 33

Code already merged upstream

  • SKB extensions

○ Needed to carry MPTCP options that are tied to the data payload ○ Also used to remove sp (sec_path) and nf_bridge pointers from struct sk_buff ○ Suitable for data that can’t fit in sk_buff and justifies memory overhead

  • Add inet_diag_ulp_info to socket diag format and ULP get_info hook

33

slide-34
SLIDE 34

Change in TCP Code

Git Stat: include/linux/skbuff.h | 11 ++ include/linux/tcp.h | 51 ++++++++++ include/net/sock.h | 6 +- include/net/tcp.h | 20 ++++ include/trace/events/sock.h | 5 +- include/uapi/linux/in.h | 2 + net/Kconfig | 1 + net/Makefile | 1 + net/ax25/af_ax25.c | 2 +- net/core/skbuff.c | 7 ++ net/decnet/af_decnet.c | 2 +- net/ipv4/inet_connection_sock.c | 2 + net/ipv4/tcp.c | 8 +- net/ipv4/tcp_input.c | 29 +++++- net/ipv4/tcp_ipv4.c | 4 +- net/ipv4/tcp_minisocks.c | 6 ++ net/ipv4/tcp_output.c | 62 +++++++++++- net/ipv4/tcp_ulp.c | 12 +++ 34

slide-35
SLIDE 35

Changes to TCP code

  • tcp_ulp_clone()
  • Export two low-level TCP functions and one struct
  • SKBs with MPTCP extensions can’t be coalesced or collapsed
  • MPTCP option parsing and writing
  • is_mptcp flag in tcp_sock and tcp_request_sock

35

slide-36
SLIDE 36

Changes to TCP code, continued

  • One MPTCP-specific branch in TCP minisocks
  • Call out to MPTCP from tcp_data_queue to add SKB extension and process

ACKs

  • Additional members in struct tcp_options_received
  • Subflow receive window sharing will introduce changes too

36

slide-37
SLIDE 37

Advanced Features Roadmap

37

slide-38
SLIDE 38

Path Manager

Which path to create/remove? Which address to announce?

38

? ?

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-39
SLIDE 39

Userspace Path Manager

  • Peers share ADD_ADDR and REMOVE_ADDR signals to advertise available

addresses for each MPTCP connection

  • Path manager runs in userspace and uses generic netlink to track address and

local interface updates and request subflow changes

  • Can be customized with different policies.
  • Multipath TCP Daemon alpha release is available at github.com/intel/mptcpd

39

slide-40
SLIDE 40

Packet Scheduler

On which available path packets will be sent? Reinject packets in another path?

40

A A ? ?

Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain

slide-41
SLIDE 41

Packet scheduling

  • Different connections may optimize for throughput, latency, or redundancy.
  • Peers can set a ‘backup’ flag on each subflow to limit transmission on that flow
  • Include basic scheduler options in the kernel
  • Consider eBPF to define custom schedulers, instead of kernel modules

41

slide-42
SLIDE 42

Using MPTCP with unmodified binaries

  • Some organizations want to take advantage of MPTCP without recompiling

their userspace

  • Can add BPF_CGROUP_SOCKET to attach an eBPF program that rewrites the

protocol number passed to socket()

  • Similar attachment points exist for bind() and connect()

42

slide-43
SLIDE 43

MPTCP Performance optimizations

  • Initial emphasis is on correctness and reasonable MPTCP performance

○ While not disrupting TCP’s optimizations!

  • Target performance optimizations based on data
  • Protocol optimizations

○ Example: changing scheduler behavior for reinjection of data on different subflows

  • TCP Fast Open support

43

slide-44
SLIDE 44

Break-before-make

  • MPTCP can keep a connection active even with zero subflows connected

○ Allows the session to continue by adding a subflow with MP_JOIN

  • Can be useful to switch between access points
  • Will add this capability if there’s demand for it

44

slide-45
SLIDE 45

Subflow socket options

  • One MPTCP socket manages a set of in-kernel subflow sockets
  • Socket options that use TCP option space or change data flow could interfere
  • The MPTCP socket can act as an intermediary for subflow options
  • Will need to whitelist specific known-safe options
  • Could expose file descriptors only good for getsockopt()/setsockopt()

45

slide-46
SLIDE 46

Kernel TLS and MPTCP

  • kTLS is built on top of TCP using ULP framework
  • An MPTCP socket is not a TCP socket, so it doesn’t have ULP
  • TLS needs to operate on the MPTCP data stream, not subflow streams

○ TLS records could be split across subflows ○ MPTCP DSS mappings are specific to TCP sequence numbers

  • TLS_SW appears feasible but would need work to integrate with an MPTCP

socket type

46

slide-47
SLIDE 47

Conclusion

47

slide-48
SLIDE 48

Conclusion

  • Build around TCP as much as we can.
  • We are close to having an initial patch set ready.

48

This project is open to everybody.

  • Wiki: https://is.gd/mptcp_upstream
  • Mailing list: https://lists.01.org/mailman/listinfo/mptcp
  • Git repository: https://github.com/multipath-tcp/mptcp_net-next
  • Paper: https://linuxplumbersconf.org/event/4/contributions/435/
  • mathew.j.martineau@linux.intel.com
  • matthieu.baerts@tessares.net
slide-49
SLIDE 49

Backup slides

49

slide-50
SLIDE 50

Protocol challenges

50

Used with Christoph Paasch’s permission

Relations between structures

slide-51
SLIDE 51

kTLS record

51

From: https://netdevconf.org/1.2/papers/netdevconf-TLS.pdf

slide-52
SLIDE 52

Protocol challenges

Coupled receive windows across TCP subflows

52

Used with Sébastien Barré’s permission

slide-53
SLIDE 53

Multipath TCP (MPTCP)

Hybrid access network use-case (BBF TR-348 by Tessares - SwissCom - OVH)

53

Telco Cloud

DSL CPE 4G CPE

TCP TCP MPTCP

copper (long line)

4G/LTE

network

fiber

xDSL

network MPTCP

cloud native HAG multi- platform Agent

available capacity

needed, for 4G/LTE connectivity

Images from Tessares

slide-54
SLIDE 54

Protocol challenges

54

slide-55
SLIDE 55

Protocol challenges

Data sequence numbers and mappings

55

A B C

Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain

slide-56
SLIDE 56

Protocol challenges

Data sequence numbers and mappings

56

A B C D E F

Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain

slide-57
SLIDE 57

Protocol challenges

Data sequence numbers and mappings

57

A B C D E F

Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain

slide-58
SLIDE 58

Protocol challenges

Data sequence numbers and mappings

58

A B C D E F

Dseq=0, seq=123,“A” Dseq=1, seq=124,“B” Dseq=2, seq=456,“C” Dseq=4, seq=125,“E” Dseq=5, seq=126,“F” Dseq=3, seq=456,“D”

Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain

slide-59
SLIDE 59

Protocol challenges

Data sequence numbers and mappings

mptcp_sock tcp_sock Socket Layer IP Layer

App sends data

59

slide-60
SLIDE 60

Protocol challenges

Data sequence numbers and mappings

mptcp_sock tcp_sock Socket Layer IP Layer 60

Select the TCP subflow

slide-61
SLIDE 61

Protocol challenges

Data sequence numbers and mappings

mptcp_sock tcp_sock Socket Layer IP Layer

TCP header: What DSS to set?

61

slide-62
SLIDE 62

Protocol challenges

Sending of ACKs to signal options, e.g. REMOVE_ADDR in a TCP ACK

mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 62

Notification: one iface is down

slide-63
SLIDE 63

Protocol challenges

Sending of ACKs to signal options, e.g. REMOVE_ADDR in a TCP ACK

mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 63

Select the subflow

slide-64
SLIDE 64

Protocol challenges

Sending of ACKs to signal options, e.g. REMOVE_ADDR in a TCP ACK

mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 64

Sending a ACK not from TCP stack

slide-65
SLIDE 65

Protocol challenges

Reception of ACKs with signaling options, e.g. REMOVE_ADDR in a TCP ACK

mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 65

TCP ACK received

slide-66
SLIDE 66

Protocol challenges

Reception of ACKs with signaling options, e.g. REMOVE_ADDR in a TCP ACK

mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 66

TCP ACK is not dropped

slide-67
SLIDE 67

Protocol challenges

Reception of ACKs with signaling options, e.g. REMOVE_ADDR in a TCP ACK

mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 67

slide-68
SLIDE 68

Protocol challenges

Signaling with MPTCP:

  • MP_CAPABLE
  • MP_JOIN
  • DSEQ / DACK
  • FAST_CLOSE
  • ADD_ADDR
  • REMOVE_ADDR

68

SYN SYN ALL ACK followed by RST ACK ACK