Breaking Up the Transport Logjam
Bryan Ford
Max Planck Institute for Software Systems and Yale University
baford@mpi-sws.org
Janardhan Iyengar
Franklin & Marshall College
jiyengar@fandm.edu
TSVAREA Open Meeting, IETF 73, 19 November 2008
Breaking Up the Transport Logjam Bryan Ford Janardhan Iyengar Max - - PowerPoint PPT Presentation
Breaking Up the Transport Logjam Bryan Ford Janardhan Iyengar Max Planck Institute Franklin & Marshall for Software Systems College and Yale University jiyengar@fandm.edu baford@mpi-sws.org TSVAREA Open Meeting, IETF 73, 19 November
Bryan Ford
Max Planck Institute for Software Systems and Yale University
baford@mpi-sws.org
Janardhan Iyengar
Franklin & Marshall College
jiyengar@fandm.edu
TSVAREA Open Meeting, IETF 73, 19 November 2008
— many semantic variations [RDP, DCCP, SCTP, SST, ...]
— high-speed [Floyd03], wireless links [Lochert07], ...
— dispersion [Gustafsson97], multihoming [SCTP],
logistics [Swany05], concurrent multipath [Iyengar06]…
— Performance Enhancing Proxies [RFC3135],
NATs and Firewalls [RFC3022], traffc shapers
— NATs & frewalls — chicken & egg: application demand vs kernel support
— impassable “TCP-friendliness” barrier — must work end-to-end, on all network types in path
— “You want how many fows? Not on my network!” — Fundamentally “TCP-unfriendly”?
Traditional transports confate 3 function areas... T
Transport Protocol
Endpoint Identification (port numbers) Transport Abstraction Congestion Control Semantics, Reliability Concerns (applications care) Performance Concerns (users, opers care) Naming, Routing Concerns (NATs, firewalls care)
Physical Layer Data Link Layer Network Layer Session Layer Application Layer Presentation Layer Physical Layer Data Link Layer Network Layer Session Layer Application Layer Presentation Layer Endpoint Layer Flow Regulation Layer Transport Layer Transport Layer
Break up the Transport according to these functions:
TCP Header UDP Header DCCP Header
Current transports have separate port spaces
IP Header
Source Port Dest Port Source Port Dest Port Source Port Dest Port Source IP Address Dest IP Address
TCP Port Space UDP Port Space DCCP Port Space Network Layer IP Address Space
Ports are routing info!
— IP address ⇒ Inter-Host Routing — port numbers ⇒ Intra-Host Routing
Do ports really belong in the Network Layer?
— Parse transport headers ⇒ only TCP, UDP get through
— DHCP port borrowing/sharing [Despres, Bajko, Boucadair]
— Assign each host a CIDR subnet, low bits = “port #”
Factor endpoints into shared Endpoint Layer
Transport Header Transport Header IP Header
Source IP Address Dest IP Address
Endpoint Layer Port Space Network Layer IP Address Space Endpoint Header
Source Port Dest Port
Transport Header Transport Header
Workable starting point exists — UDP!
IP Header
Source IP Address Dest IP Address
Endpoint Layer Port Space Network Layer IP Address Space UDP Header
Source Port Dest Port
It's happening in any case!
[Rosenberg 08]
— SCTP [Ong 00, Tuexen 07, Denis-Courmont 08] — DCCP [Phelan 08]
— IPSEC [RFC 3947/3948], Mobile IP [RFC 3519],
Teredo [RFC 4380], …
...but the new model also has technical benefts...
Can now evolve separately:
— New transports get through frewalls, NATs, etc. — Easily deploy new user-space transports,
interoperable with kernel transports
— Application controls negotiation among transports
— Better cooperation with NATs [UPnP, NAT
— identity/locator split, port/service names [Touch06],
security and authentication info ...?
Network Protocol Kernel-space Transport Application Network Protocol User-space Transport UDP Application Kernel User Kernel User
Host A Host B
User-space transports are easy to deploy, but can't talk to kernel implementations of same transport!
(without special privileges, raw sockets, etc.)
Network Protocol Kernel-space Transport Application Network Protocol User-space Transport Application Kernel User Kernel User Endpoint Protocol Endpoint Protocol
Host A Host B
Endpoint layer provides full interoperability, user-space transports require no special privileges
Many applications support multiple transports, but can't negotiate them effciently
Host A
“Cautious Negotiation”
Host B Host A Host B
“Shotgun Negotiation”
“TCP or UDP?” “UDP!” “Hello!” “Hello?” “Hello?”
UDP TCP TCP UDP
“Hello?”
UDP
“Hello!”
TCP
RST
When application controls its Endpoint Layer ports, it can combine transport negotiation with setup Host A
Transport Negotiation “Meta-SYN” T1 SYN
T2 SYN
T3 SYN T2 SYN/ACK
Host B
B chooses Transport 2
“Next-Generation Endpoint Layer” could:
— Use same port space, fall back on UDP transparently
— Port names [Touch 06], user/service names, auth info, ...?
— NATs could propagate listener advertisements upstream,
translate inbound connections as policy permits
— Enable cleaner solutions to “NAT signaling” mess?
[UPnP, NAT
Transport includes end-to-end congestion control
— regulates fow transmission rate to network capacity
But one E2E path may cross many...
— different network technologies
— different administrative domains
Can't tune performance, fairness in one domain w/o affecting other domains, E2E semantics [RFC3515]
Factor fow regulation into underlying Flow Layer
Transport Layer Network Layer Endpoint Layer Flow Layer
Transport Semantics, Reliability Flow Performance Regulation Endpoint Naming
Can split E2E fow into separate CC segments
— Specialize CC algorithm to network technology — Specialize CC algorithm within admin domain
… without interfering with E2E transport semantics!
Endpoint Flow
Host A Host B
Network Transport Application Endpoint Flow Network Transport Application Endpoint Flow Network Endpoint Flow Network
Flow Middlebox Flow Middlebox Segment 2 Satellite Segment 1 WiFi LAN Segment 3 Internet Core
Ad Hoc Wireless Network Wired Internet Mobile Wireless Link
(1) Last-mile proxies for wireless/mobile links
Flow MidB Flow MidB Host Host
Mobility-Aware Congestion Control [M-TCP, ELFN, ...] TCP-friendly Congestion Control [Reno, TFRC, ...] Ad Hoc Wireless Congestion Control [WTCP, ATCP, ...]
LAN LAN
(2) Lossy Satellite or Long-Distance Wireless Links
Host Host
TCP-friendly CC [Reno, TFRC, ...]
Flow MidB
TCP-friendly CC [Reno, TFRC, ...] Specialized/High-Performance CC [HS-TCP, Scalable TCP, BIC-TCP, ...]
Flow MidB
LAN LAN
Host Host Flow MidB Flow MidB
Site 2 LAN
(3) Inter-Site WAN Links in Corporate Networks
Site 1 LAN
Host Host Flow MidB Flow MidB
TCP-friendly or Locally Configured Congestion Control Explicit Congestion Control [XCP, manually configured max rate, ...] Reserved Bandwidth WAN Link TCP-friendly or Locally Configured Congestion Control
Net Source Host Flow Middlebox Router Router Router Router Target Host App Net App Congestion Control Loop 1 Congestion Control Loop 2 Transmit Buffer Receive Buffer Feedback
(ACKs, etc.)
Feedback
(ACKs, etc.)
(1) (1) Link Link Bottleneck Bottleneck (3) (3) “Packets “Packets Dropped!” Dropped!” (4) (4) “Slow “Slow Down!” Down!” (5) (5) Queue Queue Fills Fills (6) (6) “Packets “Packets Dropped!” Dropped!” (7) (7) “Slow “Slow Down!” Down!” (2) (2) Queue Queue Fills Fills
Incrementally deploy performance enhancements
— multihoming [RFC 4960], multipath [Lee 01],
dispersion [Gustafsson 97], aggregation [Seshan 97], ...
… without affecting E2E transport semantics!
Endpoint Protocol
Host A Host B
Transport Protocol Application Protocol Endpoint Protocol Transport Protocol Application Protocol Endpoint Protocol
Flow Middlebox
end-to-end multipath
Endpoint Protocol Flow Protocol Flow Protocol Flow Protocol Flow Protocol
per-segment multipath
Flow Middlebox
Endpoint Protocol
Host A2
Transport Protocol Application Protocol Endpoint Protocol
Flow Middlebox
Endpoint Protocol Flow Protocol Flow Protocol Flow Protocol
Flow Middlebox
Endpoint Protocol
Host A1
Transport Protocol Application Protocol Flow Protocol Endpoint Protocol
Host B2
Transport Protocol Application Protocol Flow Protocol Endpoint Protocol
Host B1
Transport Protocol Application Protocol Flow Protocol
Aggregate Flow
Shared Access Network
— Effcient traffc measurement, management — Fairness at “macro-fow” granularity
Give customers equal shares of upstream BW independent of # connections per customer
ISP Network Home Network
Host Flow Aggregation Middlebox
Upstream Providers
CPE Host ISP-controlled CPE with flow aggregation
Home Network
Host CPE Host Per-bundle CC, 1:1 BW sharing
FTP User BitTorrent User
wo likely “starting points” already exist:
— Congestion Manager [Balakrishnan99] — DCCP [Kohler06]
(just stop thinking of it as a “transport”)
— Support for fow middleboxes, path segmenting — Interfaces between (new) higher & lower layers
Contains “what's left”:
— Datagrams, streams, multi-streams, …
— “Hard” acknowledgment, retransmission
— Receiver-directed fow control — Stream prioritization — ...
— Can traverse NATs & frewalls — Can deploy interoperably in kernel or user space — Apps can negotiate effciently among transports
— Can specialize to different network types — Can deploy/manage within administrative domains
— Can deploy/manage within administrative domains
Promising architecture (we think), but lots of details to work out
— Functionality within each layer — Interfaces between each layer — Application-visible API changes
Big, open-ended design space
— We are starting to explore, but
would love to collaborate
— We are interested in learning about
Transport evolution is stuck T
— Endpoint naming/routing into separate Endpoint Layer — Flow regulation into separate Flow Layer — Leave semantic abstractions in Transport Layer
=> increase
=> decrease
focus on communication performance
— Precisely the role for which the e2e principle
justifes in-network mechanisms
related soft state
— End-to-end fate-sharing is thus preserved