Structured Streams: A New Transport Abstraction Bryan Ford - - PowerPoint PPT Presentation

structured streams a new transport abstraction
SMART_READER_LITE
LIVE PREVIEW

Structured Streams: A New Transport Abstraction Bryan Ford - - PowerPoint PPT Presentation

Structured Streams: A New Transport Abstraction Bryan Ford Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology ACM SIGCOMM , August 30, 2007 http://pdos.csail.mit.edu/uia/sst/ Current Transport


slide-1
SLIDE 1

Structured Streams: A New Transport Abstraction

Bryan Ford

Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology ACM SIGCOMM, August 30, 2007 http://pdos.csail.mit.edu/uia/sst/

slide-2
SLIDE 2

Current Transport Abstractions

Streams

– Extended lifetime – In-order delivery

Examples:

– TCP – SCTP

Datagrams

– Ephemeral lifetime – Independent delivery

Examples:

– UDP – RDP – DCCP

slide-3
SLIDE 3

Simplistic Overview

The Problem:

  • Streams don't quite match applications' needs
  • Datagrams make the application do everything

The Solution:

  • Structured Streams: like streams, only better
slide-4
SLIDE 4

How Applications Use TCP

Natural approach: streams as transactions or application data units (ADUs) [Clark/Tennenhouse] Example: HTTP/1.0

GET 200 OK <...> GET 200 OK <...> GET 200 OK <...>

TCP Stream Web Client Web Server

GET 200 OK <...>

slide-5
SLIDE 5

TCP Streams as Transactions/ADUs

Advantages:

– Reliability, ordering within each ADU – Independence, parallelism between ADUs

☞ Application-Layer Framing [Clark/Tennenhouse]

Disadvantages:

– Setup cost: 3-way handshake per stream – Setup cost: slow start per stream – Shutdown cost: 4-minute TIME-WAIT period – Network cost: firewall/NAT state per stream – Network cost: unfair congestion control behavior

slide-6
SLIDE 6

How Applications Use TCP

Practical approach: streams as sessions

Cmd Echo

TCP Stream SSH Client SSH Server

CR Echo Cmd Output Cmd Echo CR Echo LIST +OK 1 <...>

TCP Stream POP Client POP Server

RETR +OK <...> DELE +OK RETR +OK <...> GET 200 OK <...>

TCP Stream Web Client Web Server

GET 200 OK <...> GET 200 OK <...>

slide-7
SLIDE 7

TCP Streams as Sessions

Advantages:

– Stream costs amortized across many ADUs

Disadvantages:

– TCP's reliability/ordering applies across many ADUs

Unnecessary serialization: no parallelism between ADUs Head-of-line blocking: one loss delays everything behind ⇒ TCP unusable for real-time video/voice conferencing ⇒ HTTP/1.1 made web browsers slower! [Nielsen/W3C]

– Makes applications more complicated

Pipelined HTTP/1.1 still not widely used after 7 years!

slide-8
SLIDE 8

What about Datagrams?

“Do Everything Yourself”:

– Tag & associate related ADUs – Fragment large ADUs (> ~8KB) – Retransmit lost datagrams (except w/ RDP) – Perform flow control – Perform congestion control (except w/ DCCP)

⇒ complexity, fragility, duplication of effort...

slide-9
SLIDE 9

Structured Stream Transport

“Don't give up on streams; fix 'em!” Goals:

– Make streams cheap

  • Let application use one stream per ADU, efficiently

– Make streams independent

  • Preserve natural parallelism between ADUs

– Make streams easy to manage

  • Don't have to bind, pass IP address & port number,

separately authenticate each new stream

slide-10
SLIDE 10

What is a Structured Stream?

Unix “fork” model for stream creation Given parent stream s between A and B

  • B listens on s
  • A creates child s' on s
  • B accepts s' on s

Image Image Web Browser: Top-level Stream Multimedia Plug-in: Control Stream Video Codec Stream Audio Codec Stream

Video Frames (Ephemeral Streams) Audio Frames (Ephemeral Streams)

Web Page Download: HTML Image Image

slide-11
SLIDE 11

Talk Outline

✔ Introduction to Structured Streams

  • SST Protocol Design
  • Prototype Implementation
  • Evaluation, Related Work
  • Conclusion
slide-12
SLIDE 12

SST Protocol Design

slide-13
SLIDE 13

SST Transport Services

Independent per stream:

– Data ordering – Reliable delivery (optional) – Flow control (receive window)

Shared among all streams:

– Congestion control – Replay/hijacking protection – Transport security (optional)

slide-14
SLIDE 14

SST Organization

Stream Protocol Channel Protocol Negotiation Protocol Underlying Protocol (e.g., UDP, IP, link layer) Application Protocol

Structured Stream Transport (SST)

Streams Channels Sessions

slide-15
SLIDE 15

Streams, Channels, Packets

Top-level Application Stream Channel 1 Channel 2

Streams Time

Substream 1 Substream 2 Substream 3 1.1 1.2 … … …

multiplex streams onto channel 1 multiplex streams onto channel 2

Channels

channel 1 nears end of life; migrate streams to channel 2

Packets

3.1 3.2

slide-16
SLIDE 16

SST Packet Header

T ransmit Sequence Number (TSN) Acknowledgment Sequence Number (ASN) Channel ID — AckCt Channel Header

(8 bytes)

Application Data Stream Payload

(variable)

Message Authentication Check (MAC) Encrypted

(optionally) 31 15 16 23 24 7 8

Type Additional Stream Header Fields (depends on T ype) Local Stream Identifier (LSID) Stream Header

(4–8 bytes)

Flags Window —

(Typical header overhead: 16 bytes + MAC)

slide-17
SLIDE 17

Channel Protocol Design

  • Sequencing
  • Acknowledgment
  • Congestion Control
  • Security (see paper)
slide-18
SLIDE 18

Channel Protocol: Sequencing

Every transmission gets new packet sequence #

– Including acks, retransmissions [DCCP]

3 4 1 5 6 7 9 8 2 Transmissions 1 3 Arrivals 4 5 6 8

(retransmit #2)

9

slide-19
SLIDE 19

Channel Protocol: Acknowledgment

  • All acknowledgments are selective [DCCP]

– No cumulative ack point as in TCP, SCTP

slide-20
SLIDE 20

Channel Protocol: Acknowledgment

  • All acknowledgments are selective [DCCP]
  • Each packet acknowledges a sequence range

2 3 5 6 7 Packet Received Acknowledgment Sent in Return Packet

(acknowledged sequence number range)

1

Ack 1 Ack 1–2 Ack 1–3 Ack 5 Ack 5–6 Ack 5–7

1 2 3 5 6 7 4

Sequence Number Space

Time

(packet 4 dropped)

4

slide-21
SLIDE 21

Channel Protocol: Acknowledgment

  • All acknowledgments are selective [DCCP]
  • Each packet acknowledges a sequence range

– Successive ACKs usually overlap

⇒ redundancy against lost ACKs

– No variable-length SACK headers needed

⇒ all info in fixed header

slide-22
SLIDE 22

Channel Protocol: Acknowledgment

  • All acknowledgments are selective [DCCP]
  • Each packet acknowledges a sequence range
  • Congestion control at channel granularity

– Many streams share congestion state

slide-23
SLIDE 23

Stream Protocol Design

  • Stream Creation
  • Data Transfer
  • Best-effort Datagrams
  • Stream Shutdown/Reset (see paper)
  • Stream Migration (see paper)
slide-24
SLIDE 24

Stream Protocol: Creating Streams

Goal:

Create & start sending data on new stream without round-trip handshake delay

Challenges:

1.What happens to subsequent data segments if initial “create-stream” packet is lost? 2.Flow control: may send how much data before seeing receiver's initial window update?

slide-25
SLIDE 25

Stream Protocol: Creating Streams

Solution:

– All segments during 1st round-trip carry “create” info

(special segment type, parent & child stream IDs)

– Child borrows from parent stream's receive window

(“create” packets belong to parent stream for flow control)

Application Payload

31 15 16 23 24 7 8

Parent Stream Identifier (PSID) Byte Sequence Number (BSN) T ype Local Stream Identifier (LSID) C Window P — —

slide-26
SLIDE 26

Stream Protocol: Data Transfer

Regular data transfer (after 1st round-trip):

– 32-bit wraparound byte sequence numbers (BSNs)

(just like TCP)

– Unlimited stream lifetime

(just like TCP)

Application Payload

31 15 16 23 24 7 8

Byte Sequence Number (BSN) T ype Local Stream Identifier (LSID) C Window P — —

slide-27
SLIDE 27

Stream Protocol: Best-effort Datagrams

“Datagrams” are ephemeral streams

Semantically equivalent to:

1.Create child stream 2.Send data on child stream 3.Close child stream

...but without buffering data for retransmission

(like setting a short SO_LINGER timeout)

slide-28
SLIDE 28

Stream Protocol: Best-effort Datagrams

When datagram is small:

– Stateless best-effort delivery optimization

(avoids need to assign stream identifier to child)

T ype Parent Stream Identifier (PSID) L Window F — Application Payload —

31 15 16 23 24 7 8

Flags: F First Fragment L Last Fragment

slide-29
SLIDE 29

Stream Protocol: Best-effort Datagrams

When datagram is small:

– Stateless best-effort delivery optimization

When datagram is large:

– Fall back to delivery using regular child stream

Makes no difference to application; datagrams of any size “just work”!

slide-30
SLIDE 30

Implementation & Evaluation

slide-31
SLIDE 31

Current Prototype

User-space library in C++

  • Application-linkable ⇒ simple deployment
  • Runs atop UDP ⇒ NAT/firewall compatibility
  • ~13,000 lines; ~4,400 semicolons

(including crypto security & key agreement)

Available at: http://pdos.csail.mit.edu/uia/sst/

slide-32
SLIDE 32

Performance

Transfer performance vs native kernel TCP

– Minimal slowdown at DSL, WiFi LAN speeds

TCP-friendliness

– Congestion control fair to TCP within ± 2%

Transaction microbenchmark: SST vs TCP, UDP Web browsing workloads

– Performance: HTTP on SST vs TCP – Responsiveness: request prioritization

slide-33
SLIDE 33

Transaction Microbenchmark

slide-34
SLIDE 34

Web Browsing Workloads

Performance of transactional HTTP/1.0 on SST:

  • Much faster than HTTP/1.0 on TCP
  • Faster than persistent HTTP/1.1 on TCP [most browsers]
  • As fast as pipelined HTTP/1.1 on TCP

[Opera browser]

slide-35
SLIDE 35

Web Browsing Workloads

HTTP/1.0 over SST can be more responsive

– No unnecessary request serialization – Simple out-of-band communication via substreams

⇨Easy to dynamically prioritize requests

(Demo)

slide-36
SLIDE 36

Related Work

  • Application-Layer Framing [Clark/Tennenhouse]
  • Transports: TCP, RDP, VMTP, SCTP, DCCP
  • Multiplexers: SSL, SSH, MUX, BXXP/BEEP
  • T/TCP: TCP for Transactions [Braden]
  • TCP congestion state sharing [Touch],

Congestion Manager [Balakrishnan]

  • Transport-layer migration support [Snoeren]
  • Network-layer prioritization for QoS [...many...]
slide-37
SLIDE 37

Summary

A New Transport Mindset

TCP: “think serial” SST: “think parallel”

slide-38
SLIDE 38

Future Work

Lots of stuff to do

– Fill holes in spec, code – Efficient implementation(s)

Protocol improvements/extensions

– Fat headers for high-BDP paths – Chunk bundling – “Widening the endpoints”: multihoming, etc.

slide-39
SLIDE 39

High Bandwidth-Delay Product Paths

w/ Regular Headers

~222 packets... ~215 new streams... ~230 stream bytes... ...per round-trip

w/ Fat Headers

~246 channel packets... ~223 new streams... ~246 stream bytes... ...per round-trip

TSN high bits ASN high bits Channel ID — AckCt Transmit Sequence Number (TSN) low bits Acknowledgment Sequence Number (ASN) low bits Type Byte Sequence Number (BSN) low bits Local Stream Identifier (LSID) Flags Window BSN high bits Parent Stream Identifier (LSID) Chunk Size Transmit Sequence Number (TSN) Acknowledgment Sequence Number (ASN) Channel ID — AckCt Type Additional Stream Header Fields (depends on T ype) Local Stream Identifier (LSID) Flags Window —

slide-40
SLIDE 40

Chunk Bundling

Bundle segments from multiple streams into one channel packet

– e.g., VoIP trunking, multiplayer gaming, etc.

Stream 1 Header Channel Header Stream 1 Data Stream 2 Header Stream 2 Data ...

slide-41
SLIDE 41

Widening the Endpoints

Multihoming: multiple interfaces, multiple paths

– Redundancy: fail-over across paths [SCTP]

Path 2 Path 1

Host Host

NIC NIC NIC NIC

slide-42
SLIDE 42

Widening the Endpoints

Multihoming: multiple paths per logical host

– Redundancy: fail-over across paths [SCTP] – Parallelism: sharing load across paths

Path 2 Path 1

Host Host

Path 3

NIC NIC NIC NIC NIC NIC

slide-43
SLIDE 43

Widening the Endpoints

Multihoming: multiple paths per logical host

– Redundancy: fail-over across paths [SCTP] – Parallelism: sharing load across paths – Scaling: clustered/distributed implementations?

Client Server Farm ... Client Client Client Server Server Server

slide-44
SLIDE 44

Widening the Endpoints

Facilitated by SST's design:

– Channel represents physical path – Stream represents logical transaction/activity – Many-to-many relationship

Stream Channel Stream Stream Stream Channel Channel Path Path Path

slide-45
SLIDE 45

Widening the Endpoints

Facilitated by SST's design:

– Channel represents physical path – Stream represents logical transaction/activity – Many-to-many relationship

Applications use many streams, not just one >> fewer inherent concurrency bottlenecks Congestion control is per-channel/path, >> no confusion from varying path delays

slide-46
SLIDE 46

Conclusion

SST enables applications to use streams as:

– Sessions (as in legacy TCP apps), or – ADUs/Transactions (as in HTTP/1.0), or – Datagrams (as in VoIP, RPC over UDP)

...without:

– TCP's per-stream costs, unnecessary serialization – UDP's datagram size limits

http://pdos.csail.mit.edu/uia/sst/

slide-47
SLIDE 47

“Can't HTTP/1.1 over TCP do this?”

Answer: “Sort of, if you work really hard.”

1.Enable HTTP/1.1 pipelining

  • Most browsers still don't because servers get it wrong!

2.Fragment large downloads via Range requests

  • Pummel server with many small HTTP requests
  • Risk atomicity issues with dynamic content

3.Track round-trip time, bandwidth in application

  • Try to keep pipeline full without adding extra delay

But:

Still get head-of-line blocking on TCP segment loss!

slide-48
SLIDE 48

Comparing SST to SCTP

SCTP:

  • No dynamic stream creation/destruction
  • No per-stream flow control (just per session)
  • Best-effort datagrams limited in size

SST:

  • No multihoming/failover (yet)

...but channel/stream split should facilitate

slide-49
SLIDE 49

Comparing SST to DCCP

DCCP:

  • No reliability, ordering, flow control
  • No association between packets
  • No cryptographic security

SST:

  • No congestion control negotiation (yet)
slide-50
SLIDE 50

Channel Protocol: Security

Design based on IPsec

  • Cryptographic security mode:

– Encrypt-then-MAC + replay protection [IPsec]

  • TCP-grade security mode:

– No encryption – MAC = 32-bit checksum + 32-bit “key”

depends on system time [Tomlinson], secret data [Bellovin] stronger protection than TCP: “validity window” size = 1