Fastpass A Centralized Zero-Queue Datacenter Network Jonathan - - PowerPoint PPT Presentation

fastpass a centralized zero queue datacenter network
SMART_READER_LITE
LIVE PREVIEW

Fastpass A Centralized Zero-Queue Datacenter Network Jonathan - - PowerPoint PPT Presentation

Fastpass A Centralized Zero-Queue Datacenter Network Jonathan Perry Amy Ousterhout Hari Balakrishnan Devavrat Shah Hans Fugal Ideal datacenter network properties No current design satisfies all these properties simultaneously Scaling


slide-1
SLIDE 1

Fastpass A Centralized “Zero-Queue” Datacenter Network

Jonathan Perry

Amy Ousterhout Hari Balakrishnan Devavrat Shah Hans Fugal

slide-2
SLIDE 2 Alizadeh et al, "DCTCP", SIGCOMM'10

Ideal datacenter network properties

No current design satisfies all these properties simultaneously

Burst Control Low Tail Latency Multiple Objectives

Scaling Memcache at Facebook, Fine-grained TCP retransmissions Datacenter TDMA, Tail at scale, pFabric, PDQ, DCTCP, D3, Orchestra EyeQ, Seawall, Oktopus, Hedera, VL2, Mordia, SWAN, MATE, DARD

slide-3
SLIDE 3 Alizadeh et al, "DCTCP", SIGCOMM'10

Fastpass goals

Is it possible to design a network that provides

  • 1. Zero network queues
  • 2. High Utilization
  • 3. Multiple app and user objectives

Burst Control Low Tail Latency Multiple Objectives

slide-4
SLIDE 4

Centralized arbiter schedules and assigns paths to all packets

Concerns with centralization: Latency Scaling Fault tolerance

Chuck Norris doesn't wait in queues.

He schedules every packet in the datacenter!

slide-5
SLIDE 5

Example: Packet from A to B

5µs A → Arbiter "A has 1 packet for B" 1-20µs Arbiter timeslot allocation & path selection 15µs Arbiter → A "@t=107: A → B through R1" no queuing A → B sends data

Arbiter A B R1 R2

slide-6
SLIDE 6

Scheduling and selecting paths

Arbiter treats network as a big switch

Timeslot = 1.2 µs Step 1: Timeslot Allocation Choose a matching Step 2: Path selection Map matching onto paths

slide-7
SLIDE 7

System structure

FCP client Host networking stack Path Selection FCP server Timeslot allocation

destination and size timeslots and paths

NIC Endpoint Arbiter

Challenges: Latency Scaling Fault tolerance

slide-8
SLIDE 8

Timeslot allocation = maximal matching

~10ns per demand

t=100

(

→ ,

)

src dst pkts

1 2 3

(

→ ,

)

src dst pkts

3 1 3

(

→ ,

)

src dst pkts

7 4 1

(

→ ,

)

src dst pkts

5 8 2

(

→ ,

)

src dst pkts

4 3 4

(

→ ,

)

src dst pkts

1 3 1

(

→ ,

)

src dst pkts

8 6 3

slide-9
SLIDE 9

How to support different objectives?

Order matters!

t=100

(

→ ,

)

src dst pkts

8 5 4

(

→ ,

)

src dst pkts

6 2 2

(

→ ,

)

src dst pkts

6 7 5

(

→ ,

)

src dst pkts

1 4 1

(

→ ,

)

src dst pkts

6 1 5

(

→ ,

)

src dst pkts

2 5 6

(

→ ,

)

src dst pkts

1 7 3

(

→ ,

)

src dst pkts

4 3 2

slide-10
SLIDE 10

Can pipeline timeslot allocation

How to scale timeslot allocation?

t=100 t=101 t=102 t=103

Core 1 Core 2 Core 3 Core 4

(

→ , )

src dst pkts

2 4 7

(

→ , )

src dst pkts

9 12 6

(

→ , )

src dst pkts

1 6 6

(

→ , )

src dst pkts

5 9 8

(

→ , )

src dst pkts

11 7 8

(

→ , )

src dst pkts

1 11 8

2211.8 Gbits/s on 8 cores

slide-11
SLIDE 11

Are maximal matchings good matchings?

Maximal Matching Optimal scheduler 2C network capacity C network capacity Dai-Prabhakar '00: Finite average latency

Finite average latency Our theorem: Average latency

≤ 2 ×

Average latency

slide-12
SLIDE 12

System structure

FCP client Host networking stack Path Selection FCP server Timeslot allocation

destination and size timeslots and paths

NIC Endpoint Arbiter

Challenges: Latency Scaling Fault tolerance

slide-13
SLIDE 13

Fault-tolerance

Arbiter failures Hot backups , TCP as last resort Switch failures Packet loss to arbiter

slide-14
SLIDE 14

Experimental results

Timeslot allocation 2.21 Terabits/s with 8 cores Path selection >5 Terabits/s with 10 cores

Facebook experiments:

Switch queue length, RTT Convergence to network share Reducing retransmission in production

slide-15
SLIDE 15

Queues & RTT

TCP ping

5 10 15 1 2 3 4

Ping time (milliseconds) Density

fastpass baseline

3.56 ms 3.56 ms .23 ms .23 ms

slide-16
SLIDE 16

Convergence to network share

2 4 6 2 4 6 baseline fastpass 50 100 150 200 250

Time (seconds) Per−connection throughput (Gbits/s)

Sender 1 2 3 4 5

5200x stddev

slide-17
SLIDE 17

Reducing retransmissions in production

2 4 6

Time (seconds) Median packet per node per second retransmissions

2000 4000 6000 b a s e l i n e b a s e l i n e f a s t p a s s

Each server: ~50k QPS

slide-18
SLIDE 18

Benefits

A: "Now I can see pictures of other's people's food and children so much more quickly...can't wait..>.>" B: "You forgot about [...] cats. I will say, faster pics of cats is probably worth some merit."

slide-19
SLIDE 19

Benefits

Low user latency Stronger network semantics

No packet drops, predictable latency, deadlines, SLAs

Developer productivity

Less dealing with bursts, tail latency, hotspots Simplify building complex systems

Lower infrastructure cost

Less over-provisioning

slide-20
SLIDE 20

Fastpass enables new network designs

Traditional Flow control Congestion control Update routing tables Scheduling & Queue management Packet forwarding SDN Flow control Congestion control Update routing tables Scheduling & Queue management Packet forwarding Fastpass Flow control Congestion control Per-packet path selection Scheduling & Queue management Packet forwarding Endpoint Centralized Switch

Fastpass: centralizes control at packet granularity Switches can become even simpler and faster

slide-21
SLIDE 21

Conclusion

Zero network queues High Utilization Multiple app and user objectives Pushes centralization to a logical extreme Opens up new possibilities for even faster networks Code (MIT licensed): http://fastpass.mit.edu