6.888 Lecture 9: Wireless/Op4cal Datacenters Mohammad Alizadeh and - - PowerPoint PPT Presentation

6 888 lecture 9 wireless op4cal datacenters
SMART_READER_LITE
LIVE PREVIEW

6.888 Lecture 9: Wireless/Op4cal Datacenters Mohammad Alizadeh and - - PowerPoint PPT Presentation

6.888 Lecture 9: Wireless/Op4cal Datacenters Mohammad Alizadeh and Dinesh Bharadia Many thanks to George Porter (UCSD) and Vyas Sekar (Berkeley) Spring 2016 1 Datacenter Fabrics Spine Leaf 1000s of server ports Scale out designs (VL2,


slide-1
SLIDE 1

6.888 Lecture 9: Wireless/Op4cal Datacenters

Mohammad Alizadeh and Dinesh Bharadia

Spring 2016

1

² Many thanks to George Porter (UCSD) and Vyas Sekar (Berkeley)

slide-2
SLIDE 2

Datacenter Fabrics

2

Leaf 1000s of server ports Spine

Scale out designs (VL2, Fat-tree)

Ø LiTle to no oversubscrip4on Ø Cost, power, complexity

slide-3
SLIDE 3

3

² hTps://code.facebook.com/posts/360346274145943/introducing-data-center-fabric-the- next-genera4on-facebook-data-center-network/

Mul4ple switching layers (Why?)

slide-4
SLIDE 4

Building Block: Merchant Silicon Switching Chips

4

Facebook Wedge 6 pack Switch ASIC

² Image courtesy of Facebook

Limited radix: 16x40Gbps High power: 17 W/port

slide-5
SLIDE 5

5

² hTps://code.facebook.com/posts/360346274145943/introducing-data-center-fabric-the- next-genera4on-facebook-data-center-network/

Long cables (fiber)

slide-6
SLIDE 6

S0,0 S0,1 S0,2 S0,3 S0,k

...

S1,0 S1,1 S1,2 S1,3 S1,k

...

S2,0 S2,1 S2,2 S2,3 S2,k

...

SN,0 SN,1

SN,k/2

... = Core transceiver = Edge transceiver

Hi Hi Hi Hi Hi

N-Layers

Scale-out packet-switch fabrics

Large number of switches, fibers, op4cal transceivers Power hungry Hard to expand

slide-7
SLIDE 7

7

Beyond Packet-Switched DC Fabrics

S0,0 S0,1 S0,2 S0,3 S0,k

...

= Edge transceiver

Hi Hi Hi Hi Hi

OCSkxk Pkt

Op4cal circuit switching

[Helios, cThrough, Mordio, ReacTor, …]

60 GHz RF

[Flyways, MirrorMirror]

² Fig. from presenta4on by Xia Zhou Steerable Links

Free-space Op4cs

[FireFly]

slide-8
SLIDE 8

Integra4ng Microsecond Circuit Switching into the Data Center

8

² Slides based on presenta4on by George Porter (UCSD)

slide-9
SLIDE 9

Key idea: Hybrid Circuit/Packet Networks

S0,0 S0,1 S0,2 S0,3 S0,k

...

= Edge transceiver

Hi Hi Hi Hi Hi

OCSkxk Pkt

Why build hybrid switch?

slide-10
SLIDE 10

Circuit vs. Packet Switching

Electrical Packet $500/port 10 Gb/s fixed rate 12 W/port Transceivers (OEO) Buffering Per-packet switching In-band control Op.cal Circuit $500/port Rate free (10/40/100/400/+) 240 mW/port No transceivers No buffering Duty cycle overhead Out-of-band control Observa4on: Correlated traffic è Circuits

slide-11
SLIDE 11

Disadvantages of Circuits

Despite advantages, circuits present different service model:

– Point-to-point connec4vity – Must wait for circuit to be assigned – Circuit “down” while being reconfigured

S0,0 S0,1 S0,2 S0,3 S0,k

...

= Edge transceiver

Hi Hi Hi Hi Hi

OCSkxk Pkt

} }

affects throughput, latency affects network duty cycle;

  • verall efficiency
slide-12
SLIDE 12

Stability Increases with Aggrega4on

12

Inter-Thread Inter-Process Inter-Server Inter-Rack Inter-Pod Inter-Data Center Where is the Sweet Spot?

  • 1. Enough Stability
  • 2. Enough Traffic
slide-13
SLIDE 13

Mordia OCS model

...

OCSkxk

S0 S1 S2 S3 Sk

à

  • Directly connects inputs to outputs
  • Reconfigura4on 4me: 10us

– “Night” 4me (Tn): no traffic during reconfigura4on – “Day” 4me (Td): circuits/mapping established

  • Duty cycle: Td / (Td+Tn)

Bi-par4te graph

… S0 S1 S2 S3 Sk … S0 S1 S2 S3 Sk

slide-14
SLIDE 14

Previous approaches: Hotspot Scheduling

TM

Step 1. Observe network traffic Step 3. Reconfigure Step 2. Compute schedule

S OCS

  • 1. Observe
  • 2. Compute
  • 3. Reconfig
  • 1. Observe
  • 2. Compute
  • 3. Reconfig
  • 1. Observe

Reconfig

Time

  • 2. Compute

X X X

Assign circuits to elephants

slide-15
SLIDE 15

Limita4ons of Hotspot Scheduling

  • 1. Observe

3 config

Time

  • 1. Observe

3

  • 1. Observe

3 Reconfig

Time

Goal

  • 1. Observe

2 3 3 3 3 3 3 3 3 3

  • 1. Observe

2 3 3 3 3 3 3 3 3 3

  • 1. Observe

2 3 3 3 3 3 3 3 3 3

  • 1. Observe

2 3 3

TM(t) TM(t)

slide-16
SLIDE 16

Traffic Matrix Scheduling

TM TM´ P1 t1 t2 tN P2 PN + + +

Step 1. Gather traffic matrix TM Step 3. Decompose TM´ into schedule Step 4. Execute schedule in hardware Step 2. Scale TM into TM´

t1 t2 tN

Birkhoff von-Neumann Decomposi4on

slide-17
SLIDE 17

BvN Decomposi4on

k’ could be large ( in worst case) T has to be doubly-stochas4c

² Suppose: T is a scaled doubly-stochas4c matrix

slide-18
SLIDE 18

Scheduling

circuit switch configura4on: bipar4te graph matching

4me

1 1 1 1 1 4 4 4 4 4

n = 5 nodes Traffic Matrix: T

slide-19
SLIDE 19

4me

1 1 1 1 1 4 4 4 4 4

Scheduling

configura4on of circuit switch modeled as bipar4te graph matching

n = 5 nodes Traffic Matrix: T

slide-20
SLIDE 20

4me

1 1 1 1 1

reconfiguration delay

Scheduling

configura4on of circuit switch modeled as bipar4te graph matching

n = 5 nodes Traffic Matrix: T

slide-21
SLIDE 21

4me

1 1 1 1 1

Scheduling

configura4on of circuit switch modeled as bipar4te graph matching

n = 5 nodes Traffic Matrix: T

slide-22
SLIDE 22

4me

Scheduling

configura4on of circuit switch modeled as bipar4te graph matching

n = 5 nodes Traffic Matrix: T

slide-23
SLIDE 23

maximize throughput in 4me-window W

4me

1 1 1 1 1 4 4 4 4 4

W

??

Scheduling

n = 5 nodes Traffic Matrix: T

slide-24
SLIDE 24

Problem Statement

maximize s.t.

permuta4on matrices dura4on number of matchings

slide-25
SLIDE 25

Eclipse: Greedy Algorithm

(with provable guarantees)

25

² Venkatakrishnan et al., “Costly Circuits, Submodular Schedules, Hybrid Switch Scheduling for Data Centers”, To appear in SIGMETRICS 2016.

slide-26
SLIDE 26

Discussion

26

slide-27
SLIDE 27

Firefly

27

² Slides based on presenta4on by Vyas Sekar (CMU)

slide-28
SLIDE 28

Why FSO instead of RF?

28

RF (e.g. 60GHZ) FSO (Free Space op4cal)

Wide beam è Faster steering of beams High interference Limited ac4ve links Limited Throughput Narrow beam è Slow steering of beams Zero interference No limit on ac4ve links High Throughput

slide-29
SLIDE 29

29

Today’s FSO

Cost: $15K per FSO Size: 3 s³ Power: 30w Non steerable

  • Current: bulky, power-hungry, and expensive
  • Required: small, low power and low expense
slide-30
SLIDE 30

Why Size, Cost, Power Can be Reduced?

30

  • Tradi4onal use : outdoor, long haul

‒ High power

‒ Weatherproof

  • Data centers: indoor, short haul
  • Feasible roadmap via commodity fiber op4cs

‒ E.g. Small form transceivers (Op4cal SFP)

slide-31
SLIDE 31

FSO Design Overview

31

SFP fiber op4c cables Diverging beam Lens focal distance

  • large cores (> 125 microns) are more robust

Large core fiber op4c cables Parallel beam lens Focusing lens Collima4ng lens

slide-32
SLIDE 32

FSO Link Performance

6 mm 6 mm

32

FSO link is as robust as a wired link

Effect of vibra4ons, etc. 6mm movement tolerance Range up to 24m tested

slide-33
SLIDE 33

33

Steerability

ü Cost ü Size ü Power

  • Not Steerable

FSO design using SFP

Via Switchable mirrors

  • r Galvo mirrors

Shortcomings of current FSOs Shortcomings of current FSOs

slide-34
SLIDE 34

Steerability via Switchable Mirror

34

A Ceiling mirror B C

  • Switchable Mirror: glass mirror
  • Electronic control, low latency

SM in “mirror” mode

slide-35
SLIDE 35

Steerability via Galvo Mirror

35

A Ceiling mirror B C

  • Galvo Mirror: small rota4ng mirror
  • Very low latency

Galvo Mirror

slide-36
SLIDE 36

How to design FireFly network?

36

Goals: Robustness to current and future traffic Budget & Physical Constraints Design parameters

– Number of FSOs? – Number of steering mirrors? – Ini4al mirrors’ configura4on

Performance metric

– Dynamic bisec4on bandwidth

slide-37
SLIDE 37

Discussion

37

slide-38
SLIDE 38

Next Time: Rack-Scale Compu4ng

38

slide-39
SLIDE 39

39