Rx for Data Center Communication Scalability mir Vigfsson Hussam - - PowerPoint PPT Presentation

rx for data center communication scalability
SMART_READER_LITE
LIVE PREVIEW

Rx for Data Center Communication Scalability mir Vigfsson Hussam - - PowerPoint PPT Presentation

Rx for Data Center Communication Scalability mir Vigfsson Hussam Abu-Libdeh Mahesh Balakrishnan Gregory Chockler Robert Burgess Yoav Tock Ken Birman Haoyuan Li IBM Research, Cornell University Microsoft Research, Haifa Labs Silicon


slide-1
SLIDE 1

Ýmir Vigfússon Gregory Chockler Yoav Tock

Rx for Data Center Communication Scalability

Hussam Abu-Libdeh Robert Burgess Ken Birman Haoyuan Li Mahesh Balakrishnan IBM Research, Haifa Labs Cornell University Microsoft Research, Silicon Valley

slide-2
SLIDE 2

Useful

– IPMC is fast, and widely supported – Multicast and pub/sub often used implicitly – Lots of redundant traffic in data centers [Anand et al. SIGMETRICS ’09]

Rarely used

– IP Multicast has scalability problems!

IP Multicast in Data Centers

slide-3
SLIDE 3

IP Multicast in Data Centers

  • Switching hierarchies
slide-4
SLIDE 4

IP Multicast in Data Centers

  • Switches have limited state space

Switch model (10Gbps) Group capacity Alcatel-Lucent OmniSwitch OS6850-4 260 Cisco Catalyst 3750E-48PD-EF 1,000 D-Link DGS-3650 864 Dell PowerConnect 6248P 69 Extreme Summit X450a-48t 792 Foundry FastIron Edge X 448+2XG 511 HP ProCurve 3500yl 1,499

slide-5
SLIDE 5

IP Multicast in Data Centers

slide-6
SLIDE 6

IP Multicast in Data Centers

  • NICs also have limited state space

E.g. 16 exact addresses 512-bit Bloom filter

slide-7
SLIDE 7

IP Multicast in Data Centers

slide-8
SLIDE 8

IP Multicast in Data Centers

  • Kernel has to filter out unwanted packets!
slide-9
SLIDE 9
  • Packet loss triggers further problems

–Reliability layer may aggravate loss –Major companies have suffered multicast storms

IPMC has dangerous scalability issues

IP Multicast in Data Centers

slide-10
SLIDE 10

Key ideas

  • Treat IPMC groups as a scarce resource

– Limit the number of physical IPMC groups – Translate logical IPMC groups into either physical IPMC groups or multicast by iterated unicast.

  • Merge similar groups together
  • Dr. Multicast
slide-11
SLIDE 11
  • Transparent: Standard IPMC interface to

user, standard IGMP interface to network.

  • Robust: Distributed, fault-tolerant service.
  • Optimizes resource use: Merges similar

multicast groups together.

  • Scalable in number of groups: Limits

number of physical IPMC groups.

  • Dr. Multicast
slide-12
SLIDE 12
  • Dr. Multicast
  • Library maps logical IPMC to

physical IPMC or iterated unicast

  • Transparent to the application

– IPMC calls intercepted and modified

  • Transparent to the network

– Ordinary IPMC/IGMP traffic

slide-13
SLIDE 13
  • Transparent: Standard IPMC interface to

user, standard IGMP interface to network.

  • Robust: Distributed, fault-tolerant service.
  • Optimizes resource use: Merges similar

multicast groups together.

  • Scalable in number of groups: Limits

number of physical IPMC groups.

  • Dr. Multicast
slide-14
SLIDE 14
  • Dr. Multicast
  • Per-node agent maintains global group

membership and mapping – Library consults local agent

  • Leader agent periodically computes new

mapping (see later).

  • State reconciled via gossip
slide-15
SLIDE 15

Library Layer Overhead

  • Experiment measuring sends/sec at one sender
  • Sending to r addresses realizes roughly 1/r ops/sec
  • Insignificant overhead when mapping logical IPMC group to

physical IPMC group.

slide-16
SLIDE 16

Network Overhead and Robustness

  • Experiment on 90 Emulab nodes

Nodes introduced 10 at a time. Total network traffic grows linearly. Average traffic received per-node. Robust to major correlated failure

Half of the nodes die

slide-17
SLIDE 17
  • Transparent: Standard IPMC interface to

user, standard IGMP interface to network.

  • Robust: Distributed, fault-tolerant service.
  • Optimizes resource use: Merges similar

multicast groups together.

  • Scalable in number of groups: Limits

number of physical IPMC groups.

  • Dr. Multicast
slide-18
SLIDE 18

Optimization questions

BLACK

Multicast Users Groups Users Groups

slide-19
SLIDE 19

Optimization Questions

Assign IPMC and unicast addresses s.t.

  • Min. receiver filtering
  • Min. network traffic
  • Min. # IPMC addresses
  • … yet deliver all messages to interested parties
slide-20
SLIDE 20

Optimization Questions

Assign IPMC and unicast addresses s.t.

  • receiver filtering
  • network traffic
  • # IPMC addresses (hard)

M 

) 1 (  

  • Knob to control relative costs of CPU filtering

and of duplicate traffic.

  • Both and are part of administrative policy.

M

slide-21
SLIDE 21

MCMD Heuristic

Groups in `user- interest’ space GRAD STUDENTS FREE FOOD

(1,1,1,1,1,0,1,0,1,0,1,1) (0,1,1,1,1,1,1,0,0,1,1,1)

slide-22
SLIDE 22

MCMD Heuristic

Groups in `user- interest’ space

Grow M meta-groups around the groups greedily while cost decreases

slide-23
SLIDE 23

MCMD Heuristic

Groups in `user- interest’ space

Grow M meta-groups around the groups greedily while cost decreases

slide-24
SLIDE 24

MCMD Heuristic

Groups in `user- interest’ space Unicast Unicast 224.1.2.3 224.1.2.4 224.1.2.5

slide-25
SLIDE 25
  • Social:

– Yahoo! Groups – Amazon Recommendations – Wikipedia Edits – LiveJournal Communities – Mutual Interest Model

Data sets/models

Users Groups

slide-26
SLIDE 26

MCMD Heuristic

  • Total cost on samples of 1000 logical groups.

– Costs drop exponentially with more IPMC addresses

slide-27
SLIDE 27
  • Social:

– Yahoo! Groups – Amazon Recommendations – Wikipedia Edits – LiveJournal Communities – Mutual Interest Model

  • Systems:

– IBM Websphere

Data sets/models

Users Groups

slide-28
SLIDE 28

MCMD Heuristic

  • Total cost on IBM Websphere data set (simulation)

– Negligible costs when using only 4 IPMC addresses

slide-29
SLIDE 29
  • Transparent: Standard IPMC interface to

user, standard IGMP interface to network.

  • Robust: Distributed, fault-tolerant service.
  • Optimizes resource use: Merges similar

multicast groups together.

  • Scalable in number of groups: Limits

number of physical IPMC groups.

  • Dr. Multicast
slide-30
SLIDE 30

Group Scalability

  • Experiment on Emulab with 1 receiver, 9 senders
  • MCMD prevents ill-effects when the # of groups scales up
slide-31
SLIDE 31

IPMC is useful, but has scalability problems  Dr. Multicast treats IPMC groups as scarce and sensitive resources

– Transparent, backward-compatible – Scalable in the number of groups – Robust against failures – Optimizes resource use by merging similar groups

  • Enables safe and scalable use of multicast
  • Dr. Multicast
slide-32
SLIDE 32