MP-HULA A Multipath Transport Layer Aware Datacenter Load Balancing - - PowerPoint PPT Presentation

mp hula a multipath transport layer aware datacenter load
SMART_READER_LITE
LIVE PREVIEW

MP-HULA A Multipath Transport Layer Aware Datacenter Load Balancing - - PowerPoint PPT Presentation

MP-HULA A Multipath Transport Layer Aware Datacenter Load Balancing Scheme Using Programmable Data Planes Cristian Hernandez Benet , Andreas J. Kassler, Theophilus Benson, Gergely Pongracz MP-HULA: A Transport Layer aware Load Balancing


slide-1
SLIDE 1

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 1

Cristian Hernandez Benet, Andreas J. Kassler, Theophilus Benson, Gergely Pongracz

MP-HULA – A Multipath Transport Layer Aware Datacenter Load Balancing Scheme Using Programmable Data Planes

slide-2
SLIDE 2

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 2

Motivation

§ Multiple Paths § Large Bisection Bandwidth

– But: at most 25% of core links are highly utilized à effective load balancing required

§ Volatile, Unpredicted Traffic patterns § Multipath Transport Protocols (e.g. MPTCP)

– Applications enhance their performance using several paths (e.g. SIRI)

§ Symmetric/Assymetric topologies with different number of layers

slide-3
SLIDE 3

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 3

State of the Art

ECMP CONGA HULA DRILL CLOVE GRANULARITY FLOW FLOWLET FLOWLET PACKET FLOWLET CONGESTION- AWARE NO YES YES YES YES CUSTOM-ASIC NO YES NO YES NO PROGRAMMABLE NO NO YES NO NO SCALABLE YES NO YES YES YES MULTIPATH- TRANSPORT- AWARE NO NO NO NO NO

slide-4
SLIDE 4

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 4

State of the Art

ECMP CONGA HULA DRILL CLOVE GRANULARITY CONGESTION- AWARE CUSTOM-ASIC PROGRAMMABLE SCALABLE MULTIPATH- TRANSPORT- AWARE

Not Multipath Transport Aware

E.g. SCTP, MPTCP, QUIC

slide-5
SLIDE 5

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 5

Challenges

– How to timely and accurately detect congestion in scalable way? – Robust against reordering – Efficiently load balance for asymetric topologies or link failures – Scalable

slide-6
SLIDE 6

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 6

State of the Art - HULA

§ HULA – SOSR’16

– Distance-vector like propagation

  • Periodic probes carry path utilization

– Each switch chooses best downstream path

  • Maintains only best next hop à cannot exploit

multipath transport features à focus of this work

  • Scales to large topologies

– Programmable at line rate

Probe propagation Per next-hop utilization monitoring

Gap ≥ | d1 - d2 |

d1 d2

slide-7
SLIDE 7

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 7

Challenges

– How to timely and accurately detect congestion in scalable way? – Robust against reordering – Efficiently load balance for asymetric topologies or link failures – Scalable

Probing – Global link utilization information Flowlet switching The periodic arrival of probes is used as keep-alive Only storing best-next hop for selected destination

slide-8
SLIDE 8

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 8

MP-HULA – Problem statement

FL:1 FL:2 FL:1 FL:2 SF:1 SF:2 1

Flowlet gap

SF:1 SF:2 MPTCP 1

TCP Connection 1 TCP Connection 2 The switch does not have contextual information about MPTCP

Best Next-hop

slide-9
SLIDE 9

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 9

MP-HULA – Problem statement

§ Most of the Load balancing schemes are not Multipath Transport Aware

– Sub-flows might be routed over the same pathà bandwidth aggregation might be reduced – Redundancy and persistence might be reduced if all sub-flows end-up in a failed link

1

Best Next-hop

FL:1 FL:2 FL:1 FL:2 SF:1 SF:2

Flowlet gap

SF:1 SF:2 MPTCP 1

slide-10
SLIDE 10

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 10

MP-HULA – Problem statement

1

Best Next-hop

Ø When both flowlets arrive, the best next-hop is port 0

FL:1 FL:2 FL:1 FL:2 SF:1 SF:2

Flowlet gap

SF:1 SF:2 MPTCP 1

slide-11
SLIDE 11

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 11

MP-HULA – Problem statement

Ø Both flowlets are sent over port 0. Best Next-hop is updated but flowlets are still sent over the same hop until flowlet expires

FL:1 FL:1

Best Next-hop 1

1 FL:1 FL:2 FL:1 FL:2 SF:1 SF:2

Flowlet gap

SF:1 SF:2 MPTCP 1

slide-12
SLIDE 12

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 12

MP-HULA – Problem statement

FL:1

Best Next-hop 1

1 FL:2

Ø When the flowlet expires, the new flowlet is sent over the current best next-hop (port 1)

FL:1 FL:2 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1

slide-13
SLIDE 13

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 13

MP-HULA – Problem statement

Best Next-hop 1

1 FL:2

Ø When the flowlet expires, the new flowlet is sent over the current best next-hop (port 1)

FL:2 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1

slide-14
SLIDE 14

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 14

MP-HULA – Problem statement

FL:2

Best Next-hop 1

1

Ø Best Next-hop is port 1, so we send flowlet 2 over port 1

FL:2 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1

slide-15
SLIDE 15

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 15

MP-HULA – Problem statement

1

Best Next-hop

What do we want to achieve instead?

  • Bandwidth aggregation
  • Redundancy & Persistence

FL:2 FL:1 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1 FL:1

slide-16
SLIDE 16

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 16

MP-HULA – Problem statement

1

1st Best Next-hop

FL:1 FL:1

2n Best Next-hop 1

What do we want to achieve instead?

FL:2 FL:1 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1 FL:1

slide-17
SLIDE 17

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 17

MP-HULA – Problem statement

1

1st Best Next-hop

FL:1 FL:1

1) Tracking not only the best next-hop but k-best hops

2n Best Next-hop 1

FL:2 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1

How can we do it?

slide-18
SLIDE 18

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 18

MP-HULA – Problem statement

1

1st Best Next-hop

FL:1 FL:1

2) Identifying the MPTCP session and sub-flows to send their flowlets over different ports

2n Best Next-hop 1

FL:2 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1

slide-19
SLIDE 19

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 19

MP-HULA – Problem statement

1

1st Best Next-hop

2) Identifying the MPTCP session and sub-flows to send their flowlets over different ports

2n Best Next-hop 1

FL:2 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1

slide-20
SLIDE 20

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 20

MP-HULA – Problem statement

1

1st Best Next-hop 1

FL:2 FL:2 Not aware that this flowlet belongs to the same MPTCP connection

3) Mark sub-flows belonging to a specific MPTCP session

2n Best Next-hop

FL:2 FL:2 SF:1 SF:2 SF:1 SF:2 MPTCP 1

slide-21
SLIDE 21

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 21

MP-HULA – MPTCP Identification Problem

§ MPTCP spreads application data over multiple sub-flows § MPTCP in general improves fairness, throughput and robustness § Beneficial for long flows (elephant flows)

1

Best Next-hop

  • 1. Syn

FL:1 SF:1 SF:1 MPTCP 1

slide-22
SLIDE 22

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 22

MP-HULA – MPTCP Identification Problem

1

Best Next-hop

  • 2. ACK

§ MPTCP spreads application data over multiple sub-flows § MPTCP in general improves fairness, throughput and robustness § Beneficial for long flows (elephant flows)

FL:1 SF:1 SF:1 MPTCP 1

slide-23
SLIDE 23

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 23

MP-HULA – MPTCP Identification Problem

1

Best Next-hop

  • 3. ACK

FL:1 SF:1 SF:1 MPTCP 1

MPTCP sender/receiver generates token A and B from {Key A} and {Key B} for authentication

slide-24
SLIDE 24

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 24

MP-HULA – MPTCP Identification Problem

1

Best Next-hop

  • 4. ACK

FL:1

Sender MPTCP A sends the generated Token B and a random number (nonce)

FL:1 SF:1 SF:1 MPTCP 1 SF:2

slide-25
SLIDE 25

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 25

MP-HULA – MPTCP Identification Problem

1

Best Next-hop

  • 5. ACK

FL:1 FL:1 SF:1 SF:1 MPTCP 1 SF:2

MPTCP receives the generated Token A and validates it.

slide-26
SLIDE 26

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 26

MP-HULA – MPTCP Identification Problem

1

Best Next-hop

  • 5. ACK

This node is not aware of the 3- handshake messages

FL:1 FL:1 SF:1 SF:1 MPTCP 1 SF:2

MPTCP sends the generated authentication code HMAC A and the connection is initiated.

slide-27
SLIDE 27

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 27

MP-HULA – Parse, Identification and Correlation

1

Best Next-hop

This node is not aware of the 3- handshake messages

§ (1) Parse - The ToR parses the MPTCP

  • ption messages carrying the keys and

tokens to (2) identify the MPTCP session using external function to compute SHA1 § (3) The ToR correlates sub-flows to a given MPTCP connection

SHA1

The ToR parses, identifies, correlates and marks the MPTCP traffic

FL:1 FL:1 SF:1 SF:1 MPTCP 1 SF:2

slide-28
SLIDE 28

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 28

MP-HULA – Parse, Identification and Correlation

1

Best Next-hop

This node is not aware of the 3- handshake messages SHA1

The ToR parses, identifies, correlates and marks the MPTCP traffic

P4 primitives Programmable Parsing RW packet metadata RW access to stateful memory Comparison/arithmetic operators External function

FL:1 FL:1 SF:1 SF:1 MPTCP 1 SF:2

slide-29
SLIDE 29

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 29

MP-HULA – Marking

1

Best Next-hop

This node is not aware of the 3- handshake messages

§ (4) Marking - ToR needs to augment MPTCP data packets by an additional header to uniquely identify the MPTCP connection and sub-flow to upper layer switches.

  • MPTCP_ID (64 bits) to identify the MPTCP

connection

  • Sub-flow_num(4bits) to identify the sub-flow

number within the MPTCP connection

The ToR parses, identifies, correlates and marks the MPTCP traffic

FL:1 FL:1 SF:1 SF:1 MPTCP 1 SF:2

slide-30
SLIDE 30

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 30

MP-HULA – Marking

1

Best Next-hop

This node is not aware of the 3- handshake messages

  • MPTCP_ID (64 bits) to identify the MPTCP

connection

  • Sub-flow_num(4bits) to identify the sub-flow

number within the MPTCP connection

Extra-tables, registers The ToR parses, identifies, correlates and marks the MPTCP traffic

P4 primitives New header format RW packet metadata RW access to stateful memory

FL:1 FL:1 SF:1 SF:1 MPTCP 1 SF:2

slide-31
SLIDE 31

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 31

Our Approach – MP-HULA

§ MP-HULA Probe Processing

– Extended HULA approach to collect k-path utilization

P4 primitives New header format Programmable Parsing RW packet metadata Comparison/arithmetic operators

Each switch maintains a link utilization estimator per switch port based on an exponential moving average generator (EWMA)

Probe

  • riginates

at ToRs Probe replicates through the network until it reaches another ToR

slide-32
SLIDE 32

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 32

Our Approach – MP-HULA

§ MP-HULA Probe Processing

– Collect k path utilization

ToR 1 S2 S3 S4 ToR 10

ToR ID = 10 Max_util = 50%

Probe

ToR ID = 10 Max_util = 80% ToR ID = 10 Max_util = 60%

Dst 1- Best hop Path util ToR 10 S4 50% ToR 1 S2 10% … … ..

Best hop tables (k)

ToR ID = 10 Max_util = 50%

Dst 2- Best hop Path util ToR 10 S3 60% ToR 1 S2 10% … … ..

1st Best next-hop 2n Best next-hop

slide-33
SLIDE 33

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 33

Our Approach – MP-HULA

§ MP-HULA MP-TCP

– Switches load balance flowlet – Correlates MPTCP sub-flows to connection IDs – Routes different sub-flows on different next hops

ToR 1 S2 S3 S4 ToR 10

P4 primitives RW access to stateful memory Comparison/arithmetic operators

Flowlet ID Dest Timestamp Sub-flow ID MPTCP ID Best-hop HASH1 TOR10 1 1 1 S4 HASH2 TOR10 2 2 1 S3 … … …

slide-34
SLIDE 34

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 34

Our Approach – MP-HULA

ToR 1 S2 S3 S4 ToR 10

Dst 1- Best hop Path util ToR 10 S4 50% ToR 1 S2 10% … …

Best hop tables (k)

Dst 2- Best hop Path util ToR 10 S3 60% ToR 1 S3 20% … … MPTCP ID Sub-flow1 Hop1 ID1 1 S4 MPTCP_ID: ID1 Sub_flow_num: 1 Dst 3- Best hop Path util ToR 10 S2 80% ToR 1 S4 30% … … MPTCP ID Sub-flow2 Hop2

slide-35
SLIDE 35

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 35

Our Approach – MP-HULA

ToR 1 S2 S3 S4 ToR 10

Dst 1- Best hop Path util ToR 10 S4 50% ToR 1 S2 10% … … Dst 2- Best hop Path util ToR 10 S3 60% ToR 1 S3 20% … … MPTCP ID Sub-flow1 Hop1 ID1 1 S4 MPTCP_ID: ID1 Sub_flow_num: 2 MPTCP_ID: ID1 Sub_flow_num: 1 Dst 3- Best hop Path util ToR 10 S2 80% ToR 1 S4 30% … … MPTCP ID Sub-flow2 Hop2 ID1 2 S3

Best hop tables (k=3)

slide-36
SLIDE 36

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 36

Our Approach – MP-HULA

ToR 1 S2 S3 S4 ToR 10

Dst 1- Best hop Path util ToR 10 S4 50% ToR 1 S2 10% … … Dst 2- Best hop Path util ToR 10 S3 60% ToR 1 S3 20% … … MPTCP_ID: ID1 Sub_flow_num: 2 MPTCP_ID: ID1 Sub_flow_num: 1 Dst 3- Best hop Path util ToR 10 S2 80% ToR 1 S4 30% … … M P T C P _ I D : I D 1 S u b _ f l

  • w

_ n u m : 3 MPTCP ID Sub-flow1 Hop1 ID1 1 S4 MPTCP ID Sub-flow2 Hop2 ID1 2 S3

. . . Best hop tables (k=3)

slide-37
SLIDE 37

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 37

Our Approach – MP-HULA

ToR 1 S2 S3 S4 ToR 10

Dst 1- Best hop Path util ToR 10 S4 50% ToR 1 S2 10% … …

Best hop tables (k=3)

Dst 2- Best hop Path util ToR 10 S3 60% ToR 1 S3 20% … … MPTCP_ID: ID1 Sub_flow_num: 2 MPTCP_ID: ID1 Sub_flow_num: 1 Dst 3- Best hop Path util ToR 10 S2 80% ToR 1 S4 30% … … M P T C P _ I D : I D 1 S u b _ f l

  • w

_ n u m : 3 MPTCP ID Sub-flow1 Hop1 ID1 1 S4 MPTCP ID Sub-flow2 Hop2 ID1 2 S3 MPTCP_ID: ID1 Sub_flow_num: 4

. . . e.g. Round-robin

slide-38
SLIDE 38

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 38

Evaluation

§ Evaluation

– NS2 simulator – RPC-based workload generator – End-to-end metric

  • Average Flow Completion Time (FCT)

– Two empirical flow size distributions

16 servers per leaf 40Gbps 40Gbps 10Gbps Assymetric

slide-39
SLIDE 39

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 39

MP-HULA exploits transport layer multipath much better

All flows - websearch small flows (<100 kB) - websearch

  • 21%
  • 24%
slide-40
SLIDE 40

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 40

MP-HULA exploits transport layer multipath much better

All flows - websearch small flows (<100 kB) - websearch

  • 21%
  • 34%
  • 24%
  • 45%
slide-41
SLIDE 41

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 41

MP-HULA exploits transport layer multipath much better

All flows – websearch - uncoupled all flows – websearch - asymmetric

  • 54%
  • 13%
slide-42
SLIDE 42

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 42

MP-HULA exploits transport layer multipath much better

All flows – websearch - uncoupled all flows – websearch - asymmetric

  • 54%
  • 15%
  • 13%
  • 32%
slide-43
SLIDE 43

MP-HULA: A Transport Layer aware Load Balancing Scheme for Programmable Data Planes 43

Conclusion & Future Work

§ MP-HULA

– Transport layer multipath aware In-Network Load balancing – Distributes congestion state for k-best paths – Proactive path probing – Adaptive to network congestion, reliable when failures occur – Programmable for emerging data planes, scalable to large topologies – It outperforms to HULA in Average FCT: 1.27 × at 50% load, 1.5 at 90% load

§ Future Work

– Test MP-HULA in P4 testbed (Netronome, NetFPGA) – Extend for other multipath protocols à MP-QUIC – Enhance loadbalancing logic