Marking-based Network Telemetry Alon Riesenberg * , Yonnie Kirzon * , - - PowerPoint PPT Presentation

marking based network telemetry
SMART_READER_LITE
LIVE PREVIEW

Marking-based Network Telemetry Alon Riesenberg * , Yonnie Kirzon * , - - PowerPoint PPT Presentation

Time-Muxed Parsing in Marking-based Network Telemetry Alon Riesenberg * , Yonnie Kirzon * , Michael Bunin * , Elad Galili * , Gidi Navon , Tal Mizrahi * ACM SYSTOR, Haifa, May 2019 * Background What is network telemetry?


slide-1
SLIDE 1

Time-Muxed Parsing in Marking-based Network Telemetry

Alon Riesenberg*, Yonnie Kirzon*, Michael Bunin*, Elad Galili*, Gidi Navon•, Tal Mizrahi⋄*

ACM SYSTOR, Haifa, May 2019

*

slide-2
SLIDE 2

6/16/2019 2

Background

What is network telemetry? Performance measurement + exporting to a remote location

Delay Queue status Packet loss

Why do we need telemetry? Detection

Failures ‘Elephant’ flows Congestion / Bottlenecks

slide-3
SLIDE 3

Network Telemetry

6/16/2019 3

Operations, Administration, Maintenance (OAM)

Network measurement / monitoring:

Control Message Control Message

slide-4
SLIDE 4

6/16/2019 4

Ping / Traceroute

slide-5
SLIDE 5

6/16/2019 5

Old-School Passive Monitoring

Counters Per port Queue State Latency Per flow Per queue

… …

slide-6
SLIDE 6

6/16/2019 6

Carrier Network OAM

OAM Protocols

IETF ICMPv4

IEEE 802.1ag ITU-T Y.1731

IETF ICMPv6 IETF IPPM

IP OAM

Higher Layers Layer 3 Layer 2 Layer 1

ITU-T Y.1711 MPLS OAM

IEEE 802.3ah

MPLS / PWE3 OAM Ethernet OAM ITU-T G.8113.1

MPLS-TP OAM

IETF MPLS-TP OAM IETF LSP-Ping MPLS OAM IETF PWE3 VCCV

IETF BFD

Active measurement / monitoring:

Control Message Control Message

slide-7
SLIDE 7

6/16/2019 7

Fate Sharing

http://www.speedtest.net

slide-8
SLIDE 8

6/16/2019 8

Piggybacked Measurement

Measurement info is piggybacked onto data packets

IOAM / INT

AM-PM

slide-9
SLIDE 9

6/16/2019 9

Piggybacked Metadata – IOAM / INT

IOAM / INT Domain Analytics Server

Switches push local metadata into header: delay, queue state, …

Telemetry Info IOAM In situ OAM INT In-band Network Telemetry

Per-packet metadata  Per-packet overhead 

slide-10
SLIDE 10

6/16/2019 10

RFC 8321

Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, “Alternate Marking method for passive and hybrid performance monitoring”, RFC 8321, 2018.

draft-mizrahi-ippm-multiplexed-alternate-marking (internet draft)

  • T. Mizrahi, C. Arad, G. Fioccola, M. Cociglio, M. Chen, L. Zheng, and G. Mirsky. “Compact Alternate

Marking Methods for Passive Performance Monitoring”, draft-mizrahi-ippm-compact-alternate- marking, work in progress, IETF, 2018.

AM-PM: Alternate Marking – Performance Measurement

slide-11
SLIDE 11

6/16/2019 11

AM-PM: What Can We Do with ONE Bit Per Packet?

Measurement

Time Marking Bit 000 11111 00000 111 Time Marking Bit 00000001000000000

Pulse Step

slide-12
SLIDE 12

6/16/2019 12

AM-PM: Pulse Marking – Delay Measurement

Analytics Server Time Sent: March 8th, 16:02, 123400789 nsec (UTC) Time Received:March 8th, 16:02, 123500789 nsec (UTC) Network Delay: 100 μsec Servers Servers Checks when packet sent Checks when packet received

slide-13
SLIDE 13

6/16/2019 13

AM-PM: Pulse Marking – Loss Measurement

Analytics Server Counter: 2100 Counter: 2000 Packets lost: 100 Servers Servers Records counter value Records counter value

Out of order?

slide-14
SLIDE 14

PacketsSent: 10,000 Packets Lost: 500 Packets Received: 9,500

6/16/2019 14

Analytics Server Servers Servers

AM-PM: Alternate Marking – Loss Measurement

Counts number of packets received Counts number

  • f packets sent

Consistent counting:

  • Export the counter of each color

when it is not in use.

  • Resilient to reordering.

... per-color counting

slide-15
SLIDE 15

6/16/2019 15

Analytics Server Servers Servers

AM-PM: Double Marking

Pulse bit: Delay Step bit: Loss TWO bits per packet

slide-16
SLIDE 16

6/16/2019 16

AM-PM: Multiplexed Marking

Servers Servers

ONE bit per packet Accurate loss and delay measurement! Pulse: Delay Step: Loss

slide-17
SLIDE 17

6/16/2019 17

Design and Implementation of AM-PM

Match-Action Lookup

TCAM / Exact match / P4

State

Detect first packet (pulse/step)

Time-as-a-match

TimeFlip

slide-18
SLIDE 18

6/16/2019 18

Time.Sec Time.Frac

1 * … * * … *

Periodic range

... time

1 second

action

Time field2 field3 field4 …

TCAM Switch header / metadata

Time-as-a-match: TimeFlip [MRM]

[MRM] Mizrahi, Rottenstreich, Moses, INFOCOM 2015.

slide-19
SLIDE 19

6/16/2019 19

Design and Implementation of AM-PM: Step/Pulse

Match-Action Lookup

TCAM / Exact match / P4

State

Detect first packet (pulse/step)

Time-as-a-match

TimeFlip

slide-20
SLIDE 20

6/16/2019 20

Multiplexed Marking: a Naïve Implementations

Time

Marking bit 1

Track the value of the marking bit. Detect pulse When the value changes for one packet. Detect step When the value changes for more than one packet.

Non-trivial to implement using a match-action abstraction.

slide-21
SLIDE 21

6/16/2019 21

Our Approach: Time-multiplexed Parsing

  • TimeFlip is used to divide time into time slots.
  • The marking bit has a different interpretation in each time slot.
  • Requires rough time synchronization, e.g., ~ 1 second.

Header field(s) have a different interpretation in each time slot!

Time 1

000 001 010 011 100 101 110 111

Marking bit

Detect step Detect ‘1’ pulse Detect ‘0’ pulse

slide-22
SLIDE 22
  • TimeFlip is used to divide time into time slots.
  • The marking bit has a different interpretation in each time slot.
  • Requires rough time synchronization, e.g., ~ 1 second.

6/16/2019 22

Our Approach: Time-multiplexed Parsing

Header field(s) have a different interpretation in each time slot!

Time 1

000 001 010 011 100 101 110 111

Marking bit

Detect step Detect ‘1’ pulse Detect ‘0’ pulse

slide-23
SLIDE 23

AM-PM Evaluation using Marvell Prestera Switches

loss and delay ฀ congestion is detected

6/16/2019 23

Traffic Generator

Management Monitored data flow Background traffic Switch 1 Switch 2

slide-24
SLIDE 24

6/16/2019 24

Software Implementation using P4

S1 S2 S3 H1 H2 Server

  • Implemented in P4.
  • Time-of-day match field.
  • AM-PM in P4.
  • Tested in Mininet.
  • Open source code.
slide-25
SLIDE 25

AM-PM: Where is it going?

6/16/2019 25

Ongoing AM-PM work in the IETF: QUIC MPLS NSH BIER Geneve AM-PM is under discussion in 6 working groups in the IETF… Network telemetry Low overhead

AM-PM

...

slide-26
SLIDE 26

Large Scale Deployment in Telecom Italia

6/16/2019 26

  • Mobile backhaul network ~ 1000 eNodeBs.
  • AM-PM one bit (step-based) loss measurement.
  • Uses unused bit in DSCP.
  • Off-the-shelf network equipment.
slide-27
SLIDE 27

Summary

6/16/2019 27

Design and implementation of AM-PM Hardware-based implementation using a Marvell switch. Experimental results Software-based implementation in P4 – open source. Novel time-multiplexed parsing

Time

1

000 001 010 011 100 101 110 111

Marking bit

table look_for_flag { reads { intrinsic_metadata.time_of_day : ternary; ipv4.flag_a : exact; } actions { _look_for_flag; _drop; } size: 256; }

slide-28
SLIDE 28

28

Thanks!

slide-29
SLIDE 29

6/16/2019 29

References

[1] Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, “Alternate Marking method for passive and hybrid performance monitoring”, RFC 8321, 2018. [2] Mizrahi, T., Arad, C., Fioccola, G., Cociglio, M., Chen, M., Zheng, L., and G. Mirsky, “Compact Alternate Marking Methods for Passive and Hybrid Performance Monitoring”, draft-mizrahi-ippm-compact-alternate-marking, work in progress, IETF, 2019. [3] Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P., Chang, R. and D. Bernier, J. Lemon, "Data Fields for In-situ OAM", draft-ietf-ippm-ioam-data, work in progress, 2019. [4]

  • C. Kim et al., “In-band network telemetry (INT)”, P4 consortium, 2015.

[5] Mizrahi, T., Vovnoboy, V., Nisim, M., G. Navon, and A. Soffer, “Network Telemetry Solutions for Data Center and Enterprise Networks”, Marvell white paper, 2018. [6] Mizrahi, T., Rottenstreich, O. and Y. Moses, “TimeFlip: Scheduling Network Updates with Timestamp-based TCAM Ranges”, IEEE INFOCOM, 2015. [7] Mizrahi, T., Navon, G., Fioccola, G., Cociglio, M., Chen, M., and G. Mirsky, “AM-PM: Efficient Network Telemetry using Alternate Marking”, IEEE Network, 2019. [8] Riesenberg, A., Kirzon, Y., Bunin, M., Galili, E., Navon, G., and T. Mizrahi, “Time-Multiplexed Parsing in Marking-based Network Telemetry”, ACM SYSTOR, 2019. [9] P4 AM-PM, https://github.com/AlternateMarkingP4/FlaseClase, 2018.