 
              High Precision Based Network Performance Monitoring in critical infrastructures Presented by Rik Boelee 26-9-2019
Agenda  High Precision Based Network Performance Monitoring  Why is it needed  Benefits  customer cases 26-9-2019
Challenge in assuring five nines network availability  Network downtime is exceptionally costly for Communications Service Providers (CSPs), topping some € 13 billion each year;  Telecom operators experience an average of five to six severe outages per year, (nearly one every other month), despite great effort to address problem;  Current Network Monitoring and Service Assurance solutions on the market have clearly failed to mitigate the problem - inaccurate, reactive and do not provide the full network visibility;  Aside from the costs of these “solutions” CSPs spend between 1.5% and 5% of their annual revenue fixing network issues;  Customer are demanding stricter SLAs, with latency warranties 26-9-2019
High Precision Network Monitoring Active Passive (Network Performance Focus) (Application, User Data focus) Taps, Packet Brokers Active Measurements Latency Video and Voice Quality Jitter Financial Applications (HFT) Data Delivery Core Sniffers End to End Cloud Service Level Agreements DCI 26-9-2019
Precision Based Network Performance Monitoring Probe-to-probe measurements • End-to-End Latency monitoring • Segmented view • Ring & Mesh topologies • NetPrecision test for one-way loss, one-way delay, one-way delay variation • TrueTCP Service Activation Testing on L2/L3 and L4-L7 • Easy-to-use Test Topology Designer KPI collection Differentiating requirements Test Topology Designer • Down to 1 ms sampling rate to catch microbursts • Microsecond accurate HW timestamping Measurements • Unique service provisioning capabilities • Add’l KPIs: MOS & DSCP changed • LAG & Equal Cost Multipath Routing supported • Microsecond and ppm level reporting vProbe ) granularities Granular high-accuracy visibility into any network 26-9-2019
Network Precision measurements  With Network Precision measurement, you can monitor end-to-end segments and topologies such as mesh, star and ring.  TrueTCP RFC 6349 and Y.1564 Service Activation Testing provide you a toolset for service activation and bandwidth testing on all layers (L2-L7).  Hardware Timestamp Engine for 1 microsecond accuracy  L2 to L7 Service Activation Testing and Troubleshooting  Enables localized and high frequency SNMP polling for granular bandwidth monitoring etc. 26-9-2019
Probe Originated Measurements Probe vProbe Probe Precision Test Continuous active Passive Throughput Tests Measurements monitoring • High Accuracy L3 measurement • Segmented Y.1564 SAT NetPrecision SNMP Bandwidth collection • Mesh/Partial Mesh TWAMP RFC 2544 • Delay (one-way, two-way), Delay UDP Echo Generic SNMP data TrueTCP (RFC 6349) Variation, Packet Loss, Percentiles Y.1731 PCAP packet iPerf3 TCP, UDP, • High PPS capture TCP Connect SCTP 26-9-2019
End-to-End HQ/Data Center End-to-End NetPre Multi-Service Stats NetPre End-to-End | Multi-Service • NetPre 1 (VLAN 100, DSCP0) • NetPre 2 (VLAN 200, DSCP46) • NetPre 3 (VLAN 300, DSCP48) • NetPre 4 (VLAN 400, DSCP52) Customer Premises Hardware probe • One-Way Latency • One-Way Loss • One-Way Delay Variation • Hardware probe Two-Way Metrics 26-9-2019
L a t e n cy M a n a g e me n t Topologies 26-9-2019
Problems Providers are trying to solve  Need to know better than customers about outages  Today L2VPN customers might see outage (BFD/LACP) that you’re not aware  Need to know how LSP Metric changes affect whole network  Did it make things better or worse when looking a big picture  Need to know how new software and new hardware affects network performance  Need to have high-level simple single number to look over time, to ensure network quality is stable or getting better 26-9-2019
Additional use cases  Use a good solution high precision data for customer portal reporting  Use performance data to trigger LSP reroutes, when performance issues happen (REST/RESTCONF APIs)  Optimize buffering – since knowing precisely what’s the “Jitter” in the network  Use virtual probes in major datacenters, also to have coarse visibility to external networks 26-9-2019
Economical benefits  Protecting business (avoid churn), one major customer annual revenue can be millions  Building trust, visualizing from customer perspective (outage, delay, jitter)  Quote from global transit provider: Customer portal important for a customer with 35-40 sites, annual revenue 2-3M €  Saves personnel cost 26-9-2019
Use Cases NetPre: End-to-End: Cloud Connectivity, Business Extensive IP/Ethernet, Data Center Interconnect... Latency Management Core Mesh: Provider Core, Data Center Interconnect Ring & Segmented: Metro / Access, Data Center Top of Racks Probe-to-Probe One-Way Visibility 26-9-2019
Customer case: Problem Statement Backbone Network Monitoring • Part of a Tier-1 operator group • Customer was not satisfied with inbuilt • Problem Statement router measurements • Need to know better than customers about outages because they were • Today L2VPN customers might see outage inaccurate and (BFD/LACP) that the service provider is not aware of unreliable • Cannot discriminate if it was customer or provider caused the issue • Management • Need to know how IP/MPLS Metric changes affect whole solution did not network provide the • Did it make things better or worse when looking a big Example: necessary view to picture • Experienced the whole network quality issues • Need to know how new software and new hardware affects but packet loss network performance was reported to be zero • Need to have high-level simple single number to look over time, to ensure network quality is stable or getting better Do Not Operate the Network Blind 26-9-2019
Customer case: Validation Backbone Network Monitoring ams -> chcg L2VPN Latency Creanord Indicative ams -> chcg L2VPN PacketLoss Creanord Indicative Date (Daily averaged) Latency (ms) Latency (ms) Date (Daily averaged) PacketLoss (%) PacketLoss (%) 11.2.2018 107,9046833 100,5173611 11.2.2018 0 0 12.2.2018 108,1344165 101,4479167 12.2.2018 0,000127315 0 13.2.2018 108,2146514 99,45138889 13.2.2018 8,10185E-05 0 14.2.2018 107,9063917 102,0590278 14.2.2018 0,000300926 0 15.2.2018 107,9068383 100,8090278 15.2.2018 0 0 16.2.2018 107,9068715 101,0798611 16.2.2018 0 0 17.2.2018 107,9063826 101,7708333 17.2.2018 0 0 18.2.2018 107,9064528 101,09375 18.2.2018 0 0 19.2.2018 107,9073111 100,1875 19.2.2018 0,000150463 0 20.2.2018 107,9119171 102,1840278 20.2.2018 0 0 21.2.2018 107,1188285 105,8680556 21.2.2018 0,03650463 0 22.2.2018 110,540758 104,1805556 22.2.2018 0 0 23.2.2018 116,2365895 103,6631944 23.2.2018 0,000138889 0 24.2.2018 116,4771983 102,0034722 24.2.2018 9,25926E-05 0 25.2.2018 116,2009374 102,3576389 25.2.2018 0,002060185 0 26.2.2018 116,0155176 103,2083333 26.2.2018 0,002071759 0 27.2.2018 114,8006412 102,9166667 27.2.2018 0,000173611 0 28.2.2018 108,3591696 112,3541667 28.2.2018 0,000162037 0 1.3.2018 116,5418087 109,3645833 1.3.2018 0 0 2.3.2018 115,9532056 106,7881944 2.3.2018 0,57210223 0 10% difference in accuracy (latency) Existing product was showing 0% packet loss In business critical application continuously ~10ms really matters 26-9-2019
Customer case: Validation Backbone Network Monitoring ams -> chcg L2VPN TrueJitter Creanord (owfeipdvavg) Indicative ams -> chcg L2VPN MaxJitter Creanord (owfeipdvmax) Indicative Date (Daily averaged) TrueJitter (µs) TrueJitter (µs) Date (Daily averaged) MaxJitter (µs) MaxJitter (µs) 11.2.2018 2,584953705 164,656309 11.2.2018 11,67812502 944,4444444 12.2.2018 2,774305558 147,2952257 12.2.2018 12,98460648 937,5 13.2.2018 2,781944443 152,5949271 13.2.2018 13,06932874 958,3333333 14.2.2018 2,751504627 154,6051597 14.2.2018 18,10277783 934,0277778 15.2.2018 2,814814816 154,2396563 15.2.2018 13,25613425 944,4444444 16.2.2018 2,714004629 151,498441 16.2.2018 12,55879632 965,2777778 17.2.2018 2,660995369 157,5291354 17.2.2018 11,79282402 958,3333333 18.2.2018 2,594328703 140,1680417 18.2.2018 11,58310184 930,5555556 19.2.2018 2,660185191 164,2908368 19.2.2018 12,78483797 937,5 20.2.2018 2,751736113 168,4940313 20.2.2018 13,03194442 975,6944444 21.2.2018 2,71613757 156,9808819 21.2.2018 16,58849211 954,8611111 22.2.2018 2,676620372 161,0013785 22.2.2018 15,19537034 954,8611111 23.2.2018 2,639351851 156,7981389 23.2.2018 15,66712965 954,8611111 24.2.2018 2,553703704 156,9808924 24.2.2018 11,92037035 972,2222222 25.2.2018 2,474652779 150,4019444 25.2.2018 11,85937501 951,3888889 26.2.2018 2,693287036 152,9604063 26.2.2018 13,21921296 958,3333333 27.2.2018 2,757986114 142,7265139 27.2.2018 13,12812497 979,1666667 28.2.2018 2,802199079 147,2952118 28.2.2018 13,0806713 961,8055556 1.3.2018 2,879745375 148,9399375 1.3.2018 16,02349543 965,2777778 2.3.2018 2,978977933 145,8332465 2.3.2018 15,13822801 968,75 Almost 50 X difference in jitter Almost 70 X difference in jitter (max) 26-9-2019
Recommend
More recommend