Metrics and models for Web performance evaluation or, How to measure - PowerPoint PPT Presentation

Metrics and models for Web performance evaluation or, How to measure SpeedIndex from raw encrypted packets, and why it matters QoE = f(QoS) This talk Dario Rossi dario.rossi@huawei.com Director, DataCom (*) Lab Brussels, Feb 1 st 2020 Huawei (*) Data Communication Network Algorithm and Measurement Technology Laboratory

Metrics and models for Web performance evaluation or, How to measure SpeedIndex from raw encrypted packets, and why it matters QoE = f(QoS) This talk Dario Rossi and, in alphabetical order, Alemnew Asrese, Alexis Huet, Diego Da Hora, Enrico Bocchi, Flavia Salutari, Florian Metzger, Gilles Dubuc, Hao Shi, Jinchun Xu, Luca De Cicco, Marco Mellia, Matteo Varvello, Renata Teixeira, Tobias Hossfeld, Shengming Cai, Vassillis Christophides, Zied Ben Houidi

Internet ISP Offering G d user Q E is a common goal

Internet ISP User App Net QoE QoS QoS For ISPs/vendors encryption makes the inference harder Detect/forecast/prevent Q oE degradation is important!

Quality at different layers Context Context influence factors User Human influence User QoE factors influences Application Application QoS System influence factors affects Network Network QoS 31/01/2020 5

Quality at different layers Context Device type Activity Location User User PLT Mean opinion (uPLT) User QoE score (MOS) Engagement influences metrics Application Page load time (PLT) Application QoS Video bitrate SpeedIndex affects Network Packet loss Latency Network QoS Bandwidth Wi-Fi quality 31/01/2020 6

Quality at different layers Context Device type Activity Location User User PLT Mean opinion (uPLT) User QoE score (MOS) influences Application HTTP/2, QUIC… Application QoS (true for any other apps) affects Network Packet loss Latency Network QoS Bandwidth Wi-Fi quality 31/01/2020 7

Metrics and models for Web performance evaluation Agenda or, How to measure SpeedIndex from raw encrypted packets, and why it matters Context User Data collection User QoE (Crowdsourcing campaign) Models (Data driven vs Expert Models) Application Application QoS Browser metrics (Instant vs Integral vs Compound) Models (From raw encrypted packets) Network Network QoS 31/01/2020 8

Metrics and models for Web performance evaluation Agenda or, How to measure SpeedIndex from raw encrypted packets, and why it matters 1. Data collection User QoE (Crowdsourcing campaign) 2. Models (Data driven vs Expert Models) 3. Browser metrics (Instant vs Integral vs Compound) 4. Method (From raw encrypted packets) Network QoS 31/01/2020 9

Data collection: Crowdsourcing campaigns https://webqoe.telecom-paristech.fr/data  Mean opinion score (MOS) Lab experiments (Award winning) “Rate your experience from Small user diversity, volounteers dataset • [PAM18] Web browsing, but artificial websites • 1-poor to 5- excellent” Artificial controlled conditions •  User perceived PLT (uPLT) Crowdsourcing (payed crowdworkers) Ongoing, with “Which of these two pages Larger userbase, but higher noise • Side-to- side videos ≠ Web browsing! • finished first ?” Artificial controlled conditions •  User acceptance Collab with Experiments from operational website “Did the page load fast Actual service users • Browsing in typical user conditions enough ?”(Yes/No) • Huge heterogeneity (devices/browsers/nets) • [WWW19] 10

Models: Data driven vs Expert models UserQoE https://webqoe.telecom-paristech.fr/models y Learn y=f(x) Fit predetermined y=f(x) x=vector of input features x=single scalar metric, generally Page Load Time (PLT) optimal f(.) selected & tuned by machine learning f(.) = pre-selected by the expert IQX Hypotesis More flexible and (slightly) more accurate [PAM18] [1] M. Fiedler et al. "A generic quantitative relationship between quality of experience and quality of service." IEEE Network , 2010 Comparison of the two models in [QoMEX-18] [INFOCOM19] Weber Fechner Standard ITU-T G1030 Still room for improvement (see [WWW19] ) https://www.itu.int/rec/T-REC-G.1030/en 11

Browser metrics: Time Instant vs Time Integral (1/2) t=DOM, page structure loaded www.iceberg.com 1 x(t) Visual Progress TB t DOM ATF PLT 12 * Images by vvstudio, vectorpocket, Ydlabs / Freepik

Browser metrics: Time Instant vs Time Integral (1/2) t=ATF, visible portion (aka Above the Fold) loaded www.iceberg.com 1 Visual Progress t ATF DOM PLT 13 * Images by vvstudio, vectorpocket, Ydlabs / Freepik

Browser metrics: Time Instant vs Time Integral (1/2) t=ATF, visible portion (aka Above the Fold) loaded www.iceberg.com 1 SpeedIndex Visual Progress 1 − 𝑦 𝑢 𝑒𝑢 x(t) t ATF DOM PLT 14 * Images by vvstudio, vectorpocket, Ydlabs / Freepik

Browser metrics: Time Instant vs Time Integral (1/2) t=ATF, visible portion (aka Above the Fold) loaded www.iceberg.com 1 SpeedIndex Visual Progress 1 − 𝑦 𝑢 𝑒𝑢 x(t) t ATF DOM PLT 15 * Images by vvstudio, vectorpocket, Ydlabs / Freepik

Browser metrics: Time Instant vs Time Integral (1/2) t=PLT, all page content loaded www.iceberg.com 1 SpeedIndex Visual Progress 1 − 𝑦 𝑢 𝑒𝑢 x(t) t DOM ATF PLT 16 * Images by vvstudio, vectorpocket, Ydlabs / Freepik

Browser metrics: Time Instant vs Time Integral (2/2)  SpeedIndex, RUMSI, PSSI › Processing intensive › Only at L7 (in browser) › Visual progress metric  ObjectIndex, ByteIndex and ImageIndex › Lightweight › ByteIndex also at L3 (in network) › Higly correlated with SpeedIndex SpeedIndex ? ImageIndex › Possibly far from user QoE ? %of visual ObjectIndex % of bytes ByteIndex completeness % of objects % of bytes of images (histogram, downloaded downloaded rectangles or SSim) 31/01/2020 17

Browser metrics: Time Instant vs Time Integral (2/2)  SpeedIndex, RUMSI, PSSI › Processing intensive x’(t ) › Only at L7 (in browser) › Visual progress metric Same PLT but slower loading  ObjectIndex, ByteIndex and ImageIndex › Lightweight › ByteIndex also at L3 (in network) › Higly correlated with SpeedIndex › Possibly far from user QoE ? ? SpeedIndex ImageIndex %of visual ObjectIndex % of bytes ByteIndex completeness % of objects % of bytes of images (histogram, downloaded downloaded rectangles or SSim) 31/01/2020 18

Browser metrics: Time Instant vs Time Integral (2/2)  SpeedIndex, RUMSI, PSSI › Processing intensive › Only at L7 (in browser) › Visual progress metric  ObjectIndex, ByteIndex and ImageIndex Different cutoffs › Lightweight › ByteIndex also at L3 (in network) › Higly correlated with SpeedIndex › Possibly far from user QoE ? ? SpeedIndex ImageIndex %of visual ObjectIndex % of bytes ByteIndex completeness % of objects % of bytes of images (histogram, downloaded downloaded rectangles or SSim) 31/01/2020 19

Method: From raw packets to browser metrics (1/2) Single session 1 SpeedIndex Visual Progress 1 − 𝑦 𝑢 𝑒𝑢 x(t) img1 t Individual DOM ATF PLT css img2 objects js htm Domain x.com 20

Method: From raw packets to browser metrics (1/2) 1 SpeedIndex Visual Progress 1 − 𝑦 𝑢 𝑒𝑢 x(t) img1 t Individual DOM ATF PLT css img2 objects js htm Domain x.com Packets ??! !?! time Single session 21

Method: From raw packets to browser metrics (1/2) Single session 1 SpeedIndex Visual Progress 1 − 𝑦 𝑢 𝑒𝑢 x(t) img1 t Individual DOM ATF PLT css img2 objects js htm Domain x.com Train ML models (XGBoost , 1D-CNN) Packets ??! !?! time Single session 22

Method: From raw packets to browser metrics (2/2) User Browser (L7) Network (L3) Works with encryption Handle multi-sessions (not in this talk) Exact online algorithm for ByteIndex Machine learning for any metric Accurate on joint tests with Orange Accurate for unseen pages & networks Available soon into Huawei products 24

Aftermath (1/3): From raw packets to rough sentiments  Expert-driven feature engineering  Neural Networks › Explainable but inherently heuristic approach › Less interpretable but more versatile › Hard to keep in sync with application/network change › Downside: requires lots of samples.... › User feedback (e.g. MOS, user PLT, etc.) › Feed NN with x(t) signal › Smartphone sensors (eg happiness › Still lightweight estimation via facial recognition) Possible inputs Possible outputs › Feed NN using a filmstrip › Brain signals acquired with sensors › More complex › Activity of brain areas correlated with user happiness 25

Aftermath (2/3): Divide et impera  World Wild Web One average model › Huge diversity, not captured by single model  Increase accuracy › Per-page QoE models › Inherently non scalable Many per-page models  Increase accuracy & scalability › Per-page QoE models (eg Alexa top 100 pages) Alexa Top 1M, 100 clusters › Aggregate QoE models (eg 100 clusters top 1M) › Generic QoE model (for the tail up to 1B pages) 26

Metrics and models for Web performance evaluation or, How to measure - PowerPoint PPT Presentation

Metrics and models for Web performance evaluation or, How to measure SpeedIndex from raw encrypted packets, and why it matters QoE = f(QoS) This talk Dario Rossi dario.rossi@huawei.com Director, DataCom (*) Lab Brussels, Feb 1 st 2020 Huawei

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

Performance Metrics for Web Browsing draft fan ippm web metrics 00 Peng Fan

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Model Evaluation Model Evaluation Metrics for Performance Evaluation How to evaluate the

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

JVM Web Application Metrics & Monitoring FOLIO @krrrr38 2 3 1. 2. 3. JVM Web

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

What is a performance evaluation? Performance Management v. Performance Evaluation Evaluation

Software Metrics And I gnominy Software Metrics And I gnominy Software Metrics And I gnominy

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

Astheno-Khler and strong KT General results metrics Bismut connection Definition of strong KT

NDCs and metrics Andrei Marcu , Director, ERCST 1 NDCs and metrics Main issues: - Which metrics

Metrics and Estimation Rahul Premraj + Andreas Zeller 1 Metrics Quantitative measures that

Software Metrics Chapter 4 1 SW Metrics SW process and product metrics are quantitative

Building Faster Mobile Websites WebRTC the nuts and bolts of hitting the 1000 millisecond

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 14:

CSE440: Introduction to HCI Methods for Design, Prototyping and Evaluating User Interaction

x86 ASM (1) 1 last time: VMs confjgurablity consistent environment isolation run malware in

EmailMarke+ng Wri+ngTipstoImproveEmailResults Craig Stouffer Mark

Introduction to Human Computer Interaction Course on NPTEL, Spring 2018 Week 6.1 Ponnurangam

Marketing & Brand Director MVMT Speaker Headshot Marketing & Brand Director, MVMT

Adaptation to Large Screens: A Case for Model-based User Interfaces?! Michael Nebeling W3C

Metrics and models for Web performance evaluation or, How to measure - PowerPoint PPT Presentation

Metrics and models for Web performance evaluation or, How to measure SpeedIndex from raw encrypted packets, and why it matters QoE = f(QoS) This talk Dario Rossi dario.rossi@huawei.com Director, DataCom (*) Lab Brussels, Feb 1 st 2020 Huawei

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

Performance Metrics for Web Browsing draft fan ippm web metrics 00 Peng Fan

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Model Evaluation Model Evaluation Metrics for Performance Evaluation How to evaluate the

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

JVM Web Application Metrics &amp; Monitoring FOLIO @krrrr38 2 3 1. 2. 3. JVM Web

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

What is a performance evaluation? Performance Management v. Performance Evaluation Evaluation

Software Metrics And I gnominy Software Metrics And I gnominy Software Metrics And I gnominy

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

Astheno-Khler and strong KT General results metrics Bismut connection Definition of strong KT

NDCs and metrics Andrei Marcu , Director, ERCST 1 NDCs and metrics Main issues: - Which metrics

Metrics and Estimation Rahul Premraj + Andreas Zeller 1 Metrics Quantitative measures that

Software Metrics Chapter 4 1 SW Metrics SW process and product metrics are quantitative

Building Faster Mobile Websites WebRTC the nuts and bolts of hitting the 1000 millisecond

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 14:

CSE440: Introduction to HCI Methods for Design, Prototyping and Evaluating User Interaction

x86 ASM (1) 1 last time: VMs confjgurablity consistent environment isolation run malware in

EmailMarke+ng Wri+ngTipstoImproveEmailResults Craig Stouffer Mark

Introduction to Human Computer Interaction Course on NPTEL, Spring 2018 Week 6.1 Ponnurangam

Marketing &amp; Brand Director MVMT Speaker Headshot Marketing &amp; Brand Director, MVMT

Adaptation to Large Screens: A Case for Model-based User Interfaces?! Michael Nebeling W3C

JVM Web Application Metrics & Monitoring FOLIO @krrrr38 2 3 1. 2. 3. JVM Web

Marketing & Brand Director MVMT Speaker Headshot Marketing & Brand Director, MVMT