Live Video Analytics at Scale with Approximation and Delay-Tolerance - PowerPoint PPT Presentation

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, Michael J. Freedman

Video cameras are pervasive 2

Video analytics queries AMBER Alert Intelligent Traffic System Electronic Toll Collection Video Doorbell 3 3

Video query: a pipeline of transform s • Vision algorithms chained together • Example: traffic counter pipeline transform transform transform transform decode detect object track object count object 4

Video queries are expensive in resource usage • Best car tracker [1] — 1 fps on an 8-core CPU • DNN for object classification [2] — 30GFlops transform transform transform transform decode b/g subtract track object count object • When processing thousands of video streams in multi-tenant clusters • How to reduce processing cost of a query? • How to manage resources efficiently across queries? [1] VOT Challenge 2015 Results. [2] Simonyan et al. CVPR abs/1409.1556, 2014 5

Vision algorithms are intrinsically approximate • Knobs: parameters / implementation choices for transforms Frame Rate Resolution Window Size Mapping Metric • License plate reader → window size • Car tracker → mapping metric • Object classifier → DNN model • Query configuration: a combination of knob values 6

Knobs impact quality and resource usage 720p 3 Quality=0.93, CPU=0.54 Frame Rate Resolution 480p 1 Quality=0.57, CPU=0.09 7

Knobs impact quality and resource usage Quality CPU Quality CPU Quality CPU Quality CPU 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.0 0.0 0.0 0.0 480p 576p 720p 900p1080p 1.7 1.6 1.5 1.4 1.3 DIST HIST SURF SIFT 2 4 6 8 10 12 14 Frame Resolution Object Mapping Metric Frame Rate Window Size Step Frame Rate Resolution Window Size Mapping Metric 8

Knobs impact quality and resource usage License Plate Reader 0.8 Quality of Result 0.6 0.4 0.2 0 0.01 0.1 1 10 100 1000 Resource Demand [CPU cores, log scale] • Orders of magnitude cheaper resource demand for little quality drop • No analytical models to predict resource-quality tradeoff • Different from approximate SQL queries 9

Diverse quality and lag requirements Lag: time difference between frame arrival and frame processing Toll Collection AMBER Alert Intelligent Traffic Quality? High Moderate High Lag? Hours Few Seconds Few Seconds 10

Goal Decide configuration and resource allocation to maximize quality and minimize lag within the resource capacity Configuration . Quality Resource Allocation . Lag 11

Video analytics framework: Challenges 1. Many knobs → large configuration space • No known analytical models to predict quality and resource impact 2. Diverse requirements on quality and lag • Hard to configure and allocate resources jointly across queries Configuration . Quality Resource Allocation . Lag 12

VideoStorm: Solution Overview Profiler query resource-quality Scheduler Workers profile utility function offline online 13

VideoStorm: Solution Overview Profiler Builds model Trades off query Reduces config resource-quality quality and lag Scheduler Workers profile space across queries utility function offline online 14

VideoStorm: Solution Overview Profiler query resource-quality Scheduler Workers profile offline online 15

Offline: query profiling • Profile: configuration ⟹ resource, quality • Ground-truth: labeled dataset or results from golden configuration • Explore configuration space, compute average resource and quality 0.8 Quality of Result 0.6 ⨂ is strictly higher quality better than 0.4 ⨂ in quality 0.2 and resource efficiency 0 0.01 0.1 1 10 100 1000 Resource Demand [CPU cores, log scale] more efficient 16

Offline: Pareto boundary of configuration space • Pareto boundary: optimal configurations in resource efficiency and quality • Cannot further increase one without reducing the other • Orders of magnitude reduction in config. search space for scheduling 0.8 Quality of Result Pareto optimal 0.6 higher quality 0.4 0.2 0 0.01 0.1 1 10 100 1000 Resource Demand more efficient 17

VideoStorm: Solution Overview Profiler resource-quality Scheduler Workers profile utility function offline online 18

Online: utility function and scheduling • Utility function: encode goals and sensitivities of quality and lag • Users set required quality and tolerable lag • Reward additional quality, penalize higher lag higher quality • Schedule for two natural goals: • Maximize the minimum utility – (max-min) fairness higher lag • Maximize the total utility – overall performance • Allow lag accumulation during resource shortage, then catch up 19

Online: scheduling approximate video queries • Queries: blue and orange Fair Quality-aware (tolerate 8s lag) Resource (cores) Resource (cores) 4 4 . . 3 3 1.0 2 2 0.8 Quality 1 1 0.6 0 0 0.4 10 22 38 10 22 38 Time Time 0.2 0.0 1.0 1.0 1 2 3 Lag Lag 0.8 0.8 CPU (core) Quality Quality 0.6 0.6 10 10 0.4 0.4 • Total CPU: 4 → 2 → 4 C2 C1 C2 C1 0.2 0.2 0.0 0 0.0 0 Time Time 10 22 38 10 22 38 • Fair scheduler: best ∑=1.0 ∑=1.5 configurations w/o lag 1.0 1.0 Lag Lag 0.8 0.8 Quality Quality • Quality-aware scheduler: 0.6 C2 C1 C2 0.6 C3 C3 C2 C3 10 10 0.4 0.4 allow lag → catch up 0.2 0.2 0.0 0 0.0 0 10 22 38 Time 10 22 38 Time 20

Additional Enhancements • Handle incorrect resource profiles • Profiled resource demand might not correspond to actual queries • Robust to errors in query profiles • Query placement and migration • Better utilization, load balancing and lag spreading • Hierarchical scheduling • Cluster and machine level scheduling • Better efficiency and scalability 21

VideoStorm Evaluation Setup • Platform: • Microsoft Azure cluster VideoStorm Manager Profiler + Scheduler • Each worker contains 4 cores of the 2.4GHz Intel Xeon processor and 14GB RAM • Four types of vision queries: • license plate reader 100 Worker Machines • car counter • DNN classifier • object tracker 22

Experiment Video Datasets • Operational traffic cameras in Bellevue and Seattle • 14 – 30 frames per second, 240P – 1080P resolution 23

Resource allocation during burst of queries Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s Share of Cluster CPUs • Start with 300 queries: 1.0 0.8 � Lag Goal=300s, low-quality ~60% 0.6 � Lag Goal=20s, low-quality ~40% 0.4 0.2 0.0 0 50 100 150 200 250 • Burst of 150 seconds (50 – 200): Time (seconds) � 200 LPR queries (AMBER Alert) Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s High-Quality, Lag Goal=20s 1.0 0.9 Quality 0.8 • VideoStorm scheduler: 0.7 � dominate resource allocation 0.6 120 significantly delay � Lag (sec) 100 80 60 run � with lower quality 40 20 All meet quality and lag goals 0 0 50 100 150 200 250 Time (seconds) 24

Resource allocation during burst of queries Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s Share of Cluster CPUs • Start with 300 queries: 1.0 0.8 � Lag Goal=300s, low-quality ~60% 0.6 � Lag Goal=20s, low-quality ~40% 0.4 0.2 0.0 0 50 100 150 200 250 • Burst of 150 seconds (50 – 200): • Compare to a fair scheduler with varying burst duration: Time (seconds) � 200 LPR queries (AMBER Alert) • Quality improvement: up to 80% Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s High-Quality, Lag Goal=20s • Lag reduction: up to 7x 1.0 0.9 Quality 0.8 • VideoStorm scheduler: 0.7 significantly delay � 0.6 120 run � with lower quality Lag (sec) 100 80 60 � dominate resource allocation 40 20 All meet quality and lag goals 0 0 50 100 150 200 250 Time (seconds) 25

VideoStorm Scalability • Frequently reschedule and reconfigure in reaction to changes of queries • Even with thousands of queries, VideoStorm makes rescheduling decisions in just a few seconds Number of Machines 100 200 500 1000 6 Scheduling 5 Time (s) 4 3 2 1 0 500 1000 2000 4000 8000 Number of Queries 26

VideoStorm: account for errors in query profiles • Errors in profile on resource demands • Over/under allocate resources → miss quality and lag goals! • Example: 3 copies of same query, should get same allocation • Profiled resource synthetically doubled, halved and unchanged • VideoStorm keeps track of mis-estimation factor # – multiplicative error between the profiled demand and actual usage Accurate Twice Half 45 (w/ adaptation) 45 45 40 (w/o adaptation) 2.0 (w/ adaptation) 40 40 35 1.5 CPU 35 35 30 CPU CPU ¹ 30 30 1.0 25 25 25 20 0.5 20 20 15 0.0 0 100 200 300 400 500 15 15 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 Time (seconds) Time (seconds) Time (seconds) Time (seconds) 27

Live Video Analytics at Scale with Approximation and Delay-Tolerance - PowerPoint PPT Presentation

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, Michael J. Freedman Video cameras are pervasive 2 Video analytics queries AMBER Alert

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

. Live Your Vision Edge Analytics Appliance Sonys First AI-Based Video Analytics Solution

6. Approximation and fitting norm approximation least-norm problems regularized

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Microsoft and

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

Video Games Written and Researched by: Patrick Kania First Video Game The first Video Game made

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data

Video Analytics Framework with Multilevel Security Dr. Patrick McDaniel Zachary Lassman Fall

Google Analytics Overview Whats Google Analytics? The Google Analytics

Introduction to Talent Analytics and Interim View 01 Overview Erich OSaben Talent Analytics

Google Analytics A beginners guide What is Google Analytics? Google Analytics is not magic.

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

ECS 231 Lecture on Approximation and Error Analysis 1 / 9 Approximation and error analysis 1.

Moderately exponential approximation Bridging the gap between exact computation and polynomial

Virtual Machine Fault-tolerance Cheng Wang, Xusheng Chen, Weiwei Jia, Boxuan Li, Haoran Qiu,

Plans for Alignment/Tolerance Studies Laura Fields 5 December 2019 Introduction - We estimate

Limits, fits and surface properties 1999, Ed. 2, 844 p., ISBN 92-67-10288-5 C o n t e n t s Part

Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) Magnus Almgren, Dan Andersson,

Key-value VST store James R. Wilcox, Doug Woos, Pavel Panchekha, Zach Tatlock, Xi

Specification and Statement of Work 111 Contents Specifications Statement of Work (SOW)

Todays Lecture: Review loop & conditionals using graphics (I) (Indefinite)

A Fault Tolerant Superscalar Processor 1 [Based on Coverage of a Microarchitecture-level

Live Video Analytics at Scale with Approximation and Delay-Tolerance - PowerPoint PPT Presentation

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, Michael J. Freedman Video cameras are pervasive 2 Video analytics queries AMBER Alert

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

. Live Your Vision Edge Analytics Appliance Sonys First AI-Based Video Analytics Solution

6. Approximation and fitting norm approximation least-norm problems regularized

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Microsoft and

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

Video Games Written and Researched by: Patrick Kania First Video Game The first Video Game made

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data

Video Analytics Framework with Multilevel Security Dr. Patrick McDaniel Zachary Lassman Fall

Google Analytics Overview Whats Google Analytics? The Google Analytics

Introduction to Talent Analytics and Interim View 01 Overview Erich OSaben Talent Analytics

Google Analytics A beginners guide What is Google Analytics? Google Analytics is not magic.

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

ECS 231 Lecture on Approximation and Error Analysis 1 / 9 Approximation and error analysis 1.

Moderately exponential approximation Bridging the gap between exact computation and polynomial

Virtual Machine Fault-tolerance Cheng Wang, Xusheng Chen, Weiwei Jia, Boxuan Li, Haoran Qiu,

Plans for Alignment/Tolerance Studies Laura Fields 5 December 2019 Introduction - We estimate

Limits, fits and surface properties 1999, Ed. 2, 844 p., ISBN 92-67-10288-5 C o n t e n t s Part

Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) Magnus Almgren, Dan Andersson,

Key-value VST store James R. Wilcox, Doug Woos, Pavel Panchekha, Zach Tatlock, Xi

Specification and Statement of Work 111 Contents Specifications Statement of Work (SOW)

Todays Lecture: Review loop &amp; conditionals using graphics (I) (Indefinite)

A Fault Tolerant Superscalar Processor 1 [Based on Coverage of a Microarchitecture-level

Todays Lecture: Review loop & conditionals using graphics (I) (Indefinite)