Live Video Analytics at Scale with Approximation and Delay-Tolerance - - PowerPoint PPT Presentation

live video analytics at scale with approximation and
SMART_READER_LITE
LIVE PREVIEW

Live Video Analytics at Scale with Approximation and Delay-Tolerance - - PowerPoint PPT Presentation

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, Michael J. Freedman Video cameras are pervasive 2 Video analytics queries AMBER Alert


slide-1
SLIDE 1

Live Video Analytics at Scale with Approximation and Delay-Tolerance

Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, Michael J. Freedman

slide-2
SLIDE 2

Video cameras are pervasive

2

slide-3
SLIDE 3

Video analytics queries

3

Intelligent Traffic System AMBER Alert Electronic Toll Collection

3

Video Doorbell

slide-4
SLIDE 4

Video query: a pipeline of transforms

transform count object transform track object transform decode transform detect object

4

  • Vision algorithms chained together
  • Example: traffic counter pipeline
slide-5
SLIDE 5

Video queries are expensive in resource usage

transform count object transform track object transform decode transform b/g subtract

5

  • Best car tracker [1] — 1 fps on an 8-core CPU
  • DNN for object classification [2] — 30GFlops

[1] VOT Challenge 2015 Results. [2] Simonyan et al. CVPR abs/1409.1556, 2014

  • When processing thousands of video streams in multi-tenant clusters
  • How to reduce processing cost of a query?
  • How to manage resources efficiently across queries?
slide-6
SLIDE 6

Vision algorithms are intrinsically approximate

  • License plate reader → window size
  • Car tracker → mapping metric
  • Object classifier → DNN model
  • Query configuration: a combination of knob values

Frame Rate Resolution Window Size Mapping Metric

6

  • Knobs: parameters / implementation choices for transforms
slide-7
SLIDE 7

Knobs impact quality and resource usage

Frame Rate Resolution 720p 3 1 480p

7

Quality=0.93, CPU=0.54 Quality=0.57, CPU=0.09

slide-8
SLIDE 8

Knobs impact quality and resource usage

Frame Rate Resolution Window Size Mapping Metric

480p 576p 720p 900p1080p

Frame Resolution

0.0 0.2 0.4 0.6 0.8 1.0

Quality CPU

2 4 6 8 10 12 14

Frame Rate

0.0 0.2 0.4 0.6 0.8 1.0

Quality CPU

DIST HIST SURF SIFT

Object Mapping Metric

0.0 0.2 0.4 0.6 0.8 1.0

Quality CPU

1.3 1.4 1.5 1.6 1.7

Window Size Step

0.0 0.2 0.4 0.6 0.8 1.0

Quality CPU

8

slide-9
SLIDE 9

Knobs impact quality and resource usage

  • Orders of magnitude cheaper resource demand for little quality drop
  • No analytical models to predict resource-quality tradeoff
  • Different from approximate SQL queries

9

0.2 0.4 0.6 0.8 0.01 0.1 1 10 100 1000

Quality of Result Resource Demand [CPU cores, log scale] License Plate Reader

slide-10
SLIDE 10

Diverse quality and lag requirements

Quality? Lag? High Hours Moderate Few Seconds High Few Seconds

10

Intelligent Traffic AMBER Alert Toll Collection Lag: time difference between frame arrival and frame processing

slide-11
SLIDE 11

11

Decide configuration and resource allocation to maximize quality and minimize lag within the resource capacity

Configuration . Quality Lag Resource Allocation .

Goal

slide-12
SLIDE 12

Video analytics framework: Challenges

12

  • 1. Many knobs → large configuration space
  • No known analytical models to predict quality and resource impact
  • 2. Diverse requirements on quality and lag
  • Hard to configure and allocate resources jointly across queries

Configuration . Quality Lag Resource Allocation .

slide-13
SLIDE 13

VideoStorm: Solution Overview

Profiler

query

Scheduler

resource-quality profile utility function

  • ffline
  • nline

13

Workers

slide-14
SLIDE 14

VideoStorm: Solution Overview

Profiler

query

Scheduler

resource-quality profile utility function

  • ffline
  • nline

14

Workers Builds model Reduces config space Trades off quality and lag across queries

slide-15
SLIDE 15

VideoStorm: Solution Overview

Profiler

query

Scheduler

resource-quality profile

  • ffline
  • nline

15

Workers

slide-16
SLIDE 16

Offline: query profiling

  • Profile: configuration ⟹ resource, quality
  • Ground-truth: labeled dataset or results from golden configuration
  • Explore configuration space, compute average resource and quality

16

0.2 0.4 0.6 0.8 0.01 0.1 1 10 100 1000

Quality of Result Resource Demand [CPU cores, log scale] more efficient higher quality

⨂ is strictly better than ⨂ in quality and resource efficiency

slide-17
SLIDE 17

Offline: Pareto boundary of configuration space

  • Pareto boundary: optimal configurations in resource efficiency and quality
  • Cannot further increase one without reducing the other
  • Orders of magnitude reduction in config. search space for scheduling

17

0.2 0.4 0.6 0.8 0.01 0.1 1 10 100 1000

Quality of Result Resource Demand more efficient higher quality

Pareto optimal

slide-18
SLIDE 18

VideoStorm: Solution Overview

Profiler Scheduler

resource-quality profile utility function

  • ffline
  • nline

18

Workers

slide-19
SLIDE 19

Online: utility function and scheduling

  • Utility function: encode goals and sensitivities of quality and lag
  • Users set required quality and tolerable lag
  • Reward additional quality, penalize higher lag
  • Schedule for two natural goals:
  • Maximize the minimum utility – (max-min) fairness
  • Maximize the total utility – overall performance
  • Allow lag accumulation during resource shortage, then catch up

19

higher quality higher lag

slide-20
SLIDE 20

Online: scheduling approximate video queries

  • Queries: blue and orange

(tolerate 8s lag)

  • Total CPU: 4 → 2 → 4
  • Fair scheduler: best

configurations w/o lag Fair Quality-aware

  • Quality-aware scheduler:

allow lag → catch up

1 2 3

CPU (core)

0.0 0.2 0.4 0.6 0.8 1.0

Quality

10 22 38 Time 1 2 3 4 Resource (cores) . 10 22 38 Time 1 2 3 4 Resource (cores) .

20

10 22 38 Time 0.0 0.2 0.4 0.6 0.8 1.0 Quality 10 Lag 10 22 38 Time 0.0 0.2 0.4 0.6 0.8 1.0 Quality 10 Lag 10 22 38 Time 0.0 0.2 0.4 0.6 0.8 1.0 Quality 10 Lag 10 22 38 Time 0.0 0.2 0.4 0.6 0.8 1.0 Quality 10 Lag

∑=1.0 ∑=1.5

C2 C1 C2 C2 C1 C2 C3 C3 C2 C3 C1

slide-21
SLIDE 21

Additional Enhancements

  • Handle incorrect resource profiles
  • Profiled resource demand might not correspond to actual queries
  • Robust to errors in query profiles
  • Query placement and migration
  • Better utilization, load balancing and lag spreading
  • Hierarchical scheduling
  • Cluster and machine level scheduling
  • Better efficiency and scalability

21

slide-22
SLIDE 22

VideoStorm Evaluation Setup

VideoStorm Manager

Profiler + Scheduler

22

100 Worker Machines

  • Platform:
  • Microsoft Azure cluster
  • Each worker contains 4 cores
  • f the 2.4GHz Intel Xeon

processor and 14GB RAM

  • Four types of vision queries:
  • license plate reader
  • car counter
  • DNN classifier
  • object tracker
slide-23
SLIDE 23

Experiment Video Datasets

  • Operational traffic cameras in Bellevue and Seattle
  • 14 – 30 frames per second, 240P – 1080P resolution

23

slide-24
SLIDE 24

Resource allocation during burst of queries

  • Start with 300 queries:

Lag Goal=300s, low-quality ~60% Lag Goal=20s, low-quality ~40%

  • Burst of 150 seconds (50 – 200):

200 LPR queries (AMBER Alert) High-Quality, Lag Goal=20s

  • VideoStorm scheduler:

dominate resource allocation significantly delay run with lower quality All meet quality and lag goals

50 100 150 200 250

Time (seconds)

0.0 0.2 0.4 0.6 0.8 1.0

Share of Cluster CPUs

Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s

0.6 0.7 0.8 0.9 1.0 Quality

Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s

50 100 150 200 250 Time (seconds) 20 40 60 80 100 120 Lag (sec)

24

slide-25
SLIDE 25

Resource allocation during burst of queries

  • Start with 300 queries:

Lag Goal=300s, low-quality ~60% Lag Goal=20s, low-quality ~40%

  • Burst of 150 seconds (50 – 200):

200 LPR queries (AMBER Alert) High-Quality, Lag Goal=20s

  • VideoStorm scheduler:

significantly delay run with lower quality dominate resource allocation All meet quality and lag goals

50 100 150 200 250

Time (seconds)

0.0 0.2 0.4 0.6 0.8 1.0

Share of Cluster CPUs

Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s

0.6 0.7 0.8 0.9 1.0 Quality

Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s

50 100 150 200 250 Time (seconds) 20 40 60 80 100 120 Lag (sec)

25

  • Compare to a fair scheduler with varying burst duration:
  • Quality improvement: up to 80%
  • Lag reduction: up to 7x
slide-26
SLIDE 26

VideoStorm Scalability

  • Frequently reschedule and reconfigure in reaction to changes of queries
  • Even with thousands of queries, VideoStorm makes rescheduling

decisions in just a few seconds

500 1000 2000 4000 8000

Number of Queries

1 2 3 4 5 6

Scheduling Time (s)

Number of Machines 100 200 500 1000

26

slide-27
SLIDE 27

100 200 300 400 500

Time (seconds)

15 20 25 30 35 40 45

CPU (w/ adaptation)

100 200 300 400 500

Time (seconds)

15 20 25 30 35 40 45

CPU (w/o adaptation)

VideoStorm: account for errors in query profiles

  • Errors in profile on resource demands
  • Over/under allocate resources → miss quality and lag goals!
  • Example: 3 copies of same query, should get same allocation
  • Profiled resource synthetically doubled, halved and unchanged
  • VideoStorm keeps track of mis-estimation factor # – multiplicative error

between the profiled demand and actual usage

100 200 300 400 500

Time (seconds)

15 20 25 30 35 40 45

CPU (w/ adaptation)

Accurate Twice Half 27

100 200 300 400 500

Time (seconds)

0.0 0.5 1.0 1.5 2.0

¹

slide-28
SLIDE 28

Conclusion

  • VideoStorm is a video analytics system that scales to processing

thousands of video streams in large clusters

  • Offline profiler: efficiently estimates resource-quality profiles
  • Online scheduler: optimizes jointly for the quality and lag of queries
  • VideoStorm is currently deployed in Bellevue Traffic Department, and

soon will be deployed in more cities

28

0.2 0.4 0.6 0.8 0.01 0.1 1 10 100 1000

quality, F1 score resource demand [CPU cores, log scale]

50 100 150 200 250

Time (seconds)

0.0 0.2 0.4 0.6 0.8 1.0

Share of Cluster CPUs

Lag Goal=300s Lag Goal=20s High-Quality, Lag Goal=20s