Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, - - PowerPoint PPT Presentation

sprocket a serverless video processing framework
SMART_READER_LITE
LIVE PREVIEW

Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, - - PowerPoint PPT Presentation

Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, , Liz Izhikevi vich ch, , Geoffrey M. . Voelker, , George Porter Video processing $ ffmpeg -i input.mp4 -vf hue=s=0 greyscale.mp4 3 min clip vs. 120 min movie "Show


slide-1
SLIDE 1

Sprocket: A Serverless Video Processing Framework

Lixi xiang Ao, , Liz Izhikevi vich ch, , Geoffrey M. . Voelker, , George Porter

slide-2
SLIDE 2

Video processing

$ ffmpeg -i input.mp4 -vf hue=s=0 greyscale.mp4 3 min clip vs. 120 min movie 4.5min vs. 190min processing time Low parallelism "Show just the scenes in the movie in which Wonder Woman appears" Complex queries not supported

slide-3
SLIDE 3

$ tr ' ' '\n' < input | sort | uniq -c $ ffmpeg -i input.mp4 -vf hue=s=0 greyscale.mp4

?

Larger dataset, more complex queries

slide-4
SLIDE 4

$ tr ' ' '\n' < input | sort | uniq -c $ ffmpeg -i input.mp4 -vf hue=s=0 greyscale.mp4

?

Larger dataset, more complex queries

A framework for highly parallel, complex video pipelines

slide-5
SLIDE 5

Related work

ExCamera[NSDI '17]: Low latency video encoding w/ serverless, functional codec Facebook SVE[SOSP '17]: Large scale video processing on dedicated cluster

slide-6
SLIDE 6

Sprocket

Serverless video processing framework. (AWS Lambda) Highly parallel, low-latency. Low cost. Build complex video pipelines with a simple domain-specific language. Process an hour of 1080p video 1000-way parallelism in 10s seconds for < $3.

slide-7
SLIDE 7

Intra-video parallelism

Video frames are interdependent within a Group of Pictures (GOP). GOPs are independent of each other. Each GOP is relative small in size. Intra-video parallelism.

slide-8
SLIDE 8

Why serverless?

Serverless: run user code in cloud without managing servers, e.g., AWS Lambda. Each instance naturally matches GOP’s size. Burst-parallelism – thousand of instances in sub-second on demand. Only pay actual running time. Cloud computer vision services, e.g., AWS Rekognition and Google Vision.

slide-9
SLIDE 9

System Overview

RPC video API call coordinator

slide-10
SLIDE 10

How do we program Sprocket applications?

slide-11
SLIDE 11

decode match face Scene change Face Recognition

Input 0: video Input 1: name

  • utput

Draw encode Logical DAG (Directed Acyclic Graph):

slide-12
SLIDE 12

{ "nodes":[ { "name": "matchFace", "stage": "matchFace", "config": { } }, { "name": "decode", "stage": "stealwork_decode", "config": { "stealwork": true, "transform": "-f image2 -c:v png" } }, { "name": "face_rek", "stage": "rek", "delivery_function": "serialized_scene", "config": { } }, … "streams": [ { "src": "input_0:chunks", "dst": "decode:chunks" }, { "src": "input_1:person", "dst": "matchFace:person" }, { "src": "decode:frames", "dst": "scenechange:frames" }, { "src": "scenechange:scene_list", "dst": "face_rek:scene_list" }, { "src": "face_rek:frame", "dst": "draw:frame" }, …

Domain-specific language: pipespec:

logical DAG node logical DAG edge control logic encoded in stages stage configs dependency definition node:edgeID

slide-13
SLIDE 13

coordinator

"youtube.com/v/12345", "Wonder Woman"

submit RPC Logical DAG physical DAG

slide-14
SLIDE 14

Data dependencies

Chain of filters Decode to frames Encode from frames Full shuffling User defined?

slide-15
SLIDE 15

!: (I, global states) → (0 → 1)

Ø user-defined dependency between upstream & downstream Ø produces a mapping from inputs to outputs using inputs and/or global states Ø dynamically updates physical DAG delivery function

slide-16
SLIDE 16

Scheduling

Manages limited resources, e.g., concurrent Lambda workers Simplified by serverless platform Implements fine-grained (task-level) priority control Priority is defined with an API Streaming scheduler

slide-17
SLIDE 17

Straggler mitigation

Stragglers seen in: ØLambda Invocation ØIntermediate data I/O ØWorker task Solved by: ØWorker late binding + over-provision ØSpeculative I/O ØWork-stealing by exploiting the GOP structure

slide-18
SLIDE 18

Evaluations

Questions we want to answer: ØCan Sprocket utilize burst-parallelism provided by serverless platforms? ØCan Sprocket schedule pipeline efficiently? ØIs Sprocket cost-efficient? ØCan Sprocket mitigate stragglers? (see paper)

slide-19
SLIDE 19

Parallelism tests

Three-stage greyscale pipeline Each Lambda worker handles a GOP. Pipeline completion time Burst parallelism of serverless supports highly parallel video processing

slide-20
SLIDE 20

Streaming scheduler

Users consume output while video processed. Meet streaming deadline while minimizing resource consumption. Adjust number of workers according to progress and deadline.

slide-21
SLIDE 21

Monetary cost

FFmpeg greyscale filter on a 30-min 1080p video. Local command: a m4.16xlarge instance w/64 cores, 256G RAM. Spark: 18-node cluster m4.2xlarge instance w/8 cores, 32G RAM. Sprocket: 900 concurrent 3G RAM Lambdas.

slide-22
SLIDE 22

Conclusion

A framework for highly parallel, complex video processing is needed. Serverless is an ideal platform for such a framework. Sprocket introduces low-latency complex video processing with low cost.

slide-23
SLIDE 23

Thank you!

Q & A