Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, - PowerPoint PPT Presentation

Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, , Liz Izhikevi vich ch, , Geoffrey M. . Voelker, , George Porter

Video processing $ ffmpeg -i input.mp4 -vf hue=s=0 greyscale.mp4 3 min clip vs. 120 min movie "Show just the scenes in the movie 4.5min vs. 190min processing time in which Wonder Woman appears" Low parallelism Complex queries not supported

$ tr ' ' '\n' < input | sort | $ ffmpeg -i input.mp4 -vf uniq -c hue=s=0 greyscale.mp4 Larger dataset, more complex queries ?

$ tr ' ' '\n' < input | sort | $ ffmpeg -i input.mp4 -vf uniq -c hue=s=0 greyscale.mp4 Larger dataset, more complex queries A framework for highly parallel, complex video pipelines ?

Related work ExCamera[NSDI '17]: Low latency video encoding w/ serverless, functional codec Facebook SVE[SOSP '17]: Large scale video processing on dedicated cluster

Sprocket Serverless video processing framework. (AWS Lambda) Highly parallel, low-latency. Low cost. Build complex video pipelines with a simple domain-specific language. Process an hour of 1080p video 1000-way parallelism in 10s seconds for < $3.

Intra-video parallelism Video frames are interdependent within a Group of Pictures (GOP). GOPs are independent of each other. Each GOP is relative small in size. Intra-video parallelism.

Why serverless? Serverless: run user code in cloud without managing servers, e.g., AWS Lambda. Each instance naturally matches GOP’s size. Burst-parallelism – thousand of instances in sub-second on demand. Only pay actual running time. Cloud computer vision services, e.g., AWS Rekognition and Google Vision.

System Overview API call coordinator RPC video

How do we program Sprocket applications?

Logical DAG (Directed Acyclic Graph): Scene Input 0: decode video change Face Draw encode output Recognition match Input 1: name face

Domain-specific language: pipespec: logical DAG edge { control logic " streams ": " nodes ":[ [ encoded { { "name": "matchFace", "src": "input_0:chunks", logical DAG in stages " stage ": " matchFace ", "dst": "decode:chunks" node "config": { }, } { stage }, "src": "input_1:person", { "dst": "matchFace:person" configs "name": "decode", }, "stage": "stealwork_decode", { " config ": { "src": " decode:frames ", node:edgeID "stealwork": true, "dst": "scenechange:frames" "transform": "-f image2 -c:v png" }, } { dependency }, "src": "scenechange:scene_list", definition { "dst": "face_rek:scene_list" "name": "face_rek", }, "stage": "rek", { " delivery_function ": "serialized_scene", "src": "face_rek:frame", "config": { "dst": "draw:frame" } }, }, … …

physical DAG Logical DAG coordinator submit RPC "youtube.com/v/12345", "Wonder Woman"

Data dependencies Chain of filters Decode to frames Encode from frames Full shuffling User defined?

delivery function !: (I, global states) → (0 → 1) Ø user-defined dependency between upstream & downstream Ø produces a mapping from inputs to outputs using inputs and/or global states Ø dynamically updates physical DAG

Scheduling Manages limited resources, e.g., concurrent Lambda workers Simplified by serverless platform Implements fine-grained (task-level) priority control Priority is defined with an API Streaming scheduler

Straggler mitigation Stragglers seen in: Ø Lambda Invocation Ø Intermediate data I/O Ø Worker task Solved by: Ø Worker late binding + over-provision Ø Speculative I/O Ø Work-stealing by exploiting the GOP structure

Evaluations Questions we want to answer: Ø Can Sprocket utilize burst-parallelism provided by serverless platforms? Ø Can Sprocket schedule pipeline efficiently? Ø Is Sprocket cost-efficient? Ø Can Sprocket mitigate stragglers? (see paper)

Parallelism tests Three-stage greyscale pipeline Each Lambda worker handles a GOP. Pipeline completion time Burst parallelism of serverless supports highly parallel video processing

Streaming scheduler Users consume output while video processed. Meet streaming deadline while minimizing resource consumption. Adjust number of workers according to progress and deadline.

Monetary cost FFmpeg greyscale filter on a 30-min 1080p video. Local command: a m4.16xlarge instance w/64 cores, 256G RAM. Spark: 18-node cluster m4.2xlarge instance w/8 cores, 32G RAM. Sprocket: 900 concurrent 3G RAM Lambdas.

Conclusion A framework for highly parallel, complex video processing is needed. Serverless is an ideal platform for such a framework. Sprocket introduces low-latency complex video processing with low cost.

Thank you! Q & A

Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, - PowerPoint PPT Presentation

Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, , Liz Izhikevi vich ch, , Geoffrey M. . Voelker, , George Porter Video processing $ ffmpeg -i input.mp4 -vf hue=s=0 greyscale.mp4 3 min clip vs. 120 min movie "Show

Serverless On Your Own Terms Using Knative Context Serverless more than Function Serverless

Serverless Gardens IoT + Serverless johncmckim.me twitter.com/@johncmckim

How Serverless Changes the IT Department Paul Johnston Opinionated Serverless Person

Lunch and Learn John McKim @johncmckim Software Engineer A Cloud Guru Serverless Framework

Kotlin Serverless Framework Vladislav Tankov What is serverless? cloud-computing execution model

Stateful Serverless Sean Walsh @SeanWalshEsq We predict that Serverless Computing will grow

Serverless Performance on a Budget Erwin van Eyk The central trade-off in serverless computing

Databases Gone Serverless Alkin Tezuysal (@ask_dba) Sr. Technical Manager, Percona Who am I?

Kotless Kotlin Serverless Framework Vladislav Tankov @vdtankov October 15, 2020 Introduction

Serverless Boom or Bust? An Analysis of Economic Incentives Xiayue Charles Lin, Joseph E.

Serverless Python Serverless Python Michael Bright , Trainer @mjbright Consulting , Trainer

Catalyst Ubers Serverless Platform Shawn Burke - Staff Engineer Uber Seattle Why Serverless?

Unikernels and Event-driven Serverless Platforms Madhuri Yechuri Agenda Bio Application

FaaS You Like It! @ewanslater Serverless CNCF Definition Serverless computing refers to

The Serverless PHP Application Rob Allen LaravelConf Taiwan 2020 Serverless? Rob Allen ~

cloudstate.io serverless 2.0 with cloudstate Sean Walsh | Field CTO and Cloud Evangelist @

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

Readiness of PXD software for phase 2 & Preparation status for phase 3 BPAC focused review

Digital Image Analysis and Processing CPE 0907544 Image Enhancement Part I Intensity

Reinforcement learning with raw image pixels as input state Damien Ernst , Rapha ee , Louis

Pixel Recurrent Neural Networks Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu Google

An introduction to shape and topology optimization ric Bonnetier and Charles Dapogny

Introduction to Topological Data Analysis Persistent Homology Norm Matloff University of

Soft modes from black hole microstates Onkar Parrikar Department of Physics and Astronomy

Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, - PowerPoint PPT Presentation

Sprocket: A Serverless Video Processing Framework Lixi xiang Ao, , Liz Izhikevi vich ch, , Geoffrey M. . Voelker, , George Porter Video processing $ ffmpeg -i input.mp4 -vf hue=s=0 greyscale.mp4 3 min clip vs. 120 min movie "Show

Serverless On Your Own Terms Using Knative Context Serverless more than Function Serverless

Serverless Gardens IoT + Serverless johncmckim.me twitter.com/@johncmckim

How Serverless Changes the IT Department Paul Johnston Opinionated Serverless Person

Lunch and Learn John McKim @johncmckim Software Engineer A Cloud Guru Serverless Framework

Kotlin Serverless Framework Vladislav Tankov What is serverless? cloud-computing execution model

Stateful Serverless Sean Walsh @SeanWalshEsq We predict that Serverless Computing will grow

Serverless Performance on a Budget Erwin van Eyk The central trade-off in serverless computing

Databases Gone Serverless Alkin Tezuysal (@ask_dba) Sr. Technical Manager, Percona Who am I?

Kotless Kotlin Serverless Framework Vladislav Tankov @vdtankov October 15, 2020 Introduction

Serverless Boom or Bust? An Analysis of Economic Incentives Xiayue Charles Lin, Joseph E.

Serverless Python Serverless Python Michael Bright , Trainer @mjbright Consulting , Trainer

Catalyst Ubers Serverless Platform Shawn Burke - Staff Engineer Uber Seattle Why Serverless?

Unikernels and Event-driven Serverless Platforms Madhuri Yechuri Agenda Bio Application

FaaS You Like It! @ewanslater Serverless CNCF Definition Serverless computing refers to

The Serverless PHP Application Rob Allen LaravelConf Taiwan 2020 Serverless? Rob Allen ~

cloudstate.io serverless 2.0 with cloudstate Sean Walsh | Field CTO and Cloud Evangelist @

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

Readiness of PXD software for phase 2 &amp; Preparation status for phase 3 BPAC focused review

Digital Image Analysis and Processing CPE 0907544 Image Enhancement Part I Intensity

Reinforcement learning with raw image pixels as input state Damien Ernst , Rapha ee , Louis

Pixel Recurrent Neural Networks Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu Google

An introduction to shape and topology optimization ric Bonnetier and Charles Dapogny

Introduction to Topological Data Analysis Persistent Homology Norm Matloff University of

Soft modes from black hole microstates Onkar Parrikar Department of Physics and Astronomy

Readiness of PXD software for phase 2 & Preparation status for phase 3 BPAC focused review