How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics - - PowerPoint PPT Presentation

how gpus power comcast s x1 voice remote and smart video
SMART_READER_LITE
LIVE PREVIEW

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics - - PowerPoint PPT Presentation

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017 Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Voice & Deep Learning Smart Home NLP Data


slide-1
SLIDE 1

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017

slide-2
SLIDE 2

2

Comcast Applied Artificial Intelligence Lab

Smart Home Smart TV Smart Internet

Media & Video Analytics Deep Learning Data Science Recommendations & Search Voice & NLP

slide-3
SLIDE 3

3

Today: How Comcast Uses AI to Evolve and Reinvent the TV Experience

Smart Home Smart TV Smart Internet

Media & Video Analytics Deep Learning Data Science Recommendations & Search Voice & NLP

slide-4
SLIDE 4

4

Netflix

LIVETV

Online Video

AI for Content Discovery –Voice Search

slide-5
SLIDE 5

5

  • Query: “HBO”

X1 Smart TV with Voice

Answer Selector Voice remote ASR

query

NLP modules

action

Set-top Box TV

slide-6
SLIDE 6

6

Open NLP: Multiple Domains with Voice

TV HOME

. . . query Domain Selector Answer Selector

. . .

Answer Selector response

CUSTOMER CARE

NEWS

slide-7
SLIDE 7

7

Open NLP: Multiple Domains with Voice

TV HOME

turn

  • n

the heat Domain Selector Answer Selector Answer Selector

CUSTOMER CARE

NEWS

response 0.80 0.15 0.02 0.03 Selected={TV, Home} Precision=100% Applicable={TV, Home} Recall=100% Threshold=0.10

slide-8
SLIDE 8

8

Open NLP: Multiple Domains with Voice

TV HOME

Show me my password Domain Selector Answer Selector Answer Selector

CUSTOMER CARE

NEWS

response 0.03 0.04 0.03 0.90 Selected={Customer Care} Precision=100% Applicable={Customer Care} Recall=100% Threshold=0.10

slide-9
SLIDE 9

9

  • Cascade of Deep Learning Models of increasing complexity

Domain Selector in Practice

Entity Detection Service “HBO” Simple Model Complex Model SEND TO DOMAIN DO NOT SEND TO DOMAIN YES YES NO YES NO NO

slide-10
SLIDE 10

10

SEND TO DOMAIN NO

  • Cascade of Deep Learning Models of increasing complexity

Domain Selector in Practice

Entity Detection Service “Show me funny comedies” Simple Model Complex Model DO NOT SEND TO DOMAIN YES YES NO YES NO

slide-11
SLIDE 11

11

  • Query: “who plays the oracle in matrix”

X1 Smart TV with Voice

Voice remote

query action

QA

Answer (id or text) Question (text) ASR NLP modules Set-top Box TV

slide-12
SLIDE 12

12

  • Given:
  • Question in natural-language form q
  • Structured knowledge base that contains list of facts
  • [ subject – relation – (attribute) – object ]
  • Return:
  • Answer to q
  • Assuming:
  • q answerable by a single fact.
  • Source entity mentioned in q.
  • Answer is neighbor of source entity node.

First-order Question Answering

subject

  • bject

attribute “Matrix” “Keanu Reeves” “Neo” “Tom Hanks” “9/1/1956”

slide-13
SLIDE 13

13

Question Answering with Knowledge Graph

Predict Relation Question Extract Entities [ e1, …, eN ] names / titles

Structured Query Subj=e1 Obj=? Rel=r

Knowledge Graph

Search

e1 | r | e2

relation r Generate Answer

Text answer

Train

subj | rel | obj How old is Tom Hanks?

slide-14
SLIDE 14

14

Question Answering with Knowledge Graph

Predict Relation Question Extract Entities [ e1, …, eN ] names / titles

Structured Query

Subj=e1 Obj=e2 Rel=r

Knowledge Graph

Search

e1 | r | e2

relation r

Generate Answer

Text answer

Train

subj | rel | obj

[ e1, …, eN ] names / titles

Subj=Tom Hanks Rel=birth Obj = ?

relation r

Tom Hanks is 55 years

  • ld.

birth Tom Hanks

Tom Hanks|birth|1956 Tom Hanks is 59 years old How old is Tom Hanks?

slide-15
SLIDE 15

15

Entity Detection [ e1, …, eN ] names / titles Predict Relation relation r subj=e

  • bj=?

attr=? rel=r

Question Answering with Knowledge Graph using Recurrent Neural Networks (RNNs)

Structured Query Question where Tom Hanks was place

  • f birth

born

memory

where Tom Hanks was NA Subj Subj NA NA born

memory

Entity Detection ~ Tagging Relation Prediction ~ Classification

slide-16
SLIDE 16

16

word

hidden input

  • utput

0.39 0.61

washington heights

0.89 0.11 memory

Recurrent Neural Networks

LOC PER PER LOC

slide-17
SLIDE 17

17

Netflix

LIVETV

Online Video

AI for Content Discovery – Automatic Content Analysis

slide-18
SLIDE 18

18

Most metadata is at the asset level

  • Genres
  • Credits
  • Synopsis
  • Keywords
slide-19
SLIDE 19

19

Much more data exists within the asset

  • Chapters
  • Moments
  • Annotations

Movie Frame Shot Scene Chapter

slide-20
SLIDE 20

20

Why is this useful?

Who is in this scene? What are the best moments on TV? In-game highlight navigation Search & Recommendations

slide-21
SLIDE 21

21

How does Automatic Content Analysis work?

Computer Vision

Audio Analysis

Natural Language Processing AI & Machine Learning

Chaptering Scene-level Annotations

Video

Frame-level Annotations

slide-22
SLIDE 22

22

Why is it possible now?

Large-scale Image recognition performance

Big Data

Better Algorithms (Deep learning) Cloud/GPU Computing

slide-23
SLIDE 23

23

Super-human accuracy in speech and image recognition!

Large-scale Image recognition performance

Big Data

Better Algorithms (Deep learning) Cloud/GPU Computing

slide-24
SLIDE 24

24

New experiences! Big Data

Better Algorithms (Deep learning) Cloud/GPU Computing

slide-25
SLIDE 25

25

  • Place highlights over games recorded onto customers’ DVRs for football, baseball, hockey, basketball and soccer.

Example Application: In-Game Highlights

“I’ll record as many games as I can. When I don’t want to watch the whole game, it’s a great way to do it.” – Customer Testimonial

“In-Game Highlights” Feature for NFL has been released on Comcast X1 last fall

slide-26
SLIDE 26

26

Netflix

LIVETV

Online Video

AI for Content Discovery – Personalization

slide-27
SLIDE 27

27

+ =

Personalized Entertainment Experiences

What is popular right now? What do you like?

Personalized Recommendations

slide-28
SLIDE 28

28

Deep learning-based recommender system for Live TV - Training a joint embedding space to combine the scores

  • Channel- and Program-based recommendations
  • Time-dependent recommendations
  • Trending/popular and personal favorite channels, programs, sport teams
  • Rich content descriptions from automatic content analysis

What should I watch right now?

Live TV Recommender System

Favorite Channels Favorite Programs Collaborative Filtering Trending Popularity Content Descriptions

slide-29
SLIDE 29

29

Netflix

LIVETV

Online Video

Deep Learning Infrastructure

slide-30
SLIDE 30

30

  • Deep Learning Frameworks

– Keras, Tensorflow, Theano, PyTorch, Caffee (older models)

  • All deployments using nvidia-docker

– Thanks to Nvidia solutions team to help with best practices

  • All deep learning training done on multi-GPU servers

– NvidiaTesla (Production) and 8xTitan X (Dev) GPUs – Nvidia DGX-1 for large scale training – video and nlp

  • Next steps

– Container scheduler – Kubernetes and Hashicorp Nomad – Network compression/simplification for increased efficiency (TensorRT) Deep Learning Infrastructure

slide-31
SLIDE 31

31

Machine Learning Data Science Big Data AI Improving Customer Experience Everywhere at Comcast/NBCU

Deep Learning-based ML is applied everywhere at Comcast

High Speed Internet Video IP Telephony Home Security / Automation Universal Parks Media Properties

For more info see: dclabs.comcast.com

slide-32
SLIDE 32