How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017
How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics - - PowerPoint PPT Presentation
How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics - - PowerPoint PPT Presentation
How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017 Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Voice & Deep Learning Smart Home NLP Data
2
Comcast Applied Artificial Intelligence Lab
Smart Home Smart TV Smart Internet
Media & Video Analytics Deep Learning Data Science Recommendations & Search Voice & NLP
3
Today: How Comcast Uses AI to Evolve and Reinvent the TV Experience
Smart Home Smart TV Smart Internet
Media & Video Analytics Deep Learning Data Science Recommendations & Search Voice & NLP
4
Netflix
LIVETV
Online Video
AI for Content Discovery –Voice Search
5
- Query: “HBO”
X1 Smart TV with Voice
Answer Selector Voice remote ASR
query
NLP modules
action
Set-top Box TV
6
Open NLP: Multiple Domains with Voice
TV HOME
. . . query Domain Selector Answer Selector
. . .
Answer Selector response
CUSTOMER CARE
NEWS
7
Open NLP: Multiple Domains with Voice
TV HOME
turn
- n
the heat Domain Selector Answer Selector Answer Selector
CUSTOMER CARE
NEWS
response 0.80 0.15 0.02 0.03 Selected={TV, Home} Precision=100% Applicable={TV, Home} Recall=100% Threshold=0.10
8
Open NLP: Multiple Domains with Voice
TV HOME
Show me my password Domain Selector Answer Selector Answer Selector
CUSTOMER CARE
NEWS
response 0.03 0.04 0.03 0.90 Selected={Customer Care} Precision=100% Applicable={Customer Care} Recall=100% Threshold=0.10
9
- Cascade of Deep Learning Models of increasing complexity
Domain Selector in Practice
Entity Detection Service “HBO” Simple Model Complex Model SEND TO DOMAIN DO NOT SEND TO DOMAIN YES YES NO YES NO NO
10
SEND TO DOMAIN NO
- Cascade of Deep Learning Models of increasing complexity
Domain Selector in Practice
Entity Detection Service “Show me funny comedies” Simple Model Complex Model DO NOT SEND TO DOMAIN YES YES NO YES NO
11
- Query: “who plays the oracle in matrix”
X1 Smart TV with Voice
Voice remote
query action
QA
Answer (id or text) Question (text) ASR NLP modules Set-top Box TV
12
- Given:
- Question in natural-language form q
- Structured knowledge base that contains list of facts
- [ subject – relation – (attribute) – object ]
- Return:
- Answer to q
- Assuming:
- q answerable by a single fact.
- Source entity mentioned in q.
- Answer is neighbor of source entity node.
First-order Question Answering
subject
- bject
attribute “Matrix” “Keanu Reeves” “Neo” “Tom Hanks” “9/1/1956”
13
Question Answering with Knowledge Graph
Predict Relation Question Extract Entities [ e1, …, eN ] names / titles
Structured Query Subj=e1 Obj=? Rel=r
Knowledge Graph
Search
e1 | r | e2
relation r Generate Answer
Text answer
Train
subj | rel | obj How old is Tom Hanks?
14
Question Answering with Knowledge Graph
Predict Relation Question Extract Entities [ e1, …, eN ] names / titles
Structured Query
Subj=e1 Obj=e2 Rel=r
Knowledge Graph
Search
e1 | r | e2
relation r
Generate Answer
Text answer
Train
subj | rel | obj
[ e1, …, eN ] names / titles
Subj=Tom Hanks Rel=birth Obj = ?
relation r
Tom Hanks is 55 years
- ld.
birth Tom Hanks
Tom Hanks|birth|1956 Tom Hanks is 59 years old How old is Tom Hanks?
15
Entity Detection [ e1, …, eN ] names / titles Predict Relation relation r subj=e
- bj=?
attr=? rel=r
Question Answering with Knowledge Graph using Recurrent Neural Networks (RNNs)
Structured Query Question where Tom Hanks was place
- f birth
born
memory
where Tom Hanks was NA Subj Subj NA NA born
memory
Entity Detection ~ Tagging Relation Prediction ~ Classification
16
word
hidden input
- utput
0.39 0.61
washington heights
0.89 0.11 memory
Recurrent Neural Networks
LOC PER PER LOC
17
Netflix
LIVETV
Online Video
AI for Content Discovery – Automatic Content Analysis
18
Most metadata is at the asset level
- Genres
- Credits
- Synopsis
- Keywords
19
Much more data exists within the asset
- Chapters
- Moments
- Annotations
Movie Frame Shot Scene Chapter
20
Why is this useful?
Who is in this scene? What are the best moments on TV? In-game highlight navigation Search & Recommendations
21
How does Automatic Content Analysis work?
Computer Vision
Audio Analysis
Natural Language Processing AI & Machine Learning
Chaptering Scene-level Annotations
Video
Frame-level Annotations
22
Why is it possible now?
Large-scale Image recognition performance
Big Data
Better Algorithms (Deep learning) Cloud/GPU Computing
23
Super-human accuracy in speech and image recognition!
Large-scale Image recognition performance
Big Data
Better Algorithms (Deep learning) Cloud/GPU Computing
24
New experiences! Big Data
Better Algorithms (Deep learning) Cloud/GPU Computing
25
- Place highlights over games recorded onto customers’ DVRs for football, baseball, hockey, basketball and soccer.
Example Application: In-Game Highlights
“I’ll record as many games as I can. When I don’t want to watch the whole game, it’s a great way to do it.” – Customer Testimonial
“In-Game Highlights” Feature for NFL has been released on Comcast X1 last fall
26
Netflix
LIVETV
Online Video
AI for Content Discovery – Personalization
27
+ =
Personalized Entertainment Experiences
What is popular right now? What do you like?
Personalized Recommendations
28
Deep learning-based recommender system for Live TV - Training a joint embedding space to combine the scores
- Channel- and Program-based recommendations
- Time-dependent recommendations
- Trending/popular and personal favorite channels, programs, sport teams
- Rich content descriptions from automatic content analysis
What should I watch right now?
Live TV Recommender System
Favorite Channels Favorite Programs Collaborative Filtering Trending Popularity Content Descriptions
29
Netflix
LIVETV
Online Video
Deep Learning Infrastructure
30
- Deep Learning Frameworks
– Keras, Tensorflow, Theano, PyTorch, Caffee (older models)
- All deployments using nvidia-docker
– Thanks to Nvidia solutions team to help with best practices
- All deep learning training done on multi-GPU servers
– NvidiaTesla (Production) and 8xTitan X (Dev) GPUs – Nvidia DGX-1 for large scale training – video and nlp
- Next steps
– Container scheduler – Kubernetes and Hashicorp Nomad – Network compression/simplification for increased efficiency (TensorRT) Deep Learning Infrastructure
31
Machine Learning Data Science Big Data AI Improving Customer Experience Everywhere at Comcast/NBCU
Deep Learning-based ML is applied everywhere at Comcast
High Speed Internet Video IP Telephony Home Security / Automation Universal Parks Media Properties
For more info see: dclabs.comcast.com