how gpus power comcast s x1 voice remote and smart video
play

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics - PowerPoint PPT Presentation

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017 Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Voice & Deep Learning Smart Home NLP Data


  1. How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017

  2. Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Voice & Deep Learning Smart Home NLP Data Science Smart Internet Recommendations & Search 2

  3. Today: How Comcast Uses AI to Evolve and Reinvent the TV Experience Media & Video Analytics Smart TV Voice & Deep Learning Smart Home NLP Data Science Smart Internet Recommendations & Search 3

  4. Online Video Netflix AI for Content Discovery –Voice Search LIVETV 4

  5. X1 Smart TV with Voice Query: “HBO” • Voice remote ASR Set-top Box TV NLP modules Answer Selector action query 5

  6. Open NLP: Multiple Domains with Voice TV HOME Answer Domain Answer Selector query response Selector Selector NEWS . CUSTOMER . CARE . . . . 6

  7. Open NLP: Multiple Domains with Voice 0.15 TV 0.80 HOME turn Answer Domain on Answer Selector response Selector the Selector heat 0.02 NEWS 0.03 CUSTOMER CARE Threshold=0.10 Selected={TV, Home} Precision=100% Applicable={TV, Home} Recall=100% 7

  8. Open NLP: Multiple Domains with Voice 0.04 TV 0.03 HOME Answer Show me Domain Answer Selector my response Selector Selector password 0.03 NEWS 0.90 CUSTOMER CARE Threshold=0.10 Selected={Customer Care} Precision=100% Applicable={Customer Care} Recall=100% 8

  9. Domain Selector in Practice • Cascade of Deep Learning Models of increasing complexity Entity YES Detection “HBO” Service NO Simple Model NO YES NO YES Complex Model SEND DO NOT SEND TO DOMAIN TO DOMAIN 9

  10. Domain Selector in Practice • Cascade of Deep Learning Models of increasing complexity Entity “Show me YES Detection funny Service comedies” NO Simple Model NO YES NO YES Complex Model DO NOT SEND SEND TO DOMAIN TO DOMAIN 10

  11. X1 Smart TV with Voice Query: “who plays the oracle in matrix” • ASR Set-top Box TV Voice remote NLP modules QA Question Answer (text) (id or text) action query 11

  12. First-order Question Answering • Given: • Question in natural-language form q • Structured knowledge base that contains list of facts • [ subject – relation – (attribute) – object ] subject “9/1/1956” object “Tom Hanks” “Keanu Reeves” “Matrix” • Return: • Answer to q attribute “Neo” • Assuming: • q answerable by a single fact. • Source entity mentioned in q . • Answer is neighbor of source entity node. 12

  13. Question Answering with Knowledge Graph Subj= e 1 Structured Predict Obj= ? relation Query Rel= r Relation r Question Search Extract How old is names / titles Tom Hanks? Entities [ e 1 , …, e N ] e 1 | r | e 2 Generate Answer subj | rel | obj Knowledge Train Text answer Graph 13

  14. Question Answering with Knowledge Graph Subj= Tom Hanks Subj= e 1 Structured Rel= birth Predict Obj= e 2 relation relation birth Query Obj = ? Relation Rel= r r r Question Search Tom Extract How old is names / titles names / titles Tom Hanks? Hanks Entities Tom Hanks [ e 1 , …, e N ] [ e 1 , …, e N ] Tom Hanks | birth | 1956 e 1 | r | e 2 is 55 years old. Generate Answer subj | rel | obj Tom Hanks Knowledge Train Text answer is 59 years old Graph 14

  15. Question Answering with Knowledge Graph using Recurrent Neural Networks (RNNs) subj= e Structured Predict relation Query obj= ? Question r Relation attr=? rel= r names / titles Entity [ e 1 , …, e N ] Detection Entity Detection ~ Tagging Relation Prediction ~ Classification memory NA Subj Subj NA NA memory place of birth where Tom where Tom Hanks was Hanks was born born 15

  16. Recurrent Neural Networks LOC PER LOC PER 0.39 0.61 0.89 0.11 output memory hidden input word heights washington 16

  17. Online Video Netflix AI for Content Discovery – Automatic Content Analysis LIVETV 17

  18. Most metadata is at the asset level • Genres • Credits • Synopsis • Keywords 18

  19. Much more data exists within the asset • Chapters • Moments • Annotations Movie Frame Shot Scene Chapter 19

  20. Why is this useful? In-game What are the best highlight moments on TV? navigation Who is in this scene? Search & Recommendations 20

  21. How does Automatic Content Analysis work? Video Computer Vision Chaptering AI & Audio Scene-level Machine Analysis Annotations Learning Frame-level Natural Annotations Language Processing 21

  22. Why is it possible now? Better Big Cloud/GPU Algorithms Data Computing (Deep learning) Large-scale Image recognition performance 22

  23. Super-human accuracy in speech and image recognition! Better Big Cloud/GPU Algorithms Data Computing (Deep learning) Large-scale Image recognition performance 23

  24. New experiences! Better Big Cloud/GPU Algorithms Data Computing (Deep learning) 24

  25. Example Application: In-Game Highlights • Place highlights over games recorded onto customers’ DVRs for football, baseball, hockey, basketball and soccer. “In-Game Highlights” Feature for NFL has been released on Comcast X1 last fall “I’ll record as many games as I can. When I don’t want to watch the whole game, it’s a great way to do it.” – Customer Testimonial 25

  26. Online Video Netflix AI for Content Discovery – Personalization LIVETV 26

  27. Personalized Entertainment Experiences What is popular right now? What do you like? + Personalized Recommendations = 27

  28. What should I watch right now? Deep learning-based recommender system for Live TV - Training a joint embedding space to combine the scores - Channel- and Program-based recommendations - Time-dependent recommendations - Trending/popular and personal favorite channels, programs, sport teams - Rich content descriptions from automatic content analysis Favorite Favorite Content Collaborative Trending Channels Programs Descriptions Filtering Popularity Live TV Recommender System 28

  29. Online Video Netflix Deep Learning Infrastructure LIVETV 29

  30. Deep Learning Infrastructure • Deep Learning Frameworks – Keras, Tensorflow, Theano, PyTorch, Caffee (older models) • All deployments using nvidia-docker – Thanks to Nvidia solutions team to help with best practices • All deep learning training done on multi-GPU servers – NvidiaTesla (Production) and 8xTitan X (Dev) GPUs – Nvidia DGX-1 for large scale training – video and nlp • Next steps – Container scheduler – Kubernetes and Hashicorp Nomad – Network compression/simplification for increased efficiency (TensorRT) 30

  31. Deep Learning-based ML is applied everywhere at Comcast Machine Learning Data Science Big Data High Speed Internet AI Video Improving Customer IP Telephony Experience Home Security / Everywhere at Automation Comcast/NBCU Universal Parks For more info see: dclabs.comcast.com Media Properties 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend