DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS ANTHONY SCODARY, - PowerPoint PPT Presentation

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS ANTHONY SCODARY, GRIDSPACE WONKYUM LEE, GRIDSPACE

INTRO “Which translation speech recognition so and so forth I mean there's a whole bunch of amazing applications that are made possible by deep learning and so internet service providers are using it for internal application development. And then lastly what you mentioned as cloud service providers and basically because of the adoption of gp use and because of the success of kuta and so many applications are now able to be accelerate on gp use so that we can extend the capabilities of moore's law so that we can continue. You'd have the benefits of of computing acceleration, which which in the cloud means reducing cost. And that's on the serve cloud service provider side of of the Internet company so that would be amazon web services as the Google compute cloud.”

OVERVIEW 1. Business Conversations 2. Recognition 3. Analysis

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS 1. Business Conversations

PROTOCOLS SIGNAL PROCESSING

PROTOCOLS - Symbol Set (Lexicon) - Rules (Syntax) - Meaning (Semantics)

TYPES OF PROTOCOLS SOURCE MEDIUM SINK

TYPES OF PROTOCOLS: ENDPOINTS BIRD NATURE SEISMOGRAPH GROWLING CALL ELECTRIC FIRE MACHINE TCP FENCE ALARM HUMAN “SIT” SIRI SPEECH NATURE MACHINE HUMAN

TYPES OF PROTOCOLS: H2H MEDIA SPEECH CHAT MISSED BANDWIDTH VOICEMAIL CALL SMS WAVING EMAIL POSTCARD INFORMATION DENSITY

WHY DO WE STILL TALK? - Fast - Innate - Layered - Synchronous - Dense in meaning

ORGANIZATIONS Calls Meetings Support Calls Hallway Chats In-Person Sales EXTERNAL INTERNAL COMMUNICATION COMMUNICATION Documents Chat Support Email Social Media Chat Email SMS

ORGANIZATIONS Mostly lost today Calls Meetings Support Calls Hallway Chats In-Person Sales EXTERNAL INTERNAL COMMUNICATION COMMUNICATION Documents Chat Support Email Social Media Chat Email SMS

THIS DATA MATTERS

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS 2. Recognition

REAL-TIME CALL ANALYSIS ASR DSP SCANNER CLASSIFIER

ASR Conventional ASR - Combination of blocks designed by each expertise Language Model “hello” Feature Extraction Acoustic Model (MFCC) (GMM) Lexicon GMM-HMM: 1980-2010

ASR Lots of tuning to improve accuracy Language Model “hello” Feature Extraction Acoustic Model (MFCC) (GMM) Lexicon Robust Feature, Speaker-Adaptation, Application specific LM

ASR Replacing acoustic model with deep neural net Language Model Acoustic Model “hello” Feature Extraction (MFCC) Lexicon DNN-HMM: 30%-40% improvement (2011-2017)

ASR Someday in the near future, Replacing whole models with one neural net All-in-one Deep Learning Model “hello” End-to-End ASR: active research in-progress

ASR HISTORY ASR error rate for decades (in Academia) WER (log scale) “Human parity” Simple Linear model(GMM) Advanced Linear model (GMM-SAT-DT) Deep Learning Model End-to-End Deep Learning (under development)

ASR CHALLENGES “However, it’s still NOT Easy in real-world business conversational voice” Language Challenge • Domain specific terminology (company name, product name, …) • Spontaneous speech (natural conversation) • Accent, Dialect, Mispronunciation Acoustic Challenge • Noise (background, channel) • Acoustic effect (reverberation, Lombard effect) • Variability from speakers • Microphone displacement (near/far field)

LARGE-SCALE DATA PROCESSING Data is King! - General Conversational Data + in-domain data (training with in-domain data improves 15-30% accuracy) - Simulated data with variety noise helps! (improves 10-15% accuracy) - Data collection with semi-supervised training helps

LARGE-SCALE DATA PROCESSING Multi-GPU Training - 4x Titan X with parallel training - One week for full-training with 25k hours audio - 80x Faster than 32 core CPU machine

REAL-TIME ADAPTIVE PROCESSING Real-time adaptive processing - Online i-vector adaptation (5-10% improvement) - speaker characteristics - environmental noise - Accent & dialect - Context-based grammar adaptation (recognize in-domain specific terms)

STATE OF THE ART DEEP LEARNING MODEL State-of-Art deep learning model - Time-delayed neural network - Computation optimization (Subsampling, bi-phone, etc) - WFST framework for search WER: 5~6% Capital Market Model 12~15% Customer Intelligence Model Real-Time-Factor: 0.3-0.35 “Purely sequence-trained neural networks for ASR based on lattice-free MMI”, Interspeech 2016

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS 3. Analysis

IS TRANSCRIPTION REALLY WHAT YOU WANT ANYWAY?

STUFF WITH ACTUAL USE TO COMPANIES - Prediction - Classification - Summarization - Entity Extraction - Anomaly Detection

“ARTIFICIAL INTELLIGENCE”

“ARTIFICIAL INTELLIGENCE” CONSCIOUSNESS ABOVE THIS LINE EMOTION THIS SURELY IS “REAL” INTELLIGENCE CONVERSATION IMAGE RECOGNITION CHESS GRAPH SEARCH ARITHMETIC

“ARTIFICIAL INTELLIGENCE” TECHNOLOGY REVOLUTION WASTE OF MONEY AND TIME

“ARTIFICIAL INTELLIGENCE” We focus on the industry needs as an engineering task.

ANALYSIS 1. Speech is complex. Let models decide what features matter for a task or application.

ANALYSIS 2. Speech is high dimensional. Datasets must be large enough to train large models to match.

ANALYSIS 3. Conversational speech is noisy. Large, well-augmented datasets are necessary to be robust.

ANALYSIS

ANALYSIS ...

ANALYSIS

ANALYSIS One-hot ℝ 300 (D-dimensions) ℝ 40 aardvark zebra

ANALYSIS WOMAN SISTER MAN BROTHER QUEEN KING

ANALYSIS i have no political party actually ~~~‘democrat’ i have no political party actually ~~~‘democrat’ i have no political party actually ~~~‘democrat’

ANALYSIS

API gridspace.com

QUESTIONS?

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS ANTHONY SCODARY, - PowerPoint PPT Presentation

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS ANTHONY SCODARY, GRIDSPACE WONKYUM LEE, GRIDSPACE INTRO Which translation speech recognition so and so forth I mean there's a whole bunch of amazing applications that are made possible by deep

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

1 ! Knowing The Right Conversation What would the right conversation look and sound like

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

13/06/17 Critical Autism Studies Conference Rosie Murray - Taking the Floor: a conversation

W hat are the Key Determ inants of Nonperform ing Loans in CESEE? 4 th EBA Policy Research W

Additi Additional Legislation l L i l ti Waste Legislation Waste Legislation Waste Management

General Meeting of Members Saturday 12 th October 2019 Future Aerotowing at Nympsfield Choice of

in High-Speed Networks Presented at INDIS 2017 Mariam Kiran ESnet, LBNL Anshuman Chabbra (NSIT)

Social spending and aggregate welfare in developing and transition economies Fiseha

Probabilistic Traffic Models for Occupancy Counting J. Boucquey 1 , F. Gonze 2 , A. Hately 1 , E.

2018 Results April 25th, 2018 1 Disclaimer This presentation contains forward-looking statements

Fiscal Policy and Debt Dynamics in Developing Countries Ethan Ilzetzki London School of

Sambuz

Useful Links

Newsletter

Mail Us

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS ANTHONY SCODARY, - PowerPoint PPT Presentation

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS ANTHONY SCODARY, GRIDSPACE WONKYUM LEE, GRIDSPACE INTRO Which translation speech recognition so and so forth I mean there's a whole bunch of amazing applications that are made possible by deep

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

1 ! Knowing The Right Conversation What would the right conversation look and sound like

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

13/06/17 Critical Autism Studies Conference Rosie Murray - Taking the Floor: a conversation

W hat are the Key Determ inants of Nonperform ing Loans in CESEE? 4 th EBA Policy Research W

Additi Additional Legislation l L i l ti Waste Legislation Waste Legislation Waste Management

General Meeting of Members Saturday 12 th October 2019 Future Aerotowing at Nympsfield Choice of

in High-Speed Networks Presented at INDIS 2017 Mariam Kiran ESnet, LBNL Anshuman Chabbra (NSIT)

Social spending and aggregate welfare in developing and transition economies Fiseha

Probabilistic Traffic Models for Occupancy Counting J. Boucquey 1 , F. Gonze 2 , A. Hately 1 , E.

2018 Results April 25th, 2018 1 Disclaimer This presentation contains forward-looking statements

Fiscal Policy and Debt Dynamics in Developing Countries Ethan Ilzetzki London School of

Sambuz

Useful Links

Newsletter

Mail Us

Deep learning for natural language processing A short primer on deep learning Benoit Favre <