Developing Your Own Wake Word Engine Just Like Alexa and OK Google - PowerPoint PPT Presentation

Developing Your Own Wake Word Engine Just Like “ Alexa ” and “OK Google” Xuchen Yao, CEO, KITT.AI Guoguo Chen, CTO, KITT.AI

What’s a “wake word”? Alexa what’s the weather today? OK Google Hey Siri • Wake word • One shot • Hot word understanding • Offline • Online • Code runs on • Code runs on cloud CPU/DSP/MCU • 7x24 • On Demand • Always listening • Explicit permission

Conversational UI Pipeline wake up device voice speech  text text  speech text text dialogue understanding management

a customizable hotword detection engine a.k.a: deep neural network in 2MB of RAM hotword.io video blog

Who’s using it (released 5/2016) 10,000+ developers, 7000+ unique hotwords Dominating developer community for hotword detection

Use Cases

#1 Hotword: Smart Mirror https://github.com/evancohen/smart-mirror (credits to Evan Cohen) video link

Command & Control: GoPiGo (credits to Paul Matz) video link

Project RePL (credits to Chris Burns) video link

Conversational UI Pipeline wake up device voice speech  text text  speech Speech Pipeline text text dialogue understanding management

Speech Pipeline Wake Word Speech Microphone Voice Detection Recognition Array local cloud/local • Close talking • IBM/Microsoft/Nua • Telephone nce/Google (8KHz Sampling) • Far field (3-9 • Alexa Voice Service • Others (16KHz) feet) • Voice Activity Detection • 2, 4, or 6 • Kaldi • Noises: TV, • Auto Gain microphones • PocketSphinx radio, street, Control • Linear/circular • HTK café, car, music • Adaptive Echo • Fast response • Command & Control • Pitch: children, • Language Cancellation (0.1 second) adults, senior Understanding • Beam forming • High accuracy • Accent: US/UK/Europe/ Asian …

Supported Platforms and Wrappers • Raspberry Pi • Mac OS X • iPhone/iPad/iPod • x86/64bit Ubuntu • Android • Pine 64 • Intel Edison • Samsung Artik • Allwinner R-series • Ingenic X1000 • Rockchip

Personal vs. Universal models Personal Universal Voice samples needed 3 At least 1500 Speaker-independent No Yes Speaker-specific Sort of No Robust against noise No Yes Free Yes No Time needed Immediately 2 weeks

Customizing a universal model hotword collect voice web API from device Iterate & Improve define train a deliver & deploy to collect voice hotword model evaluate beta users desired performance: ship & >90% detection rate success <= 3 false alarms in 24 hours

Science behind wake word

Challenges Is this “ Alexa ”? • High detection rate • Low false alarm • Efficient: detect every 0.1 short window longer window second • Small RAM: <2MB • Too much ambiguity, not much context

Existing Algorithm

Existing Algorithm • Advantage: – Simplified pipeline – Simplified decoder • Disadvantage: – Massive hotword specific training data

Possible Ways to Improve • Data augmentation – Adding noise – Adding reverberation – And so on … original add noise add noise and reverberation

Possible Ways to Improve • Network models – Model selection • Feedforward models? Recurrent models? – Model compression • 32-bit float  16-bit float  8-bit integer • Parameters with small absolute value

Possible Ways to Improve • Decoder redesigning – Modeling smaller units • Syllables, phones, etc – False alarm suppression • Additional classifier?

Training with Tesla K20/K80 • Positive data – 1,500 hotword samples • Negative data – Thousands of hours of speech • Training time – Half a day with 4 K80 GPUs

Software Architecture Backend Frontend

KITT.AI Scientific Computing Content Data Training Model Deploy Websocket audio, msg Traffic HTTPs  Deep Learning Cloud ELB Message Queue Production Devices Cloud

Running Your First Snowboy Demo

Developing Your Own Wake Word Engine Just Like Alexa and OK Google - PowerPoint PPT Presentation

Developing Your Own Wake Word Engine Just Like Alexa and OK Google Xuchen Yao, CEO, KITT.AI Guoguo Chen, CTO, KITT.AI Whats a wake word? Alexa whats the weather today? OK Google Hey Siri Wake word One shot

WAKE TRANSIT PLAN Summer 2018 Planning for growth WAKE COUNTYs population already exceeds ONE

Historic Landmark Designation Public Hearing AP~A I WAKE COUNTY Purpose: Wake County Historic

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Wake Transi sit Dr Draft Wor ork Pl Plan Summary - Fisc scal Year 2020 - WAKE COUNTY IS

Search Engine Optimization What is Search Engine Optimization Search Engine Optimization is the

Knit, Chisel, Hack: Crafting with Guile Scheme Andy Wingo ~ wingo@igalia.com wingolog.org ~

1 T T own of Coeymans own of Coeymans T T own of Coeymans own of Coeymans Compr ehensive

European Wake Vortex European Wake Vortex Mitigation Benefits Study Mitigation Benefits Study

Foreclosure s Wake s Wake Foreclosure The Credit Experiences of Individuals Following

Wake Turbulence: do we Wake Turbulence: do we know enough to manage the know enough to manage

The Wake Community-University Partnership (WakeCUP) Presentation Goals and Objectives Overview

Outline Outline Turbulent Wake Flows Turbulent Wake Flows Momentum Integral

Experiments in the wind turbine far wake for the evaluation of analytical wake models Norwegian

Application of Wake Turbulence Application of Wake Turbulence Separation at London Heathrow

Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for Multi-Core Systems Multi-Core

Whats New in Engine Research Whats New in Engine Research Mark Musculus Engine Combustion

with GP U Davit Baghdasaryan, CEO, 2Hz Arto Minasyan, CTO, 2Hz 2 Mute Background Noises Voice

Submission on the Consultation Paper of the Conceptual Framework Phase 4: Presentation in General

INTRODUCTION WELCOME Andrew Phillips will present the webinar on the Searching the Register today.

CON2DIS 2.0 Ruo Zhang Dhamma Kimpara Anagha Indic Spring 2016 Project Objectives Utilize

Administration and Effectiveness of the Environmental Contribution Levy Tabled 25 June 2014 25

INVESTOR PRESENTATION 2Q20 and 1H20 Financial Results 20 August 2020 www.bankofgeorgiagroup.com

Overview & Practical Challenges ICATT IFRS Seminar September 2019 Agenda 1. Overview

Board of Public Health Meeting Tuesday, October 13, 2015 Commissioners Update Brenda

Developing Your Own Wake Word Engine Just Like Alexa and OK Google - PowerPoint PPT Presentation

Developing Your Own Wake Word Engine Just Like Alexa and OK Google Xuchen Yao, CEO, KITT.AI Guoguo Chen, CTO, KITT.AI Whats a wake word? Alexa whats the weather today? OK Google Hey Siri Wake word One shot

WAKE TRANSIT PLAN Summer 2018 Planning for growth WAKE COUNTYs population already exceeds ONE

Historic Landmark Designation Public Hearing AP~A I WAKE COUNTY Purpose: Wake County Historic

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Wake Transi sit Dr Draft Wor ork Pl Plan Summary - Fisc scal Year 2020 - WAKE COUNTY IS

Search Engine Optimization What is Search Engine Optimization Search Engine Optimization is the

Knit, Chisel, Hack: Crafting with Guile Scheme Andy Wingo ~ wingo@igalia.com wingolog.org ~

1 T T own of Coeymans own of Coeymans T T own of Coeymans own of Coeymans Compr ehensive

European Wake Vortex European Wake Vortex Mitigation Benefits Study Mitigation Benefits Study

Foreclosure s Wake s Wake Foreclosure The Credit Experiences of Individuals Following

Wake Turbulence: do we Wake Turbulence: do we know enough to manage the know enough to manage

The Wake Community-University Partnership (WakeCUP) Presentation Goals and Objectives Overview

Outline Outline Turbulent Wake Flows Turbulent Wake Flows Momentum Integral

Experiments in the wind turbine far wake for the evaluation of analytical wake models Norwegian

Application of Wake Turbulence Application of Wake Turbulence Separation at London Heathrow

Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for Multi-Core Systems Multi-Core

Whats New in Engine Research Whats New in Engine Research Mark Musculus Engine Combustion

with GP U Davit Baghdasaryan, CEO, 2Hz Arto Minasyan, CTO, 2Hz 2 Mute Background Noises Voice

Submission on the Consultation Paper of the Conceptual Framework Phase 4: Presentation in General

INTRODUCTION WELCOME Andrew Phillips will present the webinar on the Searching the Register today.

CON2DIS 2.0 Ruo Zhang Dhamma Kimpara Anagha Indic Spring 2016 Project Objectives Utilize

Administration and Effectiveness of the Environmental Contribution Levy Tabled 25 June 2014 25

INVESTOR PRESENTATION 2Q20 and 1H20 Financial Results 20 August 2020 www.bankofgeorgiagroup.com

Overview &amp; Practical Challenges ICATT IFRS Seminar September 2019 Agenda 1. Overview

Board of Public Health Meeting Tuesday, October 13, 2015 Commissioners Update Brenda

Overview & Practical Challenges ICATT IFRS Seminar September 2019 Agenda 1. Overview