SIV for VoiceXML 3.0: Language and Application Design - PowerPoint PPT Presentation

SIV for VoiceXML 3.0: Language and Application Design Considerations Ken Rehor Cisco Systems, Inc. krehor@cisco.com March 05, 2009

VoiceXML Application Architecture VoiceXML VoIP VoiceXML VoiceXML Verification Server Gateway IP PSTN / Application Application (HTTP) VoIP ASR TTS SIV SIV VP DB engine Audio DTMF

SIV in VoiceXML 2.x • Server-side SIV processing – <record> – <field> with recordutterance • Language extensions – Nuance "voiceprint forms" – BeVocal

VoiceXML 2.x SIV Integration recordutterance <record> VoiceXML Application VoiceXML Server PSTN / IP VoIP <subdialog> Verification VoiceXML Application (HTTP) SIV VP DB engine

VoiceXML 2.x SIV Integration recordutterance <record> VoiceXML Application VoiceXML Server PSTN / IP VoIP <subdialog> Verification VoiceXML Application (HTTP) SIV engine VP DB

Standard VoiceXML prompt/field model • Text-independent – <prompt> / <record> – Submit recording to application server • Text-dependent, Text-prompted – <prompt> / <field> (with recordutterance) – Submit utterance recording to application server

VoiceXML 2.x <record> <form name="verify">  < record name="utterance" maxtime="5s <prompt> Say this digit sequence: one two three four five.</prompt> <noinput> I didn't hear anything, please try again. </noinput> </record> <block> <submit next="check_utterance.pl" enctype="multipart/form-data" method="post" namelist="utterance"/> </block> </form>

VoiceXML 2.1 <field> <form name="verify"> <prompt>Say this digit sequence: one two three four five.</prompt> <field type="digits"> <filled>  </filled> </field> </form>

VoiceXML 2.1 <field> with recordutterance <form name="verify"> <property name="recordutterance" value="true"/> <prompt>Say this digit sequence: one two three four five.</prompt> <field type="digits"> <filled>  </filled> </field> </form>

Security Concerns

Architecture / Security / Trust • One architecture may not be suitable for every use case � Some architectures may not support the level of (dis)trust required for a particular deployment

Security, Trust and Protocol Considerations in Distributed Voice Web Applications Architecture options carry security implications <vxml> .wav VoiceXML browser Voice Web Authentication PSTN Application Web Service or IP Server network … Other Application Web Services ? MRCP Server Voice template database TTS ASR SIV Engine Engine Engine <grxml> voice <ssml> template Voice DEFF may be used between SIV components and services Voice Web Service interface template database

SIV engine and database managed by App server VoiceXML browser records the utterance and forwards to app server (typical scenario for VoiceXML 2.0/2.1) <vxml> .wav VoiceXML browser audio Voice Web PSTN <record> Application or IP Server network MRCP Client SIV Engine audio Note: DTMF processing not shown Voice voice MRCP Server template template database TTS ASR Voice templates Engine Engine managed and <grxml> <ssml> stored locally by SIV engine Audio stream vs. buffers Streaming handled by RTP? Buffers may be handled by audio recorder function. Part of browser or MRCP engine?

SIV engine and database managed by App server VoiceXML browser records the utterance and forwards to app server (typical scenario for VoiceXML 2.0/2.1) Service <vxml> Provider .wav VoiceXML browser audio Voice Web Voice Web PSTN IP <record> Application Application or IP Server network MRCP Server Client SIV Engine audio Note: DTMF processing not shown Voice voice MRCP Server template template database TTS ASR Voice templates Engine Engine managed and <grxml> <ssml> stored locally by SIV engine

SIV engine and database managed by MRCP server <vxml> .wav VoiceXML browser Voice Web PSTN Application or IP Server network MRCP Client audio Note: DTMF processing not shown Audio stream vs. buffers Streaming handled by RTP? MRCP Server Buffers may be handled by TTS ASR SIV audio recorder function. Part Engine Engine Engine of browser or MRCP engine? <grxml> <ssml> Voice templates Voice voice managed and template template stored locally by database SIV engine

SIV engine managed by MRCP server SIV database managed by app server Voice model transmission managed by engine or MRCP Server <vxml> .wav VoiceXML browser Voice Web PSTN Application or IP Server network MRCP Client Voice voice audio template Note: DTMF template database processing not shown Voice templates retrieved from database by app MRCP Server server TTS ASR SIV Engine Engine Engine <grxml> <ssml> voice template

SIV engine managed by MRCP server SIV database managed by app server Voice model transmission managed by VoiceXML browser <vxml> .wav VoiceXML browser Voice Web PSTN Application or IP Server network MRCP Client Voice voice audio template Note: DTMF template database processing not shown Voice templates managed and stored locally by SIV engine MRCP Server TTS ASR SIV Engine Engine Engine <grxml> <ssml> voice template Voice templates retrieved from database by ap server

SIV in VoiceXML 3.0

V3 Integration Requirements • Control multiple Input Resources – ASR and biometric engines – Simultaneously – Switch on a per <field> or verification basis • Consistent with V3 overall design goals • Simplify integration, yet provide sufficient control

V3 Data, Event relationship between components Commands from events Mark other resource data controllers SSML FA Resource Controller (an object with semantics similar to form item) Add Add Barge-in on/off, Stop, Play voiceprint() grammar() done Prompt Resources Input Input 2 Input 3 queue Inputs are all session-level Recording types to consider: Events: • <record> Stop, Play audio, mark, • Utterance recording audio … • Whole-call recording (two-channel?) error, DTMF • Multi-turn recording (e.g. mixed-initiative recording) done recognition audio verification, SSML/media player YOU ARE HERE YOU ARE HERE device(s) recorder etc

SIV "Session" • Enrollment Session or Verification Session • Verification process: Uninterrupted process over several dialog states (having a Session-ID) where the results of each utterance are cumulated VoiceXMLSession Verification Session SIV dialog SIV dialog SIV dialog

Define Data Model • Data passed to SIV engine – Environment – Properties – Attributes – Voice models • Data returned from SIV engine – Results specified as an EMMA result – Errors/info • Data used within SIV session • Associate SIV result with ASR result

Define event model • Combine references from: – VoiceXML Forum – MRCP v2 – Engine vendors

VoiceXML and SIV Web Services

VoiceXML 2.x/3.x SIV Integration via BIAS web service BIAS VoiceXML Application VoiceXML Verification (Web Service) VoiceXML Application (HTTP) Browser PSTN / IP recordutterance VoIP <record> BioAPI SIV VP DB engine

VoiceXML 2.x/3.x SIV Integration via <subdialog> VoiceXML Application VoiceXML Verification VoiceXML Application (HTTP) Browser PSTN / IP VoIP VoiceXML <subdialog> (HTTP) recordutterance <record> SIV VP DB engine

VoiceXML 3.0 SIV Integration VoiceXML Application VoiceXML VoiceXML (HTTP) Browser PSTN / VoIP VP DB BioAPI, MRCP, etc. SIV engine • V3 SIV native language features • Browser/Engine integration via BioAPI, MRCP, proprietary API, etc.

VoiceXML 3.0 SIV Integration VoiceXML Application VoiceXML Verification VoiceXML Application (HTTP) Browser PSTN / IP VoIP VoiceXML <subdialog> (HTTP) BioAPI, MRCP, etc. SIV SIV engine VP DB engine • V3 SIV native language features • Browser/Engine integration via BioAPI, MRCP, proprietary API, etc.

VoiceXML SIV Integration via BIAS web service or <subdialog> recordutterance <record> BIAS VoiceXML Application VoiceXML Verification (Web Service) VoiceXML Application (HTTP) Browser PSTN / IP VoIP VoiceXML <subdialog> (HTTP) SIV SIV engine VP DB engine

VoiceXML Application Switching recordutterance <record> VoiceXML Application VoiceXML Verification VoiceXML Application (HTTP) Browser PSTN / IP VoIP VoiceXML <subdialog> (HTTP) SIV SIV engine VP DB engine

Pros and Cons of Native V3 SIV functions

SIV for VoiceXML 3.0: Language and Application Design - PowerPoint PPT Presentation

SIV for VoiceXML 3.0: Language and Application Design Considerations Ken Rehor Cisco Systems, Inc. krehor@cisco.com March 05, 2009 VoiceXML Application Architecture VoiceXML VoIP VoiceXML VoiceXML Verification Server Gateway IP PSTN

VoiceXML 3 SIV Extensions Pros and Cons Todays Typical SIV Architecture Audio or Link PSTN

SIV Applications and VoiceXML Judith A. Markowitz, PhD J. Markowitz, Consultants Chicago, IL

Internet Engineering: VoiceXML Ali Kamandi Sharif University of Technology Fall 2007

SIV IN MRCP W3C Biometrics W orkshop March 2009 Overview What is MRCP? MRCPv1 SIV in MRCPv1

Deutsche Telekom Laboratories W3C SIV Workshop (Menlo Park, March 5-6, 2009) Ingmar Kliche,

SIV Workshop March 2009 Security, Privacy and Management March 2009 Valene Skerpac,

AVCAL Focus on SIV 2 June 2015 Austrade: Significant and Premium Investor Visa Programmes

Reconsidering the Security Bound of AES-GCM-SIV Tetsu Iwata 1 and Yannick Seurin 2 1 Nagoya

Configuration and Management of Speaker Verification Systems W3C Workshop on Speaker Biometrics

Hey Furby, call mom! using Skype and VoiceXML Miloslav Pavl ek (pavlim3@fel.cvut.cz)

Speech Processing 15-492/18-492 Spoken Dialog Systems Tree based dialogs VoiceXML State-based

BioAPI BioAPI 6 March 2009 Catherine Tilton W3C Workshop on SIV BioAPI? The BioAPI

The Deepwater Horizon Oil Spill Trustees CITY, STATE SiV l;. tfiC USDA m i

GCM-SIV: Full Nonce Mis isuse-Resistant Authenticated Encry ryption at t Under One Cycle per

TEAM INTRODUCTIONS ODOT Paul Rachel - Division 3 Engineer Siv Sundaram - Environmental

By : Ro Romi misaa saa Ad Adel , S l , SIV IV Major or chemistry mistry Supervis ised by:

Flat Metric Minimization with Applications in Generative Modeling Thomas M ollenhoff Daniel

Job Monitoring MIB Proposal for a new standards track project Developed by Printer MIB

Everything is numbers Everything is bits e.g., 9-digit SSN: 10 9 = 1 billion possible

CS101 Lecture 11: Data Representation: Binary Numbers Number Systems Binary Numbers Aaron

Federal Cybersecurity Workforce Federal Computer Security Managers Forum February 14, 2017 1

CSCI-UA.0380-001 Programming Challenges Sean McIntyre Class 04: Search Today's agenda

Reconciling Occupational Mobility in the Current Population Survey Christian vom Lehn 1 Cache

Programming Languages Janyl Jumadinova September 10-15, 2020 Janyl Jumadinova Programming