Multimodal Interaction for Next Generation Networks Jrgen Sienel - - PowerPoint PPT Presentation

multimodal interaction for next generation networks
SMART_READER_LITE
LIVE PREVIEW

Multimodal Interaction for Next Generation Networks Jrgen Sienel - - PowerPoint PPT Presentation

Multimodal Interaction for Next Generation Networks Jrgen Sienel Alcatel Research & Innovation Stuttgart, Germany W3C MM Workshop Sophia Antipolis July 19, 2004 Alcatel e-Business Networking www.alcatel.com/enterprise Outline


slide-1
SLIDE 1

www.alcatel.com/enterprise Alcatel e-Business Networking

Multimodal Interaction for Next Generation Networks

W3C MM Workshop Sophia Antipolis July 19, 2004 Jürgen Sienel Alcatel Research & Innovation Stuttgart, Germany

slide-2
SLIDE 2

Alcatel R&I 19- 7 - 2004 Page 2

Outline Motivation Multimodal Applications Multimodal Architecture Approaches Standardisation Issues Conclusion

slide-3
SLIDE 3

Alcatel R&I 19- 7 - 2004 Page 3

Voice

Access Information Services through Communication Networks

to deliver next generation services, across the domains of enterprise, fixed and mobile, across disparate devices

IT Data

Converged Functionality

Private Public

Domains

Converged Functionality

slide-4
SLIDE 4

Alcatel R&I 19- 7 - 2004 Page 4

Motivation

Mobile Environment

slide-5
SLIDE 5

Alcatel R&I 19- 7 - 2004 Page 5

Customizable (operator, user), adaptive to user profile, preference and terminal capability Multimodal User Interaction

Human – Machine Communication

Fixed Mobile Car Home Public Areas

Voice Graphic Voice/ Graphic Voice/ Graphic

slide-6
SLIDE 6

Alcatel R&I 19- 7 - 2004 Page 6

Multimodal Interaction

Reasons

Human perception allows the parallel processing of multiple input channels Higher „Bandwidth“ of communication (Non-verbal) Concentration on strength of each modality Selection of most appropriate modality depending on

environment, e.g. noisy context, e.g. driving in a car complexity of task, e.g. directory assistance device capability , e.g. small displays preferences and disabilities of the user, e.g. visually impaired

slide-7
SLIDE 7

Alcatel R&I 19- 7 - 2004 Page 7

Multimodal Applications Operators Visions

Application Area Services / Features

Telephone Services Information Services Messaging Operator Services Enterprise Applications Mobile Commerce Security Services

Voice-activated dialing, Call Handling Voice Portals, Wireless Web, Telematics Handling of Voice mail, email and UM, IM Voice deputy, Directory Assistance Call/Contact center Multi Modal Event Notification, Mobile transactions Speaker verification, Biometrics

Enablers

Text-to-Speech Multimodal Interaction Web Interfaces Automatic Speech Recognition User Identification

slide-8
SLIDE 8

Alcatel R&I 19- 7 - 2004 Page 8

▼Adaptation to terminal capability and user preference ▼Flexible combination of visual and acoustical interaction ▼Customization

Multimodal Application Instant Messaging

slide-9
SLIDE 9

Alcatel R&I 19- 7 - 2004 Page 9

Approaches

Multimodal Browser

Handwriting Recognition Speech Recognition Speech Generation TTS User Interface Microphone Keyboard Mouse Pen Display Loudspeaker API HTTP Application Browser Client Device Web- Server API API

slide-10
SLIDE 10

Alcatel R&I 19- 7 - 2004 Page 10

Multimodal Browser

Some pros and cons

Pros: All functionality in one device Just one handler for a document Easy synchronisation methods of graphics and voice Direct interpretation and handling of sensors Cons: Limited resources on mobile devices Dependent on the device Multilinguality may be missing Interaction Management has no deeper application knowledge, can only interpret the document Transfer of application data (e.g. grammars) might be more expensive than transfer of speech

slide-11
SLIDE 11

Alcatel R&I 19- 7 - 2004 Page 11

Speech Server (Recognition, TTS) Multi-modal Interpreter Browser/ Graphics signalling Speech input Speech output Terminal Terminal Application Application Server Server Speech Server Speech Server grammar XML page Result recognition- results

SE: Speech coding or DSR

SD: Speech decoding or TTS Distributed Speech Recognition: Backend processing will be in the Speech Server

SD SE

Multimodal Architectures

Server based Approach

slide-12
SLIDE 12

Alcatel R&I 19- 7 - 2004 Page 12

Server based Approach

Some pros and cons

Pros: Exact knowledge of the application Handling of meta dialog Storage of voice records for security reasons (banking application) Easy support of multilingual applications Cons: How to get detailed knowledge about the class of device Direct interpretation and handling of sensor data in terminal Harder to synchronise (delays) No sharing of ASR and TTS resources

slide-13
SLIDE 13

Alcatel R&I 19- 7 - 2004 Page 13

CLIENT SIDE

Resource Manager Front-end Dialogmanager

SERVER SIDE

Web Browser Context / Presence

Media Server

W3C MM Mark- up

NGN

MuMo Proxy ASR TTS HWR MuMo Backend

Applications

HTTP

Application Server

Internet

Interaction Server

Approaches

Distributed Architecture

slide-14
SLIDE 14

Alcatel R&I 19- 7 - 2004 Page 14

Multimodal Browser Some pros and cons

Pros: On demand functionality (use of local functionality where possible) Storage of voice records for security reasons (banking application) Could support of multilingual applications Better interpretation and handling of sensors and device capabilities Optimised network traffic May support multiple devices Cons: Complex Synchronisation Interaction Management has no deeper application knowledge, can only interpret the document Higher standardisation effort needed Architecture may be not transparent for application developer

slide-15
SLIDE 15

Alcatel R&I 19- 7 - 2004 Page 15

Requirements for Standardisation

Multimodal Framework and Components System and Environment Definition Result proposition (EMMA) Support of Distributed Processing (DOM) Interface to Media Processing Modules (SpeechSc) Improved device descriptions and presence (DI, OMA) WebService Interface for component binding Interface on parallel devices Definition of modality independent dialogs and content

slide-16
SLIDE 16

Alcatel R&I 19- 7 - 2004 Page 16

Conclusion

Next Generation Networks will provide converged IT and communication access to a set of existing and new services and application Independence from the end-user device is must Multimodal Interfaces support the usability of such services and devices A network centric architecture offering On-Demand capabilities can support the multi device access Standardisation has to be continued, more interaction between the organisations might be needed to fulfil the common vision

slide-17
SLIDE 17

Alcatel R&I 19- 7 - 2004 Page 17

www.alcatel.com