Prosody Basics
ECE 596D/LING 580G – Conversational AI Trang Tran University of Washington
Prosody Basics ECE 596D/LING 580G Conversational AI Trang Tran - - PowerPoint PPT Presentation
Prosody Basics ECE 596D/LING 580G Conversational AI Trang Tran University of Washington Agenda Announcements: Final presentations + demo (15 mins); poster session Monday, June 10, ECE 303, 2-4pm Amazon guests
ECE 596D/LING 580G – Conversational AI Trang Tran University of Washington
2
3
written words
“Mary knows many languages (that) you know.” (syntax)
4
elements in utterance
utterance
èAcoustic cues individually and in combination signal prominence and phrasing
emphasis
phrase boundaries
è Mapping between acoustic & symbolic levels is complex; challenging to annotate
5
6
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-911-transcribing-prosodic- structure-of-spoken-utterances-with-tobi-january-iap-2006/lecture-notes/chapter2_3/ From: Common annotation system: ToBI Sequence of H(igh) & L(ow) tones Break indices: 0-4
7
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-911-transcribing-prosodic- structure-of-spoken-utterances-with-tobi-january-iap-2006/lecture-notes/chapter2_3/ From: Common annotation system: ToBI Sequence of H(igh) & L(ow) tones Break indices: 0-4
boundaries (Grosjean et al., 1979)
8
Mary knows many languages you know Mary knows many languages you know [pause] [reduced] [prominent]
vs.
structure of a sentence
text
repairs
annotations (ToBI)
9
ROOT S NP NNP Mary VP VBZ knows NP JJ many NNS languages . .
Output: Input:
Mary knows many languages. [she knew] mary knows many uh languages
Input with disfluencies:
boundaries (Grosjean et al., 1979)
1967; Huang & Hirschberg, 2015)
10
Mary knows many languages Mary knows many languages
vs.
boundaries (Grosjean et al., 1979)
1967; Huang & Hirschberg, 2015)
11
Useful for understanding structure (parsing) Useful for generation (concept-to- speech)
12
context independent predefined schemata intensive signal processing; prone to distortion available in most commercial systems
(acoustics) signals
to words
13
14
15
speech rate, voice, etc.
synthesis-markup-language-ssml-reference.html
reference-interjections-english-us.html
16
17
18
with syntax
success
always available in low socio-economic communities
expressive prosody
19