computational
play

COMPUTATIONAL PARALINGUISTICS AND WHAT WE MIGHT GET FROM PHONETICS - PowerPoint PPT Presentation

COMPUTATIONAL PARALINGUISTICS AND WHAT WE MIGHT GET FROM PHONETICS / SPEECH SCIENCE Anton Batliner May 7th, 2015 NSF, Arlington The Topic Paralinguistics: not what but how the person(s) behind The Interspeech Computational Paralinguistic


  1. COMPUTATIONAL PARALINGUISTICS AND WHAT WE MIGHT GET FROM PHONETICS / SPEECH SCIENCE Anton Batliner May 7th, 2015 NSF, Arlington

  2. The Topic Paralinguistics: not what but how  the person(s) behind The Interspeech Computational Paralinguistic Challenges ● 2009: emotion (childrens' speech) ● 2010: age & gender, affect (level of interest) ● 2011: intoxication (+/- alcoholised), sleepiness ● 2012: personality (big 5), likability, pathology ● 2013: social signals, conflict, emotion, autism ● 2014: physical load, cognitive load ● 2015: degree of nativeness, Parkinson's condition, eating condition The Book Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing Björn Schuller & Anton Batliner, 344 pages, 2014, Wiley. Anton Batliner 2

  3. Cultural Clashes Phonetics (Speech Science) Speech Processing phonetics/knowledge-based interpretation: we don't really know what's brute force : we don't know what's happening happening because: only what we are looking for is what we get. but we know how good we can be (roughly). data small, laboratory, controlled large, real-life manual (labels, segmentation) automatic pre-processing few many, brute forcing, MFCC features (low resolution, high generalisation) (high resolution) processing basic , (M)anova, mixed models ML / Pattern Recognition inferential, statistics (fusion of) classifiers / regression  significance  effect size driving force description, explanation, performance, models applications both: what can we model, convey, teach? Anton Batliner 3

  4. What to do: CP Challenges  challenges ● ML procedures, multi-modality, acoustic normalisation ● cross-corpus /language/culture databases ● speaker normalisation/adaptation  severe wrong assignments ● confusions: hits vs. ● 'most important' features (from phonetics) ● hybrid approach: same constellation, a few features based on tradition / phonetic evidence vs. brute force feature sets with/without feature reduction/selection ● interests: performance, interpretation, usability in applications ● loudness in Parkinson's Condition – primary feature, to teach ● speech tempo in non-nativeness – secondary feature, not to teach ● speaker overlap in conflict – primary but : different cultures! – to teach ● variability in depression or autism – cover feature, maybe to teach Anton Batliner 4

  5. Features: Hybrid approach performance brute force huge feature processing vector x interpretation ? ? performance phonetics processing y interpretation hand-picked, few features processing performance x huge feature hybrid vector processing interpretation y=x phonetic knowledge usability in applications

  6. A Bandanna Approach Thank you for your attention Anton Batliner 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend