F0 of Adolescent Speakers First Results for the German - - PowerPoint PPT Presentation

f0 of adolescent speakers
SMART_READER_LITE
LIVE PREVIEW

F0 of Adolescent Speakers First Results for the German - - PowerPoint PPT Presentation

F0 of Adolescent Speakers First Results for the German Ph@ttSessionz Database Chr. Draxler, F. Schiel, T. Ellbogen BAS Bavarian Archive of Speech Signals University of Munich, Germany Introduction previous f0 studies of adolescents


slide-1
SLIDE 1

F0 of Adolescent Speakers

First Results for the German Ph@ttSessionz Database

  • Chr. Draxler, F. Schiel, T. Ellbogen

BAS Bavarian Archive of Speech Signals University of Munich, Germany

slide-2
SLIDE 2

Introduction

  • previous f0 studies of adolescents
  • small numbers of speakers
  • limited and artificial speech material,

e.g. sustained vowels

  • no speech data available
  • forensic databases
  • not available for German
slide-3
SLIDE 3

Ph@ttSessionz: Goals

  • 1000 speakers
  • 50% male, 50% female (±5%)
  • 13-19 years
  • good dialect coverage
  • recorded via Internet in secondary schools
  • 22.05 kHz, 16 bit linear PCM, stereo
slide-4
SLIDE 4

Session Contents

item # item #

isolated digit 10 date 3 numbers 11-100 19 time 3 PC command phrases 12 directory assistance 9 telephone numbers 13 spelling 10 mobile phone keys 3 phonetically rich 30 credit card 3 spontaneous 5 PIN 3 narrative 2

slide-5
SLIDE 5

Session Contents

item # item #

isolated digit 10 date 3 numbers 11-100 19 time 3 PC command phrases 12 directory assistance 9 telephone numbers 13 spelling 10 mobile phone keys 3 phonetically rich 30 credit card 3 spontaneous 5 PIN 3 narrative 2

  • SpeechDat and RVG-I compatible
slide-6
SLIDE 6

Speaker Data

  • date of birth, sex, weight, height
  • dialect region (federal state at age 6)
  • mother tongue of speaker and family
  • smoking habits, dental braces, piercings
slide-7
SLIDE 7

F0 Analysis

  • pre-release version of the database
  • 762 speakers
  • ~ 49% f, 51% m
  • good age distribution
  • biased dialect region distribution
  • 90829 utterances
slide-8
SLIDE 8

F0 Calculation

  • Praat built-in algorithm
  • frequency 75-400 Hz
  • max candidates 15
  • silence/voicing threshold 0.03/0.45
  • octave/jump/voiced cost 0.01/0.35/0.14
  • f0 mean, min, max (in Hz and mel)
slide-9
SLIDE 9

F0mean vs. Age

0,00 50,00 100,00 150,00 200,00 250,00 13 14 15 16 17 18 19 m f

slide-10
SLIDE 10

F0 vs. BMI

mean f0 vs. BMI (female)

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 0,00 10,00 20,00 30,00 40,00 BMI Hz

mean f0 vs. BMI (male)

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 0,00 10,00 20,00 30,00 40,00 BMI Hz

slide-11
SLIDE 11

F0 Data

f0 single digit f

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 400,00 13 14 15 16 17 18 19 f0 min f0 max f0 mean

slide-12
SLIDE 12

F0 Data

f0 single digit m

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 400,00 13 14 15 16 17 18 19 min f0 max f0 mean f0

f0 single digit f

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 400,00 13 14 15 16 17 18 19 f0 min f0 max f0 mean

slide-13
SLIDE 13

F0 Data

f0 single digit m

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 400,00 13 14 15 16 17 18 19 min f0 max f0 mean f0

f0 spelling geographical name m

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 400,00 13 14 15 16 17 18 19 min f0 max f0 mean f0

f0 spelling geographical name f

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 400,00 13 14 15 16 17 18 19 f0 min f0 max f0 mean

f0 single digit f

0,00 50,00 100,00 150,00 200,00 250,00 300,00 350,00 400,00 13 14 15 16 17 18 19 f0 min f0 max f0 mean

slide-14
SLIDE 14

F0 Range

  • F0abs = F0max - F0min
  • F0rel = F0max / F0min
  • scale
  • absolute Hz scale
  • perception-based mel scale
slide-15
SLIDE 15

F0rel mel

0,00 0,50 1,00 1,50 2,00 2,50 3,00 3,50 digit number

  • n. geographical
  • n. company
  • n. person

command time PIN code date sentence telephone

  • sp. geographical
  • sp. arbitrary

mobile keys

  • sp. person

credit card short text long production

slide-16
SLIDE 16

Outlook

  • use final release of the database
  • 864 speakers
  • refine analysis
  • re-compute F0 for phrases
slide-17
SLIDE 17

Summary

  • Ph@ttSessionz database
  • largest database for adolescent speakers
  • technology development and research
  • statistically reliable voice data for German
  • F0 variation dependent on utterance class
slide-18
SLIDE 18

Summary

  • Ph@ttSessionz database
  • largest database for adolescent speakers
  • technology development and research
  • statistically reliable voice data for German
  • F0 variation dependent on utterance class?