Predicting Tongue Shapes From A Few Landmark Locations Chao Qin 1 , - PowerPoint PPT Presentation

Predicting Tongue Shapes From A Few Landmark Locations Chao Qin 1 , Miguel Á. Carreira-Perpiñán 1 , Korin Richmond 2 , Alan Wrench 3 , Steve Renals 2 1 EECS, School of Engineering, UC Merced, USA 2 Centre for Speech Technology Research, University of Edinbugh, UK 3 Queen Margaret University, Edinburgh, UK 1 Interspeech’08, Brisbane

Introduction • Tongue is the most important speech production articulator • Articulatory datasets only provide sparse representation of tongue. Wisconsin X-ray microbeam MOCHA • Questions 1. Are these 3 or 4 pellets sufficient to reconstruct the tongue shape? 2. How many are necessary for an accurate reconstruction? 2 3. Where to place them optimally?

Machine learning approach • Assume midsaggital contours • Collect a training set of tongue contours (ground truth) � � , . . . , � � ∈ � � • Predict a test contour from the location of pellets using a � � K nonlinear regression: � � � � � � • Estimate the mapping from the training set (least-square) � � � � � � � K �� 3

Data collection • Ultrasound data of tongue movement Midsagittal tongue contour Teeth shadow Hyoid bone shadow (front) (back) 4

Data collection • Ultrasound machine and head stabilization device (QMU) 5

Data collection • Tongue contour tracking – A difficult task due to noisy ultrasound images – Tongue parts are invisible from time to time – Our solution: automatic + manual correction • Automatic tracking by EdgeTrak ( Li et al’ 05 ), based on snake segmentation • Tongue contour dataset – One native English speaker with Scottish accent – 20 read TIMIT sentences – tongue contours and audio N � �� • Each contour = 2D position of 24 points � ∈ � � � �� 6

Reconstructing tongue shape from a few landmarks � ∈ � � � �� ∈ � � � � �� K � � • Unsupervised spline interpolation – Uses only information in the landmarks K – Smooth but easy to penetrate the palate or teeth, poor extrapolation • Supervised prediction: learn mapping using a training set � � � � � � – Linear prediction – Nonlinear prediction � � � � � � � � φ � � � � , φ � � � � � �� − � � � � � − � � � /σ � � � • We use Gaussian Radial Basis Function networks (RBF) – Universal mapping approximator – Simple and fast training 7

Experimental results F3 F97 F205 F428 F553 F711 F663 Frame 754 N−point contour Cubic B−spline RBFs K=3 landmarks 10 mm 10 mm 8

Experimental results by RBF prediction � � � • Landmarks : test each of the combinations, � P � �� , K � � , � , � , � � • Ignore unreasonable arrangements of landmarks – Divide the contour into consecutive segments K – Constrain each landmark to select points from one segment RMSE (mm) RMSE (mm) K Tongue position 9

Experimental results by spline interpolation • Run spline interpolation on the same landmarks’ locations as RBF • Worse than RBF prediction by an order of magnitude RMSE (mm) RMSE (mm) K Tongue position 10

Optimal locations of landmarks Practical rule: quasi-equidistant placement, more landmarks on the tongue tip 11

Conclusions • Using 3 or 4 landmarks is sufficient to predict the tongue shape by a nonlinear mapping with RMS error below 0.4mm • Nonlinear prediction can predict very realistic tongue shapes and is much more reliable than spline interpolation • Useful for determining optimal number and locations of landmarks for EMA and X-ray microbeam techniques • Small deviations from the optimal landmark locations increase the error only slightly • Approach applicable to reconstruct 3D tongue shapes if 3D data available • Future work – Speaker adaptation – Tongue contour animation for vocal tract visualization – Augment tongue pellets in MOCHA and X-ray datasets, eg. for articulatory inversion • Supported by NSF CAREER award IIS-0754089 and Marie Curie Early Stage Training Site EdSST (MESTCT-2005=020568) 12

Acknowledgement • Thanks D. Massaro and M. Cohen (UC Santa Cruz) for useful discussions 13

Predicting Tongue Shapes From A Few Landmark Locations Chao Qin 1 , - PowerPoint PPT Presentation

Predicting Tongue Shapes From A Few Landmark Locations Chao Qin 1 , Miguel . Carreira-Perpin 1 , Korin Richmond 2 , Alan Wrench 3 , Steve Renals 2 1 EECS, School of Engineering, UC Merced, USA 2 Centre for Speech Technology Research,

Landmark Landmark-based routing based routing Landmark Landmark-based routing based routing

Tongue-Palate Speech Pressure Appliance PURPOSES 1. Quantify contact pressure between the tongue

3/13/2012 Shapes, Inc. Modeling the Shapes, Inc. Business We have been hired to model the

Shapes, Inc. We have been hired to model the business objects of Shapes, Inc. Following are their

Landmark Map L11: Landmark Mapping Locations and uncertainties of n landmarks, with respect

Why We Like Iridium Whitney Tilson & Glenn Tongue Whitney Tilson & Glenn Tongue T2

Primary One Mother Tongue Mrs Wong Sujin HOD Mother Tongue 3 rd January 2017 JING SHAN PRIMARY

Primary 3 Mother Tongue Languages CHIJ Our Lady of the Nativity Simple in Virtue, Steadfast in

1 6 And the tongue is a fire, the very world of iniquity; the tongue is set among our members as

Where do we use maths in the classroom? Time What time is shown on the clock? 2D Shapes What

AR Idea Ardavan Mirhosseini Kirsti Langen Shapes Shapes to Scan Shapes for Game Gecko Shark

CS 557 Landmark Routing The Landmark Hierarchy: A New Hierarchy For Routing in Very Large

Event Shapes in t t and QCD Events @ LHC Using transverse, 3D Event Shapes in Multivariate

AAKASH NIHALANI PROJECT 1 2D shapes refer to shapes with length and width. This shape is flat and

SHAPES FORESIGHT EXERCISES Awareness Week Monday Foresight in SHAPES Fraunhofer INT

MEDIA KIT COVERAGE LOCATIONS English Language Locations Chinese Language Locations Melbourne

Direct estimation of fetal head circumference from ultrasound images based on regression CNN Jing

Introduction to Mobile Robotics Proximity Sensors Wolfram Burgard, Cyrill Stachniss, Maren

Data ta ove ver Sou ound Risks ks and and Chan Chance ces of of an an emerging C Com

Exploiting Environmental Properties for Wireless Localization and Location Aware Applications

Lipid-Modulating Effects of Evacetrapib, a Novel CETP Inhibitor, Administered as Monotherapy or

2018 FMA A Con onference erence Celina Makowski, MBA, CHCP , RHIT Manager, CPPD/CME

Three-Dimensional Modeling of Ultrasound Cancer Imaging Mohammad Daoud Introduction

Application of Reverse Time Migration (RTM) for ultrasound tomography problem V. Filatova V.

Predicting Tongue Shapes From A Few Landmark Locations Chao Qin 1 , - PowerPoint PPT Presentation

Predicting Tongue Shapes From A Few Landmark Locations Chao Qin 1 , Miguel . Carreira-Perpin 1 , Korin Richmond 2 , Alan Wrench 3 , Steve Renals 2 1 EECS, School of Engineering, UC Merced, USA 2 Centre for Speech Technology Research,

Landmark Landmark-based routing based routing Landmark Landmark-based routing based routing

Tongue-Palate Speech Pressure Appliance PURPOSES 1. Quantify contact pressure between the tongue

3/13/2012 Shapes, Inc. Modeling the Shapes, Inc. Business We have been hired to model the

Shapes, Inc. We have been hired to model the business objects of Shapes, Inc. Following are their

Landmark Map L11: Landmark Mapping Locations and uncertainties of n landmarks, with respect

Why We Like Iridium Whitney Tilson &amp; Glenn Tongue Whitney Tilson &amp; Glenn Tongue T2

Primary One Mother Tongue Mrs Wong Sujin HOD Mother Tongue 3 rd January 2017 JING SHAN PRIMARY

Primary 3 Mother Tongue Languages CHIJ Our Lady of the Nativity Simple in Virtue, Steadfast in

1 6 And the tongue is a fire, the very world of iniquity; the tongue is set among our members as

Where do we use maths in the classroom? Time What time is shown on the clock? 2D Shapes What

AR Idea Ardavan Mirhosseini Kirsti Langen Shapes Shapes to Scan Shapes for Game Gecko Shark

CS 557 Landmark Routing The Landmark Hierarchy: A New Hierarchy For Routing in Very Large

Event Shapes in t t and QCD Events @ LHC Using transverse, 3D Event Shapes in Multivariate

AAKASH NIHALANI PROJECT 1 2D shapes refer to shapes with length and width. This shape is flat and

SHAPES FORESIGHT EXERCISES Awareness Week Monday Foresight in SHAPES Fraunhofer INT

MEDIA KIT COVERAGE LOCATIONS English Language Locations Chinese Language Locations Melbourne

Direct estimation of fetal head circumference from ultrasound images based on regression CNN Jing

Introduction to Mobile Robotics Proximity Sensors Wolfram Burgard, Cyrill Stachniss, Maren

Data ta ove ver Sou ound Risks ks and and Chan Chance ces of of an an emerging C Com

Exploiting Environmental Properties for Wireless Localization and Location Aware Applications

Lipid-Modulating Effects of Evacetrapib, a Novel CETP Inhibitor, Administered as Monotherapy or

2018 FMA A Con onference erence Celina Makowski, MBA, CHCP , RHIT Manager, CPPD/CME

Three-Dimensional Modeling of Ultrasound Cancer Imaging Mohammad Daoud Introduction

Application of Reverse Time Migration (RTM) for ultrasound tomography problem V. Filatova V.

Why We Like Iridium Whitney Tilson & Glenn Tongue Whitney Tilson & Glenn Tongue T2