Automatic Speech Recognition (CS753) Automatic Speech Recognition - PowerPoint PPT Presentation

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 4: WFST algorithms contd. + WFSTs in ASR Instructor: Preethi Jyothi August 3, 2017

Qv iz-1 Postmortem Common Mistakes: • Correct Incorrect Missing insertion/deletion   • in E.fst a) E.fst Forgot to mark final   • states/self-loops b) T.fst Output vocabulary for   • T.fst has to be complete   words, “bad”, “bead”, etc.   0 10 20 30 40 50 rather than le tu ers

Project Proposal Start brainstorming! • In case of doubt, discuss potential ideas with me during my • o ff ice hours (Thur, 5:00 pm to 6:30 pm) Once decided, you will have to fill out a form specifying: • Title of the project • Names/roll numbers of all project members • A 300-400 word abstract of the proposed project • Due by 11:59 pm on Aug 14th •

�� Composition: Recap If T 1 transduces x to z ,   • and T 2 transduces z to y ,   then T 1 ○ T 2 transduces x to y Note: output alphabet of T 1 ⊆ input alphabet of T 2 • E.g. If T 1 removes punctuation symbols from a string, and T 2 changes   • uppercase le tu ers to lowercase le tu ers, then T 1 ⚬ T 2 brings about   both changes

Determinization: Recap A (W)FST is deterministic if: • Unique start state • No two transitions from a state share the same input label • No epsilon input labels • Not all WFSAs can be determinized •

�� Determinization: Weighted FSA Some Weighted -FSAs are not determinizable! [M97] 1 0 3 2 Weight of string �� n � = n and weight of �� n � = 2 n A fu er seeing �� n an FSA can’t remember n [M97] M. Mohri. Finite-State Transducers in Language and Speech Processing. Computational Linguistics, 23(2), 1997

Determinization: Recap A (W)FST is deterministic if: • Unique start state • No two transitions from a state share the same input label • No epsilon input labels • Not all WFSAs can be determinized • Guaranteed to yield a deterministic WFSA under some technical • conditions characterising the automata (e.g. twins property)

� � � � � � � � � � � � Minimization Minimization : find an equivalent deterministic FSA with the least number of states (and transitions) Unweighted FSAs have a unique minimal FSA [Aho74] 1 12 0 3 0 3 2 Obtained by identifying and merging equivalent states Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The design and analysis of computer algorithms. Addison Wesley, 1974.

�� Minimization: Weighted FSA Two states are equivalent only if for every input string, the outcome — weight assigned to the string, if accepted — starting from the two states are the same 1 12 0 3 0 3 2 Redistribute weights before identifying equivalent states

�� Minimization: Weighted FSA Reweighting OK as long as resulting WFSA is equivalent Can reweight using a “potential function” on states +2 2 -2 1 1 0 0 -2 +1 3 0 3 0 +1 -1 2 -1 2 1 “Weight pushing”: Reweighting using a potential function that optimally moves weights towards the start state

�� Minimization: Weighted FSA A fu er weight-pushing, can simply apply unweighted FSA minimization (treating label/weight as label) 1 3 0 12 3 0 2 Guaranteed to yield a minimal WFSA (under some technical conditions required for weight-pushing)

Toolkits to work with finite-state machines AT&T FSM Library (no longer supported)   • h tu p://www3.cs.stonybrook.edu/~algorith/implement/fsm/ implement.shtml RWTH FSA Toolkit   • h tu ps://www-i6.informatik.rwth-aachen.de/~kanthak/fsa.html Carmel   • h tu ps://www.isi.edu/licensed-sw/carmel/ MIT FST Toolkit   • h tu p://people.csail.mit.edu/ilh/fst/ OpenFST Toolkit (actively supported)   • h tu p://www.openfst.org/twiki/bin/view/FST/WebHome

Brief Introduction to OpenFst

�� Qv ick Intro to OpenFst (www.openfst.org) a �� “ 0 ” � l a b e l � i s � r e s e r v e d � f o r � e p s i l o n 0 1 2 an �� 0 1 an a <eps> 0 Input   1 2 <eps> n an 1 alphabet   (in.txt) 0 2 a a a 2 1 2 <eps> 0 Output   a 1 alphabet   A.txt (out.txt) n 2

�� Qv ick Intro to OpenFst (www.openfst.org) a �� 2/0.1 0 1 an �� 0 1 an a 0.5 1 2 <eps> n 1.0 0 2 a a 0.5 1 2 0.1

Compiling & Printing FSTs The text FSTs need to be “compiled” into binary objects before further use with OpenFst utilities Command used to compile: • fstcompile --isymbols=in.txt --osymbols=out.txt A.txt A.fst Get back the text FST using a print command with the binary file: • fstprint --isymbols=in.txt --osymbols=out.txt A.fst A.txt

Drawing FSTs Small FSTs can be visualized easily using the draw tool: fstdraw --isymbols=in.txt --osymbols=out.txt A.fst | dot -Tpdf > A.pdf 1 <eps>:n an:a 0 2 a:a

FSTs can get very large!

WFSTs applied to ASR

WFST-based ASR System Acoustic   Context   Pronunciation   Language   Models Transducer Monophones Model Model Acoustic   Word   Triphones Words Indices Sequence

WFST-based ASR System Acoustic   Context   Pronunciation   Language   Models Transducer Monophones Model Model Acoustic   Word   Triphones Words Indices Sequence H a/a_b f 4 : ε f 1 : ε f 3 : ε f 5 : ε f 0 :a: a_b f 2 : ε f 4 : ε f 6 : ε } b/a_b FST Union + One 3-state   Closure HMM for   Resulting . each   FST . triphone H . x/y_z

WFST-based ASR System Acoustic   Context   Pronunciation   Language   Models Transducer Monophones Model Model Acoustic   Word   Triphones Words Indices Sequence C x:x/ ε _ ε y:y/ ε _x x:x/ ε _y x:x/y_x x:x/y_ ε ε ,* x:x/y_y y,x x, ε x:x/x_x x:x/ ε _x y:y/x_x x:x/x_y x,y x,x y:y/x_y y:y/y_x y:y/y_y y,y y:y/y_ ε y:y/x_ ε y, ε x:x/x_ ε y:y/ ε _y y:y/ ε _ ε C -1 : Arc labels: “monophone : phone / le fu -context_right-context” Figure reproduced from “Weighted Finite State Transducers in Speech Recognition”, Mohri et al., 2002

WFST-based ASR System Acoustic   Context   Pronunciation   Language   Models Transducer Monophones Model Model Acoustic   Word   Triphones Words Indices Sequence L (a) t: ε /0.3 ax: ε /1 ey: ε /0.5 2 3 4 dx: ε /0.7 ae: ε /0.5 d:data/1 1 0 d:dew/1 uw: ε /1 5 6 (b) Figure reproduced from “Weighted Finite State Transducers in Speech Recognition”, Mohri et al., 2002

WFST-based ASR System Acoustic   Context   Pronunciation   Language   Models Transducer Monophones Model Model Acoustic   Word   Triphones Words Indices Sequence G are/0.693 walking birds/0.404 the 0 were/0.693 animals/1.789 is boy/1.789

Automatic Speech Recognition (CS753) Automatic Speech Recognition - PowerPoint PPT Presentation

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 4: WFST algorithms contd. + WFSTs in ASR Instructor: Preethi Jyothi August 3, 2017 Qv iz-1 Postmortem Common Mistakes: Correct Incorrect Missing

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 23: Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 12: Acoustic

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 4: WFSTs in ASR

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 21: Speaker

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 24: Statistical

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 22: Speaker

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 20:

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 10: Deep Neural

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 8: Hidden

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 14: Language

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 19: Search,

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 11: Recurrent

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 15: Language

Cafeteria Mid-Year Election Changes Laurie Brophy Del Horton Making Mid-Year Election Changes

Lake Park Elementary School Student Advisory Council SAC February 26, 2020 6:00 pm Lake Park

Who We Are Our mission at CFED is to make it possible for

Fee Simple Absolute (FSA): Alienable Inheritable Devisable X s FSA t g Crea5on of

Bac ackpac ack Guide de to ELA A Stan andards dards How can you help your b child in

Formal Languages Philippe de Groote 2018-2019 Philippe de Groote Formal Languages 2018-2019 1

Sustaining Your Research Credit Cost Allocations: Making the Most of What You Have William A.

Software Component Protocol Inference Tao Xie General Examination Presentation Dept. of