HTK Version 3.4 Features (cont) Mark Gales, Andrew Liu & Phil - PowerPoint PPT Presentation

HTK Version 3.4 Features (cont) Mark Gales, Andrew Liu & Phil Woodland 19th April 2007 HTK3 Development Team Cambridge University Engineering Department HTK users meeting ICASSP’07

HTK Version 3.4 HTK Large Vocabulary Decoder - HDecode • Basic Features: – bi-gram or tri-gram full decoding – lattice generation – lattice rescoring and alignment • Supporting many other HTK Features: – fully integrated with adaptation schemes – STC and HLDA – lattice generation for discriminative training • Typical use in a multi-pass system • Limitations and Future Development HTK V3 Project HTK users meeting ICASSP’07 1 Cambridge University

HTK Version 3.4 HDecode: Basic Features (1) • Tree strutured network based beam search cross-word trip-hone decoder. • Effective pruning techniques to constrain search space: – main search beam – word end beam – maximum active model – lattice beam – LM back-off beam • Efficient likelihood computation during decoding: – state and/or component output probability caching – language model probability caching • Token sets merging and LM score look-ahead during propagation HTK V3 Project HTK users meeting ICASSP’07 2 Cambridge University

HTK Version 3.4 HDecode: Basic Features (2) HDecode performs search using a model level network expanded from a dictionary and a finite state grammar constructed from a word based bi-gram or tri-gram model, as in full decoding : • 1-best transcription stored in HTK MLF format. • word lattices may be generated in HTK SLF format with – detailed timing – word level scores (acoustic, LM and pron) – LM and pron prob scaling factors – other model specific information • Higher order N-gram models applicable to resulting lattices (HLRescore). HTK V3 Project HTK users meeting ICASSP’07 3 Cambridge University

HTK Version 3.4 HDecode: Basic Features (3) or word lattices marked with LM scores, as in lattice rescoring . • HDecode outputs “word lattices” containing duplicate word paths of – different pronunciation variants - “contrapoint” – silence related different phone contexts - “fugue” • determinization of word lattices required prior to rescoring (HLRescore). • 1-best hypothesis and lattices generated as in full decoding. • model level alignment may also be generated in resulting lattices: – model alignment and duration marked on lattice arcs – important for discriminative training HTK V3 Project HTK users meeting ICASSP’07 4 Cambridge University

HTK Version 3.4 HDecode: Supported new HTK Features • A variety forms of linear transformations for adaptation: – MLLR transforms – CMLLR transforms – covariance transforms – hierarchy of linear transformations • Covariance modeling and linear projection schemes: – STC – HLDA • Lattice generation for discriminative training: – denominator word lattices generation – numerator and denominator lattices model alignment HTK V3 Project HTK users meeting ICASSP’07 5 Cambridge University

HTK Version 3.4 HDecode: Typical use in a multi-pass system Lattice • Upadapted tri-gram decoding plus CN 4-gram rescoring to generate initial 1−best Segmentation hypotheses with tight pruning. Initial transcription • Bi-gram or tri-gram adapted full Normalisation decoding to generate word lattices Adaptation with wide pruning. Lattice generation • Lattice expansion and pruning using Lattices more complicated LMs (HLRescore). Adapt Adapt • Lattice rescoring using re-adapted P3a P3x more complicated acoustic models CNC and system combination. HTK V3 Project HTK users meeting ICASSP’07 6 Cambridge University

HTK Version 3.4 HDecode: Limitations and Future Development • Known limitations are: – only works for cross-word tri-phones; – sil and sp symbols reserved for silence models; – appended to all words in pronunciation dictionary; – lattices generated require determinization for rescoring; – only batch mode adaptation supported. • Possible future work areas: – fast Gaussian likelihood computation? – more efficient token pruning? – incremental adaptation? HTK V3 Project HTK users meeting ICASSP’07 7 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools • Basic Features: – MMI – MPE and MWE – efficient lattice based implementation • Supporting many other HTK Features: – fully integrated with adaptation schemes – discriminative MAP – lattice based adaptation – single pass re-train using new front-ends • Typical procedure of building discriminatively trained models HTK V3 Project HTK users meeting ICASSP’07 8 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools: Training Criteria Two types of discriminative training criteria supported: • maximum mutual information (MMI) � F ( λ ) = log P ( W r |O r , λ ) r • minimum Bayes risk (MBR) P ( ˜ W r |O r , λ ) A ( W , ˜ � F ( λ ) = W ) r, ˜ W with error cost function A ( W , ˜ W ) computed on – phone model level - minimum phone error (MPE) – word level - minimum word error (MWE) HTK V3 Project HTK users meeting ICASSP’07 9 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools: Basic Procedure Ref HLRescore Num Lat LM Audio Den Lat HDecode HLRescore ML AM HMMIRest MPE AM HTK V3 Project HTK users meeting ICASSP’07 10 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools: I-smoothing Flexible use of prior information for parameter smoothing: • Common priors used in I-smoothing: – ML statistics – MMI statistics – Static model based priors – hierarchy of smoothing statistics back-off – important for MPE/MWE training to generalize well • Applicable to a variety of systems: – useful in discriminative MAP training – gender dependent HMMs – cluster adaptively trained HMMs (CAT) – STC/HLDA models HTK V3 Project HTK users meeting ICASSP’07 11 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools: Lattice Implementation Two sets of model marked lattices required: • numerator lattices: from reference transcription • denominator lattices: from full recognition using weak LM Efficient lattice level forward-backward algorithm benefits from: • support of flexible sharing of model parameters • state and Gaussian level output probability caching • Gaussian frame occupancy caching • fixed phone boundary model internal re-alignment - “ Exact Match ” • batch I/O access of lattices as merged lattice label files (LLF) HTK V3 Project HTK users meeting ICASSP’07 12 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools: Std Configurations Useful common configuration variables: • E : constant used in EBW update, e.g., 2.0 • LATPROBSCALE : acoustic scaling by LM score inverse, e.g., 1/13 • ISMOOTH { TAU,TAUT,TAUW } : I-smoothing constants, e.g., 50/1/1 for MPE • PRIOR { TAU,TAUT,TAUW,K } : static prior, e.g., 25/10/10/1, for MPE-MAP • PHONEMEE : MWE or MPE training • EXACTCORRECTNESS : “Exact” or approximate error in MPE/MWE • MMIPRIOR : use MMI prior HTK V3 Project HTK users meeting ICASSP’07 13 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools: Supported HTK Features & Limitations Many other useful HTK features are supported: • multi-streams, tied-mixtures and parameter tying • a variety of adaptation schemes, e.g., MMI/MPE-SAT • lattice based adaptation • single pass re-train using new front-ends, e.g., bandwidth specific models Know limitations are: • only diagonal covariance HMMs supported • Gaussian means and variances tied on the same level HTK V3 Project HTK users meeting ICASSP’07 14 Cambridge University

HTK Version 3.4 HTK Discriminative Training Tools: General procedure reference HDecode transcripts HMMIRest word HLRescore lattices HTKLM uni−gram or heavily pruned bi−gram LM speech speech MLE audio audio model numerator word lattices lattices deterministic lattices denominator MPE lattices model MLE model HTK V3 Project HTK users meeting ICASSP’07 15 Cambridge University

HTK Version 3.4 Thank you! HTK V3 Project HTK users meeting ICASSP’07 16 Cambridge University

HTK Version 3.4 Features (cont) Mark Gales, Andrew Liu & Phil - PowerPoint PPT Presentation

HTK Version 3.4 Features (cont) Mark Gales, Andrew Liu & Phil Woodland 19th April 2007 HTK3 Development Team Cambridge University Engineering Department HTK users meeting ICASSP07 HTK Version 3.4 HTK Large Vocabulary Decoder - HDecode

Introduction to HTK Toolkit Berlin Chen 2004 Reference: - Steve Young et al. The HTK Book .

Introduction to The HTK Toolkit Hsin-min Wang Reference: - The HTK Book Outline An Overview

Century SAGA Century SAGA Version 7.6 / Version 7.6 / Version 8.2 Version 8.2 Purpose

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Acoustic Modeling for Speech Recognition Berlin Chen 2004 References: 1. X. Huang et. al. Spoken

Acoustic Modeling for Speech Recognition Berlin Chen 2003 References: 1. X. Huang et. al.,

Minimal ConT EXt Distribution Mojca Miklavec, BachoT EX 2008 Specifics of ConT EXt

1 2 3 State R&D Graphic, Version 1 Version 1 4 State R&D Graphic, Version 1,

Migration to ConT EXt? First experience with ConT EXt typesetting Tom Hla KONVOJ

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

CS171 Introduction to Computer Science II Recursion (cont.) + MergeSort Recursion (cont.) +

A General Artificial Neural Network Extension for HTK Chao Zhang & Phil Woodland University

CU-HTK April 2002 Switchboard System Phil Woodland, Gunnar Evermann, Mark Gales, Thomas Hain,

Fonctionnalits de la version 11 Nouveauts de la version 12 Version 11 and version 12 in a

Ian Cross Centre for Music & Science University of Cambridge http://www.mus.cam.ac.uk/~ic108

Advanced hash function design: inside Keccak Guido Bertoni 1 Joan Daemen 1 Michal Peeters 2

Music Informatics Alan Smaill Jan 29, 2018 Alan Smaill Music Informatics Jan 29, 2018 1/1

The Two Nucleon System in Chiral Effective Field Theory: Searching for the Power Counting M.

Keccak and SHA-3: code and standard updates Guido Bertoni 1 Joan Daemen 1 Michal Peeters 2

The Hitchhikers Guide to the SHA-3 Competition Orr Dunkelman Computer Science Department

Hash function design and MD2, MD4, MD5 Title of Presentation SHA-512 SHA-1 cryptanalysis:

Keccak From: 1.The Keccak reference 2. Keccak and the SHA-3 Standardization written by ! Guido

HTK Version 3.4 Features (cont) Mark Gales, Andrew Liu & Phil - PowerPoint PPT Presentation

HTK Version 3.4 Features (cont) Mark Gales, Andrew Liu & Phil Woodland 19th April 2007 HTK3 Development Team Cambridge University Engineering Department HTK users meeting ICASSP07 HTK Version 3.4 HTK Large Vocabulary Decoder - HDecode

Introduction to HTK Toolkit Berlin Chen 2004 Reference: - Steve Young et al. The HTK Book .

Introduction to The HTK Toolkit Hsin-min Wang Reference: - The HTK Book Outline An Overview

Century SAGA Century SAGA Version 7.6 / Version 7.6 / Version 8.2 Version 8.2 Purpose

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Acoustic Modeling for Speech Recognition Berlin Chen 2004 References: 1. X. Huang et. al. Spoken

Acoustic Modeling for Speech Recognition Berlin Chen 2003 References: 1. X. Huang et. al.,

Minimal ConT EXt Distribution Mojca Miklavec, BachoT EX 2008 Specifics of ConT EXt

1 2 3 State R&amp;D Graphic, Version 1 Version 1 4 State R&amp;D Graphic, Version 1,

Migration to ConT EXt? First experience with ConT EXt typesetting Tom Hla KONVOJ

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

CS171 Introduction to Computer Science II Recursion (cont.) + MergeSort Recursion (cont.) +

A General Artificial Neural Network Extension for HTK Chao Zhang &amp; Phil Woodland University

CU-HTK April 2002 Switchboard System Phil Woodland, Gunnar Evermann, Mark Gales, Thomas Hain,

Fonctionnalits de la version 11 Nouveauts de la version 12 Version 11 and version 12 in a

Ian Cross Centre for Music &amp; Science University of Cambridge http://www.mus.cam.ac.uk/~ic108

Advanced hash function design: inside Keccak Guido Bertoni 1 Joan Daemen 1 Michal Peeters 2

Music Informatics Alan Smaill Jan 29, 2018 Alan Smaill Music Informatics Jan 29, 2018 1/1

The Two Nucleon System in Chiral Effective Field Theory: Searching for the Power Counting M.

Keccak and SHA-3: code and standard updates Guido Bertoni 1 Joan Daemen 1 Michal Peeters 2

The Hitchhikers Guide to the SHA-3 Competition Orr Dunkelman Computer Science Department

Hash function design and MD2, MD4, MD5 Title of Presentation SHA-512 SHA-1 cryptanalysis:

Keccak From: 1.The Keccak reference 2. Keccak and the SHA-3 Standardization written by ! Guido

1 2 3 State R&D Graphic, Version 1 Version 1 4 State R&D Graphic, Version 1,

A General Artificial Neural Network Extension for HTK Chao Zhang & Phil Woodland University

Ian Cross Centre for Music & Science University of Cambridge http://www.mus.cam.ac.uk/~ic108