Tandem modeling investigations Dan Ellis International Computer - PowerPoint PPT Presentation

Tandem modeling investigations Dan Ellis International Computer Science Institute, Berkeley CA <dpwe@icsi.berkeley.edu> Outline 1 What makes Tandem successful? 2 Can we make Tandem better? 3 Does Tandem work with LVCSR tricks? Tandem investigations - Dan Ellis 2001-01-25 - 1

What makes Tandem work? 1 (with Manuel Reyes) Combo over msg: +20% plp Neural net classifier C 0 h# pcl bcl C 1 tcl dcl C 2 C k t n t n+w Gauss mix HTK PCA models decoder Pre-nonlinearity over orthog'n + posteriors: +12% Input sound s ah t msg Neural net Words Combo-into-HTK over KLT over classifier Combo-into-noway: direct: h# C 0 pcl bcl C 1 tcl +15% dcl +8% C 2 C k t n Combo over plp: t n+w +20% Combo over mfcc: NN over HTK: Tandem over HTK: Tandem over hybrid: +25% +15% +35% +25% Tandem combo over HTK mfcc baseline: +53% • Model diversity? - try a phone-based GMM model - try training the NN model to HTK state labels • Discriminative network training? - (try posteriors from GMM & Bayes) Tandem investigations - Dan Ellis 2001-01-25 - 2

Phone vs. word models Neural net Gauss mix KLT HTK plp classifier models orthog'n decoder h# C 0 pcl bcl C 1 tcl dcl Input C 2 Words s ah t C k sound t n t n+w Trained on Trained to phoneme targets subword states • Try a phone-based HTK model (instead of whole-word models) • Try training NN model to subword-state labels - 181 net outputs; reduce to 40 in KLT • Results (Aurora2k, HTK-baseline WER ratio): System test A: matched test B: var noise test C: var chan Tandem PLP baseline 63.5% 70l.3% 59.5% Phone-based HTK sys 63.6% 72.5% 61.5% Subword-based NN sys 63.1% 62.8% 55.1% • Diversity doesn’t help - subword units may be good for NN Tandem investigations - Dan Ellis 2001-01-25 - 3

Enhancements to Tandem-Aurora 2 • More tandem-feature-domain processing: Neural net Gauss mix KLT classifier models norm / orthog'n norm / deltas? deltas? h# C 0 pcl bcl C 1 tcl dcl C 2 C k t n t n+w • Results (HTK baseline WER ratio): System test A: matched test B: var noise test C: var chan PLP: Tandem baseline 63.5% 70l.3% 59.5% PLP: norm - KLT 72.6% 71.2% 63.6% PLP: KLT - norm 57.8% 58.8% 51.3% PLP: KLT - delta 59.0% 60.2% 52.9% PLP: KLT - delta - norm 58.1% 59.9% 48.9% PLP: delta - KLT - norm 54.7% 53.6% 46.9% - delta-KLT-norm: 80% Tdm baseline WER Tandem investigations - Dan Ellis 2001-01-25 - 4

Best effort Tandem system • Deltas & norms help PLP: try on combo (PLP+MSG) system : System test A: matched test B: var noise test C: var chan PLP+MSG: baseline 51.1% 52.0% 45.6% PLP+MSG: dlt-KLT-nrm 50.9% 50.5% 43.6% PLP+MSG: KLT-nrm 48.3% 49.5% 39.4% - deltas hurt for MSG: features too sluggish? • Deltas help clean, norms help noisy: 7 70 baseline K-D K-N 6 60 5 50 WER / % WER / % 40 4 3 30 2 20 10 1 0 0 -5 0 5 10 15 20 clean SNR / dB Tandem investigations - Dan Ellis 2001-01-25 - 5

Tandem for LVCSR: the SPINE task 3 (with Rita Singh/CMU & Sunil Sivadas/OGI) • Noisy spontaneous speech, ~5000 word vocab • Recognition: PLP feature Neural net Tandem SPHINX calculation classifier 1 feature recognizer calculation C 0 h# pcl bcl C 1 tcl dcl C 2 C k t n t n+w PCA GMM HMM classifier decoder decorrelation + Pre-nonlinearity Input outputs Subword Words sound s ah t likelihoods MSG feature Neural net calculation classifier 2 MLLR adaptation h# C 0 pcl bcl C 1 tcl dcl C 2 C k t n t n+w - same tandem features - NN training from Broadcast News boot + iterate - GMM-HMM has context-dependence, MLLR Tandem investigations - Dan Ellis 2001-01-25 - 6

SPINE-Tandem results • Evaluation WER results: Features (dimensions) CI system CD system CD + MLLR MFCC + d + dd (39) 69.5% 35.1% 33.5% Tandem features (56) 47.6% 35.7% 32.8% - much better for CI systems - differences evaporate with CD, MLLR • Not quite fair: - CD senones optimized for MFCC - worth 2-3% absolute? • Not unexpected: - NN confounds CD variants - Tandem ‘space’ very nonlinear - bad for MLLR • Any hope? - more training data / train CD classes / ... Tandem investigations - Dan Ellis 2001-01-25 - 7

Tandem modeling investigations Dan Ellis International Computer - PowerPoint PPT Presentation

Tandem modeling investigations Dan Ellis International Computer Science Institute, Berkeley CA <dpwe@icsi.berkeley.edu> Outline 1 What makes Tandem successful? 2 Can we make Tandem better? 3 Does Tandem work with LVCSR tricks?

DNA Short Tandem Repeats Organism DNA Short Tandem Repeats Organ DNA Short Tandem Repeats Cell

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

RESPITE: Tandem & multistream research Dan Ellis International Computer Science Institute,

Modeling Wind Shielding for FPSO Tandem Offloading using CFD Bob Gordon, Granherne Satpreet

Samantha Ellis Bournemouth Beach , 2016 Samantha Ellis, The amazement of sea and sky, 2017. Oil on

Modeling Meeting Turns Dan Ellis <dpwe@ee.columbia.edu> LabROSA, Columbia University &

Tandem Nishita Muhnot | Kevin Scott | Tiffany Tsai | Ari Zilnik Whats Tandem? The

Tandem bike for autistic person (Team Tandem) Team Members: Client: Callie Mataczynski - Team

Orientations bipolaires et chemins tandem Eric Fusy (CNRS/LIX) Travaux avec Mireille

The Potential of Tandem Photovoltaic Solar Cells Tandem Photovoltaic Solar Cells for Indoor

Inspections and Inspections and Inspections and Investigations Investigations Investigations

ELLIS ACT ANALYSIS Causation, Factors Which Contribute to Ellis Withdrawal, and Possible

Patrick Clavin Chief Bureau Officer Full Preliminary Total Investigations Investigations

On and Off-Blockchain Enforcement Of Smart Contracts Dr Ellis Solaiman Ellis.Solaiman@ncl.ac.uk

Status of the TanDEM-X Mission Irena Hajnsek*, Daniel Schulze, Thomas Busche, Manfred Zink,

Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative

Lecture 2 Signal Processing and Dynamic Time Warping Michael Picheny, Bhuvana Ramabhadran,

Discrete Probabilistic Programming from First Principles Guy Van den Broeck The 6 th Workshop on

Efficient inference in discrete and continuous domains for PLP languages under the Distribution

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/Dida2ca/PLP-14/ Prof.

Heterogeneous Networks Jie Tang, Tiancheng Lou, and Jon Kleinberg + *Tsinghua University +

THE ENLIGHTENMENT: Still Burning Bright FERNANDO N ZIALCITA, PhD DEPARTMENT OF SOCIOLOGY AND

ATraPos: Adaptive Transaction Processing on Hardware Islands Danica Porobic , Erietta Liarou,

AGILE Speech to Text (STT) Contributors: BBN: Long Nguyen, Tim Ng, Kham Nguyen, Rabih Zbib, John

Tandem modeling investigations Dan Ellis International Computer - PowerPoint PPT Presentation

Tandem modeling investigations Dan Ellis International Computer Science Institute, Berkeley CA <dpwe@icsi.berkeley.edu> Outline 1 What makes Tandem successful? 2 Can we make Tandem better? 3 Does Tandem work with LVCSR tricks?

DNA Short Tandem Repeats Organism DNA Short Tandem Repeats Organ DNA Short Tandem Repeats Cell

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

RESPITE: Tandem &amp; multistream research Dan Ellis International Computer Science Institute,

Modeling Wind Shielding for FPSO Tandem Offloading using CFD Bob Gordon, Granherne Satpreet

Samantha Ellis Bournemouth Beach , 2016 Samantha Ellis, The amazement of sea and sky, 2017. Oil on

Modeling Meeting Turns Dan Ellis &lt;dpwe@ee.columbia.edu&gt; LabROSA, Columbia University &amp;

Tandem Nishita Muhnot | Kevin Scott | Tiffany Tsai | Ari Zilnik Whats Tandem? The

Tandem bike for autistic person (Team Tandem) Team Members: Client: Callie Mataczynski - Team

Orientations bipolaires et chemins tandem Eric Fusy (CNRS/LIX) Travaux avec Mireille

The Potential of Tandem Photovoltaic Solar Cells Tandem Photovoltaic Solar Cells for Indoor

Inspections and Inspections and Inspections and Investigations Investigations Investigations

ELLIS ACT ANALYSIS Causation, Factors Which Contribute to Ellis Withdrawal, and Possible

Patrick Clavin Chief Bureau Officer Full Preliminary Total Investigations Investigations

On and Off-Blockchain Enforcement Of Smart Contracts Dr Ellis Solaiman Ellis.Solaiman@ncl.ac.uk

Status of the TanDEM-X Mission Irena Hajnsek*, Daniel Schulze, Thomas Busche, Manfred Zink,

Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative

Lecture 2 Signal Processing and Dynamic Time Warping Michael Picheny, Bhuvana Ramabhadran,

Discrete Probabilistic Programming from First Principles Guy Van den Broeck The 6 th Workshop on

Efficient inference in discrete and continuous domains for PLP languages under the Distribution

Principles of Programming Languages h&quot;p://www.di.unipi.it/~andrea/Dida2ca/PLP-14/ Prof.

Heterogeneous Networks Jie Tang*, Tiancheng Lou*, and Jon Kleinberg + *Tsinghua University +

THE ENLIGHTENMENT: Still Burning Bright FERNANDO N ZIALCITA, PhD DEPARTMENT OF SOCIOLOGY AND

ATraPos: Adaptive Transaction Processing on Hardware Islands Danica Porobic , Erietta Liarou,

AGILE Speech to Text (STT) Contributors: BBN: Long Nguyen, Tim Ng, Kham Nguyen, Rabih Zbib, John

RESPITE: Tandem & multistream research Dan Ellis International Computer Science Institute,

Modeling Meeting Turns Dan Ellis <dpwe@ee.columbia.edu> LabROSA, Columbia University &

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/Dida2ca/PLP-14/ Prof.

Heterogeneous Networks Jie Tang, Tiancheng Lou, and Jon Kleinberg + *Tsinghua University +