BAT System Description for NIST LRE 2015 BUT+Agnitio+Torino - PowerPoint PPT Presentation

BAT System Description for NIST LRE 2015 BUT+Agnitio+Torino Oldrich Plchot, Pavel Matejka, Radek Fer, Ondrej Glembek,Ondrej Novotny, Jan Pesan, Lukas Burget, Martin Karafiat, Karel Vesely, Lucas Ondel, Santosh Kesiraju, Frantisek Grezl, Sri Harish Mallidi (JHU), Ruizhi Li (JHU), Niko Brummer, Albert Swart, Sandro Cumani June 22, Bilbao, Odyssey 2016

Data ● Fixed training condition ○ Train - 60% of training data, short cuts generated evenly from 3 to 30 seconds ○ Dev - 40% - short cuts ranging from 3 to 30 seconds with uniform distribution ● Open training condition ○ all relevant data we managed to find ;) (no Babel data for i-vec, just for BN features) ○ main additions are KALAKA-3 (European Spanish, British English) and Arabic - Al Jazeera free corpus ● Details in our system description / Odyssey paper

Stacked Bottleneck features (SBN) ● Based on a hierarchy of two NNs. Bottlenecks from the first network are stacked in time and used as inputs to the second NN. ● Bottlenecks from the second NN are the final features. ● Fixed condition training data ○ Switchboard with ~7k triphone state targets ○ LRE15 training data with labels obtained using acoustic unit discovery tool (200 3-state units) ● Open condition training data ○ 17 languages from Babel project (IARPA) as Multilingual BN - with ~100 phone states per language

General system overview ● i-vector based systems : ○ Features: ■ DNN bottlenecks trained on ● Switchboard English (Fixed cond.) ● Babel data – multilingual bottleneck features (Open cond.) ■ MFCC-SDC+PLLR (phone LLH ratios) ○ 2048 Full or Diagonal GMM/UBM, 600 dimensional i-vectors ○ Gaussian Linear Classifier (GLC) seems sufficient ■ Including i-vector uncertainty in scoring helps ● Frame Level Sequence Summarizing NN ( SSNN )

Fusion with Prior-weighted Logistic Regression ● Fusion is trained on dev data in score domain ● One weight per system and one bias per language ● Cluster prior: For the data of each cluster, we used a cluster-specific prior, with zero probabilities for out-of-cluster languages and equal weights within the cluster. ● Alternative system to allow between cluster analysis: Uniform prior: (flat) over all languages

Fixed Training Condition Fusion EVL Single systems DEV DEV EVL System name cavg* cavg/cavg* cavg* cavg System name classf Primary 1.9 SBN80-SWB1-KALDI--CD GLC COV 2.41 Alternate1 1.24 SBN80-SWB1--CD NN 2.80 SDC-PLLR--CD GLC 4.72 SBN80-AUTO600-KALDI--CD GLC COV 5.46 SSNN/ Alternate 2 NN 10.46 SBN80-SWB1-KALDI--CD/ Alt3 GLC 2.31

Fixed Training Condition Fusion EVL Single systems DEV DEV EVL System name cavg* cavg/cavg* cavg* cavg System name classf Primary 1.9 18.1 / 13.5 SBN80-SWB1-KALDI--CD GLC COV 2.41 16.9 Alternate1 1.24 19.4 / 13.4 SBN80-SWB1--CD NN 2.80 19.9 SDC-PLLR--CD GLC 4.72 22.0 SBN80-AUTO600-KALDI--CD GLC COV 5.46 27.0 SSNN/ Alternate 2 NN 10.46 35.0 SBN80-SWB1-KALDI--CD/ Alt3 GLC 2.31 18.48 ● Eval: Single best system better than Primary fusion ● Calibration ○ Almost no calibration loss on Dev ○ Fairly large calibration loss on eval

Cluster dependent i--vector ● Average of scores from 6 systems, where ● UBM is trained only on data in a given cluster DEV EVAL Fixed Training Condition cavg* cavg cavg* SYSTEM NAME SBN80-SWB1-KALDI 2.9 20.1 16.2 SBN80-SWB1-KALDI-CD 2.5 19.7 15.4 SBN80-SWB1-KALDI-CD diag 2.3 18.5 14.9

Sequence Summarizing NN

Open Training Condition Single systems Fusion EVL DEV DEV EVL cavg* cavg System name cavg* cavg/cavg* System name classf SSNN NN 30.0 Primary 7.14 ML-17-SBN-CD GLC 8.8 Alternate1 7.15 MultilangRDT GLC 10.4 SBN80-SWB1-KALDI--CD GLC 10.4 SDC-PLLR-CD NN 12.7 SNB80-AUTO600-KALDI NN 15.6 ML-17-SBN - trained on Open GLC COV 8.9

Open Training Condition Single systems Fusion EVL DEV DEV EVL cavg* cavg System name cavg* cavg/cavg* System name classf SSNN NN 30.0 41.3 Primary 7.14 14.1 / 10.3 ML-17-SBN-CD GLC 8.8 13.9 Alternate1 7.15 14.1 / 10.4 MultilangRDT GLC 10.4 13.6 SBN80-SWB1-KALDI--CD GLC 10.4 17.6 SDC-PLLR-CD NN 12.7 21.4 SNB80-AUTO600-KALDI NN 15.6 25.0 ML-17-SBN - trained on Open GLC COV 8.9 12.0 ● Single best system trained fully on Open Training condition better than fusion

Analysis of training data - Analysis of using different training data for UBM/ivec and classifier - Important to train i--vector and classifier on Open dataset F …. Fixed Training data UBM/IVEC_Classifier O … Open Submission

Comparison of different features - Fixed Training Condition - all systems with 2048G FullCov UBM, 600 ivec and Gaussian classifier * 16.1 20.1 * 19.7 22.1 28.9 23.8 - Violates fixed data condition * (post eval analysis only)

French cluster disaster ● Radio vs. Telephone in DEV - most probably overtrained for channel ● Channel is taking over on the EVAL data ● Calibration on eval data is not able to fix a wrong classifier

Comparison of different i-vector classifiers - Different classifiers performs similarly - Gaussian Linear Classifier (GLC) - Language Dependent Ivector (LDI) - Multiclass Multivariate Fully Bayesian Gaussian Classifier (MMFBG) - Neural Network - Logistic Regression

Automatically derived acoustic units for BN training - Variational Bayes trained Dirichlet Process mixture of HMMs - Open loop of infinite number of phone-like units - 3 state HMMs, 2 Gaussians per state - 2048G FullCov UBM, 600 ivec and GLC + cuts DEV EVAL Fixed data condition cavg* cavg/cavg* Features MFCC-SDC 6.3 23.8 / 21.5 SBN80-AUTO600-KALDI 5.4 28.9 / 24.2 SBN80-SWB1-KALDI 2.9 20.1 / 16.2 - we can do better than SDC baseline on DEV even without transcription - conventional bottleneck trained on (probably) any data is still better

Conclusion - lessons learned ● State-of-the-art system is i--vector system with Bottleneck features ● GLC with uncertainty performs similar to GLC trained with a lot of small cuts ● Phonotactic systems do not contribute to the final fusion ● Data engineering is always important ● Frame level NN approaches ○ prone to overtraining ○ Better to use NN as a source of counts which are modelled by other classifier ● Other systems ○ Denoising/Dereverberation with NN - helping on EVL but not on DEV ○ Phonotactic systems - with Switchboard phoneme recognizer ○ Frame level DNN

THANK YOU

BAT System Description for NIST LRE 2015 BUT+Agnitio+Torino - PowerPoint PPT Presentation

BAT System Description for NIST LRE 2015 BUT+Agnitio+Torino Oldrich Plchot, Pavel Matejka, Radek Fer, Ondrej Glembek,Ondrej Novotny, Jan Pesan, Lukas Burget, Martin Karafiat, Karel Vesely, Lucas Ondel, Santosh Kesiraju, Frantisek Grezl, Sri

BAT-2 Status BAT-2 Status Oliver Schulz oschulz@mpp.mpg.de (mailto:oschulz@mpp.mpg.de) BAT

(BAT) What is BAT? Who Serves on BAT? CORE TEAM Campus Police Chief Tom Engells HR

DEPLOYMENT BAT REVIEW TANKER TOWLINE DEPLOYMENT BAT REVIEW TANKER TOWLINE DEPLOYMENT BAT REVIEW

Batteri drevet vakuum lfter AL-PACK-MOBILE-BAT Batterie betriebener Vakuum Heber

The MITLL NIST LRE 2015 Language Recognition System* Contributors in alphabetical order Najim

The Sheffield language recognition system in NIST LRE 2015 Raymond Ng, Mauro Nicolao, Oscar Saz,

University of the Basque Country (EHU) Systems for the NIST 2011 LRE Mikel Penagarikano, Amparo

Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE 2015 Alan

NIST Gaithersburgs Approach to a Solar PV Array Project John.R.Bollinger@nist.gov 2 NIST

Federal Computer Security Managers Forum Meeting September 10, 2018 NIST Gaithersburg NIST

FEDERAL COMPUTER SECURITY MANAGERS FORUM MEETING FEBRUARY 6, 2020 NIST WEST SQUARE NIST

NIST Trustworthy Email Project High Assurance Domain Project Scott Rose, NIST scottr@nist.gov

1. Bat tourism in the Scenic rim. Title page: Potential of bat tourism in the Scenic rim 2. Values

NFHS BASEBALL RULES CHANGES www.nfhs.or g APPROVED NON-WOOD BAT The Easton Lock &

New Bat Programmatic Agreement!!! So whats new? Simplified Habitat Types/Cutting Seasons

Hands on introduction to BAT Statistics Tools School 7 Apr 2011 Julia Grebenyuk for the BAT

Jewish Worship and Community Learning Objective: To find out the meaning of Jewish rituals in

TDDE18 & 726G77 Multilevel and Multiple inheritance Different kind of inheritance

DB 2 02 Unary Table Storage Summer 2018 Torsten Grust Universitt Tbingen, Germany 02 1

Build and Test The COIN-OR Way Ted Ralphs COIN fORgery: Developing Open Source Tools for OR

Truth value of a conditional Either true or false The answer to the question is the

eBATS is open to public submission of BATs Which public-key systems (Benchmarkable Asymmetric

A Modeling Framework for Future Energy Systems Gran Andersson, ETH Zrich ETH Power Systems

Cricket Activity Detection Ashok Kumar(11164) Javesh Garg(11334) IIT Kanpur March 4, 2014 AI

Sambuz

Useful Links

Newsletter

Mail Us

BAT System Description for NIST LRE 2015 BUT+Agnitio+Torino - PowerPoint PPT Presentation

BAT System Description for NIST LRE 2015 BUT+Agnitio+Torino Oldrich Plchot, Pavel Matejka, Radek Fer, Ondrej Glembek,Ondrej Novotny, Jan Pesan, Lukas Burget, Martin Karafiat, Karel Vesely, Lucas Ondel, Santosh Kesiraju, Frantisek Grezl, Sri

BAT-2 Status BAT-2 Status Oliver Schulz oschulz@mpp.mpg.de (mailto:oschulz@mpp.mpg.de) BAT

(BAT) What is BAT? Who Serves on BAT? CORE TEAM Campus Police Chief Tom Engells HR

DEPLOYMENT BAT REVIEW TANKER TOWLINE DEPLOYMENT BAT REVIEW TANKER TOWLINE DEPLOYMENT BAT REVIEW

Batteri drevet vakuum lfter AL-PACK-MOBILE-BAT Batterie betriebener Vakuum Heber

The MITLL NIST LRE 2015 Language Recognition System* Contributors in alphabetical order Najim

The Sheffield language recognition system in NIST LRE 2015 Raymond Ng, Mauro Nicolao, Oscar Saz,

University of the Basque Country (EHU) Systems for the NIST 2011 LRE Mikel Penagarikano, Amparo

Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE 2015 Alan

NIST Gaithersburgs Approach to a Solar PV Array Project John.R.Bollinger@nist.gov 2 NIST

Federal Computer Security Managers Forum Meeting September 10, 2018 NIST Gaithersburg NIST

FEDERAL COMPUTER SECURITY MANAGERS FORUM MEETING FEBRUARY 6, 2020 NIST WEST SQUARE NIST

NIST Trustworthy Email Project High Assurance Domain Project Scott Rose, NIST scottr@nist.gov

1. Bat tourism in the Scenic rim. Title page: Potential of bat tourism in the Scenic rim 2. Values

NFHS BASEBALL RULES CHANGES www.nfhs.or g APPROVED NON-WOOD BAT The Easton Lock &amp;

New Bat Programmatic Agreement!!! So whats new? Simplified Habitat Types/Cutting Seasons

Hands on introduction to BAT Statistics Tools School 7 Apr 2011 Julia Grebenyuk for the BAT

Jewish Worship and Community Learning Objective: To find out the meaning of Jewish rituals in

TDDE18 &amp; 726G77 Multilevel and Multiple inheritance Different kind of inheritance

DB 2 02 Unary Table Storage Summer 2018 Torsten Grust Universitt Tbingen, Germany 02 1

Build and Test The COIN-OR Way Ted Ralphs COIN fORgery: Developing Open Source Tools for OR

Truth value of a conditional Either true or false The answer to the question is the

eBATS is open to public submission of BATs Which public-key systems (Benchmarkable Asymmetric

A Modeling Framework for Future Energy Systems Gran Andersson, ETH Zrich ETH Power Systems

Cricket Activity Detection Ashok Kumar(11164) Javesh Garg(11334) IIT Kanpur March 4, 2014 AI

Sambuz

Useful Links

Newsletter

Mail Us

NFHS BASEBALL RULES CHANGES www.nfhs.or g APPROVED NON-WOOD BAT The Easton Lock &

TDDE18 & 726G77 Multilevel and Multiple inheritance Different kind of inheritance