Improved Statistical Models for SMT-Based Speaking Style - PowerPoint PPT Presentation

Improved Statistical Models for SMT-Based Speaking Style Transformation Improved Statistical Models for SMT-Based Speaking Style Transformation Graham Neubig, Yuya Akita, Shinsuke Mori, Tatsuya Kawahara School of Informatics, Kyoto University, Japan 1

Improved Statistical Models for SMT-Based Speaking Style Transformation 1. Overview of Speaking-Style Transformation 2

Improved Statistical Models for SMT-Based Speaking Style Transformation Speaking Style Transformation (SST) ● ASR is generally modeled to find the verbatim utterance V given acoustic features X ● In many cases verbatim speech is difficult to read: ya know when I was asked earlier about uh the issue of V coal uh you under my plan uh of a cap and trade system ... ● In order to create usable transcripts from ASR results, it is necessary to transform V into clean text W When I was asked earlier about the issue of coal under my W plan of a cap and trade system, ... 3

Improved Statistical Models for SMT-Based Speaking Style Transformation Previous Research ● Detection-Based Approaches ● Focus on deletion of fillers, repeats, and repairs, as well as insertion of punctuation ● Modeled using noisy-channel models [Honal & Schultz 03, Maskey et al. 06], HMMs, and CRFs [Liu et al. 06] ● SMT-Based Approaches ● Treat spoken and written language as different languages, and “translate” between them ● Proposed by [Shitaoka et al. 04] and implemented using WFSTs and log-linear models in [Neubig et al. 09] ● Is able to handle colloquial expression correction, insertion of dropped words (important for formal settings) 4

Improved Statistical Models for SMT-Based Speaking Style Transformation Research Summary ● Propose two enhancements of the statistical model for finite-state SMT-based SST ● Incorporation of context in a noisy channel model by transforming context-sensitive joint probabilities to conditional probabilities ● Allowing greater emphasis on frequent patterns by log-linearly interpolating joint and conditional probability models ● Evaluation of the proposed methods on both verbatim transcripts and ASR output for the Japanese Diet (national congress) 5

Improved Statistical Models for SMT-Based Speaking Style Transformation 2. Noisy-Channel and Joint-Probability Models for SMT 6

Improved Statistical Models for SMT-Based Speaking Style Transformation Noisy Channel Model ● Statistical models for SST attempt to maximize P  W ∣ V  ● Training requires a parallel corpus of W and V ● It is generally easier to acquire a large volume of clean transcripts ( W ) than a parallel corpus ( W and V ) ● Bayes' law is used to decompose the probabilities  W = argmax P  W ∣ V  W = argmax P t  V ∣ W  P l  W  W Translation Model (TM) Language Model (LM) ● is estimated using an n -gram (3-gram) model P l  W  7

Improved Statistical Models for SMT-Based Speaking Style Transformation Probability Estimation for the TM ● is difficult to estimate for the whole sentence P t  V ∣ W  ● Assume that the word TM probabilities are independent ● Set the sentence TM probability equal to the product of the word TM probabilities P t  V ∣ W ≈ ∏ P t  v i ∣ w i  i ● However, the word TM probabilities are actually not context independent P t (like| ε ) I like told him that I really like his new hairstyle. 8 P t (like| ε, H 1 ) (large) P t (like| ε, H 2 ) (small)

Improved Statistical Models for SMT-Based Speaking Style Transformation Joint Probability Model [Casacuberta & Vidal 2004] ● The joint probability model is an alternative to the noisy- channel model for speech translation  W = argmax P t  W ,V  W ● Sentences are aligned into matching words or phrases V = ironna e- koto de chumon tsukeru to desu ne ... W = iroiro na koto de chumon o tsukeru to ... ● A sequence Γ of word/phrase pairs is created Γ = ironnna/iroiro_na e-/ε koto/koto de/de chumon/chumon ε/o tsukeru/tsukeru to/to desu/ε ne/ε 9

Improved Statistical Models for SMT-Based Speaking Style Transformation Joint Probability Model (2) ● The probability of Γ is estimated using a smoothed n - gram model trained on Γ strings K P t  W ,V = P t ≈ ∏ k = 1 k − 1 P t  k ∣ k − n  1  ● Context information is contained in the joint probability ● However, this probability can only be trained on parallel text (an LM probability cannot be used) P t  W ∣ V ≠ argmax P t  W ,V  P l  W  argmax W W ● It is desirable to have a context-sensitive model that can be used with a language model 10

Improved Statistical Models for SMT-Based Speaking Style Transformation 3. A Context-Sensitive Translation Model 11

Improved Statistical Models for SMT-Based Speaking Style Transformation Context-Sensitive Conditional Probability ● It is possible to model the conditional (TM) probability from right-to-left, similarly to the joint probability k P t  V ∣ W = ∏ i = 1 P t  v i ∣ v 1 ,  ,v i − 1 ,w 1 ,  ,w k  k = ∏ i = 1 P t  v i ∣ 1 ,  ,  i − 1 ,w i ,  ,w k  Context Information Prediction Unit   v i − 2 v i − 1 v i v i  1 v i  2   w i − 2 w i − 1 w i w i  1 w i  2 12

Improved Statistical Models for SMT-Based Speaking Style Transformation Independence Assumptions ● To simplify the model, we make two assumptions ● Assume that word probabilities rely only on preceding words k P t  V ∣ W ≈ ∏ i = 1 P t  v i ∣ 1 ,  ,  i − 1 ,w i  ● Limit the history length k P t  V ∣ W ≈ ∏ i = 1 P t  v i ∣ i − n  1 ,  ,  i − 1 ,w i    v i − 2 v i − 1 v i v i  1 v i  2   w i − 2 w i − 1 w i w i  1 w i  2 13

Improved Statistical Models for SMT-Based Speaking Style Transformation Calculating Conditional Probabilities from Joint Probabilities ● It is possible to decompose this equation into its numerator and denominator P t  v i ∣ i − n  1 ,  ,  i − 1 ,w i = P t  i ∣ i − n  1 ,  ,  i − 1  P t  w i ∣ i − n  1 ,  ,  i − 1  ● The numerator is equal to the joint n -gram probability, while the denominator can be marginalized P t  i ∣ i − n  1 ,  ,  i − 1  P t  v i ∣ i − n  1 ,  ,  i − 1 ,w i = ∑ P t   ∣ i − n  1 ,  ,  i − 1  ∈{   : 〈  v , w i 〉} ● This conditional probability uses context information and can be combined with a language model 14

Improved Statistical Models for SMT-Based Speaking Style Transformation Training the Proposed Model Clean Corpus Parallel Corpus Clean 会議録 (W) Verbatim Transcripts Transcripts (W) or ASR Results (V) P t  W ,V  Clean P  W ∣ V  Train Transcripts Joint Prob. (W) Calculate Train P t  V ∣ W  P l  W  P  W ∣ V  Context- P  W ∣ V  LM Sensitive TM Noisy-Channel Model 15

Improved Statistical Models for SMT-Based Speaking Style Transformation Log-Linear Interpolation with the Joint Probability ● The joint probability contains information about pattern frequency not present in the conditional probability c(γ 1 ) = 100 c(γ 2 ) = 1 P t ( v 1 |w 1 ) = P t ( v 2 |w 2 ) c( w 1 ) = 1000 c( w 2 ) = 10 P t (γ 1 ) ≠ P t (γ 2 ) ● High-frequency patterns are more reliable ● The strong points of both models can be utilized through log-linear interpolation Noisy-Channel Model Joint Probability log  P  W ∣ V ∝ 1 log  P t  V ∣ W  2 log  P l  W  3 log  P t  V ,W  16

Improved Statistical Models for SMT-Based Speaking Style Transformation Training the Proposed Model Clean Corpus Parallel Corpus Clean 会議録 (W) Verbatim Transcripts Transcripts (W) or ASR Results (V) P t  W ,V  Clean P  W ∣ V  Train Transcripts Joint Prob. (W) Calculate Train  3 P t  V ∣ W  P l  W  P  W ∣ V  Context- P  W ∣ V  LM Sensitive TM  1  2 Log-Linear Model 17

Improved Statistical Models for SMT-Based Speaking Style Transformation 4. Evaluation 18

Improved Statistical Models for SMT-Based Speaking Style Transformation Experimental Setup ● Verbatim transcripts and ASR output of meetings from the Japanese Diet were used as a target Data Type Size Time Period LM Training 158M 1/1999 - 8/2007 TM Training 2.31M 1/2003 - 10/2006 Weight Training 66.3k 10/2006-12/2006 Testing 300k 10/2007 ● TM training: ● Verbatim system: Verbatim transcripts and clean text ● ASR system: ASR output and clean text ● Baseline: noisy channel, 3-gram LM, 1-gram TM 19

Improved Statistical Models for SMT-Based Speaking Style Transformation Effect of Translation Models (Verbatim Transcripts) ● 4 models were compared A) The context-sensitive noisy-channel model B) A with log-linear interpolation of the LM and TM C) The joint-probability model D) B and C log-linearly interpolated ● Evaluated using edit distance from the clean transcript (WER), with no editing, the WER was 18.62% TM n-gram order Model LL 1-gram 2-gram 3-gram 6.51% 5.33% 5.32% A. Noisy-Channel (Noisy) ★ 5.99% 5.15% 5.13% B. Noisy-Channel (Noisy LL) C. Joint Probability (Joint) 9.89% 4.70% 4.60% 20 D. B+C (Noisy+Joint LL) ★ 5.81% 4.12% 4.05%

Improved Statistical Models for SMT-Based Speaking Style - PowerPoint PPT Presentation

Improved Statistical Models for SMT-Based Speaking Style Transformation Improved Statistical Models for SMT-Based Speaking Style Transformation Graham Neubig, Yuya Akita, Shinsuke Mori, Tatsuya Kawahara School of Informatics, Kyoto University,

style#1 grace style#2 freya style#3 iona style#4 skye style#5 cora style#6 maisie style#7 isla

SMT-Style Program Analysis with Value-based Refinements Vijay DSilva Leopold Haller Daniel

SMT WORLDWIDE SMT America, Europe and Asia staff has over 20 years experience in the SMT field

POLYMETALLIC PRODUCER AGM PRESENTATION June 30, 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL: SMT

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 & angr

Style le GAN Prof. Leal-Taix and Prof. Niessner 1 Style leGAN Style-based generator

Building a Phrase-based SMT System Graham Neubig & Kevin Duh Nara Institute of Science and

Click to edit Master title style DRVR Click to edit Master title style Click to edit Master

Introduction to SAT and SMT Solvers Interfacing Yosys and SMT Solvers for BMC and more using

POLYMETALLIC PRODUCER CORPORATE PRESENTATION July 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

SMT in Asia Content Teknek and the SMT industry The market Why cleaning is needed

POLYMETALLIC PRODUCER CORPORATE PRESENTATION February 2020 TSX: SMT | NYSE AMERICAN: SMTS |

DIVERSIFIED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

DIVERSIFED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

SMT-LIB for HOL Daniel Kroening Philipp Rmmer Georg Weissenbacher Oxford University Computing

MAXILLOMANDIBULAR ADVANCEMENT: Primary Treatment for OSA ??? Sampeter L Odera DMD, MD Assistant

Ch. 3: Inverse Kinematics Ch. 4: Velocity Kinematics Inverse orientation kinematics Now

Skeletons CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Winter 2018 Matrix

pain during lockdown and beyond 22 July 2020 www.local.gov.uk Welcome www.local.gov.uk

Review for Final Exam 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom THANK YOU !!!! JON !!

Joint Source and Channel Coding: Fundamental Bounds and Connections to Machine Learning Deniz G

Fine-grained quantum supremacy Tomoyuki Morimae (YITP, Kyoto University) 45min Joint work with

Property Accountability CW2 Mandee L. Mintz 4 th Brigade Property Book Officer Agenda

Sambuz

Useful Links

Newsletter

Mail Us

Improved Statistical Models for SMT-Based Speaking Style - PowerPoint PPT Presentation

Improved Statistical Models for SMT-Based Speaking Style Transformation Improved Statistical Models for SMT-Based Speaking Style Transformation Graham Neubig, Yuya Akita, Shinsuke Mori, Tatsuya Kawahara School of Informatics, Kyoto University,

style#1 grace style#2 freya style#3 iona style#4 skye style#5 cora style#6 maisie style#7 isla

SMT-Style Program Analysis with Value-based Refinements Vijay DSilva Leopold Haller Daniel

SMT WORLDWIDE SMT America, Europe and Asia staff has over 20 years experience in the SMT field

POLYMETALLIC PRODUCER AGM PRESENTATION June 30, 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL: SMT

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 &amp; angr

Style le GAN Prof. Leal-Taix and Prof. Niessner 1 Style leGAN Style-based generator

Building a Phrase-based SMT System Graham Neubig &amp; Kevin Duh Nara Institute of Science and

Click to edit Master title style DRVR Click to edit Master title style Click to edit Master

Introduction to SAT and SMT Solvers Interfacing Yosys and SMT Solvers for BMC and more using

POLYMETALLIC PRODUCER CORPORATE PRESENTATION July 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

SMT in Asia Content Teknek and the SMT industry The market Why cleaning is needed

POLYMETALLIC PRODUCER CORPORATE PRESENTATION February 2020 TSX: SMT | NYSE AMERICAN: SMTS |

DIVERSIFIED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

DIVERSIFED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

SMT-LIB for HOL Daniel Kroening Philipp Rmmer Georg Weissenbacher Oxford University Computing

MAXILLOMANDIBULAR ADVANCEMENT: Primary Treatment for OSA ??? Sampeter L Odera DMD, MD Assistant

Ch. 3: Inverse Kinematics Ch. 4: Velocity Kinematics Inverse orientation kinematics Now

Skeletons CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Winter 2018 Matrix

pain during lockdown and beyond 22 July 2020 www.local.gov.uk Welcome www.local.gov.uk

Review for Final Exam 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom THANK YOU !!!! JON !!

Joint Source and Channel Coding: Fundamental Bounds and Connections to Machine Learning Deniz G

Fine-grained quantum supremacy Tomoyuki Morimae (YITP, Kyoto University) 45min Joint work with

Property Accountability CW2 Mandee L. Mintz 4 th Brigade Property Book Officer Agenda

Sambuz

Useful Links

Newsletter

Mail Us

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 & angr

Building a Phrase-based SMT System Graham Neubig & Kevin Duh Nara Institute of Science and