Empirical Study of a Two-Step Approach to Estimate Translation - PowerPoint PPT Presentation

Empirical Study of a Two-Step Approach to Estimate Translation Quality J. Gonz´ alez-Rubio, J.R. Navarro-Cerd´ an, F. Casacuberta jegonzalez@dsic.upv.es, jonacer@iti.upv.es, fcn@dsic.upv.es Pattern Recognition and Human Language Technology Group Instituto Tecnol´ ogico de Inform´ atica Universitat Polit` ecnica de Val` encia (Spain) Work supported by the EU 7 th Framework program (FP/2007-2013) under the CasMaCat project (gran no 287576) JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Overview • Introduction • Proposed Two-Step Quality Estimation Approach • Experimental Setup • Results • Conclusions JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Introduction JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Motivation • Quality estimation (QE) is a key element in practical translation systems • Usually addressed as a regression problem – Predict a quality score from a set of translation features • Problem: translation features are ambiguous, noisy, and collinear • Chosen solution: a two-step training methodology reduced original translation feature feature Machine source sentence prediction Feature set set Dimensionality learning additional computation . reduction . model . sources of information JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Two-Step Quality Estimation Approach JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Dimensionality Reduction • Based on Partial Least Squares Regression (PLSR) • Widely-used PCA takes into account only the features – Principal components (PCs) contain almost not redundancy... – ...but they do not necessarily are the best features for prediction • In contrast, PLSR does take into account the values to be predicted – The new set of Latent Variables (LVs) contain almost no redundancy – Additionally, they explain most of the variability in the quality scores JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Prediction Model • Goal: predict the actual quality scores from the LVs • Model: Support Vector Machines for regression (SVR) • Good empirical prediction accuracy in a number of tasks • Widely used in the QE literature JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Experimental Setup JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Corpus • English-Spanish news texts from WMT 2012 QE task • 1832 translations for training and 422 for test • Each translation has a real-valued score between one and five • Post-edition effort likert scale: 5: The translation requires little editing to be publishable 4: 10% – 25% of the translation needs to be edited 3: 25% – 50% of the translation needs to be edited 2: 50% – 70% of the translation needs to be edited 1: The translation must be translated from scratch JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Feature Sets Clean Collinear Constant Features (308) (49) (15) (43) (56) (497) (82) (147) 100% 80% 60% 40% 20% 0% DCU-SYMC LORIA SDLLW TCD UEDIN UPV UU WLV-SHEF • Wide variety of experimental conditions JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Experimental Methodology • Evaluation metric: Root Mean Squared Error (RMSE) • Free parameters optimized by 10-fold cross-validation – Number of LVs and SVR meta-parameters • 8 dev-train folds, one dev-tuning fold, and one dev-test fold – Result: averaged prediction accuracy for the separated dev-test folds • Final models built with the whole training using best parameter values JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Results JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Cross-Validation RMSE Results Baseline PCA Our approach 0.8 RMSE 0.75 0.7 0.65 0.6 DCU-SYMC LORIA SDLLW TCD UEDIN UPV UU WLV-SHEF • Equal or lower prediction error than Baseline and PCA – Error reduction correlated with the number of noisy features JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Cross-Validation Feature Reduction Ratio Baseline PCA Our approach % of the original features 100 80 60 40 20 0 DCU-SYMC LORIA SDLLW TCD UEDIN UPV UU WLV-SHEF • About half the number of LVs than PCs • Operational time of the QE system largely reduced JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Cross-Validation learning curves 0.86 0.9 RMSE SDLLW feature set Baseline RMSE UPV feature set Baseline 0.85 PCA PCA 0.82 Our approach Our approach 0.8 0.78 0.75 0.74 0.7 0.7 0.65 0.66 0.6 # latent variables # latent variables 0.62 1 10 20 30 40 50 60 70 80 90 100 1 3 5 7 9 11 13 15 Band indicates the 95% confidence interval of prediction accuracy (RMSE) • Larger and faster error reduction for higly-redundant sets (left plot) • Same accuracy with less features for concise sets (right plot) JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Test Results Baseline PCA Our approach 1.1 RMSE 1 0.9 0.8 0.7 DCU-SYMC LORIA SDLLW TCD UEDIN UPV UU WLV-SHEF • Different result respect to cross-validation, why? JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Analysis of Test Results • Hypothesis: training partition did not adequately represent test • Studied by a series of Hotelling’s two-sample T 2 tests – Multivariate analog of Student’s t-test Samples from Samples from Samples from Samples from same population same population same population same population Population 1 Population 1 Population 1 Population 1 p > 0.01 p > 0.01 p > 0.01 p > 0.01 – Compares two independently drawn samples ∗ E.g., training and test partitions – Do they belong to the same population? Samples from Samples from Samples from Samples from different populations different populations different populations different populations Population 2 Population 2 Population 2 Population 2 p < 0.01 p < 0.01 p < 0.01 p < 0.01 JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Analysis of Test Results II • T 2 tests indicated that training and test were from different populations – Main reason: data scarcity (only 1832 training samples) • Further analysis of each individual feature: – Most had statistically different values in test – Between one quarter and three quarters depending on the set • In contrast, only about only 1% between dev-train and dev-test folds JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Conclusions JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Conclusions • Empirical results showed the soundness of the proposed approach – Improvements in prediction accuracy – Large feature reduction ratios • Not so good test results due to data scarcity • Feature reduction boosts QE scalability and time-efficiency – Suitable to be applied in scenarios with temporal restrictions – Allows the use of thousands of features JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Thank you, questions? JGR,JNC,FCN Empirical Study of a Two-Step Approach to Estimate Translation Quality IWSLT’13

Empirical Study of a Two-Step Approach to Estimate Translation - PowerPoint PPT Presentation

Empirical Study of a Two-Step Approach to Estimate Translation Quality J. Gonz alez-Rubio, J.R. Navarro-Cerd an, F. Casacuberta jegonzalez@dsic.upv.es, jonacer@iti.upv.es, fcn@dsic.upv.es Pattern Recognition and Human Language Technology

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Quick guide Step 1: Purchasing an RSEvents! membership Step 2: Downloading RSEvents! Step 3:

Step by step guide Step 1: Purchasing an RSBlog! membership Step 2: Downloading RSBlog! Step 3:

Step by step guide Step 1: Purchasing an RSEvents! membership Step 2: Downloading RSEvents! Step

Step by step guide Step 1: Accessing the account Step 2: Download RSFiles! 2.1 Download the

Step 1 Step 2 Step 3 Step 4 Step 5 Preparation of a sketch Submission of birth map of all

Quick guide Step 1: Purchasing RSMail! Step 2: Download RSMail! Step 3: Installing RSMail! Step

Credential Assessment Mapping Privilege Escalation at Scale Matt Weeks @scriptjunkie1 Adversary

Step by step guide Step 1: Purchasing a RSMembership! membership Step 2: Download RSMembership!

Selection of Design Team Step 3 Design Step 4 June 2013 Project Management Concept

Step by step guide Step 1: Purchasing an RSMail! membership Step 2: Download RSMail! 2.1.

Step by step guide Step 1: Purchasing a RSFirewall! membership Step 2: Download RSFirewall! 2.1.

Step by step guide Step 1: Purchasing a RSTickets!Pro membership Step 2: Downloading

Quick guide Step 1: Purchasing a RSComments! membership Step 2: Download RSComments! Step 3:

Step by step guide Step 1: Purchasing an RSSeo! membership Step 2: Download RSSeo! 2.1 Download

Step by step guide Step 1: Purchasing a RSComments! membership Step 2: Download RSComments! 2.1.

and Deep Reconstruction Dr. Uwe Kruger Department of Biomedical Engineering Jonsson Engineering

Low rank SDP extreme points and Applications Mohit Singh Georgia Tech SDP extreme points

Microeconomics 3200/4200: Part 1 P. Piacquadio p.g.piacquadio@econ.uio.no September 14, 2017

Lectures 67: Monotone Comparative Statics, with Applications to Producer Theory Alexander

Regime for Exploration in India Anupama Sen Senior Research Fellow, Oxford Institute for Energy

Introduction to Java 1 / 10 Java Developed for home appliances - cross-platform VM a key

Introduction to Object-Oriented Programming Introduction and Java Overview Christopher Simpkins

Learning objectives Understand the basic features of Java What are portability and

Sambuz

Useful Links

Newsletter

Mail Us