Speeding up target-language driven part-of-speech tagger training - PowerPoint PPT Presentation

Speeding up TL driven part-of-speech tagger training for MT Speeding up target-language driven part-of-speech tagger training for machine translation Felipe Sánchez-Martínez Juan Antonio Pérez-Ortiz Mikel L. Forcada Transducens Group – Departament de Llenguatges i Sistemes Informàtics Universitat d’Alacant, E-03071 Alacant, Spain {fsanchez,japerez,mlf}@dlsi.ua.es MICAI, 5th Mexican International Conference on Artificial Intelligence Apizaco, México November 16, 2006

Speeding up TL driven part-of-speech tagger training for MT Outline Introduction 1 Part-of-speech tagging for machine translation Part-of-speech tagging with HMM Target-language driven HMM training 2 Method overview Disadvantage Pruning of disambiguation paths 3 Pruning method HMM updating Experiments 4 Overview Results 5 Discussion Concluding remarks Future work

Speeding up TL driven part-of-speech tagger training for MT Introduction Outline Introduction 1 Part-of-speech tagging for machine translation Part-of-speech tagging with HMM Target-language driven HMM training 2 Method overview Disadvantage Pruning of disambiguation paths 3 Pruning method HMM updating Experiments 4 Overview Results 5 Discussion Concluding remarks Future work

Speeding up TL driven part-of-speech tagger training for MT Introduction Part-of-speech tagging for machine translation Part-of-speech tagging Part-of-speech tagging: determining the lexical category or part-of-speech (PoS) of each word that appears in a text Lexically ambiguous word: word with more than one possible lexical category or PoS Lemma PoS noun book book verb book Ambiguities are usually solved according to the surrounding context 2/22

Speeding up TL driven part-of-speech tagger training for MT Introduction Part-of-speech tagging for machine translation PoS tagging for machine translation /1 Indirect rule-base machine translation (MT) systems usually perform PoS tagging as a subtask of the analysis procedure source → Analysis → Transfer → Generation → target text text 3/22

Speeding up TL driven part-of-speech tagger training for MT Introduction Part-of-speech tagging for machine translation PoS tagging for machine translation /2 PoS tagging becomes crucial Translation may differ from one PoS to another English PoS Spanish book libro noun reservar verb Some transformation is applied (or not) for some PoS reordering English PoS Spanish ← rule the green house green -adj la casa verde * el césped casa applied green -noun 4/22

Speeding up TL driven part-of-speech tagger training for MT Introduction Part-of-speech tagging with HMM PoS tagging with HMM Hidden Markov models are one of the standard statistical solutions for PoS tagging verb verb | noun | adj verb | noun 0.02 0.2 . . . noun noun | verb noun | prp 0.1 0.2 0.08 noun . . . 0.01 verb 0 0.12 verb . . . noun 0 0.4 . . . Each HMM state corresponds to a different PoS tag Each input word is replaced by its corresponding ambiguity class 5/22

Speeding up TL driven part-of-speech tagger training for MT Introduction Part-of-speech tagging with HMM HMM parameter estimation Supervisedly (non-ambiguous corpora available): Maximum-likelihood estimate (MLE) Unsupervisedly (only ambiguous corpora available): Baum-Welch (Expectation-maximization, EM) Our recently proposed (Sánchez-Martínez et al. 2004) target-language (TL) driven method ... 6/22

Speeding up TL driven part-of-speech tagger training for MT Target-language driven HMM training Outline Introduction 1 Part-of-speech tagging for machine translation Part-of-speech tagging with HMM Target-language driven HMM training 2 Method overview Disadvantage Pruning of disambiguation paths 3 Pruning method HMM updating Experiments 4 Overview Results 5 Discussion Concluding remarks Future work

Speeding up TL driven part-of-speech tagger training for MT Target-language driven HMM training Method overview Target-language driven method overview The method uses the MT system in which the resulting tagger will be embedded; however it will also work for other natural language processing tasks A target-language (TL) model is used to choose the best disambiguations HMM parameters are calculated according to the likelihood of the corresponding translations into TL The resulting tagger is tuned to the translation quality 7/22

Speeding up TL driven part-of-speech tagger training for MT Target-language driven HMM training Method overview Example Source-language (SL) sentence (English): He -prn books -noun|verb the -art room -noun|verb Possible translations (Spanish) according to each disambiguation and their normalized likelihoods according to a target-language (TL) model: • Él -prn reserva -verb la -art habitación -noun 0.75 • Él -prn reserva -verb la -art aloja -verb 0.15 • Él -prn libros -noun la -art habitación -noun 0.06 • Él -prn libros -noun la -art aloja -verb + 0.04 1.00 The HMM parameters involved in these 4 disambiguations are updated according to their likelihoods in TL 8/22

Speeding up TL driven part-of-speech tagger training for MT Target-language driven HMM training Disadvantage Disadvantage The number of possible disambiguations to translate grows exponentially with the segment length Translation is the most time-consuming task Consequence: Segment length must be constrained to keep complexity under control Potential benefits of likelihood estimated from longer segments is rejected Goal: To overcome this problem How? Pruning unlikely disambiguation paths by using a priori knowledge 9/22

Speeding up TL driven part-of-speech tagger training for MT Pruning of disambiguation paths Outline Introduction 1 Part-of-speech tagging for machine translation Part-of-speech tagging with HMM Target-language driven HMM training 2 Method overview Disadvantage Pruning of disambiguation paths 3 Pruning method HMM updating Experiments 4 Overview Results 5 Discussion Concluding remarks Future work

Speeding up TL driven part-of-speech tagger training for MT Pruning of disambiguation paths Pruning method Pruning method /1 Based on an initial model of SL tags ( M tag ) Assumption: Any reasonable model of SL tags may be useful to choose a set of possible disambiguation paths, being the correct one in that set It is not necessary to translate all possible disambiguation paths, but the “promising” ones The model used for pruning can be update dynamically 10/22

Speeding up TL driven part-of-speech tagger training for MT Pruning of disambiguation paths Pruning method Pruning method /2 The a priori likelihood p ( g i | s , M tag ) of each possible 1 disambiguation path g i of segment s is calculated using the model M tag Then, the set of disambiguation paths to take into account 2 is determined: Only the most likely disambiguation paths A mass probability threshold ρ is introduced The set of disambiguation paths taken into account satisfies � ρ ≤ p ( g i | s , M tag ) ∀ g i ∈ T ( s ) 11/22

Speeding up TL driven part-of-speech tagger training for MT Pruning of disambiguation paths HMM updating HMM updating The model M tag used for pruning can be updated with the new evidences collected from the TL The update consist of: Calculating the HMM parameters with the counts collected 1 from the TL Mixing the parameters of the new HMM with the initial one 2 12/22

Speeding up TL driven part-of-speech tagger training for MT Pruning of disambiguation paths HMM updating HMM parameters mixing Let θ = ( a γ 1 γ 1 , ..., a γ | Γ | γ | Γ | , b γ 1 σ 1 , ..., b γ | Γ | σ | Σ | ) be a vector containing all the parameters of a given HMM Mixing equation: θ mixed ( x ) = λ ( x ) θ TL ( x ) + ( 1 − λ ( x )) θ init λ ( x ) assigns a weight to the model estimated using the counts collected from the TL ( θ TL ) This weight function is made to depend on the number x of SL words processed so far λ ( x ) = x / C 13/22

Speeding up target-language driven part-of-speech tagger training - PowerPoint PPT Presentation

Speeding up TL driven part-of-speech tagger training for MT Speeding up target-language driven part-of-speech tagger training for machine translation Felipe Snchez-Martnez Juan Antonio Prez-Ortiz Mikel L. Forcada Transducens Group

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Exploring the use of target-language information to train the part-of-speech tagger of machine

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech and Language CS 188: Artificial Intelligence Spring 2011 Speech technologies

Speeding up the Inter-Planetary File System (IPFS) Speeding up the Inter-Planetary File System

Speeding Up Your Mac A Joe ON Tech Guide Speeding Up Your Mac Basics Three factors affect

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2005 References: 1. Speech and

net.tagger: Crowdsourcing Local physical network infrastructure Justin P. Rohrer Robert Beverly

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

An Unseen Interface :D Creating Speech-driven UI For Your App That Makes Users Happy by Halle

HMMS and Speech HMMS and Speech HMMS and Speech Recognition Recognition Recognition Presented

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Equational Constraints and Cylindrical Algebraic Decomposition James Davenport (Bath) with

Schur -groups Michael Bush Washington and Lee University August 5, 2013 Michael Bush Schur

The Orbifold Construction for Join Restriction Categories Dorette Pronk 1 with Robin Cockett 2 and

Statistical Machine Learning A Crash Course Part II: Classification & SVMs - 11.05.2012

Gamma-hadron discrimination in EAS a method based on multiscale, lacunarity and artificial neural

Outline: Session #1 Look Inside Feature Description Text Session #2 Additional

The Benefits of ArcGIS Pro and InfoWater Pro for Water Utility Engineering Presented by:

Cloud Lewis County Applications Russ Brownell Mickey Dietrich, GISP Tax Map Technician GIS

Speeding up target-language driven part-of-speech tagger training - PowerPoint PPT Presentation

Speeding up TL driven part-of-speech tagger training for MT Speeding up target-language driven part-of-speech tagger training for machine translation Felipe Snchez-Martnez Juan Antonio Prez-Ortiz Mikel L. Forcada Transducens Group

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Exploring the use of target-language information to train the part-of-speech tagger of machine

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech and Language CS 188: Artificial Intelligence Spring 2011 Speech technologies

Speeding up the Inter-Planetary File System (IPFS) Speeding up the Inter-Planetary File System

Speeding Up Your Mac A Joe ON Tech Guide Speeding Up Your Mac Basics Three factors affect

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2005 References: 1. Speech and

net.tagger: Crowdsourcing Local physical network infrastructure Justin P. Rohrer Robert Beverly

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

An Unseen Interface :D Creating Speech-driven UI For Your App That Makes Users Happy by Halle

HMMS and Speech HMMS and Speech HMMS and Speech Recognition Recognition Recognition Presented

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Equational Constraints and Cylindrical Algebraic Decomposition James Davenport (Bath) with

Schur -groups Michael Bush Washington and Lee University August 5, 2013 Michael Bush Schur

The Orbifold Construction for Join Restriction Categories Dorette Pronk 1 with Robin Cockett 2 and

Statistical Machine Learning A Crash Course Part II: Classification &amp; SVMs - 11.05.2012

Gamma-hadron discrimination in EAS a method based on multiscale, lacunarity and artificial neural

Outline: Session #1 Look Inside Feature Description Text Session #2 Additional

The Benefits of ArcGIS Pro and InfoWater Pro for Water Utility Engineering Presented by:

Cloud Lewis County Applications Russ Brownell Mickey Dietrich, GISP Tax Map Technician GIS

Statistical Machine Learning A Crash Course Part II: Classification & SVMs - 11.05.2012