using word alignments to assist computer aided
play

Using word alignments to assist computer-aided translation users by - PowerPoint PPT Presentation

Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Using word alignments to assist computer-aided translation users by marking which target-side words to change or keep unedited Miquel Espl`


  1. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Using word alignments to assist computer-aided translation users by marking which target-side words to change or keep unedited Miquel Espl` a-Gomis Felipe S´ anchez-Mart´ ınez Mikel L. Forcada { mespla,fsanchez,mlf } @dlsi.ua.es Departament de Llenguatges i Sistemes Inform` atics Universitat d’Alacant, E-03071 Alacant, Spain 15th Annual Conference of the EAMT Leuven, May 30, 2011

  2. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Outline Introduction 1 Related Work 2 Methodology 3 Experiments and Results 4 Conclusion 5 Current and future Work 6

  3. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Outline Introduction 1 Related Work 2 Methodology 3 Experiments and Results 4 Conclusion 5 Current and future Work 6

  4. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Translation Memories English Catalan s 1 : European Association for t 1 : Associaci´ o Europea per a la Machine Translation Traducci´ o Autom` atica s 2 : The EAMT is a member of t 2 : L ’EAMT ´ es membre de l’IAMT the IAMT t 3 : el congr´ s 3 : current year’s conference is es d’enguany se cel- held in Leuven ebra a Lovaina . . . . . .

  5. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Translation Memories English Catalan s 1 : European Association for t 1 : Associaci´ o Europea per a la Machine Translation Traducci´ o Autom` atica s 2 : The EAMT is a member of t 2 : L ’EAMT ´ es membre de l’IAMT the IAMT t 3 : el congr´ s 3 : current year’s conference is es d’enguany se cel- held in Leuven ebra a Lovaina . . . . . . New sentence s ′ : The AMTA is a member of the IAMT

  6. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Translation Memories English Catalan s 1 : European Association for t 1 : Associaci´ o Europea per a la Machine Translation Traducci´ o Autom` atica s 2 : The EAMT is a member of t 2 : L ’EAMT ´ es membre de l’IAMT the IAMT t 3 : el congr´ s 3 : current year’s conference is es d’enguany se cel- held in Leuven ebra a Lovaina . . . . . . New sentence s ′ : The AMTA is a member of the IAMT Best match s 2 : The EAMT is a member of the IAMT

  7. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Translation Memories English Catalan s 1 : European Association for t 1 : Associaci´ o Europea per a la Machine Translation Traducci´ o Autom` atica s 2 : The EAMT is a member of t 2 : L ’EAMT ´ es membre de l’IAMT the IAMT t 3 : el congr´ s 3 : current year’s conference is es d’enguany se cel- held in Leuven ebra a Lovaina . . . . . . New sentence s ′ : The AMTA is a member of the IAMT Best match s 2 : The EAMT is a member of the IAMT Proposal t 2 : L ’EAMT ´ es membre de l’IAMT

  8. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Fuzzy Matching Scores Fuzzy matching scores measure the similarity between segments s ′ (segment to be translated) and s i (matching segment in the Translation memory) score ( s ′ , s i ) = 1 − EditDistance ( s ′ , s i ) max ( | s ′ | , | s i | )

  9. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Fuzzy Matching Scores Fuzzy matching scores measure the similarity between segments s ′ (segment to be translated) and s i (matching segment in the Translation memory) score ( s ′ , s i ) = 1 − EditDistance ( s ′ , s i ) max ( | s ′ | , | s i | ) Example s ′ : The Association for Machine Translation in the Americas is the American branch of the IAMT s i : The European Association for Machine Translation is a member of the IAMT score ( s ′ , s i ) = 1 − 7 15 ≃ 0 , 53

  10. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Translation-Memory Based CAT Tools

  11. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Fuzzy Match Scores + Alignment Edit distance provides information about the matching words between s ′ and s i : Example l’ Associaci´ o Europea per a la Traducci´ o Autom` atica t i s i the European Association for Machine Translation s ′ the Asia-Pacific Association for Machine Translation

  12. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Fuzzy Match Scores + Alignment Word alignment may be used to “project” source-side matching information onto t i to suggest which words to change and which to keep unedited: Example l’ Associaci´ o Europea per a la Traducci´ o Autom` atica t i ❅ ❆ ❉ ❆ � ❅ ❆ ❉ ❆ � ❅ ❆ ❉ � ❆ ❅ ❆ ❉ � ❆ ❅ ❆ ❉ � ❆ s i the European Association for Machine Translation s ′ the Asia-Pacific Association for Machine Translation

  13. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Outline Introduction 1 Related Work 2 Methodology 3 Experiments and Results 4 Conclusion 5 Current and future Work 6

  14. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Related Work Simard (2003) : Statistical MT techniques allows exploiting TMs at sub-segment (sub-sentential) level: translation spotting Bourdaillet et al. (2009) : Similar approach for a bilingual concordancer, TransSearch Kranias and Samiotou (2004) : Sub-segment level alignments using a bilingual dictionary to (i) detect words to be changed and (ii) propose translations for them

  15. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Outline Introduction 1 Related Work 2 Methodology 3 Experiments and Results 4 Conclusion 5 Current and future Work 6

  16. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Rationale [keep] [change] [?] w ij w ij ′ w ij ′′ t i � � � � ? � � � � v ik v ik ′ � � s i matched unmatched matched with s ′ with s ′ with s ′ w ij and v ik aligned and v ik matched = ⇒ keep w ij w ij and v ik aligned and v ik not matched = ⇒ change w ij w ij not aligned = ⇒ ???

  17. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Rationale What to do if there is more than one alignment with contradictory evidence? [???] w ij t i ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ v ik v ik ′ ✁ ❆ s i matched unmatched with s ′ with s ′

  18. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Rationale We define the likelihood of keeping the word w ij unedited as: � v ik ∈ aligned ( w ij ) matched ( v ik ) f K ( w ij , s ′ , s i , t i ) = | aligned ( w ij ) | aligned ( w ij ) : set of source-side words aligned with w ij in s i matched ( v ik ) : 1 if v ik is matched in s ′ and 0 otherwise

  19. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Interpretation of f K ( w ij , s ′ , s i , t i ) Two ways to interpret f K ( w ij , s ′ , s i , t i ) : Unanimity: if f K ( w ij , s ′ , s i , t i ) = 1: w ij → keep unedited if f K ( w ij , s ′ , s i , t i ) = 0: w ij → change otherwise → not marked Majority: if f K ( w ij , s ′ , s i , t i ) > 1 2 : w ij → keep unedited if f K ( w ij , s ′ , s i , t i ) < 1 2 : w ij → change otherwise → not marked

  20. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Example of Unanimity Criterion [change] [?] [keep] [keep] t i : he missed his brother ✁ ❏ ✁ ❏ ❏ ✁ s i : el ech´ ´ o de menos a su hermano s ′ : ella ech´ o de casa a su hermano

  21. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Example of Majority Criterion [change] [keep] [keep] [keep] t i : he missed his brother ✁ ❏ ✁ ❏ ❏ ✁ s i : el ech´ ´ o de menos a su hermano s ′ : ella ech´ o de casa a su hermano

  22. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Outline Introduction 1 Related Work 2 Methodology 3 Experiments and Results 4 Conclusion 5 Current and future Work 6

  23. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Corpora

  24. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Evaluation Metrics Accuracy = correctly marked words marked words Coverage = marked words total words

  25. Introduction Related Work Methodology Experiments and Results Conclusion Current and future Work Statistical Word Alignment We use the GIZA++ (Och and Ney, 2003) free/open-source tool we obtain SL to TL alignment and a TL to SL alignment on the TM we experiment with three ways to combine the alignments: union intersection grow-diag-final-and

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend