improving bilingual sub sentential alignment by sampling
play

Improving Bilingual Sub-sentential Alignment by Sampling-based - PowerPoint PPT Presentation

Improving Bilingual Sub-sentential Alignment by Sampling-based Transpotting Li Gong , Aur elien Max, Franc ois Yvon LIMSI-CNRS & Universit e Paris-Sud Orsay, France Method Experimental Results Conclusion and future work Context


  1. Improving Bilingual Sub-sentential Alignment by Sampling-based Transpotting Li Gong , Aur´ elien Max, Franc ¸ois Yvon LIMSI-CNRS & Universit´ e Paris-Sud Orsay, France

  2. Method Experimental Results Conclusion and future work Context of this work Building SMT systems, step 1 : align parallel corpus s e m e s p r u o u t t s o ... f c o ... a r o a n n c t i i • parallel corpus can be huge ... une • we don’t use / need everything troupe de • we may regularly receive new comédiens data déguisés dans ... Our method for parallel corpus alignment • is very simple to describe and implement • processes each sentence pair independently • uses new data transparently ( plug-and-play ) 2 / 26

  3. Method Experimental Results Conclusion and future work Context of this work Building SMT systems, step 1 : align parallel corpus s e m e s p r u o u t t s o ... f c o ... a r o a n n c t i i • parallel corpus can be huge ... une • we don’t use / need everything troupe de • we may regularly receive new comédiens data déguisés dans ... Our method for parallel corpus alignment • is very simple to describe and implement • processes each sentence pair independently • uses new data transparently ( plug-and-play ) 2 / 26

  4. Method Experimental Results Conclusion and future work Outline 1 Method Sampling-based transpotting Sub-sentential alignment extraction 2 Experimental Results Basic alignment task Incremental alignment task 3 Conclusion and future work 3 / 26

  5. Method Experimental Results Conclusion and future work Outline 1 Method Sampling-based transpotting Sub-sentential alignment extraction 2 Experimental Results Basic alignment task Incremental alignment task 3 Conclusion and future work 4 / 26

  6. Method Experimental Results Conclusion and future work Outline 1 Method Sampling-based transpotting Sub-sentential alignment extraction 2 Experimental Results Basic alignment task Incremental alignment task 3 Conclusion and future work 5 / 26

  7. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] , [1, 0, 1] please [1, 0, 0] . [1, 1, 1] un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

  8. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] English French , [1, 0, 1] un caf´ 1 one coffee , please . e , s’il vous plaˆ ıt . please [1, 0, 0] 2 the coffee is not bad . ce caf´ e est correct . . [1, 1, 1] 3 yes , one tea . oui , un th´ e . un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

  9. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] English French , [1, 0, 1] un caf´ 1 one coffee , please . e , s’il vous plaˆ ıt . please [1, 0, 0] 2 the coffee is not bad . ce caf´ e est correct . . [1, 1, 1] 3 yes , one tea . oui , un th´ e . un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

  10. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] English French , [1, 0, 1] un caf´ 1 one coffee , please . e , s’il vous plaˆ ıt . please [1, 0, 0] 2 the coffee is not bad . ce caf´ e est correct . . [1, 1, 1] 3 yes , one tea . oui , un th´ e . un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

  11. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] English French , [1, 0, 1] un caf´ 1 one coffee , please . e , s’il vous plaˆ ıt . please [1, 0, 0] 2 the coffee is not bad . ce caf´ e est correct . . [1, 1, 1] 3 yes , one tea . oui , un th´ e . un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

  12. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] English French , [1, 0, 1] un caf´ 1 one coffee , please . e , s’il vous plaˆ ıt . please [1, 0, 0] 2 the coffee is not bad . ce caf´ e est correct . . [1, 1, 1] 3 yes , one tea . oui , un th´ e . un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

  13. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] English French , [1, 0, 1] un caf´ 1 one coffee , please . e , s’il vous plaˆ ıt . please [1, 0, 0] 2 the coffee is not bad . ce caf´ e est correct . . [1, 1, 1] 3 yes , one tea . oui , un th´ e . un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

  14. Method Experimental Results Conclusion and future work Sampling-based transpotting 1 Given a source-target sentence pair, extract an association table : one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . 2 Draw a random sub-corpus from the parallel corpus and compute profiles for each word 3 Increment the count for each contiguous phrase pairs 4 Repeat steps 2 to 3 N times, so as to obtain an association table for the given sentence pair one diet coke , please . ↔ un coca z´ ero , s’il vous plaˆ ıt . one [1, 0, 1] diet [0, 0, 0] coke [0, 0, 0] English French , [1, 0, 1] un caf´ 1 one coffee , please . e , s’il vous plaˆ ıt . please [1, 0, 0] 2 the coffee is not bad . ce caf´ e est correct . . [1, 1, 1] 3 yes , one tea . oui , un th´ e . un [1, 0, 1] coca [0, 0, 0] . . . . . . 6 / 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend