phrasal rank encoding
play

Phrasal Rank-Encoding Exploiting Phrase Redundancy and - PowerPoint PPT Presentation

Phrasal Rank-Encoding Exploiting Phrase Redundancy and Translational Relations for Phrase Table Compression Marcin Junczys-Dowmunt Department of Mathematics and Computer Science Adam Mickiewicz University in Pozna n ul.Umultowska 87, 61-614


  1. Phrasal Rank-Encoding Exploiting Phrase Redundancy and Translational Relations for Phrase Table Compression Marcin Junczys-Dowmunt Department of Mathematics and Computer Science Adam Mickiewicz University in Pozna´ n ul.Umultowska 87, 61-614 Pozna´ n, Poland Global Databases Section World Intelectual Property Organizaton (WIPO) 34, chemin des Colombettes, 1211 Geneva, Switzerland junczys@amu.edu.pl September 4, 2012 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 1 / 31

  2. Translation Model Size 70 Phrase T. 60 30.4 + 27.6 Reord. T. 50 40 Gigabytes 30 20 10 0 Text M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 2 / 31

  3. Translation Model Size 70 Phrase T. 60 30.4 + 27.6 Reord. T. 50 40 Gigabytes 30 20 5.9 + 2.6 10 0 Text Gzipped M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 2 / 31

  4. Translation Model Size 70 22.0 + 44.4 Phrase T. 60 30.4 + 27.6 Reord. T. 50 40 Gigabytes 30 20 5.9 + 2.6 10 0 Text Gzipped Moses M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 2 / 31

  5. Translation Model Size 70 22.0 + 44.4 Phrase T. 60 30.4 + 27.6 Reord. T. 50 40 Gigabytes 30 20 5.9 + 2.6 10 4.8 + 1.4 0 Text Gzipped Moses Compact M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 2 / 31

  6. Translation Model Size 70 22.0 + 44.4 Phrase T. 60 30.4 + 27.6 Reord. T. 50 40 Gigabytes 30 20 5.9 + 2.6 10 4.8 + 1.4 2.9 + 1.4 0 Text Gzipped Moses Compact PR−Enc M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 2 / 31

  7. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 3 / 31

  8. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 3 / 31

  9. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch Maria no daba una bofetada a la bruja verde ||| Mary did not slap the green witch ||| 0-0 1-1 1-2 2-3 no daba una bofetada a la bruja verde ||| did not slap the green witch ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 Maria no daba una bofetada a la ||| Mary did not slap the ||| 0-0 1-1 1-2 2-3 3-3 4-3 5-4 6-4 daba una bofetada a la bruja verde ||| slap the green witch ||| 0-0 1-0 2-0 3-1 4-1 5-3 6-2 Maria no daba una bofetada ||| Mary did not slap ||| 0-0 1-1 1-2 2-3 3-3 4-3 no daba una bofetada a la ||| did not slap the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 a la bruja verde ||| the green witch ||| 0-0 1-0 2-2 3-1 Maria no ||| Mary did not ||| 0-0 1-1 1-2 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 bruja verde ||| green witch ||| 0-1 1-0 Maria ||| Mary ||| 0-0 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 bruja ||| witch ||| 0-0 verde ||| green ||| 0-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 4 / 31

  10. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| did not slap the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 5 / 31

  11. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| did not slap the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 5 / 31

  12. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| did not slap the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 5 / 31

  13. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| did not slap the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  14. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| did not slap the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  15. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| (0,2,0) the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  16. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| (0,2,0) the ||| 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  17. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| (0,2,0) the ||| 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  18. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| (0,2,0) (1,0,0) ||| 4-3 5-3 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  19. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| (0,2,0) (1,0,0) ||| no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  20. bofetada bruja Maria no daba una a la verde Mary did not slap the green witch no daba una bofetada a la ||| (0,2,0) (1,0,0) ||| no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 6 / 31

  21. Maria no daba una bofetada a la bruja verde ||| Mary did not slap the green witch ||| 0-0 1-1 1-2 2-3 no daba una bofetada a la bruja verde ||| did not slap the green witch ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 Maria no daba una bofetada a la ||| Mary did not slap the ||| 0-0 1-1 1-2 2-3 3-3 4-3 5-4 6-4 daba una bofetada a la bruja verde ||| slap the green witch ||| 0-0 1-0 2-0 3-1 4-1 5-3 6-2 Maria no daba una bofetada ||| Mary did not slap ||| 0-0 1-1 1-2 2-3 3-3 4-3 no daba una bofetada a la ||| did not slap the ||| 0-0 0-1 1-2 2-2 3-2 4-3 5-3 a la bruja verde ||| the green witch ||| 0-0 1-0 2-2 3-1 Maria no ||| Mary did not ||| 0-0 1-1 1-2 no daba una bofetada ||| did not slap ||| 0-0 0-1 1-2 2-2 3-2 daba una bofetada a la ||| slap the ||| 0-0 1-0 2-0 3-1 4-1 bruja verde ||| green witch ||| 0-1 1-0 Maria ||| Mary ||| 0-0 no ||| did not ||| 0-0 0-1 daba una bofetada ||| slap ||| 0-0 1-0 2-0 a la ||| the ||| 0-0 1-0 bruja ||| witch ||| 0-0 verde ||| green ||| 0-0 M. Junczys-Dowmunt (UAM) Phrasal Rank-Encoding September 2012 7 / 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend