inversion transduction grammars
play

Inversion Transduction Grammars Wilker Aziz 3/5/17 Word-based - PowerPoint PPT Presentation

Inversion Transduction Grammars Wilker Aziz 3/5/17 Word-based Translation Mary did not slap the green witch Mary no di una bofetada a la bruja verde Every French word is generated by an English word (or null) 2 Generative


  1. Inversion Transduction Grammars Wilker Aziz 3/5/17

  2. Word-based Translation Mary did not slap the green witch Mary no dió una bofetada a la bruja verde Every French word is generated by an English word (or null) 2

  3. Generative Story IBM ≥ 3 : Given E Mary did not slap the green witch 3

  4. Generative Story IBM ≥ 3 : Fertility Mary did not slap the green witch Mary did not slap slap slap the green witch 4

  5. Generative Story IBM ≥ 3 : NULL insertion Mary did not slap the green witch Mary did not slap slap slap the green witch NULL 5

  6. Generative Story IBM ≥ 3 : Translation Mary did not slap the green witch Mary did not slap slap slap the green witch NULL Mary no dió una bofetada a la verde bruja 6

  7. Generative Story IBM ≥ 3 : Distortion Mary did not slap the green witch Mary did not slap slap slap the green witch NULL Mary no dió una bofetada a la verde bruja Mary no dió una bofetada a la bruja verde 7

  8. Discussion • IBM models do not constrain divergence with respect to word order • Distortion step must consider all the m ! permutations of m French words 8

  9. All permutations: sensible or not? If we do not impose structural constraints (yet they do exist) • the model will have to learn (rather implicitly ) how not to violate them • which ought to require more data 9

  10. Practical consequences 10

  11. Practical consequences Estimation • modelling outcomes that even though possible are not plausible (unlikely to be observed) 10

  12. Practical consequences Estimation • modelling outcomes that even though possible are not plausible (unlikely to be observed) Generation • NP-completeness! 10

  13. NP-completeness

  14. NP-completeness NP-complete problem

  15. NP-completeness NP-complete problem • Generalised TSP [Knight, 1999; Zaslavskiy et al, 2009]

  16. NP-completeness NP-complete problem • Generalised TSP [Knight, 1999; Zaslavskiy et al, 2009] • Perfect matching [DeNero and Klein, 2008]

  17. NP-completeness NP-complete problem • Generalised TSP [Knight, 1999; Zaslavskiy et al, 2009] • Perfect matching [DeNero and Klein, 2008] • All permutations [Asveld, 2006; 2008]

  18. All permutations Let Σ n = { a 1 , ..., a n } • S ➝ A Σ n • A X ➝ a A X -{ a } for X ⊆ Σ n , # X ≥ 2, a ∈ X • A { a } ➝ a Regular grammar (there is an equivalent FSA) Asveld (2006, 2008) 12

  19. Complexity Note that nonterminals are indexed by subsets of Σ n i.e. power set of Σ • 2 n nonterminals (states) • n ⨉ 2 n productions (transitions) • n ! strings (paths) 13

  20. Example: 3 elements S ➝ A 123 A 123 ➝ a 1 A 23 | a 2 A 13 | a 3 A 12 A 12 ➝ a 1 A 2 | a 2 A 1 A 13 ➝ a 1 A 3 | a 3 A 1 A 23 ➝ a 2 A 3 | a 3 A 2 A 1 ➝ a 1 A 2 ➝ a 2 A 3 ➝ a 3 14

  21. "IBM constraint" Distortion limit in generation but not in estimation • any reasons why that may be unsatisfactory? 15

  22. Constraining permutations without a distortion limit Inversion Transduction Grammars (ITGs) [Wu, 1995; 1997] • Binarizable permutations • two streams are simultaneously generated • context-free backbone 16

  23. [Wu, 1997] 17

  24. Number of Permutations [Wu, 1997]

  25. ITG 19

  26. ITG English French 19

  27. ITG English French S ➝ X X copy 19

  28. ITG English French S ➝ X X copy X ➝ X 1 X 2 X 1 X 2 copy 19

  29. ITG English French S ➝ X X copy X ➝ X 1 X 2 X 1 X 2 copy X 2 X 1 invert 19

  30. ITG English French S ➝ X X copy X ➝ X 1 X 2 X 1 X 2 copy X 2 X 1 invert X ➝ e f transduce 19

  31. ITG English French S ➝ X X copy X ➝ X 1 X 2 X 1 X 2 copy X 2 X 1 invert X ➝ e f transduce X ➝ e ε delete 19

  32. ITG English French S ➝ X X copy X ➝ X 1 X 2 X 1 X 2 copy X 2 X 1 invert X ➝ e f transduce X ➝ e ε delete X ➝ ε f insert 19

  33. ITG Trees I really miss you I really miss you Sinto tanto sua falta Sinto tanto sua falta

  34. ITG Trees B E I really miss you I really miss you A F Sinto tanto sua falta Sinto tanto sua falta

  35. Model Joint probability model P(T) = P(A, B, E, F) t = h r 1 , . . . , r n i e = yield 1 ( t ) f = yield 2 ( t ) a = alignment( t ) b = bracketing( t ) P ( T = t ) = P ( A = a, B = b, E = e, F = f ) N Y = θ r i i =1 22

  36. Parametrisation 23

  37. Parametrisation Multinomial: one parameter per rule 23

  38. Parametrisation Multinomial: one parameter per rule • θ [] one parameter for monotone 23

  39. Parametrisation Multinomial: one parameter per rule • θ [] one parameter for monotone • θ <> one parameter for swap 23

  40. Parametrisation Multinomial: one parameter per rule • θ [] one parameter for monotone • θ <> one parameter for swap • θ e/f one parameter per word pair 23

  41. Parametrisation Multinomial: one parameter per rule • θ [] one parameter for monotone • θ <> one parameter for swap • θ e/f one parameter per word pair • θ e/ ε one parameter per deleted English word 23

  42. Parametrisation Multinomial: one parameter per rule • θ [] one parameter for monotone • θ <> one parameter for swap • θ e/f one parameter per word pair • θ e/ ε one parameter per deleted English word • θ ε /f one parameter per inserted French word 23

  43. MLE We do not typically construct treebanks of ITG trees • potential counts instead of observed counts h n ( X ! α ) i P ( A,B | F,E ) θ X ! α = P α 0 h n ( X ! α 0 ) i P ( A,B | F,E ) Expectations from parse forests • Inside-Outside [Baker, 1979; Lari and Young, 1990; Goodman, 1999] Typically initialised with IBM1 24

  44. Difficulties Inference: complexity O(l 3 m 3 ) Model: too few reordering parameters Decisions: ambiguity • Disambiguation problem is NP-complete [Sima'an, 1996] X arg max P ( A | F, E ) = arg max P ( A, B | F, E ) A A B ≈ arg max P ( A, B | F, E ) A,B 25

  45. Bibliography • Knight, Kevin. 1999. Decoding complexity in word-replacement translation models. In Computational Linguistics . MIT Press. • Zaslavskiy, Mikhail and Dymetman, Marc and Cancedda, Nicola. 2009. Phrase-based statistical machine translation as a traveling salesman problem. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1. • DeNero, John and Klein, Dan. 2008. The Complexity of Phrase Alignment Problems. In Proceedings of ACL-08: HLT . • Asveld, Peter R. J. 2006. Generating All Permutations by Context-free Grammars in Chomsky Normal Form. In Theoretical Computer Science . Elsevier Science Publishers Ltd. • Asveld, Peter R. J. 2008. Generating All Permutations by Context-free Grammars in Greibach Normal Form. In Theoretical Computer Science . Elsevier Science Publishers Ltd. • Wu, D. 1995. An Algorithm for Simultaneously Bracketing Parallel Texts by Aligning Words. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics . ACL.

  46. Bibliography Wu, D. 1997. Stochastic Inversion Transduction Grammars and Bilingual • Parsing of Parallel Corpora. In Computational Linguistics . MIT Press. James K. Baker. 1979. Trainable grammars for speech recognition. In • Proceedings of the Spring Conference of the Acoustical Society of America . Karim Lari and Steve J. Young. 1990. The estimation of stochastic context- • free grammars using the inside--outside algorithm. In Computer Speech and Language . Goodman, Joshua. 1999. Semiring parsing. In Computational Linguistics. • Sima'an, Khalil. 1996. Computational complexity of probabilistic • disambiguation by means of tree-grammars. In Proceedings of the 16th conference on Computational linguistics - Volume 2 .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend