generation in machine translation from deep syntactic
play

Generation in Machine Translation from Deep Syntactic Trees Keith - PowerPoint PPT Presentation

Generation in Machine Translation from Deep Syntactic Trees Keith Hall Petr N mec Johns Hopkins University Charles University in Prague Outline Transfer-based MT Tectogrammatical Representation (TR) (deep syntax) Generation from


  1. Generation in Machine Translation from Deep Syntactic Trees Keith Hall Petr N ě mec Johns Hopkins University Charles University in Prague

  2. Outline ● Transfer-based MT ● Tectogrammatical Representation (TR) (deep syntax) ● Generation from English TR trees ● process ● models ● Empirical results SSST ‘07 - Hall & N ě mec

  3. Transfer-based MT Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  4. Transfer-based MT Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  5. Transfer-based MT Interlingua Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  6. Transfer-based MT Interlingua Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  7. Transfer-based MT Tectogrammar Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  8. Tecto Transfer-based MT Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  9. Tecto Transfer-based MT deep deep syntax syntax (Czech Tecto) (English Tecto) surface surface syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  10. Tecto Transfer-based MT deep deep syntax syntax (Czech Tecto) (English Tecto) g n i s r a surface surface p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  11. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) g n i s r a surface surface p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  12. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) generation g n i s r a surface surface p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  13. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) generation g n i s r a surface surface ` p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  14. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) generation g n i s ? r a surface surface ` p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  15. Transfer-based MT ● Allows us to explore deep syntactic representations ● Factored models are clear ● Need not be greedy one-best process ● although we present one-best generation/results SSST ‘07 - Hall & N ě mec

  16. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  17. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  18. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” lemma FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  19. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT functor FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  20. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT part-of-speech FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  21. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED tense & mood POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  22. Generation Process deep 1. Insert syn-semantic (function) words syntax (English Tecto) 2. Subtree reordering ● Intermediary surface syntax ? surface ● Reordering constraints? syntax ● maximum subtree size ● coordination English sentence SSST ‘07 - Hall & N ě mec

  23. Generation Model arg max A,f P ( A, f | T ) = arg max A,f P ( f | A, T ) P ( A | T ) ≈ arg max P ( f | T, arg max A P ( A | T )) f ● tecto nodes: T = { t 1 , . . . , t i , . . . , t n } ● insertion string: A = { a 1 , . . . , a i , . . . , a k } n ≤ k ≤ 2 n ● order mapping: f : { A ∪ T } → { 1 , . . . , 2 n } SSST ‘07 - Hall & N ě mec

  24. Generation Model Insertion arg max A,f P ( A, f | T ) = arg max A,f P ( f | A, T ) P ( A | T ) ≈ arg max P ( f | T, arg max A P ( A | T )) f ● tecto nodes: T = { t 1 , . . . , t i , . . . , t n } ● insertion string: A = { a 1 , . . . , a i , . . . , a k } n ≤ k ≤ 2 n ● order mapping: f : { A ∪ T } → { 1 , . . . , 2 n } SSST ‘07 - Hall & N ě mec

  25. Generation Model Reordering arg max A,f P ( A, f | T ) = arg max A,f P ( f | A, T ) P ( A | T ) ≈ arg max P ( f | T, arg max A P ( A | T )) f ● tecto nodes: T = { t 1 , . . . , t i , . . . , t n } ● insertion string: A = { a 1 , . . . , a i , . . . , a k } n ≤ k ≤ 2 n ● order mapping: f : { A ∪ T } → { 1 , . . . , 2 n } SSST ‘07 - Hall & N ě mec

  26. Insertion Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network now bureau capital ACT TWHEN PAT LOC NN RB NN NN news hungarian RSTR RSTR NN JJ SSST ‘07 - Hall & N ě mec

  27. Insertion Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network now bureau capital has ACT TWHEN PAT LOC AUX NN RB NN NN the DT news hungarian in RSTR PP RSTR NN JJ the a DT DT SSST ‘07 - Hall & N ě mec

  28. Insertion Model P ( A | T ) � = P ( a i | a 1 , . . . , a i − 1 , T ) i � ≈ P ( a i | t i , t g ( i ) ) i ● Insertion is dependent on local context: ● tecto node (includes: lemma, functor, POS) ● parent node ● Three independent models: ● articles ● prepositions and subordinating conjunctions ● modals (deterministic, given functor) SSST ‘07 - Hall & N ě mec

  29. Reordering Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network bureau capital now has ACT PAT LOC TWHEN AUX NN NN NN RB SSST ‘07 - Hall & N ě mec

  30. Reordering Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network has bureau capital now ACT AUX PAT LOC TWHEN NN NN NN RB SSST ‘07 - Hall & N ě mec

  31. Reordering Process “Now the network has opened a news bureau in the Hungarian capital” network open bureau capital has now ACT PRED PAT LOC AUX TWHEN NN VBN NN NN RB SIM_IND SSST ‘07 - Hall & N ě mec

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend