finite state technology in natural language processing
play

Finite-State Technology in Natural Language Processing Andreas - PowerPoint PPT Presentation

Finite-State Technology in Natural Language Processing Andreas Maletti Institute for Natural Language Processing Universitt Stuttgart, Germany maletti@ims.uni-stuttgart.de Ume August 18, 2015 FST in NLP A. Maletti 1 Roadmap


  1. Part-of-Speech Tagging Tags (from the P ENN tree bank — English) DT = determiner NN = noun (singular or mass) JJ = adjective MD = modal VB = verb (base form) VBD = verb (past tense) VBG = verb (gerund or present participle) Example (Tagging exercise) the show FST in NLP A. Maletti 10 ·

  2. Part-of-Speech Tagging Tags (from the P ENN tree bank — English) DT = determiner NN = noun (singular or mass) JJ = adjective MD = modal VB = verb (base form) VBD = verb (past tense) VBG = verb (gerund or present participle) Example (Tagging exercise) the show DT NN FST in NLP A. Maletti 10 ·

  3. Part-of-Speech Tagging History 1960s ◮ manually tagged B ROWN corpus (1,000,000 words) ◮ tag lists with frequency for each token e.g., { VB , MD , NN } for can ◮ excluding ling.-implausible sequences (e.g. DT VB) ◮ “most common tag” yields 90% accuracy [C HARNIAK , 97] FST in NLP A. Maletti 11 ·

  4. Part-of-Speech Tagging History 1960s ◮ manually tagged B ROWN corpus (1,000,000 words) ◮ tag lists with frequency for each token e.g., { VB , MD , NN } for can ◮ excluding ling.-implausible sequences (e.g. DT VB) ◮ “most common tag” yields 90% accuracy [C HARNIAK , 97] 1980s ◮ hidden M ARKOV models (HMM) ◮ dynamic programming and V ITERBI algorithms (wA algorithms) FST in NLP A. Maletti 11 ·

  5. Part-of-Speech Tagging History 1960s ◮ manually tagged B ROWN corpus (1,000,000 words) ◮ tag lists with frequency for each token e.g., { VB , MD , NN } for can ◮ excluding ling.-implausible sequences (e.g. DT VB) ◮ “most common tag” yields 90% accuracy [C HARNIAK , 97] 1980s ◮ hidden M ARKOV models (HMM) ◮ dynamic programming and V ITERBI algorithms (wA algorithms) 2000s ◮ British national corpus (100,000,000 words) ◮ parsers are better taggers (wTA algorithms) FST in NLP A. Maletti 11 ·

  6. Markov Model Statistical approach Given a sequence w = w 1 · · · w k of tokens, determine the most likely sequence t 1 · · · t k of part-of-speech tags ( t i is the tag of w i ) (ˆ t 1 , . . . , ˆ t k )= arg max p ( t 1 , . . . , t k | w ) ( t 1 ,..., t k ) p ( t 1 , . . . , t k , w ) = arg max p ( w ) ( t 1 ,..., t k ) = arg max p ( t 1 , . . . , t k , w 1 , . . . , w k ) ( t 1 ,..., t k ) k = arg max � p ( t 1 , w 1 ) · p ( t i , w i | t 1 , . . . , t i − 1 , w 1 , . . . , w i − 1 ) ( t 1 ,..., t k ) i = 2 FST in NLP A. Maletti 12 ·

  7. Markov Model Modelling as stochastic process introduce event E i = w i ∩ t i = ( w i , t i ) k � p ( t 1 , w 1 ) · p ( t i , w i | t 1 , . . . , t i − 1 , w 1 , . . . , w i − 1 ) i = 2 k � = p ( E 1 ) · p ( E i | E 1 , . . . , E i − 1 ) i = 2 assume M ARKOV property p ( E i | E 1 , . . . , E i − 1 ) = p ( E i | E i − 1 ) = p ( E 2 | E 1 ) FST in NLP A. Maletti 13 ·

  8. Markov Model Summary initial weights p ( E ) (not indicated below) transition weights p ( E | E ′ ) 0 . 2 0 ( the , DT ) ( fun , NN ) 0 0 . 1 0 . 4 0 . 1 0 0 0 . 4 0 . 2 0 . 1 0 . 1 0 . 3 0 ( a , DT ) ( car , NN ) 0 0 . 1 FST in NLP A. Maletti 14 ·

  9. Markov Model Maximum likelihood estimation (MLE) Assume that likelihood = relative frequency in corpus initial weights p ( E ) How often does E start a tagged sentence? transition weights p ( E | E ′ ) How often does E follow E ′ ? FST in NLP A. Maletti 15 ·

  10. Markov Model Maximum likelihood estimation (MLE) Assume that likelihood = relative frequency in corpus initial weights p ( E ) How often does E start a tagged sentence? transition weights p ( E | E ′ ) How often does E follow E ′ ? Problems Vocabulary: ≈ 350,000 English tokens, but only 50,000 tokens (14%) in B ROWN corpus Sparsity: (car, NN) (fun, NN) not attested in corpus, but plausible (frequency estimates might be wrong) FST in NLP A. Maletti 15 ·

  11. Transformation into Weighted Automaton 0 . 2 ( the , DT ) ( fun , NN ) 0 0 0 . 1 0 . 4 0 . 1 0 0 0 . 4 0 . 2 0 . 1 0 . 1 0 . 3 0 ( a , DT ) ( car , NN ) 0 0 . 1 FST in NLP A. Maletti 16 ·

  12. Transformation into Weighted Automaton ( fun , NN ) 0 . 2 q 1 q 2 ( the , DT ) 0 . 1 ( car , NN ) 0 . 4 ( fun , NN ) 0 . 2 0 . 1 0 . 1 ( fun , NN ) ( the , DT ) 0 . 4 ( car , NN ) q 0 ( a , DT ) 0 . 1 0 . 3 ( car , NN ) q 3 q 4 0 . 1 ( a , DT ) FST in NLP A. Maletti 16 ·

  13. Transformation into Weighted Automaton ( fun , NN ) 0 . 2 ( fun , NN ) 0 . 2 q 1 q 2 ( the , DT ) 0 . 1 ( the , DT ) ( car , NN ) 0 . 5 0 . 4 0 . 2 ( fun , NN ) 0 . 1 0 . 1 ( fun , NN ) ( the , DT ) ( car , NN ) 0 . 4 q 0 ( a , DT ) 0 . 1 ( a , DT ) 0 . 3 0 . 3 ( car , NN ) q 3 q 4 0 . 1 ( a , DT ) 0 . 5 ( car , NN ) FST in NLP A. Maletti 16 ·

  14. Part-of-Speech Tagging Typical questions Decoding: (or language model evaluation) Given model M and sentence w , determine probability M 1 ( w ) ◮ project labels to first components ◮ evaluate w in the obtained wA M 1 ◮ efficient: initial-algebra semantics (forward algorithm) FST in NLP A. Maletti 17 ·

  15. Part-of-Speech Tagging Typical questions Decoding: (or language model evaluation) Given model M and sentence w , determine probability M 1 ( w ) ◮ project labels to first components ◮ evaluate w in the obtained wA M 1 ◮ efficient: initial-algebra semantics (forward algorithm) Tagging: Given model M and sentence w , determine the best tag sequence t 1 · · · t k ◮ intersect M with the DFA for w and any tag sequence ◮ determine best run in the obtained wA ◮ efficient: V ITERBI algorithm FST in NLP A. Maletti 17 ·

  16. Part-of-Speech Tagging Typical questions (Weight) Induction: (or MLE training) Given NFA ( Q , Σ , I , ∆ , F ) and sequence w 1 , . . . , w k of tagged sentences w i ∈ Σ ∗ , determine transition weights wt : ∆ → [ 0 , 1 ] such that � k i = 1 M wt ( w i ) is maximal with M wt = ( Q , Σ , I , ∆ , F , wt ) ◮ no closed solution (in general), but many approximations ◮ efficient: hill-climbing methods (EM, simulated annealing, etc.) FST in NLP A. Maletti 18 ·

  17. Part-of-Speech Tagging Typical questions (Weight) Induction: (or MLE training) Given NFA ( Q , Σ , I , ∆ , F ) and sequence w 1 , . . . , w k of tagged sentences w i ∈ Σ ∗ , determine transition weights wt : ∆ → [ 0 , 1 ] such that � k i = 1 M wt ( w i ) is maximal with M wt = ( Q , Σ , I , ∆ , F , wt ) ◮ no closed solution (in general), but many approximations ◮ efficient: hill-climbing methods (EM, simulated annealing, etc.) Learning: (or HMM induction) Given NFA ( Q , Σ , I , ∆ , F ) and sequence w 1 , . . . , w k of untagged sentences w i , determine transition weights wt : ∆ → [ 0 , 1 ] such that � k i = 1 ( M wt ) 1 ( w i ) is maximal with M wt = ( Q , Σ , I , ∆ , F , wt ) ◮ no exact solution (in general), but many approximations ◮ efficient: hill-climbing methods (EM, simulated annealing, etc.) FST in NLP A. Maletti 18 ·

  18. Part-of-Speech Tagging Issues WA too big (in comparison to training data) ◮ cannot reliably estimate that many probabilities p ( E | E ′ ) ◮ simplify model e.g., assume transition probability only depends on tags � ( w , t ) | ( w ′ , t ′ ) � = p ( t | t ′ ) p FST in NLP A. Maletti 19 ·

  19. Part-of-Speech Tagging Issues WA too big (in comparison to training data) ◮ cannot reliably estimate that many probabilities p ( E | E ′ ) ◮ simplify model e.g., assume transition probability only depends on tags � ( w , t ) | ( w ′ , t ′ ) � = p ( t | t ′ ) p unknown words ◮ no statistics on words that do not occur in corpus ◮ allow only assignment of open tags (open tag = potentially unbounded number of elements, e.g. NNP) (closed tag = fixed finite number of elements, e.g. DT or PRP) ◮ use morphological clues (capitalization, affixes, etc.) ◮ use context to disambiguate ◮ use “global” statistics FST in NLP A. Maletti 19 ·

  20. Part-of-Speech Tagging TCS contributions efficient evaluation and complexity considerations (initial-algebra semantics, best runs, best strings, etc.) model simplifications (trimming, determinization, minimization, etc.) model transformations (projection, intersection, RegEx-to-DFA, etc.) model induction (grammar induction, weight training, etc.) FST in NLP A. Maletti 20 ·

  21. Parsing FST in NLP A. Maletti 21 ·

  22. Parsing Motivation (syntactic) parsing = determining the syntactic structure of a sentence important in several applications: ◮ co-reference resolution (determining which noun phrases refer to the same object/concept) ◮ comprehension (determining the meaning) ◮ speech repair and sentence-like unit detection in speech (speech offers no punctuation; needs to be predicted) FST in NLP A. Maletti 22 ·

  23. Parsing We must bear in mind the Community as a whole S NP VP PRP MD VP We must VB PP NP bear IN NP NP PP in NN DT NN IN NP Community as mind the DT NN a whole FST in NLP A. Maletti 23 ·

  24. Parsing We must bear in mind the Community as a whole S NP VP PRP MD VP We must VB PP NP bear IN NP NP PP in NN DT NN IN NP Community as mind the DT NN a whole FST in NLP A. Maletti 23 ·

  25. Trees Finite sets Σ and W Definition Set T Σ ( W ) of Σ -trees indexed by W is smallest T w ∈ T for all w ∈ W σ ( t 1 , . . . , t k ) ∈ T for all k ∈ N , σ ∈ Σ , and t 1 , . . . , t k ∈ T FST in NLP A. Maletti 24 ·

  26. Trees Finite sets Σ and W Definition Set T Σ ( W ) of Σ -trees indexed by W is smallest T w ∈ T for all w ∈ W σ ( t 1 , . . . , t k ) ∈ T for all k ∈ N , σ ∈ Σ , and t 1 , . . . , t k ∈ T Notes obvious recursion & induction principle FST in NLP A. Maletti 24 ·

  27. Parsing Problem assume a hidden g : W ∗ → T Σ ( W ) (reference parser) given a finite set T ⊆ T Σ ( W ) (training set) generated by g develop a system representing f : W ∗ → T Σ ( W ) (parser) approximating g FST in NLP A. Maletti 25 ·

  28. Parsing Problem assume a hidden g : W ∗ → T Σ ( W ) (reference parser) given a finite set T ⊆ T Σ ( W ) (training set) generated by g develop a system representing f : W ∗ → T Σ ( W ) (parser) approximating g FST in NLP A. Maletti 25 ·

  29. Parsing Problem assume a hidden g : W ∗ → T Σ ( W ) (reference parser) given a finite set T ⊆ T Σ ( W ) (training set) generated by g develop a system representing f : W ∗ → T Σ ( W ) (parser) approximating g Clarification T generated by g ⇐ ⇒ T = g ( L ) for some finite L ⊆ W ∗ for approximation we could use |{ w ∈ W ∗ | f ( w ) = g ( w ) }| FST in NLP A. Maletti 25 ·

  30. Parsing Short history before 1990 ◮ hand-crafted rules based on POS tags (unlexicalized parsing) ◮ corrections and selection by human annotators 1990s ◮ P ENN tree bank (1,000,000 words) ◮ weighted local tree grammars (weighted CFG) as parsers (often still unlexicalized) ◮ W ALL S TREET J OURNAL tree bank (30,000,000 words) since 2000 ◮ weighted tree automata (weighted CFG with latent variables) ◮ lexicalized parsers FST in NLP A. Maletti 26 ·

  31. Weighted Local Tree Grammars S S NP VP NP VP PRP VBD ADVP PRP$ NN VBZ I scored RB My dog sleeps well FST in NLP A. Maletti 27 ·

  32. Weighted Local Tree Grammars S S NP VP NP VP PRP VBD ADVP PRP$ NN VBZ I scored RB My dog sleeps well LTG production extraction simply read off CFG productions: S − → NP VP NP − → PRP$ NN PRP$ − → My NN − → dog VP − → VBZ VBZ − → sleeps NP − → PRP PRP − → I VP − → VBD ADVP VBD − → scored ADVP − → RB RB − → well FST in NLP A. Maletti 27 ·

  33. Weighted Local Tree Grammars Observations LTG offer unique explanation on tree level (rules observable in training data; as for POS tagging) but ambiguity on the string level (i.e., on unannotated data; as for POS tagging) → weighted productions FST in NLP A. Maletti 28 ·

  34. Weighted Local Tree Grammars Observations LTG offer unique explanation on tree level (rules observable in training data; as for POS tagging) but ambiguity on the string level (i.e., on unannotated data; as for POS tagging) → weighted productions Illustration S S NP VP NP VP PRP VBD S-BAR PRP VBD NP saw We S saw We PRP$ NN NP VP her duck PRP VBP her duck FST in NLP A. Maletti 28 ·

  35. Weighted Local Tree Grammars Definition A weighted local tree grammar (wLTG) is a weighted CFG G = ( N , W , S , P , wt ) finite set N (nonterminals) finite set W (terminals) (start nonterminals) S ⊆ N finite set P ⊆ N × ( N ∪ W ) ∗ (productions) mapping wt : P → [ 0 , 1 ] (weight assignment) It computes the weighted derivation trees of the wCFG FST in NLP A. Maletti 29 ·

  36. Weighted Local Tree Grammars S S NP VP NP VP PRP VBD ADVP PRP$ NN VBZ I scored RB My dog sleeps well wLTG production extraction simply read of CFG productions and keep counts: S − → NP VP ( 2 ) NP − → PRP$ NN ( 1 ) PRP$ − → My ( 1 ) NN − → dog ( 1 ) VP − → VBZ ( 1 ) VBZ − → sleeps ( 1 ) NP − → PRP ( 1 ) PRP − → I ( 1 ) VP − → VBD ADVP ( 1 ) VBD − → scored ( 1 ) ADVP − → RB ( 1 ) RB − → well ( 1 ) FST in NLP A. Maletti 30 ·

  37. Weighted Local Tree Grammars wLTG production extraction normalize counts: (here by left-hand side) S − → NP VP ( 2 ) NP − → PRP$ NN ( 1 ) NP − → PRP ( 1 ) PRP$ − → My ( 1 ) NN − → dog ( 1 ) VP − → VBZ ( 1 ) VP − → VBD ADVP ( 1 ) VBZ − → sleeps ( 1 ) PRP − → I ( 1 ) VBD − → scored ( 1 ) ADVP − → RB ( 1 ) RB − → well ( 1 ) FST in NLP A. Maletti 31 ·

  38. Weighted Local Tree Grammars wLTG production extraction normalize counts: (here by left-hand side) 1 S → NP VP − 0 . 5 0 . 5 NP → PRP$ NN NP → PRP − − 1 PRP$ → My − 1 NN → dog − 0 . 5 0 . 5 VP → VBZ VP → VBD ADVP − − 1 VBZ → sleeps − 1 PRP → I − 1 VBD → scored − 1 ADVP → RB − 1 RB → well − FST in NLP A. Maletti 31 ·

  39. Weighted Local Tree Grammars Weighted parses S S NP VP NP VP PRP VBD ADVP PRP$ NN VBZ I scored RB My dog sleeps well weight: 0 . 25 weight: 0 . 25 Weighted LTG productions (only productions with weight � = 1) 0 . 5 0 . 5 NP → PRP$ NN NP → PRP − − 0 . 5 0 . 5 VP → VBZ VP → VBD ADVP − − FST in NLP A. Maletti 32 ·

  40. Parser Evaluation B ERKELEY parser [Reference]: S NP VP PRP MD VP must We VB PP NP bear IN NP NP PP in NN DT NN IN NP Community as mind the DT NN a whole C HARNIAK -J OHNSON parser: S NP VP PRP MD VP We must VB PP NP PP bear IN NP DT NNP IN NP in NN the Community as DT NN a mind whole FST in NLP A. Maletti 33 ·

  41. Parser Evaluation Definition (ParseEval measure) precision = number of correct constituents (heading the same phrase as in reference) divided by number of all constituents in parse FST in NLP A. Maletti 34 ·

  42. Parser Evaluation Definition (ParseEval measure) precision = number of correct constituents (heading the same phrase as in reference) divided by number of all constituents in parse recall = number of correct constituents divided by number of all constituents in reference FST in NLP A. Maletti 34 ·

  43. Parser Evaluation Definition (ParseEval measure) precision = number of correct constituents (heading the same phrase as in reference) divided by number of all constituents in parse recall = number of correct constituents divided by number of all constituents in reference (weighted) harmonic mean precision · recall F α = ( 1 + α 2 ) · α 2 · precision + recall FST in NLP A. Maletti 34 ·

  44. Parser Evaluation Reference Parser output S NP VP S PRP MD VP NP VP We must VB PP NP PRP MD VP must bear IN NP NP PP We VB PP NP PP in NN DT NN IN NP bear IN NP DT NNP IN NP Community as Community as mind the DT NN in NN the DT NN a whole mind a whole precision = 9 9 = 100 % FST in NLP A. Maletti 35 ·

  45. Parser Evaluation Reference Parser output S NP VP S PRP MD VP NP VP We must VB PP NP PRP MD VP must bear IN NP NP PP We VB PP NP PP in NN DT NN IN NP bear IN NP DT NNP IN NP Community as Community as mind the DT NN in NN the DT NN a whole mind a whole precision = 9 9 = 100 % 9 recall = 10 = 90 % FST in NLP A. Maletti 35 ·

  46. Parser Evaluation Reference Parser output S NP VP S PRP MD VP NP VP We must VB PP NP PRP MD VP must bear IN NP NP PP We VB PP NP PP in NN DT NN IN NP bear IN NP DT NNP IN NP Community as Community as mind the DT NN in NN the DT NN a whole mind a whole precision = 9 9 = 100 % 9 recall = 10 = 90 % F 1 = 2 · 1 · 0 . 9 1 + 0 . 9 = 95 % FST in NLP A. Maletti 35 ·

  47. Parser Evaluation Standardized Setup training data: P ENN treebank Sections 2–21 (articles from the W ALL S TREET J OURNAL ) development test data: P ENN treebank Section 22 evaluation data: P ENN treebank Section 23 Experiment [P OST , G ILDEA , ’09] grammar model precision recall F 1 wLTG 75.37 70.05 72.61 FST in NLP A. Maletti 36 ·

  48. Parser Evaluation Standardized Setup training data: P ENN treebank Sections 2–21 (articles from the W ALL S TREET J OURNAL ) development test data: P ENN treebank Section 22 evaluation data: P ENN treebank Section 23 Experiment [P OST , G ILDEA , ’09] grammar model precision recall F 1 wLTG 75.37 70.05 72.61 These are bad compared to the state-of-the-art! FST in NLP A. Maletti 36 ·

  49. Parser Evaluation State-of-the-art models context-free grammars with latent variables (CFG lv ) [C OLLINS , ’99], [K LEIN , M ANNING , ’03], [P ETROV , K LEIN , ’07] tree substitution grammars with latent variables (TSG lv ) [S HINDO et al., ’12] (both as expressive as weighted tree automata) other models FST in NLP A. Maletti 37 ·

  50. Parser Evaluation State-of-the-art models context-free grammars with latent variables (CFG lv ) [C OLLINS , ’99], [K LEIN , M ANNING , ’03], [P ETROV , K LEIN , ’07] tree substitution grammars with latent variables (TSG lv ) [S HINDO et al., ’12] (both as expressive as weighted tree automata) other models Experiment [S HINDO et al., ’12] grammar model F 1 wLTG = wCFG 72.6 wTSG [C OHN et al., 2010] 84.7 wCFG lv [P ETROV , 2010] 91.8 wTSG lv [S HINDO et al., 2012] 92.4 FST in NLP A. Maletti 37 ·

  51. Grammars with Latent Variables Definition A grammar with latent variables is (grammar with relabeling) a grammar G generating L ( G ) ⊆ T Σ ( W ) a (total) mapping ρ : Σ → ∆ functional relabeling FST in NLP A. Maletti 38 ·

  52. Grammars with Latent Variables Definition A grammar with latent variables is (grammar with relabeling) a grammar G generating L ( G ) ⊆ T Σ ( W ) a (total) mapping ρ : Σ → ∆ functional relabeling Definition (Semantics) L ( G , ρ ) = ρ ( L ( G )) = { ρ ( t ) | t ∈ L ( G ) } Language class: REL ( L ) for language class L FST in NLP A. Maletti 38 ·

  53. Weighted Tree Automata Definition A weighted tree automaton (wTA) is a system G = ( Q , N , W , S , P , wt ) finite set Q (states) finite set N (nonterminals) finite set W (terminals) (start states) S ⊆ Q finite set P ⊆ � Q × N × ( Q ∪ W ) + � � � (productions) ∪ Q × W mapping wt : P → [ 0 , 1 ] (weight assignment) production ( q , n , w 1 , . . . , w k ) is often written q → n ( w 1 , . . . , w k ) FST in NLP A. Maletti 39 ·

  54. Grammars with Latent Variables Theorem REL ( wLTL ) = REL ( wTSL ) = wRTL wRTL [wTA] REL(wTSL) [wTSG lv ] wTSL [wTSG] REL(wCFL) [wCFG lv ] wCFL [wCFG] FST in NLP A. Maletti 40 ·

  55. Grammars with Latent Variables Theorem REL ( wLTL ) = REL ( wTSL ) = wRTL wRTL [wTA] REL(wTSL) [wTSG lv ] wTSL [wTSG] REL(wCFL) [wCFG lv ] wCFL [wCFG] FST in NLP A. Maletti 40 ·

  56. Grammars with Latent Variables Theorem REL ( wLTL ) = REL ( wTSL ) = wRTL wRTL [wTA] REL(wTSL) [wTSG lv ] wTSL [wTSG] REL(wCFL) [wCFG lv ] wCFL [wCFG] here: latent variables ≈ finite-state FST in NLP A. Maletti 40 ·

  57. Parsing Typical questions Decoding: (or language model evaluation) Given model M and sentence w , determine probability M ( w ) ◮ intersect M with the DTA for w and any parse ◮ evaluate w in the obtained WTA ◮ efficient: initial-algebra semantics (forward algorithm) FST in NLP A. Maletti 41 ·

  58. Parsing Typical questions Decoding: (or language model evaluation) Given model M and sentence w , determine probability M ( w ) ◮ intersect M with the DTA for w and any parse ◮ evaluate w in the obtained WTA ◮ efficient: initial-algebra semantics (forward algorithm) Parsing: Given model M and sentence w , determine the best parse t for w ◮ intersect M with the DTA for w and any parse ◮ determine best tree in the obtained WTA ◮ efficient: none (NP-hard even for wLTG) FST in NLP A. Maletti 41 ·

  59. Parsing Statistical parsing approach Given wLTG M and sentence w , return highest-scoring parse for w FST in NLP A. Maletti 42 ·

  60. Parsing Statistical parsing approach Given wLTG M and sentence w , return highest-scoring parse for w Consequence The first parse should be prefered (“duck” more frequently a noun, etc.) S S NP VP NP VP PRP VBD S-BAR PRP VBD NP saw We S saw We PRP$ NN NP VP her duck PRP VBP her duck FST in NLP A. Maletti 42 ·

  61. Parsing TCS contributions efficient evaluation and complexity considerations (initial-algebra semantics, best runs, best trees, etc.) model simplifications (trimming, determinization, minimization, etc.) model transformations (intersection, normalization, lexicalization, etc.) model induction (grammar induction, weight training, spectral learning, etc.) FST in NLP A. Maletti 43 ·

  62. Parsing NLP contribution to TCS good source of (relevant) problems good source for practical techniques (e.g., fine-to-coarse decoding) good source of (relevant) large wTA language states non-lexical productions English 1,132 1,842,218 Chinese 994 1,109,500 German 981 616,776 FST in NLP A. Maletti 44 ·

  63. Machine Translation FST in NLP A. Maletti 45 ·

  64. Machine Translation Applications Technical manuals Example (An mp3 player) The synchronous manifestation of lyrics is a procedure for can broadcasting the music, waiting the mp3 file at the same time showing the lyrics. FST in NLP A. Maletti 46 ·

  65. Machine Translation Applications Technical manuals Example (An mp3 player) The synchronous manifestation of lyrics is a procedure for can broadcasting the music, waiting the mp3 file at the same time showing the lyrics. With the this kind method that the equipments that synchronous function of support up broadcast to make use of document create setup, you can pass the LCD window way the check at the document contents that broadcast. FST in NLP A. Maletti 46 ·

  66. Machine Translation Applications Technical manuals Example (An mp3 player) The synchronous manifestation of lyrics is a procedure for can broadcasting the music, waiting the mp3 file at the same time showing the lyrics. With the this kind method that the equipments that synchronous function of support up broadcast to make use of document create setup, you can pass the LCD window way the check at the document contents that broadcast. That procedure returns offerings to have to modify, and delete, and stick top , keep etc. edit function. FST in NLP A. Maletti 46 ·

  67. Machine Translation Applications Technical manuals US military Example (Speech-to-text [J ONES et al., ’09]) Okay, what is your name? E: Abdul. A: And your last name? E: Al Farran. A: FST in NLP A. Maletti 46 ·

  68. Machine Translation Applications Technical manuals US military Example (Speech-to-text [J ONES et al., ’09]) Okay, what is your name? Okay, what’s your name? E: E: Abdul. milk a mechanic and I am here A: A: I mean yes And your last name? E: Al Farran. A: FST in NLP A. Maletti 46 ·

  69. Machine Translation Applications Technical manuals US military Example (Speech-to-text [J ONES et al., ’09]) Okay, what is your name? Okay, what’s your name? E: E: Abdul. milk a mechanic and I am here A: A: I mean yes What is your last name? And your last name? E: E: every two weeks Al Farran. A: A: my son’s name is ismail FST in NLP A. Maletti 46 ·

  70. Machine Translation V AUQUOIS triangle: foreign German semantics syntax phrase Translation model: FST in NLP A. Maletti 47 ·

  71. Machine Translation V AUQUOIS triangle: foreign German semantics syntax phrase Translation model: string-to-tree FST in NLP A. Maletti 47 ·

  72. Machine Translation V AUQUOIS triangle: foreign German semantics syntax phrase Translation model: tree-to-tree FST in NLP A. Maletti 47 ·

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend