natural language generation
play

Natural Language Generation . .. . . .. .. . .. . . . .. - PowerPoint PPT Presentation

. .. . . .. . . .. . . .. . . .. . . . . Ondej Duek Ondej Duek 1/ 40 May 11 th , 2016 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics (mostly) for Spoken


  1. . .. .. . .. . . .. . . .. . . .. . . . . Standard NLG Pipeline ( Textbook ) Ondřej Dušek 4/ 40 [Text] [Sentence plan(s)] [Content plan] [Inputs] Textbook NLG Pipeline . Introduction to NLG . .. . . .. .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . ↓ Content/text planning (“what to say”) • Content selection, basic ordering ↓ Sentence planning/microplanning (“middle ground”) • aggregation, lexical choice, referring… ↓ Surface realization (“how to say it”) • linearization, conforming to rules of the target language

  2. • Content selection according to communication goal • Basic structuring (ordering) . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Inputs weather report numbers etc.) Content planning 5/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Communication goal (e.g. “inform user about search results”) • Knowledge base (e.g. list of matching entries in database, • User model (constraints, e.g. user wants short answers) • Dialogue history (referring expressions, repetition)

  3. . . .. . . .. . . .. . . .. . . .. . .. . Textbook NLG Pipeline Ondřej Dušek 5/ 40 Content planning weather report numbers etc.) Inputs Standard NLG Pipeline ( Textbook ) Introduction to NLG . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . . .. .. . . .. . . .. Natural Language Generation • Communication goal (e.g. “inform user about search results”) • Knowledge base (e.g. list of matching entries in database, • User model (constraints, e.g. user wants short answers) • Dialogue history (referring expressions, repetition) • Content selection according to communication goal • Basic structuring (ordering)

  4. • Creating linear text from (typically) structured input • Ensuring grammatical correctness . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Sentence planning (micro-planning) Surface realization 6/ 40 Ondřej Dušek .. .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Word and syntax selection (e.g. choose templates) • Dividing content into sentences • Aggregation (merging simple sentences) • Lexicalization • Referring expressions

  5. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Sentence planning (micro-planning) Surface realization 6/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Word and syntax selection (e.g. choose templates) • Dividing content into sentences • Aggregation (merging simple sentences) • Lexicalization • Referring expressions • Creating linear text from (typically) structured input • Ensuring grammatical correctness

  6. • Templates, grammars, rules, statistics, or a mix thereof • Varied, custom-tailored, non-compatible . .. . . .. . . . . . .. . . .. . .. .. . .. . . .. . Introduction to NLG Real NLG Systems Real NLG Systems Few systems implement the whole pipeline realization Approaches Data representations 7/ 40 Ondřej Dušek .. . . . . . .. . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here

  7. • Varied, custom-tailored, non-compatible . . . .. . . .. .. . .. . . .. . .. . . . Real NLG Systems Ondřej Dušek 7/ 40 Data representations Approaches realization Few systems implement the whole pipeline Real NLG Systems . Introduction to NLG . .. . . .. .. . . . .. . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. .. . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here • Templates, grammars, rules, statistics, or a mix thereof

  8. . .. .. . .. . . .. . . .. . . .. . . . . Real NLG Systems Ondřej Dušek 7/ 40 Data representations Approaches realization Few systems implement the whole pipeline Real NLG Systems . Introduction to NLG . .. . . .. .. . . . . .. .. . . .. . . .. . . .. . . .. . . . . . .. . .. .. . . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here • Templates, grammars, rules, statistics, or a mix thereof • Varied, custom-tailored, non-compatible

  9. • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools .. . . .. . . . . . . .. . . .. .. . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . . .. .. . . .. . . . .. . .. . . .. . . . . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning

  10. • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . .. . . .. . . .. .. . . .. . . . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix

  11. • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . . .. . . .. . . .. .. . .. .. . .. . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek . . . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one

  12. • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up

  13. . . .. . .. .. . . .. . . .. . . .. . .. . . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools

  14. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  15. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  16. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  17. • Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:

  18. . .. . . .. . . .. . . .. . . .. . . . .. 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples . NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. Natural Language Generation .. . .. . . .. . . • Divided by NLG stage: • Each stage:

  19. • Actually typically handcrafued or non-existent • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺ . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek .. . .. . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Various input/output formats, not very comparable

  20. • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺ . .. . . .. . . . . . .. . . .. .. .. . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek . . .. . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Various input/output formats, not very comparable • Actually typically handcrafued or non-existent

  21. . .. . .. . . .. . . .. . . .. . . . .. . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek .. . . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Various input/output formats, not very comparable • Actually typically handcrafued or non-existent • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺

  22. . .. .. . . .. . . . . . .. . . .. . . .. .. domain Ondřej Dušek 11/ 40 plans hand-annotated sentence (RankBoost) trained on overgeneration the flight information . Trainable Sentence Planning: SPoT Sentence planning NLG Systems Examples . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Spoken Dialogue System in • Handcrafued generator + • Statistical reranker

  23. • Paiva&Evans : linguistic features annotated in corpus generated with • PERSONAGE-PE : personality traits connected to linguistic features via . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . NLG Systems Examples Sentence planning Trainable Sentence Planning: Parameter Optimization Examples many parameter settings, correlation analysis machine learning 12/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. . Natural Language Generation . .. . . .. . . • Requires a flexible handcrafed planner • No overgeneration • Adjusting its parameters “somehow”

  24. . . .. . . .. . . .. . . .. . . .. . .. . Sentence planning Ondřej Dušek 12/ 40 machine learning many parameter settings, correlation analysis Examples Trainable Sentence Planning: Parameter Optimization NLG Systems Examples . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • Requires a flexible handcrafed planner • No overgeneration • Adjusting its parameters “somehow” • Paiva&Evans : linguistic features annotated in corpus generated with • PERSONAGE-PE : personality traits connected to linguistic features via

  25. • Typical NLG training: • Our sentence planner learns alignments jointly • training from pairs: MR + sentence . . .. . . .. . . .. . . .. . . . . .. . .. .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation .. .. . . .. . . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module

  26. • Our sentence planner learns alignments jointly • training from pairs: MR + sentence . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . . . . .. . . .. .. Natural Language Generation . .. . . .. . . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module • Typical NLG training: MR inform(name=X, type=placetoeat, eattype=restaurant, area=riverside, food=Italian) alignment X is an italian restaurant in the riverside area . text

  27. . . .. .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . . . . .. . . . .. . . . . .. .. . .. . . Natural Language Generation .. . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module • Typical NLG training: • Our sentence planner learns alignments jointly • training from pairs: MR + sentence MR inform(name=X, type=placetoeat, eattype=restaurant, area=riverside, food=Italian) X is an italian restaurant in the riverside area . text

  28. • Output : deep-syntax dependency trees • based on TectoMT 's t-layer, but very • two attributes per tree node: • using surface word order • Conversion to plain text sentences – • Treex / TectoMT English synthesis . .. . . .. . . . .. . . .. . . .. . Our Approach to Sentence Planning .. 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) . surface realization t-lemma + formeme simplified I/O formats . NLG Systems Examples .. . .. .. . .. . . .. . . . . . .. . . .. . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. in the riverside area. • Input : a MR • dialogue acts: “inform” + slot-value pairs • other formats possible

  29. • Conversion to plain text sentences – • Treex / TectoMT English synthesis . . . .. . . .. .. . . .. . . .. . . .. .. . 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) . surface realization t-lemma + formeme simplified I/O formats Our Approach to Sentence Planning NLG Systems Examples . .. . . .. .. . .. .. . . . . . .. . . .. . . . . .. . . . .. . . . . . .. . .. in the riverside area. . . . .. .. • Input : a MR • dialogue acts: “inform” + slot-value pairs t-tree • other formats possible be • Output : deep-syntax dependency trees v:fin • based on TectoMT 's t-layer, but very X-name restaurant n:subj n:obj • two attributes per tree node: italian area adj:attr n:in+X • using surface word order riverside n:attr

  30. . .. . . .. . . .. . . .. . . .. . . .. .. . . 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) .. surface realization t-lemma + formeme simplified I/O formats Our Approach to Sentence Planning NLG Systems Examples . . . .. .. . . .. . . . . . . .. . . .. . . .. in the riverside area. . . . .. . . .. .. . .. . . .. . . . .. . • Input : a MR • dialogue acts: “inform” + slot-value pairs t-tree • other formats possible be • Output : deep-syntax dependency trees v:fin • based on TectoMT 's t-layer, but very X-name restaurant n:subj n:obj • two attributes per tree node: italian area adj:attr n:in+X • using surface word order riverside n:attr • Conversion to plain text sentences – • Treex / TectoMT English synthesis

  31. • Using two subcomponents: • candidate generator • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) • scorer /ranker for the candidates • influences which candidate trees will be . . .. . . .. . . .. .. . . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Overall Structure of Our Sentence Planner candidate sentence plan expanded (selects the most promising) 15/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • A*-style search – “finding the path” empty tree → full sentence plan tree • always expand the most promising • stop when candidates don't improve for a while

  32. • scorer /ranker for the candidates • influences which candidate trees will be . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Overall Structure of Our Sentence Planner candidate sentence plan expanded (selects the most promising) 15/ 40 Ondřej Dušek Natural Language Generation .. .. . . . . .. . . .. . . .. . . .. . . .. .. . . . . .. . .. . .. . .. . . .. . . MR • A*-style search – “finding the path” empty tree → full sentence plan tree Sentence planner • always expand the most promising candidate generator • stop when candidates don't improve for a while A* search • Using two subcomponents: • candidate generator scorer • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) Sentence plan (deep syntax tree)

  33. . . .. . . .. . . .. . . .. . . .. . .. . Our Approach to Sentence Planning Natural Language Generation Ondřej Dušek 15/ 40 expanded (selects the most promising) candidate sentence plan Overall Structure of Our Sentence Planner NLG Systems Examples . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . .. . . . .. . . .. . .. . . .. . MR • A*-style search – “finding the path” empty tree → full sentence plan tree Sentence planner • always expand the most promising candidate generator • stop when candidates don't improve for a while A* search • Using two subcomponents: • candidate generator scorer • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) • scorer /ranker for the candidates • influences which candidate trees will be Sentence plan (deep syntax tree)

  34. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek . . .. . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • Given a candidate plan tree, generate its successors

  35. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. .. .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be recommend serve v:fin v:fin v:fin

  36. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. .. .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be recommend serve v:fin v:fin v:fin

  37. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . .. . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . .. . . . . . .. . . .. . . .. .. . . .. .. . . . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be be be v:fin v:fin v:fin be v:fin restaurant X-name restaurant n:obj n:subj n:subj

  38. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . .. . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . .. . . . . . .. . . .. . . .. .. . . .. .. . . . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be be be v:fin v:fin v:fin be v:fin restaurant X-name restaurant n:obj n:subj n:subj

  39. • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. . . . . .. . . . . . .. . . .. . . .. .. . .. .. .. . . .. . . . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj

  40. • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. . . . . .. . . .. . . .. . . .. . . .. . . .. .. .. . . .. . . . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice

  41. • parent–child • t-lemma + formeme • number of children, tree size … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation .. .. . . . . .. . . .. .. . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice • using combination of things seen in training data

  42. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. .. . . . . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size …

  43. • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m

  44. • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m

  45. • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m

  46. • tree shape • tree edges (parent-child) • … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. . Natural Language Generation . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes

  47. • tree edges (parent-child) • … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek .. .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape

  48. • … . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child)

  49. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • …

  50. • Training: • given m , generate the best tree t top with current weights • update weights if t top t gold (gold-standard) • Update: w feat t gold m feat t top m • Updates based on partial trees • Estimating future value of the trees .. . . .. . . . . .. . . .. . . NLG Systems Examples .. . . .. . .. Our Approach to Sentence Planning Perceptron scorer Basic form w Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek . .. . .. . .. . . .. . . . . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation • score = w ⊤ · feat ( t , m )

  51. • Update: w feat t gold m feat t top m • Updates based on partial trees • Estimating future value of the trees . .. . . .. . . . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form w Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . Natural Language Generation • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard)

  52. • Updates based on partial trees • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m ))

  53. • Updates based on partial trees • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m ))

  54. • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m )) • Updates based on partial trees

  55. . . .. .. . .. . . .. . . .. . . .. . .. . Our Approach to Sentence Planning Ondřej Dušek 18/ 40 Trying to guide the search on incomplete trees Our improvements Basic form Perceptron scorer NLG Systems Examples . . .. . . .. . . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m )) • Updates based on partial trees • Estimating future value of the trees

  56. • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well .. . . .. . . . . . .. . . . .. .. . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . . .. .. . . .. . . . .. . .. . . .. . . . . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs

  57. • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . .. . . .. . . .. . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89%

  58. • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . .. . . .. . . . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results but: 19/ 40 Ondřej Dušek . .. .. .. . . . .. . . . . . .. . . .. . . . .. .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU…

  59. • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . . .. . . .. . . .. .. . .. .. . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include

  60. • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful

  61. • slooooow, doesn't scale very well . . .. . . .. . . .. . . .. .. . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output

  62. . . .. . .. .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well

  63. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  64. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  65. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  66. . NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation

  67. . . . .. . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Surface Realization Examples Treex / TectoMT realizer 21/ 40 Ondřej Dušek .. . . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Also various input formats, at least output is always text • From handcrafued to different trainable realizers • Also including our own (developed here at ÚFAL): • actually handcrafued for the most part

  68. • General purpose • Functional Unification . . . .. . . .. . .. Surface Realization . . .. . . .. . . NLG Systems Examples KPML Grammar-based Realizers (90's): KPML , FUF/SURGE "It is raining cats and dogs." Ondřej Dušek 22/ 40 ) (C / OBJECT :LEX CATS-AND-DOGS :NUMBER MASS)) :TENSE PRESENT-CONTINUOUS :ACTEE (A / AMBIENT-PROCESS :LEX RAIN :LOGICALFORM :TARGETFORM .. EX-SET-1 :NAME (EXAMPLE Grammar FUF/SURGE Grammar multilingual .. . . . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. Natural Language Generation • General purpose, • Systemic Functional

  69. . . .. . . .. . . .. . NLG Systems Examples .. . . .. . . .. . Surface Realization . "It is raining cats and dogs." Ondřej Dušek 22/ 40 ) (C / OBJECT :LEX CATS-AND-DOGS :NUMBER MASS)) :TENSE PRESENT-CONTINUOUS :ACTEE (A / AMBIENT-PROCESS :LEX RAIN :LOGICALFORM :TARGETFORM Grammar-based Realizers (90's): KPML , FUF/SURGE EX-SET-1 :NAME (EXAMPLE Grammar FUF/SURGE Grammar multilingual KPML .. . .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . .. . . . .. . . .. Natural Language Generation • General purpose, • Systemic Functional • General purpose • Functional Unification

  70. . . .. . . .. . . .. . . .. . . .. . .. . Surface Realization Ondřej Dušek 23/ 40 enhancements Grammar multi-lingual Grammar-based Realizer: OpenCCG NLG Systems Examples . . .. . . .. . .. . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • General purpose, • Combinatory Categorial • Used in several projects • With statistical

  71. . . . . .. . . .. . .. . . . .. . . .. . .. NLG Systems Examples .. p.setVerb("chase"); Ondřej Dušek 24/ 40 >>> Mary chased the monkey. System.out.println(output); String output = realiser.realiseSentence(p); p.setFeature(Feature.TENSE, Tense.PAST); p.setObject("the monkey"); p.setSubject("Mary"); Surface Realization SPhraseSpec p = nlgFactory.createClause(); Realiser realiser = new Realiser(lexicon); NLGFactory nlgFactory = new NLGFactory(lexicon); Lexicon lexicon = new XMLLexicon("my-lexicon.xml"); (procedural) other languages Procedural Realizer: SimpleNLG .. . . .. . .. . . .. . . . .. . .. . . .. . . . . . . .. . . .. . . .. . .. . . . .. . . .. Natural Language Generation • General purpose • English, adapted to several • Java implementation

  72. • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG ) • Provides variance, but at a greater computational cost . .. . . .. . . .. . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek .. . . . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . Natural Language Generation .. .. . . .. . . • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker

  73. • Provides variance, but at a greater computational cost . . .. .. . .. . . .. . . .. . . .. .. . . . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . . . . .. . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG )

  74. . . . .. .. . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek . . . .. . . .. . . .. . . .. . . .. . . . . . .. . . . .. . Natural Language Generation .. . . .. .. . . • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG ) • Provides variance, but at a greater computational cost

  75. . .. .. .. . . .. . . .. . . .. . . . .. . .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Syntax-Based 26/ 40 Ondřej Dušek . . . .. . . .. . . .. . . .. . . .. . . . . . .. . . . . .. .. . . .. . . .. Natural Language Generation • StuMaBa : general realizer based on SVMs • Pipeline: ↓ Deep syntax/semantics ↓ surface syntax ↓ linearization ↓ morphologization

  76. • We use it for our experiments ( TGEN1 ) • analysis • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . .. . . .. . . .. . . .. . . . . .. . .. .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer synthesis on BAGEL data = 89.79% BLEU trees 27/ 40 Ondřej Dušek . . .. .. .. . . .. . . . . . .. . . .. . . . . .. .. . .. . . .. . . . . . .. . . .. . Natural Language Generation • Domain-independent

  77. • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . . .. . . .. . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU

  78. • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . . .. . . . . . .. . . .. .. .. . . . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect )

  79. . . . .. . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. . .. . . .. . . .. . . .. . . . . . .. . . .. . . . .. . Natural Language Generation .. . . .. . . .. • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized

  80. • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization t-tree zone=en_gen jump v:fin cat window n:subj n:through+X . .. . . . . .. . . .. . Treex / TectoMT Surface Realization Example .. . . .. . NLG Systems Examples Our Surface realizer . 28/ 40 Ondřej Dušek Natural Language Generation .. . .. . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . • Realizer steps (simplified):

  81. • Determine morphological agreement • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . . .. . . .. . . .. .. . .. .. . . . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . .. .. .. . . .. . . . . . .. . . .. . . . . .. .. . .. . . .. . . . . . . .. . . .. • Realizer steps (simplified): • Copy the deep tree (sentence plan) t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

  82. • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . .. . . .. . . .. . . .. .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . .. .. . . . .. . . .. . • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

  83. • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . .. . . .. . . .. . . .. .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

  84. • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . . .. . . .. . .. .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation .. . . . . . .. .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions • Add articles t-tree zone=en_gen jump v:fin cat window n:subj n:through+X

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend