 
              . .. .. . .. . . .. . . .. . . .. . . . . Standard NLG Pipeline ( Textbook ) Ondřej Dušek 4/ 40 [Text] [Sentence plan(s)] [Content plan] [Inputs] Textbook NLG Pipeline . Introduction to NLG . .. . . .. .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . ↓ Content/text planning (“what to say”) • Content selection, basic ordering ↓ Sentence planning/microplanning (“middle ground”) • aggregation, lexical choice, referring… ↓ Surface realization (“how to say it”) • linearization, conforming to rules of the target language
• Content selection according to communication goal • Basic structuring (ordering) . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Inputs weather report numbers etc.) Content planning 5/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Communication goal (e.g. “inform user about search results”) • Knowledge base (e.g. list of matching entries in database, • User model (constraints, e.g. user wants short answers) • Dialogue history (referring expressions, repetition)
. . .. . . .. . . .. . . .. . . .. . .. . Textbook NLG Pipeline Ondřej Dušek 5/ 40 Content planning weather report numbers etc.) Inputs Standard NLG Pipeline ( Textbook ) Introduction to NLG . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . . .. .. . . .. . . .. Natural Language Generation • Communication goal (e.g. “inform user about search results”) • Knowledge base (e.g. list of matching entries in database, • User model (constraints, e.g. user wants short answers) • Dialogue history (referring expressions, repetition) • Content selection according to communication goal • Basic structuring (ordering)
• Creating linear text from (typically) structured input • Ensuring grammatical correctness . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Sentence planning (micro-planning) Surface realization 6/ 40 Ondřej Dušek .. .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Word and syntax selection (e.g. choose templates) • Dividing content into sentences • Aggregation (merging simple sentences) • Lexicalization • Referring expressions
. . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . Introduction to NLG Textbook NLG Pipeline Standard NLG Pipeline ( Textbook ) Sentence planning (micro-planning) Surface realization 6/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Word and syntax selection (e.g. choose templates) • Dividing content into sentences • Aggregation (merging simple sentences) • Lexicalization • Referring expressions • Creating linear text from (typically) structured input • Ensuring grammatical correctness
• Templates, grammars, rules, statistics, or a mix thereof • Varied, custom-tailored, non-compatible . .. . . .. . . . . . .. . . .. . .. .. . .. . . .. . Introduction to NLG Real NLG Systems Real NLG Systems Few systems implement the whole pipeline realization Approaches Data representations 7/ 40 Ondřej Dušek .. . . . . . .. . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here
• Varied, custom-tailored, non-compatible . . . .. . . .. .. . .. . . .. . .. . . . Real NLG Systems Ondřej Dušek 7/ 40 Data representations Approaches realization Few systems implement the whole pipeline Real NLG Systems . Introduction to NLG . .. . . .. .. . . . .. . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. .. . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here • Templates, grammars, rules, statistics, or a mix thereof
. .. .. . .. . . .. . . .. . . .. . . . . Real NLG Systems Ondřej Dušek 7/ 40 Data representations Approaches realization Few systems implement the whole pipeline Real NLG Systems . Introduction to NLG . .. . . .. .. . . . . .. .. . . .. . . .. . . .. . . .. . . . . . .. . .. .. . . .. . . .. . . Natural Language Generation • Systems focused on content planning with trivial surface • Surface-realization-only, word-order-only systems • One-step (holistic) approaches • SDS: content planning done by dialogue manager → only sentence planning and realization here • Templates, grammars, rules, statistics, or a mix thereof • Varied, custom-tailored, non-compatible
• Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools .. . . .. . . . . . . .. . . .. .. . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . . .. .. . . .. . . . .. . .. . . .. . . . . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning
• Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . .. . . .. . . .. .. . . .. . . . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix
• Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . . .. . . .. . . .. .. . .. .. . .. . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek . . . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one
• Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up
. . .. . .. .. . . .. . . .. . . .. . .. . . . .. . . .. . Introduction to NLG Two-step or One-step architecture? Two-step or one-step? Why go two-step Why go one-step 8/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Dividing makes the tasks simpler • no need to worry about morphology in sentence planning • Surface realization can be rule based • you can hardcode the grammar, it is more straightforward to fix • Surface realizer is (relatively) easy to implement • and you can use a third-party one • Problem of all pipelines: error propagation • the more steps, the more chance to screw it up • Need to provide training sentence plans (statistical planners) • sometimes you may use existing analysis tools
• Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:
• Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:
• Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:
• Each stage: . . . .. . . .. .. . .. . . .. . . . . . 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples .. NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . .. .. . . .. . . Natural Language Generation • Divided by NLG stage:
. .. . . .. . . .. . . .. . . .. . . . .. 1. Sentence planning Ondřej Dušek 9/ 40 2. Current state-of-the art / our works 1. History 3. One-step approaches to NLG 2. Surface realization NLG systems examples . NLG Systems Examples . .. . . .. .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. Natural Language Generation .. . .. . . .. . . • Divided by NLG stage: • Each stage:
• Actually typically handcrafued or non-existent • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺ . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek .. . .. . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Various input/output formats, not very comparable
• One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺ . .. . . .. . . . . . .. . . .. .. .. . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek . . .. . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Various input/output formats, not very comparable • Actually typically handcrafued or non-existent
. .. . .. . . .. . . .. . . .. . . . .. . .. . . .. . . .. . NLG Systems Examples Sentence planning Sentence Planning Examples 10/ 40 Ondřej Dušek .. . . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Various input/output formats, not very comparable • Actually typically handcrafued or non-existent • One-step approaches or simplistic systems • Here we focus on trainable approaches • …and especially on our own ☺
. .. .. . . .. . . . . . .. . . .. . . .. .. domain Ondřej Dušek 11/ 40 plans hand-annotated sentence (RankBoost) trained on overgeneration the flight information . Trainable Sentence Planning: SPoT Sentence planning NLG Systems Examples . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. . . .. . . Natural Language Generation • Spoken Dialogue System in • Handcrafued generator + • Statistical reranker
• Paiva&Evans : linguistic features annotated in corpus generated with • PERSONAGE-PE : personality traits connected to linguistic features via . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . NLG Systems Examples Sentence planning Trainable Sentence Planning: Parameter Optimization Examples many parameter settings, correlation analysis machine learning 12/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. . Natural Language Generation . .. . . .. . . • Requires a flexible handcrafed planner • No overgeneration • Adjusting its parameters “somehow”
. . .. . . .. . . .. . . .. . . .. . .. . Sentence planning Ondřej Dušek 12/ 40 machine learning many parameter settings, correlation analysis Examples Trainable Sentence Planning: Parameter Optimization NLG Systems Examples . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • Requires a flexible handcrafed planner • No overgeneration • Adjusting its parameters “somehow” • Paiva&Evans : linguistic features annotated in corpus generated with • PERSONAGE-PE : personality traits connected to linguistic features via
• Typical NLG training: • Our sentence planner learns alignments jointly • training from pairs: MR + sentence . . .. . . .. . . .. . . .. . . . . .. . .. .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek .. . . . . . .. . . .. . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation .. .. . . .. . . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module
• Our sentence planner learns alignments jointly • training from pairs: MR + sentence . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . . . . .. . . .. .. Natural Language Generation . .. . . .. . . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module • Typical NLG training: MR inform(name=X, type=placetoeat, eattype=restaurant, area=riverside, food=Italian) alignment X is an italian restaurant in the riverside area . text
. . .. .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning 2. Learns from unaligned data a) requires alignment of MR elements and words/phrases b) uses a separate alignment step 13/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . . . . .. . . . .. . . . . .. .. . .. . . Natural Language Generation .. . Our A ∗ /Perceptron Sentence Planner ( TGEN1 ) 1. Requires no handcrafued module • Typical NLG training: • Our sentence planner learns alignments jointly • training from pairs: MR + sentence MR inform(name=X, type=placetoeat, eattype=restaurant, area=riverside, food=Italian) X is an italian restaurant in the riverside area . text
• Output : deep-syntax dependency trees • based on TectoMT 's t-layer, but very • two attributes per tree node: • using surface word order • Conversion to plain text sentences – • Treex / TectoMT English synthesis . .. . . .. . . . .. . . .. . . .. . Our Approach to Sentence Planning .. 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) . surface realization t-lemma + formeme simplified I/O formats . NLG Systems Examples .. . .. .. . .. . . .. . . . . . .. . . .. . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. in the riverside area. • Input : a MR • dialogue acts: “inform” + slot-value pairs • other formats possible
• Conversion to plain text sentences – • Treex / TectoMT English synthesis . . . .. . . .. .. . . .. . . .. . . .. .. . 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) . surface realization t-lemma + formeme simplified I/O formats Our Approach to Sentence Planning NLG Systems Examples . .. . . .. .. . .. .. . . . . . .. . . .. . . . . .. . . . .. . . . . . .. . .. in the riverside area. . . . .. .. • Input : a MR • dialogue acts: “inform” + slot-value pairs t-tree • other formats possible be • Output : deep-syntax dependency trees v:fin • based on TectoMT 's t-layer, but very X-name restaurant n:subj n:obj • two attributes per tree node: italian area adj:attr n:in+X • using surface word order riverside n:attr
. .. . . .. . . .. . . .. . . .. . . .. .. . . 14/ 40 X is an Italian restaurant area=riverside, food=Italian) eattype=restaurant, inform(name=X, type=placetoeat, Natural Language Generation Ondřej Dušek (rule-based, later) .. surface realization t-lemma + formeme simplified I/O formats Our Approach to Sentence Planning NLG Systems Examples . . . .. .. . . .. . . . . . . .. . . .. . . .. in the riverside area. . . . .. . . .. .. . .. . . .. . . . .. . • Input : a MR • dialogue acts: “inform” + slot-value pairs t-tree • other formats possible be • Output : deep-syntax dependency trees v:fin • based on TectoMT 's t-layer, but very X-name restaurant n:subj n:obj • two attributes per tree node: italian area adj:attr n:in+X • using surface word order riverside n:attr • Conversion to plain text sentences – • Treex / TectoMT English synthesis
• Using two subcomponents: • candidate generator • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) • scorer /ranker for the candidates • influences which candidate trees will be . . .. . . .. . . .. .. . . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Overall Structure of Our Sentence Planner candidate sentence plan expanded (selects the most promising) 15/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • A*-style search – “finding the path” empty tree → full sentence plan tree • always expand the most promising • stop when candidates don't improve for a while
• scorer /ranker for the candidates • influences which candidate trees will be . . . .. . . .. .. . .. . . .. . . . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Overall Structure of Our Sentence Planner candidate sentence plan expanded (selects the most promising) 15/ 40 Ondřej Dušek Natural Language Generation .. .. . . . . .. . . .. . . .. . . .. . . .. .. . . . . .. . .. . .. . .. . . .. . . MR • A*-style search – “finding the path” empty tree → full sentence plan tree Sentence planner • always expand the most promising candidate generator • stop when candidates don't improve for a while A* search • Using two subcomponents: • candidate generator scorer • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) Sentence plan (deep syntax tree)
. . .. . . .. . . .. . . .. . . .. . .. . Our Approach to Sentence Planning Natural Language Generation Ondřej Dušek 15/ 40 expanded (selects the most promising) candidate sentence plan Overall Structure of Our Sentence Planner NLG Systems Examples . . .. . . .. . . .. .. .. . . .. . . .. . . .. . . .. . . . . . .. . . . .. . . .. . .. . . .. . MR • A*-style search – “finding the path” empty tree → full sentence plan tree Sentence planner • always expand the most promising candidate generator • stop when candidates don't improve for a while A* search • Using two subcomponents: • candidate generator scorer • churning out candidate sentence plan trees • given an incomplete candidate tree, add node(s) • scorer /ranker for the candidates • influences which candidate trees will be Sentence plan (deep syntax tree)
• “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek . . .. . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • Given a candidate plan tree, generate its successors
• “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. .. .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be recommend serve v:fin v:fin v:fin
• “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. .. .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be recommend serve v:fin v:fin v:fin
• “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . .. . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . .. . . . . . .. . . .. . . .. .. . . .. .. . . . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be be be v:fin v:fin v:fin be v:fin restaurant X-name restaurant n:obj n:subj n:subj
• “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . .. . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . . . . . . .. . . . . . .. . . .. . . .. .. . . .. .. . . . . .. .. . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree t-tree be be be v:fin v:fin v:fin be v:fin restaurant X-name restaurant n:obj n:subj n:subj
• “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. . . . . .. . . . . . .. . . .. . . .. .. . .. .. .. . . .. . . . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj
• using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. . . . . .. . . .. . . .. . . .. . . .. . . .. .. .. . . .. . . . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice
• parent–child • t-lemma + formeme • number of children, tree size … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation .. .. . . . . .. . . .. .. . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice • using combination of things seen in training data
. . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Candidate generator by adding 1 node (at every possible place) 16/ 40 Ondřej Dušek Natural Language Generation . .. .. . . . . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . • Given a candidate plan tree, generate its successors t-tree t-tree t-tree be be be v:fin v:fin v:fin X-name restaurant X-name bar X-name n:subj n:obj n:subj n:obj n:subj • “possible places” must be limited in practice • using combination of things seen in training data • parent–child • t-lemma + formeme • number of children, tree size …
• occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m
• occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m
• occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • … . . . .. . . .. . . .. .. . .. . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m
• tree shape • tree edges (parent-child) • … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek .. .. . . . . .. . . .. . . .. . . .. . . .. .. . .. . . .. . . .. . Natural Language Generation . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes
• tree edges (parent-child) • … . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek .. .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape
• … . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child)
. . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Scorer How to describe the fitness? Features: 17/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • a function: sentence plan tree t , MR m → real-valued score • describes the fitness of t for m • occurence of input DA slots + t-lemmas / formemes • tree shape • tree edges (parent-child) • …
• Training: • given m , generate the best tree t top with current weights • update weights if t top t gold (gold-standard) • Update: w feat t gold m feat t top m • Updates based on partial trees • Estimating future value of the trees .. . . .. . . . . .. . . .. . . NLG Systems Examples .. . . .. . .. Our Approach to Sentence Planning Perceptron scorer Basic form w Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek . .. . .. . .. . . .. . . . . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation • score = w ⊤ · feat ( t , m )
• Update: w feat t gold m feat t top m • Updates based on partial trees • Estimating future value of the trees . .. . . .. . . . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form w Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . Natural Language Generation • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard)
• Updates based on partial trees • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m ))
• Updates based on partial trees • Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m ))
• Estimating future value of the trees . . . .. . . .. .. . .. . . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Perceptron scorer Basic form Our improvements Trying to guide the search on incomplete trees 18/ 40 Ondřej Dušek .. . . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m )) • Updates based on partial trees
. . .. .. . .. . . .. . . .. . . .. . .. . Our Approach to Sentence Planning Ondřej Dušek 18/ 40 Trying to guide the search on incomplete trees Our improvements Basic form Perceptron scorer NLG Systems Examples . . .. . . .. . . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • score = w ⊤ · feat ( t , m ) • Training: • given m , generate the best tree t top with current weights • update weights if t top ̸ = t gold (gold-standard) • Update: w = w + α · ( feat ( t gold , m ) − feat ( t top , m )) • Updates based on partial trees • Estimating future value of the trees
• basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well .. . . .. . . . . . .. . . . .. .. . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . . .. .. . . .. . . . .. . .. . . .. . . . . .. .. . . .. . . .. . . . . . .. . . .. . Natural Language Generation • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs
• less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . .. . . .. . . .. . .. . .. . . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . .. .. .. . . .. . . . . . .. . . .. . . . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89%
• we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . .. . . .. . . . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results but: 19/ 40 Ondřej Dušek . .. .. .. . . . .. . . . . . .. . . .. . . . .. .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU…
• outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . . .. . . .. . . .. .. . .. .. . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . . . . . .. . . .. . . .. . . .. . . .. . . .. .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include
• problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well . . .. . . .. . . .. . . .. .. . . . .. . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek .. . . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. Natural Language Generation . . .. . . .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful
• slooooow, doesn't scale very well . . .. . . .. . . .. . . .. .. . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output
. . .. . .. .. . . .. . . .. . . .. . .. . . . .. . . .. . NLG Systems Examples Our Approach to Sentence Planning Evaluation of Our Sentence Planner Data Results 19/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . . .. Natural Language Generation . .. . . .. .. . . • Restaurant recommendations from the BAGEL generator • restaurant location, food type, etc. • just 404 utterances for 202 DAs • basic setup 54.24% BLEU, best version 59.89% • less than BAGEL 's ~ 67% BLEU… but: • we do not use alignments • our generator decides itself what to include • outputs mostly fluent and meaningful • problems: • repeated/missing/irrelevant information on the output • slooooow, doesn't scale very well
. NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation
. NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation
. NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation
. NLG Systems Examples X is a restaurant in the riverside area near X. Generated X restaurant is near X on the riverside. Reference eattype=restaurant) inform(name=X-name, type=placetoeat, area=riverside, near=X-near, Input DA Example Outputs Our Approach to Sentence Planning . inform(name=X-name, type=placetoeat, area=X-area, .. . . .. . . .. . . Input DA pricerange=moderate, eattype=restaurant) . Input DA Ondřej Dušek 20/ 40 X is a Japanese restaurant in the centre of town near X and X. Generated X is a Chinese takeaway and Japanese restaurant in the city centre near X. Reference food=Japanese) area=citycentre, near=X-near, food=“Chinese takeaway”, inform(name=X-name, type=placetoeat, eattype=restaurant, X is a French restaurant in the riverside area which serves French food. Reference Generated X is a French restaurant on the riverside. Reference area=riverside, food=French) inform(name=X-name, type=placetoeat, eattype=restaurant, Input DA X is a restaurant in the X area. Generated X is a moderately priced restaurant in X. .. . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . Natural Language Generation
. . . .. . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Surface Realization Examples Treex / TectoMT realizer 21/ 40 Ondřej Dušek .. . . .. . . .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • Also various input formats, at least output is always text • From handcrafued to different trainable realizers • Also including our own (developed here at ÚFAL): • actually handcrafued for the most part
• General purpose • Functional Unification . . . .. . . .. . .. Surface Realization . . .. . . .. . . NLG Systems Examples KPML Grammar-based Realizers (90's): KPML , FUF/SURGE "It is raining cats and dogs." Ondřej Dušek 22/ 40 ) (C / OBJECT :LEX CATS-AND-DOGS :NUMBER MASS)) :TENSE PRESENT-CONTINUOUS :ACTEE (A / AMBIENT-PROCESS :LEX RAIN :LOGICALFORM :TARGETFORM .. EX-SET-1 :NAME (EXAMPLE Grammar FUF/SURGE Grammar multilingual .. . . . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. Natural Language Generation • General purpose, • Systemic Functional
. . .. . . .. . . .. . NLG Systems Examples .. . . .. . . .. . Surface Realization . "It is raining cats and dogs." Ondřej Dušek 22/ 40 ) (C / OBJECT :LEX CATS-AND-DOGS :NUMBER MASS)) :TENSE PRESENT-CONTINUOUS :ACTEE (A / AMBIENT-PROCESS :LEX RAIN :LOGICALFORM :TARGETFORM Grammar-based Realizers (90's): KPML , FUF/SURGE EX-SET-1 :NAME (EXAMPLE Grammar FUF/SURGE Grammar multilingual KPML .. . .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . .. . . . .. . . .. Natural Language Generation • General purpose, • Systemic Functional • General purpose • Functional Unification
. . .. . . .. . . .. . . .. . . .. . .. . Surface Realization Ondřej Dušek 23/ 40 enhancements Grammar multi-lingual Grammar-based Realizer: OpenCCG NLG Systems Examples . . .. . . .. . .. . .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . . .. . . .. Natural Language Generation • General purpose, • Combinatory Categorial • Used in several projects • With statistical
. . . . .. . . .. . .. . . . .. . . .. . .. NLG Systems Examples .. p.setVerb("chase"); Ondřej Dušek 24/ 40 >>> Mary chased the monkey. System.out.println(output); String output = realiser.realiseSentence(p); p.setFeature(Feature.TENSE, Tense.PAST); p.setObject("the monkey"); p.setSubject("Mary"); Surface Realization SPhraseSpec p = nlgFactory.createClause(); Realiser realiser = new Realiser(lexicon); NLGFactory nlgFactory = new NLGFactory(lexicon); Lexicon lexicon = new XMLLexicon("my-lexicon.xml"); (procedural) other languages Procedural Realizer: SimpleNLG .. . . .. . .. . . .. . . . .. . .. . . .. . . . . . . .. . . .. . . .. . .. . . . .. . . .. Natural Language Generation • General purpose • English, adapted to several • Java implementation
• Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG ) • Provides variance, but at a greater computational cost . .. . . .. . . .. . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek .. . . . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . Natural Language Generation .. .. . . .. . . • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker
• Provides variance, but at a greater computational cost . . .. .. . .. . . .. . . .. . . .. .. . . . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek . . .. .. . . .. . . .. . . .. . . . . . .. . . . .. . . . .. Natural Language Generation . .. . . .. . . .. • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG )
. . . .. .. . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Overgenerate and Rank + alignment (repeating words uttered by dialogue counterpart) 25/ 40 Ondřej Dušek . . . .. . . .. . . .. . . .. . . .. . . . . . .. . . . .. . Natural Language Generation .. . . .. .. . . • Require a handcrafued realizer, e.g. CCG realizer • Input underspecified → more outputs possible • Overgenerate • Then use a statistical reranker • Ranking according to: • n -gram models ( NITROGEN, HALOGEN ) • Tree models (XTAG grammar – FERGUS ) • Predicted Text-to-Speech quality ( Nakatsu and White ) • Personality traits (extraversion, agreeableness… – CRAG ) • Provides variance, but at a greater computational cost
. .. .. .. . . .. . . .. . . .. . . . .. . .. . . .. . . .. . NLG Systems Examples Surface Realization Trainable Realizers: Syntax-Based 26/ 40 Ondřej Dušek . . . .. . . .. . . .. . . .. . . .. . . . . . .. . . . . .. .. . . .. . . .. Natural Language Generation • StuMaBa : general realizer based on SVMs • Pipeline: ↓ Deep syntax/semantics ↓ surface syntax ↓ linearization ↓ morphologization
• We use it for our experiments ( TGEN1 ) • analysis • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . .. . . .. . . .. . . .. . . . . .. . .. .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer synthesis on BAGEL data = 89.79% BLEU trees 27/ 40 Ondřej Dušek . . .. .. .. . . .. . . . . . .. . . .. . . . . .. .. . .. . . .. . . . . . .. . . .. . Natural Language Generation • Domain-independent
• Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . . .. . . .. . . . .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. . . . . .. . . .. . . .. . . .. . . .. . . . .. . . .. . . .. Natural Language Generation .. . .. . . .. . . • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU
• Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized . .. . . .. . . . . . .. . . .. .. .. . . . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. .. .. . . .. . . .. . . .. . . .. . . . . . . .. . . .. . Natural Language Generation . .. . . .. . . .. • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect )
. . . .. . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realizer trees 27/ 40 Ondřej Dušek . .. . .. . . .. . . .. . . .. . . . . . .. . . .. . . . .. . Natural Language Generation .. . . .. . . .. • Domain-independent • We use it for our experiments ( TGEN1 ) • analysis → synthesis on BAGEL data = 89.79% BLEU • Pipeline approach • Mostly simple, single-purpose, rule-based modules (blocks) • Word inflection: statistical ( Flect ) • Gradual transformation of deep trees into surface dependency • Surface trees are then simply linearized
• Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization t-tree zone=en_gen jump v:fin cat window n:subj n:through+X . .. . . . . .. . . .. . Treex / TectoMT Surface Realization Example .. . . .. . NLG Systems Examples Our Surface realizer . 28/ 40 Ondřej Dušek Natural Language Generation .. . .. . . . .. . . .. . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . • Realizer steps (simplified):
• Determine morphological agreement • Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . . .. . . .. . . .. .. . .. .. . . . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . .. .. .. . . .. . . . . . .. . . .. . . . . .. .. . .. . . .. . . . . . . .. . . .. • Realizer steps (simplified): • Copy the deep tree (sentence plan) t-tree zone=en_gen jump v:fin cat window n:subj n:through+X
• Add prepositions and conjunctions • Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . .. . . .. . . .. . . .. .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . .. .. . . . .. . . .. . • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement t-tree zone=en_gen jump v:fin cat window n:subj n:through+X
• Add articles • Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . .. . . .. . . .. . . .. .. . . .. .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation . . . . . . . . . .. . . .. . . .. . . .. .. . . .. . . .. . . .. .. . .. . . .. . . • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions t-tree zone=en_gen jump v:fin cat window n:subj n:through+X
• Compound verb forms (add auxiliaries) • Punctuation • Word inflection • Capitalization . . .. . . .. . .. .. . . .. . . . . .. . . .. . . .. . NLG Systems Examples Our Surface realizer Treex / TectoMT Surface Realization Example 28/ 40 Ondřej Dušek Natural Language Generation .. . . . . . .. .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. • Realizer steps (simplified): • Copy the deep tree (sentence plan) • Determine morphological agreement • Add prepositions and conjunctions • Add articles t-tree zone=en_gen jump v:fin cat window n:subj n:through+X
Recommend
More recommend