Candidacy Exam Content Planning in Generation Pablo A. Duboue - PowerPoint PPT Presentation

Candidacy Exam Content Planning in Generation Pablo A. Duboue December 11 th , 2001 Committee Computer Science Department Kathleen R. McKeown Columbia University Rebecca J. Passoneau in the city of New York Nina Wacholder

The Problem: Content Planning Generation of multisentential text. breed(Fido,CockerSpaniel) loves(John,dog) owner(Fido,John) is-a(Fido,dog) located(John,NYC) loves(John,Fido) aunt(John,Marie) located(Marie,Paris) is-a(NYC,city) is-a(Paris,city) Tell me about Fido. Compare: Fido is a dog. Marie lives in Paris. Fido is a CockerSpaniel. New York is a city . . . Fido is a CockerSpaniel dog, owned by John. John loves Fido because he loves dogs. He lives in the city of New York . . . • Given certain information, structure a subset of it.

The Problem: Content Planning • The Tasks – Content Selection : choosing the right bits of information to include in the final output. – Content Structuring : organizing the data in some sensible way. • The Goals – Coherence (Structuring) – Conciseness (Both) – Appropriateness (Selection)

The Papers This candidacy exam covers 29 papers: Milestones in the search for a solution to the content planning problem. Prestigious highly cited papers published in well-established journals. Innovative remarkably new approaches to the problem.

rsttheory2 planningadvisory parsimonious incompatiblerst RST setrst problemrst bottomup instructional stochasticsearch multilingualplanning edpo centering knowledgeselection rulesselection constraintsatisfaction C.P . edp domaindependent ooplanning cognitive sdrt coherentvisual rhetoricaldialog MM hypertextdialogue layoutrstcl injuries nag dpocl AI planningargumentative reinterpretationarch 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001

Issues My perspectives to analyze the papers: • Input • Relations • Surface • Output • Intentions • Multimedia Domain • Algorithm • Centering Communication • Knowledge • Tree Structure

Input What should the input to a Content Planner be? • Should RST-relations be part of it? Yes (Marcu, ’97) , (Mellish et al. , ’98) . Looks limited to problems lacking any real use of the communicative power of the rhetorical structure. No done by most content planners (Young and Moore, ’94) and architectures (Cahill et al. , ’00) , they find the relations while structuring the document. It let you find relations that hold as a result of the structure ( presentational relations). • Should it include the whole Knowledge Base or just some part? Total Access This happens mainly with generation-specific KB. Partial Access For example, the idea of views (Lester and Porter, ’97) . This allows a generation more detached from the KB.

Output What should the output of a Content Planner be? • Should it be a tree , a sequence , or an equational system ? Tree Most content planners use trees as output, (Cahill et al. , ’00) . Sequence More restricted than trees, for several applications they may be enough (Huang, ’94) , (Mellish et al. , ’98) . I find them very appealing. Further stages can do local revision, if needed. Equational System More expressive than trees (Danlos et al. , ’01) . Seem to solve some of the problems involved in trees. • Should include textual levels (paragraphs, etc.)? Yes The content planner has the high level perspective to do so. No Text structure may be incompatible with rhetoric (Bouayad-Agha et al. , ’00) and a new task should be spanned. The problem may actually be in the definition of what rhetoric structure is.

Algorithm What should the process inside a Content Planner be? • Should it be actually a planning process? Yes (Moore and Paris, ’92) a well-motivated example of a complex planning process. Also (Huang, ’94) , (Ansari and Hirst, ’98) , (Kosseim and Lapalme, ’00) . Full planning is the “real thing” although expensive and require modeling a full rainbow of issues. No Other alternatives are macro-expansion (Lester and Porter, ’97) , rule systems (Reiter et al. , ’97) as pointed out in (Rambow, ’99) . Simpler architectures are always appealing. • With which operators? Rhetorical Relations Intentional Intentions Domain Communication Language Pragmatical

Algorithm • Should it be top-down or bottom-up? Top-Down Speed and ease of understanding motivates building top-down planners (Young and Moore, ’94) . Bottom-Up (Marcu, ’97) sees the whole process as a linking among facts by means of input-given RST-relations. Interesting perspective, although it is too shallow to be enough far-reaching. Hybrid (Huang, ’94) combines a top-down (planned) approach with a bottom-up opportunistic perspective based on centering. The top-down module has too high priority to be a real hybrid. • Other approaches – (Mellish et al. , ’98) stochastic search (e.g., genetic algorithms). – (Power, ’00) constraint satisfaction. – (Knott et al. , ’97) defeasible rules.

Relations The existence of rhetorical relations holding between spans of text is an agreed fact. • What are the sizes of those spans? – (Mann and Thompson, ’88) , (Stent, ’00) . • What are the relations themselves? How many? – (Knott and Dale, ’93) , (Hovy and Maier, ’95) . • Is there a fixed amount? Yes (Rambow, ’99) , (Knott and Dale, ’93) . No (Mann and Thompson, ’88) , (Hovy and Maier, ’95) .

Intentions The beliefs of the H and the intentions of the S are impor- tant to generate well-motivated discourse. • How they can be represented? – (Moore and Paris, ’92) . • Should we model degrees of belief? Yes (Zukerman et al. , ’96) , (Walker and Rambow, ’94) , (Rambow, ’99) . No (Moore and Paris, ’92) , (Young and Moore, ’94) .

Domain Communication Knowledge (DCK) Some discourses are completely shaped by its use and environment. • The concept of DCK and its necessity is introduced in (Kittredge et al. , ’91) . • (Rambow, ’99) proposes an integrated approach to deal with DCK and other issues. • How is this knowledge represented ? implicit In most of the cases. explicit (Huang, ’94) (Proof Communicative Acts). (Lester and Porter, ’97) (Schemas). • It is distinguished from domain knowledge?

Surface The planner decisions may relate to lower level issues than the mere rhetorical tree. • Where should the connectives (e.g., cue phrases) defined? • Should the particularities in the realization of given phrases (act./pass., to-inf/gerund) be synchronized with rhetorical decisions? • (Kosseim and Lapalme, ’00) does a very detailed analysis of the issues relating election of syntactical forms given a communicative context. • (Rambow, ’99) provides a framework to allow the content planner synthesize decisions at different levels of abstractions, as it may see fit. • (Bouayad-Agha et al. , ’00) analyses possible incompatibilities between the text structure (paragraphs, etc.) and the rhetorical structure.

Multimedia/Multilingual/Dialog/Layout Planning different type of contents. • How compatible are textual and non-textual materials from the planning perspective? • How does layout affect the communicative process? • How different languages affect the structuring of the message? • (Dale et al. , ’97) , hyperlinks; (Kamps et al. , ’01) , layout; (Stent, ’00) , dialog; (Marcu et al. , ’00) , multilingual; (Power, ’00) , (Bouayad- Agha et al. , ’00) , text structure; (Andr´ e and Rist, ’95) , multimedia.

Centering The local focus is an agreed ingredient in the coherence of texts. The relation between centering theory and content planning are brought explict in (Kibble and Power, ’99) . • How explicit should centering be represented and dealt with? • For understanding, centering is local issue, is that the case for generation? • (Mellish et al. , ’98) uses some idea of local focus for scoring possible candidates during the genetic search. • (Huang, ’94) uses centering to drive its bottom-up planner. But this considers it a local behaviour

Tree Structure There is some agreement that the output of a Content Planner should be a tree. However, so many incompatibility results may suggest it may not be the case. What other options do we have? • (Mann and Thompson, ’88) makes a strong argument for the rhetorical structure to be a tree in most of the texts . However their JOINT relation seems to be an ad hoc way to complete tree. • (Danlos et al. , ’01) provides a good discussion on the other direc- tion, i.e., that a tree is not enough. • Incompatibility results: (Marcu et al. , ’00) , multilingual; (Bouayad- Agha et al. , ’00) , text structure; (Mellish et al. , ’98) , cross-over. • (Ansari and Hirst, ’98) , (Rambow, ’99) , (Lester and Porter, ’97) , (Marcu, ’97) , (Kittredge et al. , ’91) .

Conclusion Normally a content planner has to integrate the following, contrasting issues: • DCK • Intentional • Rhetoric • Semantic Each particular planner explores some directions according to the particularities of their problem at hand. As (Rambow, ’99) points out, a general framework for dealing with the problem will require powerful, aggregated operators. Given such operators, an appealing way to combine then is to think them as adding constraints to the search space in a constraint satisfaction setting. (As sketched in (Power, ’00) .)

Candidacy Exam Content Planning in Generation Pablo A. Duboue - PowerPoint PPT Presentation

Candidacy Exam Content Planning in Generation Pablo A. Duboue December 11 th , 2001 Committee Computer Science Department Kathleen R. McKeown Columbia University Rebecca J. Passoneau in the city of New York Nina Wacholder The Problem:

EDIC Candidacy Exam Ola Svensson go.epfl.ch/phd-edic- Program Director candidacy-exam

Second Year Student Meeting PhD Candidacy Exam On-topic or Off-topic Candidacy Exam? On-Topic:

Second Year Student Meeting PhD Candidacy Exam Petitioning for Candidacy Submit Petition for PhD

Exam4 Information and Guidance General Topics General Exam Information Exam types

Quicksort Sorting Lower Bound Exam Exam Exam Exam 2 2 tomorrow evening 2 2 tomorrow

Credentialing Candidates and Clergy Rev. Meg Lassiat Executive Director, Candidacy and

Examination Lydia Love DVM DACVAA 2018 Exam Committee Chair September 2018 Exam Format

Sensible Cryptocurrencies Ghada Almashaqbeh Columbia University Ph.D Candidacy Exam Nov. 2017

Announcements Announcements Final Exam will be a take Final Exam will be a take- -home exam

The Bohr Model of Hydrogen Exam Details The exam will be held Wednesday, October 5th from

Lectur Lecture 20: e 20: DC M DC Motor otors Exam Exam 2 Results 2 Results Most M ost

7: The Exam CS1021 CS1021 Exam structure Exam consists of 4 questions,

Final exam effects Textures I Final exam effects Final exam effects Lighting Grads

ICS 101 Final Exam Review Fall 2016 Final Exam information In lab: check final exam schedule

Exam 2 Review CS461/ECE422 Fall 2009 Exam guidelines Same as for first exam A single page

Final Exam Details The final exam will be posted on Blackboard by 7am on April 26th It will be

Condensed Matter Physics More is Different N 10 23 Hierachical Structure of Science P

Quantum mechanics and the sanctity of linearity Lajos Di osi Wigner Center, Budapest 14 June

Places to learn more: Particle and nuclear physics links http://pdg.lbl.gov

Dynamics of a two-step Electroweak Phase Transition in Collaboration with Pavel Fileviez Prez

COSMOLOGICAL GRAVITATIONAL WAVES DANIEL G. FIGUEROA IFIC, Valencia, Spain September 23-27 2019,

TESTING THE FUNDAMENTAL LAWS OF NATURE AT THE ENERGY FRONTIER Roberto Contino Scuola Normale

? ? ? Photo by A. Dunlap (Thanks to Aimee Dunlap for designing this lecture.) Open the Black Box

Pedro G. Ferreira Oxford Oslo, 2015 Thursday, 15 January 15 Outline Can gravity solve