Example Sentences and Making them Useful for Theoretical and - PowerPoint PPT Presentation

Example Sentences and Making them Useful for Theoretical and Computational Linguistics Stefan M¨ uller Email: Stefan.Mueller@cl.uni-bremen.de http://www.cl.uni-bremen.de/˜stefan/ DGfS-Jahrestagung Mainz, 27.02.2004

Outline • Why test suites / data collections? • What do we have? • B-Ger-TS • Demo • Suggestions for using test suites / data collections • Guidelines • Conclusions

Why are Test Suites Needed for NLP? • Language is very complex → minimal changes to a grammar may have unexpected effects • Check improvement in grammar development – coverage – processing speed – memory requirements 2/15

What Test Suites and Data Bases are There? • Test Suites developed in TSNLP (Oepen, Netter and Klein, 1997) – English – German – French 3/15

What Test Suites and Data Bases are There? • Test Suites developed in TSNLP (Oepen, Netter and Klein, 1997) – English – German – French • Test Suites that come with [incr TSDB()] wich is part of the LKB (Copestake, 2002) – English (Lingo, CSLI) – German (VM, DFKI) – Spanish – Japanese – Norwegian 3/15

What Test Suites and Data Bases are There? • Test Suites developed in TSNLP (Oepen, Netter and Klein, 1997) – English – German – French • Test Suites that come with [incr TSDB()] wich is part of the LKB (Copestake, 2002) – English (Lingo, CSLI) – German (VM, DFKI) – Spanish – Japanese – Norwegian • Babel Test Suite 3/15

What Test Suites and Data Bases are There? • Test Suites developed in TSNLP (Oepen, Netter and Klein, 1997) – English – German – French • Test Suites that come with [incr TSDB()] wich is part of the LKB (Copestake, 2002) – English (Lingo, CSLI) – German (VM, DFKI) – Spanish – Japanese – Norwegian • Babel Test Suite • A3-Datenbank in T¨ ubingen (Sternefeld, et. al.) • Others? 3/15

Why Should we Have Additional Ones? (I) • Babel Test Suite is unsystematic, naturally grown from a diploma thesis 4/15

Why Should we Have Additional Ones? (I) • Babel Test Suite is unsystematic, naturally grown from a diploma thesis • TSNLP is very systematic: (1) a. die alte Wand b. * der alte Wand c. * das alte Wand d. * des alte Wand e. * den alte Wand f. * dem alte Wand g. * die alte W¨ ande h. * der alte W¨ ande i. * das alte W¨ ande j. * des alte W¨ ande k. * den alte W¨ ande l. * dem alte W¨ ande m. * der alte W¨ anden n. * die alte W¨ anden 4/15

Why Should we Have Additional Ones? (II) but it is only a part of what is needed: • phenomena are missing 5/15

Why Should we Have Additional Ones? (II) but it is only a part of what is needed: • phenomena are missing • There are tons of strange ungrammatical sentences that are relevant in the context of a discussion of a particular analysis only. Such things are not in TSNLP. Examples: – Agreement as head feature and coordination. – Haider’s Designated Argument as a head feature and coordination of unergatives and unakkusatives 5/15

Outline • Why test suites / data collections? • What do we have? • B-Ger-TS • Demo • Suggestions for using such test suites / data collections • Guidelines • Conclusions

B-Ger-TS (I) • B-Ger-TS developed from Babel-TS • contains examples I gathered over the past ten years • I started to systematize it, to crossclassify items with regard to phenomena • extended the database by examples from the literature • provided references to bibliographic sources • eliminated lexical ambiguity 6/15

B-Ger-TS (II) • verb position, scrambling, fronting and island data, extraposition, subjacency, . . . • coherent/incoherent constructions, complex predicates, particle verbs, control and raising, AcI constructions • incomplete category fronting with adjectives and verbs, multiple frontings • adjunction in the nominal and verbal area – attributive adjectives and participles – prepositional phrases – relative clauses • free relative clauses • left dislocation • topic drop 7/15

B-Ger-TS (III) • depictive secondary predicates • passive in various forms (e.g., stative passive, dative passive, lassen passive) • modal infinitives • coordination • and the interaction between all of this! 8/15

B-Ger-TS (III) • depictive secondary predicates • passive in various forms (e.g., stative passive, dative passive, lassen passive) • modal infinitives • coordination • and the interaction between all of this! • items are crossclassified according to the phenomena 8/15

B-Ger-TS (III) • depictive secondary predicates • passive in various forms (e.g., stative passive, dative passive, lassen passive) • modal infinitives • coordination • and the interaction between all of this! • items are crossclassified according to the phenomena • retreival with respect to various aspects is possible 8/15

Demo of TSDB 9/15

Suggestions for Using Test Suites / Data Collections • All published grammar fragments should come with a list of used test suites and results. (many already do, mainly those connected to the CSLI/DFKI groups) • example: http://www.cl.uni-bremen.de/Fragments/b-ger-gram.html 10/15

Suggestions for Using Test Suites / Data Collections • All published grammar fragments should come with a list of used test suites and results. (many already do, mainly those connected to the CSLI/DFKI groups) • example: http://www.cl.uni-bremen.de/Fragments/b-ger-gram.html • Journal articles can be written and reviewed with reference to publically availible data collections. 10/15

The Format • simple ASCII text • lines with ‘ ;;; ’ indicate a phenomenon until the next line with ‘ ;;; ’ ;;; Extraposition daß der Mann schl¨ aft, der stirbt. ;; Extraposition aus Subjekt Der Mann liebt Maria, der ihn verachtet. ;; Extraposition aus Subjekt im Vorfeld Den Mann liebt Maria, der ihn verachtet. ;; Extraposition aus Objekt im Vorfeld Daß Karl schl¨ aft, ist dem Mann aufgefallen, der ihn kennt. ;; @ nach Haider94 11/15

The Format • simple ASCII text • lines with ‘ ;;; ’ indicate a phenomenon until the next line with ‘ ;;; ’ ;;; Extraposition daß der Mann schl¨ aft, der stirbt. ;; Extraposition aus Subjekt Der Mann liebt Maria, der ihn verachtet. ;; Extraposition aus Subjekt im Vorfeld Den Mann liebt Maria, der ihn verachtet. ;; Extraposition aus Objekt im Vorfeld Daß Karl schl¨ aft, ist dem Mann aufgefallen, der ihn kennt. ;; @ nach Haider94 • everything that follows ‘ ;; ’ and preceedes ‘ @ ’ is a comment • everything that follows ‘ @ ’ is the source of the example 11/15

The Format • simple ASCII text • lines with ‘ ;;; ’ indicate a phenomenon until the next line with ‘ ;;; ’ ;;; Extraposition daß der Mann schl¨ aft, der stirbt. ;; Extraposition aus Subjekt Der Mann liebt Maria, der ihn verachtet. ;; Extraposition aus Subjekt im Vorfeld Den Mann liebt Maria, der ihn verachtet. ;; Extraposition aus Objekt im Vorfeld Daß Karl schl¨ aft, ist dem Mann aufgefallen, der ihn kennt. ;; @ nach Haider94 • everything that follows ‘ ;; ’ and preceedes ‘ @ ’ is a comment • everything that follows ‘ @ ’ is the source of the example • crossclassification of phenomena: listing phenomena separated by ‘+’ ;;; Extraktion + w-Satz * daß ich nicht weiß, dieses Buch warum ich lesen sollte. ;; @GMueller98a:244 11/15

Lexical Ambiguity and Efficiency Ambiguity in case does not hurt, but ambiguity in number does. (2) a. Will der Manager lachen? b. Will der Mann lachen? Manager projects to a full NP, Manager lachen a full VP + sentence 12/15

Lexical Ambiguity and Efficiency Ambiguity in case does not hurt, but ambiguity in number does. (2) a. Will der Manager lachen? b. Will der Mann lachen? Manager projects to a full NP, Manager lachen a full VP + sentence Even worse: If the verb has an optional object, we get unwanted ambiguities: (3) Will der Manager essen? ( der = subject, manager = object) 12/15

Lexical Ambiguity and Efficiency Ambiguity in case does not hurt, but ambiguity in number does. (2) a. Will der Manager lachen? b. Will der Mann lachen? Manager projects to a full NP, Manager lachen a full VP + sentence Even worse: If the verb has an optional object, we get unwanted ambiguities: (3) Will der Manager essen? ( der = subject, manager = object) (4) a. Will der Manager essen? → 307 passive edges b. Will der Mann essen? → 114 passive edges 12/15

Lexical Ambiguity and Usability of Test Suites (Grammatical Sentences) ihr is ambiguous between dative feminine and second person plural and the possessive pronoun. A theory/grammar that makes wrong claims about case could analyze (5) as a sentence with two nominatives. (5) Ihr helfen wir. So the grammatical sentence could be parsed although the theory assigns a wrong structure/wrong case values. 13/15

Example Sentences and Making them Useful for Theoretical and - PowerPoint PPT Presentation

Example Sentences and Making them Useful for Theoretical and Computational Linguistics Stefan M uller Email: Stefan.Mueller@cl.uni-bremen.de http://www.cl.uni-bremen.de/stefan/ DGfS-Jahrestagung Mainz, 27.02.2004 Outline Why test

SYMBOLIC LOGIC UNIT 10: SINGULAR SENTENCES Singular Sentences (monadic) Paris is beautiful

Lay Them Down Chorus: Lay them down, Lay them down, Lay your branches down for Him Spread them

Linguistics 101 Theoretical Syntax Theoretical Syntax When constructing sentences, our brains

Nouns, V erbs, and Sentences 98-348: Lecture 2 Nouns, verbs and sentences 98-348: Lecture 2

Activity 1 Describe this character using as many 2a sentences as you can. Try and use ambitious

Quantifier Elimination Helpful lemmas Let S be a set of sentences. Helpful lemmas Let S be a set

Making maps pretty Andrea Aime Jim Groffen Making Maps Pretty Making Maps Pretty 1 1 Making

The Firefighter Problem on Trees David Ellison RMIT School of Science Co-authors: Pierre

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

Making Journal-Quality Tables Making Journal-Quality Tables (and other useful hints!) (and other

An Example for An Example for An Example for An Example for An Example for An Example for An

Theoretical physics and theoretical astrophysics John Campbell 2015 Institutional Review

Useful Tools for Testing Aled Smith Useful Tools for Testing This presentation will be

One mask to group them all, One code to find them, One file to store them all, And in a

They Don t Want Them Or You t Want Them Or You They Don Don t Have Them: t Have

Computer Networks - Xarxes de Computadors Outline Course Syllabus Unit 1: Introduction Unit 2.

Intel 10Gbe status and other thoughts Linux IPsec Workshop 2018 Shannon Nelson Oracle Corp

INFORMATION TECHNOLOGY TRACK Russell Brown Chapter 13 Standing Trustee - Phoenix, AZ Allan

https://tinyurl.com/y5w749j9 Keith Berger Nicolas Bock Senior Software Engineer Senior Software

Engaging Community: Civic Engagement in Higher Education Marisa Hightower Associate Director

Role of FFR-Guided PCI in Patients with Stable CAD William F. Fearon, MD Professor of Medicine

Influence of Beckmann, McGuire, and Winsten's Studies in the Economics of Transportation on

H OW TO FIND A FINITE ALGEBRA WITH A GIVEN CONGRUENCE LATTICE ? H OW TO FIND A FINITE ALGEBRA WITH

Sambuz

Useful Links

Newsletter

Mail Us