proc of the 37th acl assoc for computational linguistics
play

Proc. of the 37th ACL (Assoc. for Computational Linguistics) - PDF document

Proc. of the 37th ACL (Assoc. for Computational Linguistics) (1999) Ecien t P arsing for Bilexical Con text-F ree Grammars and Head Automaton Grammars Jason Eisner Giorgio Satta Dept. of Computer &


  1. Proc. of the 37th ACL (Assoc. for Computational Linguistics) (1999) E�cien t P arsing for Bilexical Con text-F ree Grammars � and Head Automaton Grammars Jason Eisner Giorgio Satta Dept. of Computer & Information Science Dip. di Elettronica e Informatica Univ ersit y of P ennsylv ania Univ ersit� a di P ado v a 200 South 33rd Street, via Gradenigo 6/A, Philadelphia, P A 19104 USA 35131 P ado v a, Italy jeisner@linc.cis.upenn.edu satta@dei.unipd.it Abstract part y" then dep ends on the grammar writer's assessmen t of whether parties can b e con v ened. Sev eral recen t sto c hastic parsers use bilexic al Sev eral recen t real-w orld parsers ha v e im- grammars, where eac h w ord t yp e idiosyncrat- pro v ed state-of-the-art parsing accuracy b y re- ically prefers particular complemen ts with par- 4 lying on probabilistic or w eigh ted v ersions of ticular head w ords. W e presen t O ( n ) parsing bilexical grammars (Alsha wi, 1996; Eisner, algorithms for t w o bilexical formalisms, impro v- 5 1996; Charniak, 1997; Collins, 1997). The ra- ing the prior upp er b ounds of O ( n ). F or a com- 3 tionale is that soft selectional restrictions pla y mon sp ecial case that w as kno wn to allo w O ( n ) 1 3 a crucial role in disam biguation. parsing (Eisner, 1997), w e presen t an O ( n ) al- The c hart parsing algorithms used b y most of gorithm with an impro v ed grammar constan t. 5 the ab o v e authors run in time O ( n ), b ecause bilexical grammars are enormous (the part of 1 In tro duction the grammar relev an t to a length- n input has Lexicalized grammar formalisms are of b oth 2 size O ( n ) in practice). Hea vy probabilistic theoretical and practical in terest to the com- pruning is therefore needed to get acceptable putational linguistics comm unit y . Suc h for- run times. But in this pap er w e sho w that the malisms sp ecify syn tactic facts ab out eac h w ord complexit y is not so bad after all: of the language|in particular, the t yp e of argumen ts that the w ord can or m ust tak e. � F or bilexicalized con text-free grammars, 4 Early mec hanisms of this sort included catego- O ( n ) is p ossible. 4 rial grammar (Bar-Hillel, 1953) and sub catego- � The O ( n ) result also holds for head au- rization frames (Chomsky , 1965). Other lexi- tomaton grammars. calized formalisms include (Sc hab es et al., 1988; � F or a v ery common sp ecial case of these 3 Mel' � cuk, 1988; P ollard and Sag, 1994). grammars where an O ( n ) algorithm w as Besides the p ossible argumen ts of a w ord, a previously kno wn (Eisner, 1997), the gram- natural-language grammar do es w ell to sp ecify mar constan t can b e reduced without 3 p ossible head w ords for those argumen ts. \Con- harming the O ( n ) prop ert y . v ene" requires an NP ob ject, but some NPs are Our algorithmic tec hnique throughout is to pro- more seman tically or lexically appropriate here p ose new kinds of sub deriv ations that are not than others, and the appropriateness dep ends constituen ts. W e use dynamic programming to largely on the NP's head (e.g., \meeting"). W e assem ble suc h sub deriv ations in to a full parse. use the general term bilexical for a grammar that records suc h facts. A bilexical grammar 2 Notation for con text-free mak es man y stipulations ab out the compatibil- grammars it y of particular pairs of w ords in particular The reader is assumed to b e familiar with roles. The acceptabilit y of \Nora con v ened the con text-free grammars. Our notation fol- � The authors w ere supp orted resp ectiv ely under ARP A 1 Gran t N6600194-C-6043 \Human Language T ec hnology" Other relev an t parsers sim ultaneously consider t w o and Ministero dell'Univ ersit� a e della Ricerca Scien ti�ca or more w ords that are not necessarily in a dep endency e T ecnologica pro ject \Metho dologies and T o ols of High relationship (La�ert y et al., 1992; Magerman, 1995; P erformance Systems for Multimedia Applications." Collins and Bro oks, 1995; Chelba and Jelinek, 1998).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend