hybrid nlp hybrid nlp
play

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and - PDF document

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing Hybrid Architectures An Advanced Platform for Hybrid NLP: Deep Thought Applications for Hybrid Processing Conclusion and Outlook LTII


  1. Hybrid NLP Hybrid NLP

  2. O UTLINE O UTLINE � Problems of Deep and Shallow Processing � Hybrid Architectures � An Advanced Platform for Hybrid NLP: Deep Thought � Applications for Hybrid Processing � Conclusion and Outlook LTII – SS 2008

  3. D EEP & S HALLOW P ROCESSING D EEP & S HALLOW P ROCESSING � deep methods for morphological - syntactic - semantic processing exploit our knowledge about the structure of human language � as opposed to shallow methods such as pattern matching grammars, n-gram language models � deep methods are needed for getting at the meaning of language input � shallow methods perform a partial or heavily under- specified analysis sufficient for certain applications LTII – SS 2008

  4. ∃ x[(old'(penny')) (x) ∧ ( Past(give'(sue‘, paul‘, x)))] S S S/NP VP NP NP NP NP V NP Det N V NP Det N A N A N Sue gab Paul einen alten Pfennig. Sue gave Paul an old penny. LTII – SS 2008

  5. A PPLICATIONS A PPLICATIONS � Machine Translation e.g. Systran, Logos, METAL-Comprendium, IBM PT � Access to Databases e.g. Core Language Engine LTII – SS 2008

  6. O NCE U PON A T IME O NCE U PON A T IME � Broad industrial research in deep parsing � Xerox - LFG Siemens - LFG � IBM Germany - HPSG � � Hewlett Packard - GPSG and HPSG � IBM USA - PLNLP and Slot Grammar � Very large projects � EUROTRA LILOG � LS-GRAM � LTII – SS 2008

  7. G RAMMAR F RAMEWORKS G RAMMAR F RAMEWORKS � Head-Driven Phrase Structure Grammar (HPSG) � Lexical Functional Grammar (LFG) � Tree-Adjunction Grammar (TAG) � Categorial Grammar (CG) � Dependency Grammar (DG) � GB-Minimalist Program LTII – SS 2008

  8. HPSG HPSG � Head-Driven Phrase Structure Grammar by Pollard and Sag � Uniform formalism: typed feature structures � High degree of lexicalization: very few PS-rules, rich lexicon structure � Ontological structure: Multiple inheritance type hierarchy LTII – SS 2008

  9. Problems with with Deep Deep Analysis Analysis Problems � Coverage (Development Time) � Robustness (Coping with Out-of-Grammar Input) � Efficiency (Runtime and Space Efficiency) � Specificity (Selection among Readings) LTII – SS 2008

  10. Problems with with Shallow Shallow Analysis Analysis Problems � Accuracy Problems with embeddings, grammatical control, � anaphora and modal as well a negative contexts. According to SVP Raul Lopez, Slator expected him to be appointed CEO of Crawford Inc. at the upcoming share holders meeting. After the retirement of Peter Smith, Mary Hopp was introduced by VP Brown as the new director of the marketing division. After every former US based vicepresident except Lisa Ronell served as Chairman of the Board, the shareholders for the first time appointed a non-US Chairperson. LTII – SS 2008

  11. R EAL G RAMMARS R EAL G RAMMARS � LinGO - English Resource Grammar � 8.000 types � 100.000 lines of code average feature structure > 300 nodes � � German Grammar of equal size � Japanese and Norwegian grammars are getting close LTII – SS 2008

  12. International Collaboration Collaboration International Toky o � Tsujii Lab at the University of Tokyo � Tsujii, Torisawa, Ninomiya, Taura, Yoshida, Mitsoishi,... Stanford � HPSG Group at CSLI � Sag, Flickinger, Copestake, Malouf, Carroll (Brighton),... Saarbrücken � LT Lab at DFKI and Dept. of CL QuickTime™ and a GIF decompressor are needed to see this picture � Oepen, Callmeier, Krieger, Kiefer, Ciortuz, Müller,... LTII – SS 2008

  13. S ETUP ETUP tsdb VALUATION S E VALUATION LTII – SS 2008 HE E T HE T

  14. R ESULTS R ESULTS � All participating systems have benefitted from the systematic comparative evaluation � Currently the fastest system is the runtime parser PET by Ulrich Callmeier (Saarbrücken) � But the other parsers also improved drastically,e.g.: LKB (Stanford, Cambridge) � � LILFES (Tokyo) PAGE (Saarbrücken) � LTII – SS 2008

  15. R ESULTS R ESULTS � HPSG Parsing is now 2000 times faster than before � Normal-length sentences parse in 0.1 - 1.0 seconds � Steady increase in hardware efficiency will also help LTII – SS 2008

  16. R EFERENCES R EFERENCES � D. Flickinger, S. Oepen, H. Uszkoreit, and J. Tsujii (eds.). 2000. Journal of Natural Language Engineering 6 (2000) 1. Special Issue on Efficient Processing with HPSG: Methods, Systems, Evaluation. Cambridge University Press. Cambridge. � A. Copestake. 2002. Implementing Typed Feature Structure Grammars. CSLI Publications, Stanford. Building a Large Annotated Corpus of English: � S. Oepen, D. Flickinger, J. Tsujii, and H. Uszkoreit. 2002. Collaborative Language Engineering. A Case Study in Efficient Grammar-based Processing. CSLI Publications, Stanford. LTII – SS 2008

  17. T HE C ORE M ACHINERY T HE C ORE M ACHINERY PET LKB Runtime Parser Development Platform English Development Platform Development Platform Development Platform Grammar Application German LKB LKB LKB Grammar Japanese Grammar Open Source tsdb Public Domain LTII – SS 2008

  18. H OWEVER H OWEVER � Back to the problems of robustness � coverage � specificity � LTII – SS 2008

  19. A SSUMPTIONS A SSUMPTIONS � Information extraction is not an alternative to deep processing but a continuum between classification and "full" semantic analysis � Information Extraction via Text Enrichment � We can detect topics, names, binary relations, complex relations, answers, etc. � Question: At what point is deep processing needed? LTII – SS 2008

  20. A PPROACH A PPROACH � Lack of robustness and coverage remains a serious problem for deep processing. � So we need to find applications, where deep processing can improve detection without spoiling the performance. � Example: Relation extraction. � Let deep processing assist shallow methods. LTII – SS 2008

  21. LT M M ETHODS LT ETHODS discrete non-discrete hybrid shallow HMM HMM- -based based POS Tagger Tagger POS deep LTII – SS 2008

  22. LT M M ETHODS LT ETHODS discrete non-discrete hybrid shallow HPSG- -Parser Parser HPSG with MRS MRS with deep LTII – SS 2008

  23. LT M M ETHODS LT ETHODS discrete non-discrete hybrid shallow PCF Parser PCF Parser deep LTII – SS 2008

  24. LT M M ETHODS LT ETHODS discrete non-discrete hybrid shallow syntactic LFG syntactic LFG parser with with ME ME parser selection selection deep LTII – SS 2008

  25. C OMBINATION OMBINATION OF M OF M ETHODS C ETHODS discrete non-discrete hybrid shallow deep LTII – SS 2008

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend