question answering the semantic web
play

Question Answering & the Semantic Web Gnter Neumann Language - PowerPoint PPT Presentation

Question Answering & the Semantic Web Gnter Neumann Language Technology-Lab DFKI, Saarbrcken Overview Hybrid Question Answering Language Technology and the Semantic Web 2004 G. Neumann Motivation: From Search Engines to


  1. Question Answering & the Semantic Web Günter Neumann Language Technology-Lab DFKI, Saarbrücken

  2. Overview • Hybrid Question Answering • Language Technology and the Semantic Web  2004 G. Neumann

  3. Motivation: From Search Engines to Answer Engines ����������� �������������������������� ��#������ ������ ������� ���%���� � ������� ��!���������� ���������� "�����#��������� ���������������������� ������$������������� ������������� ��� �������������  2004 G. Neumann

  4. ▼ ✮ ✩ ✴ ❇ ✯ ✷ ✾ ✻ ❏ ✭ ✰ ✯ ✻✧ ✩ ✫ ✯ ■ ✮ ✺ ✻ ✫ ✧ ✰ ✮✯ ✰ ✯ ✩ ✯ ✳ ✫ ✻❁ ✺ ❑ ✪ ✩ ✱ ✬ ✫ ✮ ✩ ✴ ✻ ✩ ✧ ●❍ ✵ ✬ ✰ ✰ ✯ ❁ ✧ ✯ ✮ ✪ ✮ ✴ ✺ ✧ ✩ ❊❋ ✫❉ ✴ ✫ ✫ ✯ ✾ ✧ ✩ ❄ ✷ ✻ ✺ ✺ ✱ ✩ ✷ ✮ ❄ ✰ ✻ ✴ ✮ ✫✰ ✮ ✧ ✻ ◗ ✻ ❏ ✮ ✮ ✪ ✫ ❁ ❀ ✯ ✾ ✫ ✯ P ❖ ✺ ✮ ✷ ●❍ ✻ ✰ ✤ ❄ ✵ ✮ ✾ ✯ ❇ ❀ ✮ ✳ ✮ ✮ ❅ ✮ ✩ ✷ ✫ ✮ ✷ ❖ ◆ ▼ ✽ ✷ ❃ ❋ ✫▲ ❁ ✻ ❀ ✯ ✮ ✮ ✧✺ ✴ ❈ ✱ ✩ ✩ ✯ ❈ ✯ ✩ ✴ ✵ ✰ ✷ ✮✯ ✬ ✴ ✺ ✮✯ ✪ ✰ ✯ ✮ ✼ ✧ ✩ ✯ ✷ ✼ ✻ ✪ ❁ ❀ ✪ ✪ � ✫✰ ✫ ✮✯ ✭ ✬ ✫ ✩ ✯ ★ ✧ ✦ ✁ ✢ ✝ ✟ ✳ ✪ ✜ ✸✹ ✼ ✻ ✯ ✮ ✴ ✧✺ ✩ ✱ ✱ ✷ ✯ ✮ ✬ ✫ ✪ ✯ ✰ ✝ ✛ ✯ ✍✑ ✏✕ ✔ ✝ ✟ ✓ ✑ ✒ ✏ ✓ ✏ ✍ ✝ ✍✎ ✝ ✌ ✂✄ ✑ ✔ ✗ ✑ ✠ ✍ ✚ ✘✙ ✁ ✟ ✕ ✡ ✏ ✍✑ ✗ ✄ ✖ ✕ ✑ ✍ ✠ ✷ ✩ ☎ ✻ ✻ ✷ ❇ ✪ ✫ ✩ ✷ ✼ ✩ ❄ ✮ ✫ ✻ ✧ ✱ ✻ ✾ ❄ ✾ ❆ ✾ ✬ ✰ ✧ ✻ ✪ ✫ ✮ ✱ ✩ ✪ ✷ ✮ ✻❈ ✴ ✪ ✪ ✻❁ ✯ ✻ ✧ ✾ ✫❂ ✻❁ ❀ ✯ ✪ ✫ ✮ ✻✧✾ ✩ ✿ ✻✧✾ ✽ ✵ ✰ ✰ ✮ ✼ ❃ ✟ ✾ ✯ ✯ ❆ ✧ ✮ ✮❅ ❄ ✧ ✱ ✧ ✷ ✮✯ ✬ ✯ ✪ ✮ ✴ ✺ • Output: a set of possible answers drawn • Input: a question in NL; a set of text and ✴✶✵ Question Answering ✮✲✱ database resources from the resources ✠✡☞☛ ✣✥✤ ☎✆✞✝  2004 G. Neumann

  5. Hybrid QA Architecture NL NL Questions Answers Hypothesis Question Answer web Analysis Generation real-life QA systems will perform best mining Query Response if they can Generation Analysis Off Line Data • combine the virtues of domain- Harvesting specialized QA with open-domain QA On-Line • utilize general knowledge about Information Extraction frequent types and External Fact DB DB • access semi-structured know- The Web via Fact DB DB of an External Enriched Texts ledge bases Search Engine Fact DB Off-Line Information Extraction  2004 G. Neumann

  6. Design Issues • Foster bottom-up system development • Data-driven, robustness, scalability • From shallow & deep NLP • Large-scale answer processing • Coarse-grained uniform representation of query/documents • Text zooming • From paragraphs to sentences to phrases • Ranking scheme for answer selection • Common basis for • Online Web pages • Large textual sources  2004 G. Neumann

  7. BiQue: A Cross-Language Question-Answering System (cf. Neumann&Sacaleanu, 2003) • Goal: • Given a question in German, find answers in English text corpora • Sub-tasks • Integration of existing components • IR-engines, our IE-core engine, EuroWordNet • Development of methods/components for • Question translation & expansion • Unsupervised NE recognition • Participation at QA-track at Clef –2003/2004  2004 G. Neumann

  8. Major control flow of BiQue “Mit wem ist David Beckham verheiratet?” {person:David Beckham, married, person:?} Web German English Question Lucene IR Question Analysis XML-indexing Query Text corpus Documents Query Answer • Translation Paragraph Annotated Type selection Corpus • WSD • Expansion Passages “David Beckham, the soccer star engaged to marry Posh Spice, is being blamed for England 's World Cup defeat.” Answer Answer Candidates Answer Validation Extraction Posh Spice {person:David Beckham, person:Posh Spice}  2004 G. Neumann

  9. Query Translation & Expansion • Second idea: • First idea: • Use EuroWordNet • Only use • Use external MT-services EuroWordNet • Overlap-mechanism for query • Defines a word-based expansion translation via synset • Crosslingual because offsets • Experience • Q-type & A-type from DE- Question Analysis • EuroWordNet too • Synsets from EuroWN direct sparse on German query expansion (online side alignment) • Neverless introduced • Experience too much ambiguity • NE-translation is • External MT services also used crucial for Word-Sense-Disambiguation WSD • So far, not very much • Reduced degree of ambiguity of help  2004 G. Neumann

  10. Example (cf. Neumann&Sacaleanu, 2003) ,7����������������8�������� �����������9����%�������� :��9; ���!���������3����4�����5��� ����+���������,--.� ��������5� 1 0���������������� ��������������������������#�����%������!���������+��������������,--.1 �������������������� ������!�����������������������������+�������������,--.1 2������������������� ������!������������������#���������+����������������������,--.1 &�' �( �)�*������������#����������+�������������������,--.������������%�����������+��� ��� ����������/ 67���������#���������������������(�� @�������,DCE-ED� @�������A-B-6C ∀ �� ∈ &�' �( �����+�#:�����(;< =������������%���������������&�' �(� �(��*%���8����!��������/ �(��*�������� ��� ��!����/ ����� ∀ ��������:�;�� 9�������������� 9���*������%�������������/ ����������������>���������������?� 2��+�������#����&�' >( =����������������������������������������� @�������ACF6DE� &�' �(� �(��*�##������#�������+����������#������ ��� ���������������������5�/ 9���*��%����������!�������%���������%����5���� ������5�� ��G/  2004 G. Neumann

  11. What we learned ... • Different MT services can help each other • Logos suitable for EN-query parsing • Necessary to determine A-type, Q-focus on EN side • Systran/FreeTranslation better in NE-translation • Problem: MT-services often compute • Ill-formed strings: bad for query parsing • “partial” translation (mixed strings): problem for IR/paragraph selection • Our envisaged approach • Use DE-query analysis as control object for determining EN query object • Prefer DE-determined EAT, NE, Q-focus • Further decrease role of external MT services; only used for WSD  2004 G. Neumann

  12. Even more to learn ... • Off-line Annotation of corpus would help defining more controlled IR • Query/Answer processing • Question analysis as “deep” as possible • Question classification as basis for answer strategy selection • Answer strategies for definition/list-based questions • Had led to substantial improvements of our Clef-2003 system for Clef-2004  2004 G. Neumann

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend