dfki at qa clef 2007
play

DFKI at QA@Clef 2007 Gnter Neumann, Bogdan Sacaleanu, Christian - PowerPoint PPT Presentation

LT-Lab DFKI at QA@Clef 2007 Gnter Neumann, Bogdan Sacaleanu, Christian Spurk, Rui Wang Language Technology Lab at DFKI Saarbrcken, Germany Clef-07 German Research Center for Artificial Intelligence LT-Lab Overview DFKI is


  1. LT-Lab DFKI at QA@Clef 2007 Günter Neumann, Bogdan Sacaleanu, Christian Spurk, Rui Wang Language Technology Lab at DFKI Saarbrücken, Germany Clef-07 German Research Center for Artificial Intelligence

  2. LT-Lab Overview ✩ DFKI is participating since 2003 – Focus on German monolingual QA and German/English cross- lingual QA – Promising results so far (acc.): DEDE=43,50%, ENDE=32,98%, DEEN=25.50% ✩ Goal for Clef 2007: increase spectrum of activities – Consideration of additional language pairs (ESEN, PTDE) – Participation in QAST pilot task – Participation in Answer Validation Exercise (AVE) Clef-07 German Research Center for Artificial Intelligence

  3. LT-Lab QA architecture – some design issues ✩ NL question – Declarative description of search strategy and control information – Analysis should be as complete and accurate as possible – Use of full parsing and semantic constraints ✩ Consider document sources as implicit search space – Off-line: Provide question type oriented preprocessing for context selection – On-line: Provide question specific preprocessing for answer processing Clef-07 German Research Center for Artificial Intelligence

  4. LT-Lab Common architecture for different answer pools ✩ Answer sources (covered by our technology) – Structured sources (DBMS) – Linguistically well-formed textual sources (news articles) – Well-structured web sources (Wikipedia) – Web snippets – Speech transcripts, cf. QAST ✩ Assumption: – QA for different answer sources share pool of same components ✩ Service oriented architecture (SOA) for QA – Strong component-oriented approach – Basis for open-source QA architecture (cf. EU project QALL-ME) Clef-07 German Research Center for Artificial Intelligence

  5. LT-Lab Overview QA architecture Clef-Corpus Cross-linguality Cross-linguality Before Method After Method Wikipedia- Speech Corpus transcripts Retrieval Component IR-Queries Extraction Component Sentences Analysis Strategy Selector Strings Component Q-Objects QA-Controller Possible Answers Answers Validation Selection Component Component Clef-07 German Research Center for Artificial Intelligence

  6. LT-Lab System Architecture for Clef 2007 Clef-07 German Research Center for Artificial Intelligence

  7. LT-Lab Query processing components Clef-07 German Research Center for Artificial Intelligence

  8. Assumption: the better LT-Lab Cross-lingual Approach to ODQA the query analysis of a translated question is Before Method done the better was the Completeness wrt. translation being made - Parse tree • Question translation - major semantic Wh-types • Translations processing -> QObjects • QObject selection Confidence Best QO Selection Source Question (DE/EN/ES/PT) QO1 QO2 QO3 Possibly Via English External German/English MT services Wh-parser Answer Proc German/English Questions Q1,Q2,Q3 Clef-07 German Research Center for Artificial Intelligence

  9. LT-Lab Question analysis SMES for IA-schema (translated) SMES for DE&EN •Wh-attachment •Generated Wordforms •Morphology •Q-type, A-type, Q- •NE-types/Concepts NL questions •Dependency trees focus •Weights •Shallow&Deep Proc. IA Topic Syntactic Semantic proto query processing analysis analysis construction LingPipe for Q-Object •NER •Coreference Sequence of Resolution NE resolved Wh-questions IA proto query Clef-07 German Research Center for Artificial Intelligence Information access

  10. LT-Lab Ouput example of query analysis Exploiting Which Jewish painter lived from 1904-1944? Natural Language Generation <QOBJ msg="quest" id="qId0" lang="DE" score="1"> IA query created for Lucene <NL-STRING id="qId0"> <SOURCE id="qId0" lang="DE">Welche juedischen +neTypes:NUMBER Maler lebten von 1904-1944?</SOURCE> <TARGETS/> AND </NL-STRING> ("lebten" OR "lebte" OR "gelebt" <QA-control> OR "leben" OR "lebt") <Q-FOCUS>Maler</Q-FOCUS> <Q-SCOPE>leb</Q-SCOPE> AND +maler^4 <Q-TYPE restriction="TEMP">C-COMPLETION</Q- AND jüdisch^1 TYPE> <A-TYPE type="list:SOME">NUMBER</A-TYPE> AND 1944^1 </QA-control> AND 1904^1 <KEYWORDS> <KEYWORD id="kw0" type="UNIQUE"> <TK pos="V" stem="leb">lebten</TK> </KEYWORD> <KEYWORD id="kw1" type="UNIQUE"> <TK pos="A" stem="juedisch">juedischen</TK> … </KEYWORD> </KEYWORDS> <EXPANDED-KEYWORDS/> <NE-LIST> <NE id="ne0" type="DATE">1944</NE> <NE id="ne1" type="DATE">1904</NE> </NE-LIST> Clef-07 </QOBJ> German Research Center for Artificial Intelligence

  11. LT-Lab Answer processing components Clef-07 German Research Center for Artificial Intelligence

  12. LT-Lab Experiments & Results Performance still ok although some lost Right W X U Run ID # % # # # Coverage problems of 60 30 121 14 5 dfki061dede M English Wh-parser 37 18.5 144 18 1 dfki061ende C BUG in NE-Informed 14 7 178 6 2 Translation (used DE- dfki061deen C based recognizer) 10 5 180 10 0 dfki062esen C Problems with MT online services (PT-EN-DE) 5 2.5 189 4 2 dfki062ptde C Clef-07 German Research Center for Artificial Intelligence

  13. LT-Lab Remarks ✩ Online MT services are still insufficient – Develop own MT solutions, cf. EU project EuroMatrix ✩ Bad coverage of our English Wh-parser – First prototype for Clef 2007 ✩ Answer extraction currently robust enough for different answer sources – Similar performance for newspaper and Wikipedia ✩ Need more semantic analysis on answer side without lost of coverage and domain-independency – We are exploring cognitive semantics (cf. Talmy, 1987) ✩ Number of QA components also used in QAST pilot task and AVE Clef-07 German Research Center for Artificial Intelligence

  14. LT-Lab DFKI at QAST and AVE Result (encouraging) ✩ QAST pilot task Task #Q #A MRR ACC – For given written factoid T1 98 19 0.17 0.15 question T2 98 9 0.09 0.09 – Extract answer from manual or T1 = Chill corpus manual automatic speech transcripts T2 = Chill corpus automatic ✩ Answer Validation Exercise Result (really encouraging) – Given a triple of form (question, Runs Recall Precis F- QA answer, supporting text) ion measu Accur re acy – Decide whether the answer to dfki07- 0.62 0.37 0.46 0.16 the question is correct and run1 – Is supported or not according to dfki07- 0.71 0.44 0.55 0.21 the given supporting text run2 Clef-07 German Research Center for Artificial Intelligence

  15. LT-Lab DFKI at QAST pilot task Goals ✩ – Get experience with this sort of answer sources – Adapt our text–based open–domain QA system that we used for the Clef main tasks – Since QAST required different set of expected answer types we developed a federated search strategy for NER called Meta-NER Same core as DFKI our textual QA system Clef-07 German Research Center for Artificial Intelligence

  16. LT-Lab META-NER ✩ Call several NER in parallel ✩ Merge results by a voting strategy BiQueNER developed by our group. Extends co-training algorithm of Collins and Singer: 1. Chunks only instead of full parsing 2. Use of typed Gazetters and rules. Clef-07 German Research Center for Artificial Intelligence

  17. LT-Lab DFKI’s AVE System ✩ AVE System is based on our RTE system (cf. Wang & Neumann, AAAI-2007, RTE-3 challenge) ✩ RTE method already demonstrated good results for QA task – RTE-3 (only QA): 81.5 %, Trec-2003 QA: 65.7 % ✩ RTE Method: Novel sentence level Kernel method – Subtree alignment on syntactic level • Check similarity between tree of H and relevant subtree in T – Subsequence kernel • Consider all possible subsequence of spine (path) of difference pairs • SVM for classification Clef-07 German Research Center for Artificial Intelligence

  18. LT-Lab AVE architecture Runs R P F QA Acc. run1 0.62 0.37 0.46 0.16 Clef-07 run2 0.71 0.44 0.55 0.21 German Research Center for Artificial Intelligence

  19. LT-Lab Error Analysis ✩ Supporting text from web documents cause parsing problems ✩ Violation of some of our RTE system’s assumptions – Required: H should be “verbally” smaller than T – Violated by: Q-A made patterns are too long – impact on recall ✩ If supporting text is very long (a complete document) then our RTE system is misleaded – Impact on precision Clef-07 German Research Center for Artificial Intelligence

  20. LT-Lab Thanks! Clef-07 German Research Center for Artificial Intelligence

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend