A Multilingual Hybrid Question-Answering System Cross-Lingual - PowerPoint PPT Presentation

A Multilingual Hybrid Question-Answering System Cross-Lingual Open-Domain Question Answering Günter Neumann, Bogdan Sacaleanu 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Inference Linguistic World and Heart of Gold Knowledge Bses Engine Domain Knowledge Answer Question Search Preparation Analysis NL NL Questions Answers QA �� Controller �� DB Semistr Free QA QA Text QA Off Line Data Harvesting External Fact DB DB The Web via Fact DB DB of An External Enriched Texts Free Text Search Engine Fact DB �� Off-Line �� Information Extraction �� 30th DFKI SAB MEETING • 04/04/2006 �� German Research Center for Artificial Intelligence

Cross-lingual Open-Domain Question-Answering “Mit wem Mit wem ist ist David Beckham David Beckham verheiratet verheiratet? ?” ” “ {person:David Beckham, married, person:?} {person:David Beckham, married, person:?} IR-Google Question IR-Query English German Analysis Construction Question Question IR-Lucene/XML Object Documents Query Translation Question Object: • Online MT-systems Passage • Focus, Scope Annotated Corpus selection • WSD • AnswerType • Expansion “David Beckham, the soccer star David Beckham, the soccer star “ Passages engaged to marry Posh Spice, is engaged to marry Posh Spice, is being blamed for England 's World being blamed for England 's World Cup defeat.” ” Cup defeat. Answer Candidates Answer Answer Selection Extraction Posh Spice Posh Spice 30th DFKI SAB MEETING • 04/04/2006 {person:David Beckham, person:Posh Spice} {person:David Beckham, person:Posh Spice} German Research Center for Artificial Intelligence

Challenges for Textual QA Open domain ✩ – No restriction on the domain and type of question – No restriction on document source and style (news text corpus, Web, …) High demands on robustness & efficiency of LT core components ✩ – From keywords to full NL questions – Very large scale sources of free text – Trade-off between off-line and on-line annotation Cross-linguality ✩ – How to exploit MT technology for textual QA ? Reusability & Scalability ✩ – Same QA framework for heterogenous document sources – Incremental bottom-up software development 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Our Design Perspective ✩ Foster bottom-up system development – Data-driven, robustness, scalability – From shallow & deep NLP ✩ Large-scale answer processing – Coarse-grained uniform representation of query/documents – Text zooming – Ranking scheme for answer selection Need-triggered use of knowledge sources ✩ – Rather exploit data-driven strategies & linguistic structure ✩ Common basis for – Online Web pages – Large textual sources 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Textual QA in Quetal: R&D Results Question-type specific selection of answer extraction strategies Flexible robust free question analysis QA-framework Quantico • Web & XML-annotated documents • ~ 5-8 sec/QA-cycle Dissemination (projects): Hybrid approach for - SmartWeb (BMBF) cross-lingual textual QA Clef participation: - HyLaP (BMBF) best results for German & - QALL-ME (EC) English as target languages - RASCALLI (EC) Answer credibility (25%DE2EN, 47.5%DE2DE) - … checking 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Quantico: Activity Flow Analysis QA Retrieval Extraction Selection Credibility Component Controller Component Component Component Component Retrieve Appositions Parse Select Retrieve Select Best Credibility Question Strategy Abbreviations Answers Check Definition Retrieve Extract Possible Sentences Answers Factoid Temporal Abbrev Store <NE,XP> Store Off-line NE/Sentence Index On-line Clef-Corpus, LT-world, Aquaint 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Free Question Analysis for Textual QA ✩ Query analysis as control ✩ Q-type specific Strategy selection information – Q-type/A-type/Q-constraints/… Answer Q-objects Q-Parser – Local Wh-grammars + dependency structure for initial (underspecified) A-Extraction Q-Strategies QA-Controller Q-info Relation Handler NE-term – Tree-traversal for determining more Handler WebQA Abbrev specific Q-info Handler Sentence Handler • Non-local syntactic constraints • Coarse-grained lexical semantic <NE,NP>- consistency checks Store NE- Store Abbrev.- Sentenc Store e- Index • Semantic types for main noun/verb lemmas Text Corpus 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

*The implementation was done by Rob Basten as part of his Master Thesis Answering Open Domain Temporally Restricted Questions in a Multi-Lingual Context , DFKI & Uni. Twente, NL Temporal Question Strategies* Examples (1 & 3 from Clef): What nearly caused the cancellation or postponement of the 1996 European Football Championship? Name a German tennis player who won Wimbledon between 1980 and 1990? Whom was Michael Jackson married to before he married Debbie Row? Core idea: Process questions of this kind on basis of our existing technology following a divide-and-conquer approach: question decomposition answer fusion ✩ ✩ – A temporally restricted questions Q is decomposed into two – The answers of both are searched for independently sub-questions – but checked for consistency in a follow-up answer fusion step – one referring to the “timeless” proposition of Q, and – the found explicit temporal restriction is used to constrain the – the other to the temporally restricting part. “timeless” proposition. Who was the German Chancellor when the Berlin Wall was opened? ⇒ ⇒ ⇒ ⇒ Who was the German Chancellor ? & When was the Berlin Wall opened? Initial/fallback strategy ✩ – The existing methods for handling factoid questions are used without change to get initial answer candidates. – In a follow-up step, the temporal restriction from the question is used to check the answer's temporal consistency. 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Cross-linguality in QA Cross-linguality Cross-linguality EN-DE DE-EN Retrieval Component Data-storage-Queries Extraction Component Sentences Analysis Strategy Selector Strings Component Q-Objects QA-Controller Possible Answers Answers Credibility Selection Component Component Before After 30th DFKI SAB MEETING • 04/04/2006 Method Method German Research Center for Artificial Intelligence

Cross-lingual QA strategies developed in Quetal Before Method EN-DE After Method DE-EN • Question translation • Question processing -> QObject • Translations processing -> QObjects • Question translation + alignment • QObject selection • QObject alignment DE EN Confidence Best QO Selection 1. Online MT 2. 3. Language Model EN Query Parsing Via pCFG 2. QO1 QO2 QO3 1. 3. Q-Focus NE Alignment of German QO QO & NE External SMES MT services Wh-parser English QO Answer Proc DE Q1,Q2,Q3 Expansion, WSD 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

SAB Recommendation The SAB recommended to take into account the dimension of credibility of the answer ✩ There exists very few work in the area of textual QA, e.g., Lita et al. (CMU), AAAI-2005 ✩ Credibility in QA: – Provide criteria about the assumed quality of an answer – Determine the credibility of the answer source – Incorporate a measure of credibility in computing the answer confidence ✩ Examples of meta information – Table of trusted links per question topic – Information from URL (last update, semantic relationship of link name with answers) – Textual information (style, fingerprints, discourse markers) 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Our starting point ✩ It is known that redundancy plays an important role for Web- based/textual QA – Answers get higher rank, if they are mentioned more often in different documents. ✩ So seen, redundancy is already a measure of credibility ✩ But, how to collect further information that supports an answer? – Use a list of trusted links to filter document sources – Select the document that mostly supports the answer 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

Two methods have been investigated ✩ Google’s total frequency counts – For answers extracted from a (small) text corpus, exploit their external Web redundancy ✩ More general model that integrates – Table of trusted links – Automatic determination of credibility for Web document sources 30th DFKI SAB MEETING • 04/04/2006 German Research Center for Artificial Intelligence

A Multilingual Hybrid Question-Answering System Cross-Lingual - PowerPoint PPT Presentation

A Multilingual Hybrid Question-Answering System Cross-Lingual Open-Domain Question Answering Gnter Neumann, Bogdan Sacaleanu 30th DFKI SAB MEETING 04/04/2006 German Research Center for Artificial Intelligence Inference Linguistic World

Drupal 8s multilingual APIs Gbor Hojtsy DRUPAL 7 MULTILINGUAL DRUPAL 7 MULTILINGUAL Drupal

Drupal 8 Multilingual Wonderland Gabor Hojtsy Acquia Foreign language site Multilingual site

Hybrid NLP Hybrid NLP Multilingual HPSG Grammar Engineering Multilingual HPSG Grammar

Question Answering What is Ques+on Answering? Dan Jurafsky Ques%on

Designing deep architectures for Visual Question Answering Matthieu Cord Sorbonne University

Question Answering and AnswerFinder Diego Moll a Centre for Language Technology Department of

Hybrid Construction Hybrid Construction Hybrid Construction Hybrid Construction 1 VP

An Question Recommendation System for Question Answer Community (Stackoverflow) Presenter: Haoyu

Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model for t he Dist

Hybrid Automobiles Hybrid Automobiles It switches easily between fuel, batteries, or both It

Multilingual App Toolkit Standards and multilingual software development 29, April 2015 Jan

Question Answering & the Semantic Web Gnter Neumann Language Technology-Lab DFKI,

Answering Queries Using Answering Queries Using Materialized view: result set is stored

CS345a Data Mining Project A Web Based Question Answering System Vincenzo Di Nicola Jyotika

Building a Smart Question Answering System from Scratch Minjoon Seo PhD Student University of

Statistical NLP Spring 2011 Lecture 26: Question Answering Dan Klein UC Berkeley Question

Mathematical Induction Krzysztof R. Apt (so not Krzystof and definitely not Krystof ) CWI,

A Coreference Corpus and Resolution System for Dutch Iris Hendrickx, Gosse Bouma, Frederik

Joint Posterior Revision of NLP Annotations via Ontological Knowledge Marco Rospocher

Welcome to P1 Networking Session 10 February 2017 CHIJ Our Lady of the Nativity Outline

Nation-wide TV Streaming Service Dmytro Karamshuk 1 , Nishanth Sastry 1 , Jigna Chandaria 2 , Andy

deep learning for natural language processing Sergey I. Nikolenko 1,2 AINL FRUCT 2016 St.

Adjudication After Bresco: Bresco Electrical Services Ltd (in liquidation) v Michael J Lonsdale

Agenda this Month Draft Finance Bill clauses published VAT Notice 700/22 MTD for VAT

A Multilingual Hybrid Question-Answering System Cross-Lingual - PowerPoint PPT Presentation

A Multilingual Hybrid Question-Answering System Cross-Lingual Open-Domain Question Answering Gnter Neumann, Bogdan Sacaleanu 30th DFKI SAB MEETING 04/04/2006 German Research Center for Artificial Intelligence Inference Linguistic World

Drupal 8s multilingual APIs Gbor Hojtsy DRUPAL 7 MULTILINGUAL DRUPAL 7 MULTILINGUAL Drupal

Drupal 8 Multilingual Wonderland Gabor Hojtsy Acquia Foreign language site Multilingual site

Hybrid NLP Hybrid NLP Multilingual HPSG Grammar Engineering Multilingual HPSG Grammar

Question Answering What is Ques+on Answering? Dan Jurafsky Ques%on

Designing deep architectures for Visual Question Answering Matthieu Cord Sorbonne University

Question Answering and AnswerFinder Diego Moll a Centre for Language Technology Department of

Hybrid Construction Hybrid Construction Hybrid Construction Hybrid Construction 1 VP

An Question Recommendation System for Question Answer Community (Stackoverflow) Presenter: Haoyu

Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model for t he Dist

Hybrid Automobiles Hybrid Automobiles It switches easily between fuel, batteries, or both It

Multilingual App Toolkit Standards and multilingual software development 29, April 2015 Jan

Question Answering &amp; the Semantic Web Gnter Neumann Language Technology-Lab DFKI,

Answering Queries Using Answering Queries Using Materialized view: result set is stored

CS345a Data Mining Project A Web Based Question Answering System Vincenzo Di Nicola Jyotika

Building a Smart Question Answering System from Scratch Minjoon Seo PhD Student University of

Statistical NLP Spring 2011 Lecture 26: Question Answering Dan Klein UC Berkeley Question

Mathematical Induction Krzysztof R. Apt (so not Krzystof and definitely not Krystof ) CWI,

A Coreference Corpus and Resolution System for Dutch Iris Hendrickx, Gosse Bouma, Frederik

Joint Posterior Revision of NLP Annotations via Ontological Knowledge Marco Rospocher

Welcome to P1 Networking Session 10 February 2017 CHIJ Our Lady of the Nativity Outline

Nation-wide TV Streaming Service Dmytro Karamshuk 1 , Nishanth Sastry 1 , Jigna Chandaria 2 , Andy

deep learning for natural language processing Sergey I. Nikolenko 1,2 AINL FRUCT 2016 St.

Adjudication After Bresco: Bresco Electrical Services Ltd (in liquidation) v Michael J Lonsdale

Agenda this Month Draft Finance Bill clauses published VAT Notice 700/22 MTD for VAT

Question Answering & the Semantic Web Gnter Neumann Language Technology-Lab DFKI,