Overview of the Recognizing Inference in TExt (RITE-2) at - PowerPoint PPT Presentation

RITE-2 Overview of the Recognizing Inference in TExt (RITE-2) at Recognizing ¡ Inference ¡in ¡ TExt@NTCIR10 ¡ NTCIR-10 � Yotaro Yusuke Junta Tomohide Hiroshi Cheng- Watanabe Miyao Mizuno Shibata Kanayama Wei Lee Tohoku NII Tohoku Kyoto IBM Academia University University University Research Sinica Chuan- Shuming Teruko Noriko Hideki Kohichi Jie Lin Shi Mitamura Kando Shima Takeda National Taiwan MSRA CMU NII CMU IBM Ocean University Research

Overview of RITE-2 � • RITE-2 is a generic benchmark task that addresses a common semantic inference required in various NLP/IA applications The Kamakura Shogunate was considered to have begun in 1192, but the current leading t 1 : theory is that it was e ff ectively formed in 1185. Can t 2 be inferred from t 1 ? (entailment?) The Kamakura Shogunate began in Japan in the t 2 : 12 th century. 2 The 10th NTCIR Conference

Motivation � • Natural Language Processing (NLP) / Information Access (IA) applications Ø Question Answering, Information Retrieval, Information Extraction, Text Summarization, Automatic evaluation for Machine Translation, Complex Question Answering • The current entailment recognition systems have not been mature enough Ø The highest accuracy on Japanese BC subtask in NTCIR-9 RITE was only 58% Ø There is still enough room to address the task to advance entailment recognition technologies 3 The 10th NTCIR Conference

≡ ≡ � ≡ ≡ ≡ RITE vs. RITE-2 � QA� Bio.� apps� IR� Application oriented sentence Search ¡ Multiple sentence- Pyramid of entailment ¡? ¡ entailment level contradiction? documents recognition inference � technology entailment ¡? BC ¡ RITE-2 � RITE � sentence paraphrase? ¡ Sentence-level entailment ¡? ¡ MC ¡ sentence inference contradiction? case ¡alternation� quantification� coordination� modification� phrase ¡rel.� lexical ¡rel.� negation� World ¡ knowledge� … Linguistic Unit ¡ phenomena- Test � sent. sent. sent. sent. sent. sent. sent. sent. level inference � Foundation oriented sent. sent. sent. sent. sent. sent. sent. sent. 4 The 10th NTCIR Conference

RITE-2 Subtasks � 5 The 10th NTCIR Conference

BC and MC subtasks � The Kamakura Shogunate was considered to BC Y or N have begun in 1192, but the current leading t 1 : theory is that it was e ff ectively formed in 1185. MC The Kamakura Shogunate began in Japan in the B,F,C or I t 2 : 12 th century. • BC subtask Ø Entailment (t 1 entails t 2 ) or Non-Entailment (otherwise) • MC subtask Ø Bi-directional Entailment (t 1 entails t 2 & t 2 entails t 1 ) Ø Forward Entailment (t 1 entails t 2 & t 2 does not entail t 1 ) Ø Contradiction (t 1 contradicts t 2 or cannot be true at the same time) Ø Independence (otherwise) � 6 The 10th NTCIR Conference

Development of BC and MC data � retrieve pairs edit pairs of sentences if needed 1: <t1, t2> 1: <t1, t2> 1: <t1, t2> 1: <t1, t2> 1: <t1, t2> 1: <t1, t2> 1: <t1, t2> 1: <t1, t2> for each example, 5 annotators assigned its semantic label 1: <t1, t2> RITE2 1: <t1, t2> 1: <t1, t2> BC, MC 1: <t1, t2> accept an example if data 4 or more annotators assigned the same label to the example 7 The 10th NTCIR Conference

成功を収めてオスマン帝国を最盛期 Entrance Exam subtasks ( Japanese only) � Entrance ¡exam ¡problem ¡ National ¡Center ¡Test ¡for ¡University ¡Admission ¡ ( Daigaku ¡Nyushi ¡Center ¡Shiken ) ¡ t 1 : ¡ スレイマン 1 世は数多くの軍事的に導いた． (Suleiman ¡I ¡contributed ¡in ¡a ¡ lot ¡of ¡military ¡successes ¡and ¡led ¡the ¡ Ottoman ¡Empire ¡to ¡its ¡peak. ¡ t 2 : ¡ オスマン帝国ではスレイマン 1 世の時代が最盛期であった． (The ¡ Ottoman ¡Empire’s ¡peak ¡was ¡during ¡ the ¡reign ¡of ¡Suleiman ¡I). ¡ 8 The 10th NTCIR Conference

Entrance Exam subtask: BC and Search � • Entrance Exam BC Ø Binary-classi fi cation problem ( Entailment or Non- entailment ) Ø t1 and t2 are given • Entrance Exam Search Ø Binary-classi fi cation problem ( Entailment or Non- entailment ) Ø t2 and a set of documents are given v Systems are required to search sentences in Wikipedia and textbooks to decide semantic labels � 9 The 10th NTCIR Conference

UnitTest ( Japanese only) � • Motivation Ø Evaluate how systems can handle linguistic phenomena that a ff ects entailment relations • Task de fi nition Ø Binary classi fi cation problem (same as BC subtask) t 1 : In the Meiji Constitution, legal clear distinction between the Imperial Family and Japan had been allowed. Category: modi fi er t 2 : In the Meiji Constitution, distinction between the Imperial Family and Japan had been allowed. t 1 : In the Meiji Constitution, distinction between the Imperial Family and Japan had been allowed. Category: melonymy t 2 : In the Meiji Constitution, distinction between the Emperor and Japan had been allowed 10 The 10th NTCIR Conference

Development of the UnitTest data � 1: <t 1 , t 2 > 1: <t 1 , t 2 > 1: <t 1 , t 2 > 1: <t 1 , t 2 > 1: <t1, t2> 1: <t 1 , t 2 > 2: <t1, t2> 1: <t1, t2> 1: <t 1 , t 2 > 1: <t 1 , t 2 > 1: <t 1 , t 2 > sampling 1: <t 1 , t 2 > 1: <t1, t2> break down 1: <t 1 , t 2 > 1.2: <t 1 , t 2 > 1: <t1, t2> 1: <t 1 , t 2 > Sampled 1.1: <t 1 , t 2 > BC subtask sentence UnitTest data data pairs • Procedure Ø Sentence pairs {<t 1 , t 2 >} were sampled from the BC subtask data Ø An annotator transformed each sampled sentence pair from t1 to t2 by breaking down the pair in a set of linguistic phenomena • [Kaneko+ 13] (to appear in ACL 2013) 11 The 10th NTCIR Conference

Distribution of the linguistic phenomena in UnitTest data � dev test dev test list 11 3 lexical synonymy 10 10 quantity 1 0 hypernymy 6 3 scrambling 16 15 meronymy 1 1 inference 4 2 entailment 1 0 Implicit relation 10 18 phrase synonymy 45 35 apposition 3 1 hypernymy 3 0 temporal 2 1 entailment 28 45 spatial 4 1 case alternation 9 7 disagree lexical 5 2 modi fi er 30 42 phrase 25 25 nominalization 2 1 modality 2 1 coreference 12 4 spatial 1 1 clause 29 14 temporal 0 1 relative clause 10 8 Total 272 241 transparent head 2 1 12 The 10th NTCIR Conference

RITE4QA (Chinese only) � • Motivation Ø Can an entailment recognition system rank a set of unordered answer candidates in QA? • Dataset Ø Developed from NTCIR-7 and NTCIR-8 CLQA data v t1: answer-candidate-bearing sentence v t2: a question in an a ffi rmative form • Requirements Ø Generate con fi dence scores for ranking process 13 The 10th NTCIR Conference

Evaluation Metrics � • Macro F1 and Accuracy (BC, MC, ExamBC, ExamSearch and UnitTest) Accuracy = 100 × N correct 1 X MacroF 1 = F 1 c | C | N examples c ∈ C • Correct Answer Ratio (Entrance Exam) Ø Y/N labels are mapped into selections of answers and calculate accuracy of the answers • Top1 and MRR (RITE4QA) � | Q | | Q | 1 1 1 X Top 1 = [top answer is correct] X MRR = | Q | | Q | rank i i =1 i =1 14 The 10th NTCIR Conference

Organization E ff ort �

≡ ≡ ≡ ≡ ≡ Generic Framework � • We provided pre-processed data and tools to lower barriers to entry � (1) Provided pre- (2) Provided a fundamental processed data entailment recognition tool sentence Linguistic Entailment outputs Analyzer Recognizer documents (3) Provided Evaluation results Evaluator RITE-2 evaluators (accuracies, F1-values…) 16 The 10th NTCIR Conference

(1) Pre-processed data � • Morphological and syntactic analysis Ø MeCab [Kudo+ 05] + CaboCha [Kudo+ 02] Ø Juman + KNP Ø Provided as XML data • Search Results for Exam Search subtask Ø Used TSUBAKI [Shinzato+ 11] to provide search results Ø Provided at most fi ve search results extracted from Wikipedia and textbooks � 17 The 10th NTCIR Conference

≡ (2) A fundamental entailment recognition tool (Baseline tool) � Baseline Tool Y or N Feature Machine B,F,C or I Extractor Learning instance (XML) relation • Features Ø a machine learning-based entailment recognition system Ø simple features are implemented (Feature Extractor) v Bag-of- {content words, aligned chunks, head words} v Ratio of aligned {content words, aligned chunks} Ø new features can be easily added Ø outputs fi les compatible with the format of the RITE-2 formal run 18 The 10th NTCIR Conference

� (3) RITE-2 Evaluators � • Generic Evaluator (all of the subtasks) $ java -jar rite2eval.jar -g RITE2_JA_test_bc.xml -s output_bc.txt � ------------------------------------------------------------ � |Label| #| Precision| Recall| F1| � | N| 354| 60.18( 204/ 339)| 57.63( 204/ 354)| 58.87| � | Y| 256| 44.65( 121/ 271)| 47.27( 121/ 256)| 45.92| � ------------------------------------------------------------ � Accuracy: � 53.28( 325/ 610) � Macro F1: � 52.40 � Confusion Matrix � --------------------- � |gold \ sys| N Y| � --------------------- � | N| 204 150| � | Y| 135 121| � --------------------- • Additional Evaluator (Entrance Exam) Ø Calculate correct answer ratio 19 The 10th NTCIR Conference

Overview of the Recognizing Inference in TExt (RITE-2) at - PowerPoint PPT Presentation

RITE-2 Overview of the Recognizing Inference in TExt (RITE-2) at Recognizing Inference in TExt@NTCIR10 NTCIR-10 Yotaro Yusuke Junta Tomohide Hiroshi Cheng- Watanabe Miyao Mizuno Shibata Kanayama Wei Lee Tohoku NII

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

NTCIR Evaluation Activities: Recent Advances on RITE (Recognizing Inference in Text) Min-Yuh

RESIDENCY-IN-TRAINING EVALUATION (RITE) ANNUAL WRITTEN EXAMINATION & INTERNAL MEDICINE

make world Chris Smowton University of Cambridge spell-rite /usr/share/real_words ~/nonsense

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Recognizing objects and actions in Finding boundaries images and video Recognizing

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

Rites of Passage Death & Grieving Monday, April 12, 2010 Death as a Rite of Passage In many

The Glory of God Father Son Holy Spirit Communion of Persons Eternal Exchange of Love The

Sa Sa sk sk ra ra (Rite of passage) The act of refinement Part-2 Surya Nanda Acharya,

Challenges in Recognizing Challenges in Recognizing NFL with DY NFL with DY Accessibility

The Tortuous Path New Creation Doing ng again ain Preparation Collecting and sorting

Attachment and Bioethics: An Anabaptist Trans-Disciplinary Perspective American Scientific

Evidentials and questions Natasha Korotkova University of California, Los Angeles

Deep Dependency Graph Conversion in English 15th International Workshop on Treebanks and

Machine Learning for Information Extraction from XML marked-up text on the Semantic Web Nigel

Management Decision Flexible working: changing the manager's role Janice Johnson Article

Syntax, Grammars & Parsing CMSC 470 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky

A Data-Mining Approach To Time-Series Microarray Alignment for Crossing Large-Scale Biomolecular

Overview of the Recognizing Inference in TExt (RITE-2) at - PowerPoint PPT Presentation

RITE-2 Overview of the Recognizing Inference in TExt (RITE-2) at Recognizing Inference in TExt@NTCIR10 NTCIR-10 Yotaro Yusuke Junta Tomohide Hiroshi Cheng- Watanabe Miyao Mizuno Shibata Kanayama Wei Lee Tohoku NII

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

NTCIR Evaluation Activities: Recent Advances on RITE (Recognizing Inference in Text) Min-Yuh

RESIDENCY-IN-TRAINING EVALUATION (RITE) ANNUAL WRITTEN EXAMINATION &amp; INTERNAL MEDICINE

make world Chris Smowton University of Cambridge spell-rite /usr/share/real_words ~/nonsense

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Recognizing objects and actions in Finding boundaries images and video Recognizing

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

Rites of Passage Death &amp; Grieving Monday, April 12, 2010 Death as a Rite of Passage In many

The Glory of God Father Son Holy Spirit Communion of Persons Eternal Exchange of Love The

Sa Sa sk sk ra ra (Rite of passage) The act of refinement Part-2 Surya Nanda Acharya,

Challenges in Recognizing Challenges in Recognizing NFL with DY NFL with DY Accessibility

The Tortuous Path New Creation Doing ng again ain Preparation Collecting and sorting

Attachment and Bioethics: An Anabaptist Trans-Disciplinary Perspective American Scientific

Evidentials and questions Natasha Korotkova University of California, Los Angeles

Deep Dependency Graph Conversion in English 15th International Workshop on Treebanks and

Machine Learning for Information Extraction from XML marked-up text on the Semantic Web Nigel

Management Decision Flexible working: changing the manager's role Janice Johnson Article

Syntax, Grammars &amp; Parsing CMSC 470 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky

A Data-Mining Approach To Time-Series Microarray Alignment for Crossing Large-Scale Biomolecular

RESIDENCY-IN-TRAINING EVALUATION (RITE) ANNUAL WRITTEN EXAMINATION & INTERNAL MEDICINE

Rites of Passage Death & Grieving Monday, April 12, 2010 Death as a Rite of Passage In many

Syntax, Grammars & Parsing CMSC 470 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky