overview of the 7 th ntcir f workshop
play

Overview of the 7 th NTCIR f Workshop N Noriko Kando k K d - PowerPoint PPT Presentation

Overview of the 7 th NTCIR f Workshop N Noriko Kando k K d National Institute of Informatics, Japan http://research.nii.ac.jp/ntcir/ h // h ii j / i / kando (at) nii. ac. Jp With th With thanks for Tetsuya Sakai for the slides k f


  1. Overview of the 7 th NTCIR f Workshop N Noriko Kando k K d National Institute of Informatics, Japan http://research.nii.ac.jp/ntcir/ h // h ii j / i / kando (at) nii. ac. Jp With th With thanks for Tetsuya Sakai for the slides k f T t S k i f th lid Noriko Kando NTC7 OV 2008-12-17 1

  2. NTCIR: NTCIR: NII Test Collection for Information Retrieval NII Test Collection for Information Retrieval Research Infrastructure for Evaluating IA Research Infrastructure for Evaluating IA A series of evaluation workshops designed to enhance research in information-access technologies by h h i i f ti t h l i b providing an infrastructure for large-scale evaluations. ■ Data sets, evaluation methodologies, and forum ■ Data sets, evaluation methodologies, and forum Project started in late 1997 7th Once every 18 months 6th 5th Data sets (Test collections or TCs) 4th 3rd Scientific, news, patents , and web 2st Chin s Chinese, Korean, Japanese, and English K r n J p n s nd En lish 1st st Tasks 0 20 40 60 80 100 # of groups # of countries IR: Cross-lingual tasks, patents, web, QA : Monolingual tasks, cross-lingual tasks Summarization, trend info., patent maps Opinion analysis, text mining C Community-based Research Activities it b d R h A ti iti NTCIR-7 participants Noriko Kando 2 NTC7 OV 2008-12-17 82 groups from 15 countries

  3. Information access (IA) Information access (IA) • Whole process ofpreparing information from the vast collection of documents usable by the vast collection of documents usable by users. • For example, IR, text summarization, QA, F l IR t t i ti QA text mining, and clustering • Use human assessments as success criteria NTC7 OV 2008-12-17 Noriko Kando 3

  4. Focus of NTCIR Focus of NTCIR N New Challenges Ch ll Lab-type IR Test Intersection of IR + NLP Intersection of IR NLP Asian Languages/cross-language Asian Languages/cross-language To make information in the Variety of Genre documents more usable for Parallel/comparable Corpus Parallel/comparable Corpus users! users! Realistic eval/user task Forum for Researchers Idea Exchange Discussion/Investigation on Evaluation methods/metrics Evaluation methods/metrics NTC7 OV 2008-12-17 Noriko Kando 4

  5. Tasks at past NTCIRs Tasks (Research Areas) of NTCIR Workshops p 1st 2nd 3rd 4th 5th 6th Japanese IR news sci T Cross-lingual IR Cross lingual IR T a Patent Retrieval s map/classif k k W b R Web Retrieval i l s Navigational Geo Result Classification Term Extraction QuestionAnswering Info Access Dialog S Summ metrics t i s Cross-Lingual Text Summarization Trend Information Opinion Analysis NTC7 OV 2008-12-17 Noriko Kando 5

  6. NTCIR-7 Clusters NTCIR-7 Clusters Cluster 1. Advanced CLIA Mu uST; V - Complex CLQA ( Chinese, Japanese, English) - IR for QA (Chinese, Japanese, English) Visuali Cluster 2. User-Generated : - Multilingual Opinion Analysis Multilingual Opinion Analysis zation Cluster 3. Focused Domain : Patent - Patent Translation ; English -> Japanese, P t t T sl ti ; E n Chall li h J - Patent Mining paper -> IPC Cluster 4. MuST : enge - Multi-modal Summarization of Trends NTC7 OV 2008-12-17 Noriko Kando 6

  7. NTCIR 7 is made up of NTCIR-7 is made up of… • Cluster 1: Advanced Cross lingual Information • Cluster 1: Advanced Cross-lingual Information Access (ACLIA) = CCLQA + IR4QA • Cluster 2: Multilingual Opinion Analysis task • Cluster 2: Multilingual Opinion Analysis task (MOAT) + CLIRB • Cluster 3: Focused Domains • Cluster 3: Focused Domains = PATMT + PATMN • Multimodal Summarization of Trend Multim d l Summ i ti n f T nd information (MuST) • The 2 nd International Workshop on Evaluating The 2 nd Internati nal W rksh p n Evaluatin Information Access (EVIA)

  8. Evaluation Workshops Evaluation Workshops • "evaluation“ It is not an competition! not an exam! • Constructs a common data set usable for Constructs a common data set usable for experiments. • provides to participants the data sets and unified provides to participants the data sets and unified procedures for evaluation – Each participating research group conducts experiments with various approaches and can participate with own h h d h purpose. • Successful examples; TREC CLEF DUC INEX Successful examples; TREC, CLEF, DUC, INEX, and TAC, FIRE (new!) Community-based activities • Implications are various Implications are various NTC7 OV 2008-12-17 Noriko Kando 8

  9. IA Systems Evaluation IA Systems Evaluation • Engineering Level: Efficiency Engineering Level Efficiency • Input Level: ex. Exhaustivity, quality, novelty of DB • Process Level: Effectiveness ex. recall,precision P L l Eff ti ll i i • Output Level: Display of output • User Level: ex. Effort that users need • Social Level: ex. Importance (Cleverdon & Keen 1966) L . mp r n ( r n & K n 966) NTC7 OV 2008-12-17 Noriko Kando 9

  10. Retrieval Difficulty Varies with Topics Effectiveness Effectiveness Across TOPICS J-J Level1 D auto Across SYSTEMS on a System 検索システム別の11pt再現率精度 1.0000 101 101 102 103 0.8000 Average over A 1 104 B 105 50 topics 50 topics 106 C C 107 D 0.8 108 0.6000 E 109 cision 110 F 111 on G G 112 pre precisi 0.6 H 113 0.4000 114 I 115 J 116 0.4 K 117 118 L 0.2000 119 M 120 0.2 N 121 122 O 123 0.0000 P 124 0 125 0 2 4 6 8 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 . . . . . . 0 0 0 0 0 1 126 recall 127 recall recall NTC7 OV 2008-12-17 Noriko Kando 10

  11. Retrieval Difficulty Varies with Topics J-J Level1 D auto J J L l1 D t Effectiveness Effectiveness Across TOPICS 検索システム別の11pt再現率精度 1.0000 Across SYSTEMS on a System on a System 101 102 “Difficult Topics” Vary with 103 0.8000 A 1 Average over 104 B 105 Systems Systems 50 topics 50 topics 106 J-J Level1 D auto C 107 D 0.8 108 0.6000 E 109 A 1.0000 ecision 110 F B 111 ision G C Precision n 0.6 0.6 112 pre 0 8000 0.8000 preci H 113 D 0.4000 114 I E 115 0.6000 J ecision F 116 0.4 K 117 G pre an Ave P 118 L 0.2000 0.4000 H 119 M I 120 0.2 N 121 J 0.2000 122 O K 123 123 Mea 0.0000 P L 124 0 0.0000 For reliable and 125 0 2 4 6 8 0 0 0.1 0.2 0.3 0.4 0.5 M 0.6 0.7 0.8 0.9 1 . . . . . . 0 0 0 0 0 1 126 N stable evaluation, stable evaluation, recall 127 Requests #101 150 Requests #101-150 recall O Topic# Topic# using substantial # topics is inevitable. NTC7 OV 2008-12-17 Noriko Kando 11

  12. TC usable to evaluate? Pharmaceutical R & D Phase II : Phase III: Phase IV: Phase I: Animal Clinical Test Test with In Vitro Experiments Healthy Human y Subject NTC7 OV 2008-12-17 Noriko Kando 12

  13. TC usable to evaluate what? NTCIR Users’ information seeing Test Collections tasks Phase II : Phase III: Phase IV: Phase I: Sharing Controlled Uncontrolled Modules Modules , Interactive Interactive Pre operational Pre-operational Laboratory- Prototype Testing using Testing type Testing testing human Subjects Pharmeceutical R Phase II : Phase III: Phase IV: & D Phase I: Animal Clinical Test Test with In Vitro Experiments Healthy Human y Subject 2.Input Level 、 6.Social Level Levels of 4.User Level 、 5.Output Levle Evaluation Evaluation 1.Engineering Level efficiency 3.Process Level: effectiveness NTC7 OV 2008-12-17 Noriko Kando 13

  14. Summary of “What is NTCIR ” Summary of What is NTCIR • Providing a scientific basis for understanding • Providing a scientific basis for understanding the effectiveness of automated information access technologies access technologies • Leveraging the R&D and technology transfer • Reusable Test collection is a key component R bl T ll k • Evaluating search effectiveness is not easy. g y A small-scale or carelessly-designed TCs may skew the test results NTC7 OV 2008-12-17 Noriko Kando 14

  15. NTCIR-7: Advanced CLIA Teruko Mitamura (CMU) Eric Nyberg (CMU) Eric Nyberg (CMU) Ruihua Chen (MSRA) Fred Gey (UCB), Donghong Ji (Wuhan Univ) Donghong Ji (Wuhan Univ) Noriko Kando (NII) Chin-Yew Lin (MSRA) Chuan-Jie Lin (Nat Taiwan Ocean Univ) Tsuneaki Kato (Tokyo Univ) Tatsunori Mori (Yokohama N Univ) Tatsunori Mori (Yokohama N Univ) Tetsuya Sakai (NewsWatch) Ad i Advisor: K.L.Kwok (Queen College) K L K k (Q C ll ) NTC7 OV 2008-12-17 Noriko Kando 15

  16. NTCIR-7: Advanced CLIA Answer Question Question Translation Translation CCLQA CCLQA Extraction & Analyzers & Retrieval Formatting XML,AP Q with q- Retrieved I Questions Answers Questions types documents Eval. By IR for QA CLIR - IR Effectiveness - QA Effectiveness QA Effectiveness -Test effectiveness of T t ff ti f OOV, PRF, QE in QA - Focused Retrieval - Focused Retrieval NTC7 OV 2008-12-17 Noriko Kando 16

  17. ACLIA Complex Cross-lingual Question Answering Complex Cross lingual Question Answering (CCLQA) Task Different teams Small teams that can exchange can exchange do not possess d t and create a an entire QA “dream-team” dream team system system QA system can contribute IR and QA communities can collaborat IR d QA i i ll b

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend