NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: - PowerPoint PPT Presentation

Welcome! Twitter: #ntcir9 Ust: ntcir-9-kick NTCIR-9 Kick-Off Event ff 2010.10.05 日本語セッション : 13:30- English Session: 15:30- li h S i 30 1

Program Program • About NTCIR Ab t NTCIR • About NTCIR-9 • Accepted Tasks • Why participate? • How to participate How to participate • Important Dates • Q & A 2

About NTCIR 3

NTCIR: NII Testbeds and Community for Information access Research Research Infrastructure for Evaluating IA Research Infrastructure for Evaluating IA A series of evaluation workshops designed to enhance research in information-access technologies by providing an h i i f ti t h l i b idi infrastructure for large-scale evaluations. ■ Data sets, evaluation methodologies, and forum , g , Project started in late 1997 Once every 18 months Data sets (Test collections or TCs) Scientific, news, patents, and web Chi K J d E li h Chinese, Korean, Japanese, and English Tasks (Research Areas) IR: Cross-lingual tasks, patents, web, Geo QA ： Monolingual tasks, cross-lingual tasks Summarization, trend info., patent maps Opinion analysis, text mining C Community-based Research Activities i b d R h A i i i 4

Information retrieval (IR) Information retrieval (IR) • Retrieve RELEVANT information from vast collection to meet users’ information needs ’ i f ti d • Using computers since the 1950s • First CS uses human assessments as success criteria • First CS uses human assessments as success criteria – Judgments vary – Comparative evaluations on the same infrastructure – Comparative evaluations on the same infrastructure Information access (IA) o at o access ( ) Whole process to make information usable by users. ex.: IR, text summarization, QA, text mining, and Q d clustering 5

Tasks at Past NTCIRs NTCIR 1 2 3 4 5 6 7 8 '99'01 '02'04'05'07'08'09- User Generated Community QA ■ Contents Contents Opinion Analysis Opinion Analysis ■ ■ ■ ■ ■ ■ Cross-Lingual QA + IR ■ ■ Module-Based Geo Temporal ■ IR for Focused Patent Patent Domain Domain ■ ■ ■ ■ □ ■ ■ ■ ■ □ Complex/ Any Types ■ ■ ■ Question Dialog ■ ■ A Answering i Cross-Lingual C Li l ■ ■ ■ ■ Factoid, List ■ ■ ■ ■ Text Mining / Classification ■ ■ ■ ■ ■ Summarization / Summarization / Trend Info Visualization ■ ■ ■ Consolidation Text Summarization ■ ■ ■ Web Web ■ ■ ■ Statistical MT ■ ■ Crosslingual Cross-Lingual IR ■ ■ ■ ■ ■ ■ ■ ■ Retrieval Non English Search Non-English Search ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ Ad Hoc IR, IR for QA Text Retrieval ■ ■ ■ ■ ■ ■ ■ ■ The Years the meetings were held. The tasks started 18 months before 6

Procedures in NTCIR Workshops p • Call forTask Proposals Call for Task Proposals • Selection of Task Proposals by Committee • Discussion about Experimental Designs and Evaluation Methods (can be continued to Formal Runs) b d l • Registration to Task(s) – Deliver Training Data (Documents, Topics, Answers) – DeliverTraining Data (Documents Topics Answers) • Experiments and Tuning by Each Participants – Deliver Test Data (Documents and Topics) • Experiments by Each Participants • Submission of Experimental Results • Pooling the Answer Candidates from the Submissions and • Pooling the Answer Candidates from the Submissions, and Conduct Manual Judgments • Return Answers (Relevance Judgments) and Evaluation Results • Workshop Meeting Discussion for the Next Round 7

NTCIR Workshop Meeting NTCIR: Workshop Meeting http://research nii ac jp/ntcir/ http://research.nii.ac.jp/ntcir/ 8 8

NTCIR-7 & -8 Program Committee Mark Sanderson, Doug Oard, Atsushi Fujii, Tatsunori Mori, Fred Gey, Noriko Kando (and EllenVoorhees Sung Hyun Myaeng Hsin Hsi Chen (and Ellen Voorhees, Sung Hyun Myaeng, Hsin-Hsi Chen, Tetsuya Sakai) 9

NTCIR Test Collections Test Collections = Docs + Topics/Questions + Answers est Co ect o s ocs op cs/Quest o s s e s 10 Available to Non-participants for Research Purpose

Focus of NTCIR Focus of NTCIR Lab-type IR Test yp New Challenges g Intersection of IR + NLP Asian Languages/cross-language To make information in the documents Variety of Genre more usable for users! Parallel/comparable Corpus Realistic eval/user task Interactive/Exploratory search QA t QA types at topic crea t t i Forum for Researchers and Other Experts/users Other Experts/users Idea Exchange Discussion/Investigation on Evaluation Discussion/Investigation on Evaluation methods/metrics 11

IR Systems Evaluation IR Systems Evaluation • Engineering Level: Efficiency • Input Level: ex. Exhaustivity, quality, novelty of DB • Process Level: Effectiveness ex. recall, precision Process Level: Effectiveness ex. recall, precision • Output Level: Display of output • User Level: ex. Effort that users need U L l Eff t th t d • Social Level: ex. Importance (Cleverdon & Keen 1966) 12

Difficulty of retrieval varies with topics J-J Level1 D auto Effectiveness across Effectiveness across TOPICS on 1.0000 101 検索システム別の11pt再現率精度 SYSTEMS system 102 103 103 104 105 1 0.8000 A 106 Average over 50 B 107 108 topics topics C C 109 D 0.8 110 0.6000 E 111 cision F 112 113 G G 0.6 pre precision 114 H 0.4000 115 I 116 J 117 0.4 118 K 119 L 0.2000 120 M 121 N 0.2 122 O 123 124 0.0000 P 125 0 2 4 6 8 0 126 0 . . . . . . 0 0 0 0 0 1 127 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 128 recall recall recall 129 13

Difficulty of retrieval varies with topics J J L J-J Level1 D auto l1 D t Effectiveness across 1.0000 Effectiveness across TOPICS on 101 検索システム別の11pt再現率精度 SYSTEMS 102 system system 103 104 105 1 0.8000 “Difficult topics” vary with systems A 106 Average over 50 B 107 A 108 C J J Level1 D auto J-J Level1 D auto topics topics 109 D B 0.8 110 0.6000 E C ecision 1.0000 111 F 112 D 113 G on 0.6 0 6 pre E E 0.8000 0 8000 114 114 precisio H 0.4000 115 F I cision 116 0.6000 G J 117 ecision 0.4 118 K H 119 119 pre n av. pre L 0.2000 0.4000 I 120 M J 121 N 0.2 122 0.2000 K O 123 L L 124 124 0.0000 P Mea For reliable and stable 125 0.0000 M 0 2 4 6 8 0 126 0 evaluation, using substantial # . . . . . . 1 0 4 7 0 0 3 0 6 9 0 2 5 0 8 1 1 4 7 0 3 6 9 N 0 0 0 1 1 1 1 2 2 2 3 3 3 4 4 4 4 127 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0.1 1 1 0.2 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 of topics is necessary. 128 recall recall O Topic# Topic# Requests #101 150 Requests #101-150 129 129 P 14

What are TCs usable for evaluating? g Pharmeceutical R & D Phase II : Phase III: Phase IV: Phase I: Animal experiments Animal experiments Tests with healthy Tests with healthy Clinical tests Clinical tests In vitro human subjects experiments 15

What are TCs usable for evaluating? g NTCIR Users’ information-seeking Test collections tasks Phase III: Phase IV: Phase II : Phase I: Controlled Uncontrolled pre- Sharing modules, Laboratory-type interactive testing interactive testing operational testing operational testing Prototype P t t testing using human testing subjects Pharmeceutical R & D Phase II : Phase III: Phase IV: Phase I: Animal experiments Animal experiments Tests with healthy Tests with healthy Clinical tests Clinical tests In vitro human subjects experiments 2.Input level 、 6.Social level Levels of evaluation 4.User level 、 5.Output level 1. Engineering level: Efficiency 3.Process level: Effectiveness 16

• Information Seeking Task g – document types + user community – user’s situation purpose of search realistic user s situation, purpose of search, realistic Experiments are Abstraction of the RealWorldTasks Experiments are Abstraction of the Real World Tasks. Trade-off between “reality” and “contorable” • Testing & Bench marking To learn how and why the system works better (worse) than others To learn how it can be improved Scientific Understanding of the effectiveness g 17

Improvement of Effectiveness by Evaluation Workshops p y p 1.5 – 2 times in 3 years Cornell University TREC Systems 0.5 n Precision 0.4 TREC-1 0.3 verage P TREC-2 TREC-3 0.2 TREC 4 TREC-4 Mean Av 0.1 TREC-5 TREC-6 TREC-6 M 0 TREC-7 '92 '93 '94 '95 '96 '97 '98 System System System System System System System System System System System System System System 18

Research Trends Number of Papers Presented at ACM-SIGIR 450 400 WEB 350 User Evaluation 300 Non-Text QA & Summarization pers 250 # of pap NLP NLP Cross-Lingual 200 ML Clustering g 150 150 Efficiency Filtering 100 Query Processign IR Models 50 General 0 '77-79 77 79 '80-84 80 84 '85-89 85 89 '90-94 90 94 '95-99 95 99 '00-04 00 04 '05-09 05 09 PpublicationYears 19

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: - PowerPoint PPT Presentation

Welcome! Twitter: #ntcir9 Ust: ntcir-9-kick NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30- li h S i 30 1 Program Program About NTCIR Ab t NTCIR About NTCIR-9 Accepted Tasks

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

91 Kick-Off Meeting Section 7: Presentations Marine and climate services (Hawke) 92 Kick-Off

72 Kick-Off Meeting Section 7: Presentations Triggers for change (Doole) 73 Kick-Off Meeting

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

ELASTIC Kick-Off Meeting Meeting Program Kick-Off Meeting OBJECTIVES Mutual knowledge of

Intercity and Regional Bus Network Study Statewide TAC Meeting #1 Kick off Kick off Agenda

KICK-OFF MEETING Sez Atamturktur AVP for Research Development MRSEC Kick-off Meeting:

Today marks our kick off for the 2040 Long Range Transportation Plan. Today marks our kick off for

RESTON ARCHITECTURAL SURVEY AND REPORT Image property of Reston Museum Kick-off Meeting Kick

NiCT/ATR in NTCIR-7 CCLQA Track Youzheng WU, Wenliang CHEN, Hideki KASHIOKA NiCT/ATR, Japan

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Admiral 2017 Half Year Results 16 th August 2017 Introduction David Stevens, Group CEO Group

Congratulations! Serving as a TNCPE examiner is the most powerful investment of time you will

How Do Users Respond to Voice Input Errors? Lexical and Phonetic Query Reformulation in Voice

DAY THREE Developed by Kaseya University Powered by IT Scholars Kaseya Version 6.2 Last

Client Termination As A Last Resort National HOPWA Institute 2017 Tampa, FL Learning Objectives q

An Assertion Language for Debugging SDN Applications Ryan Beckett with X. Kelvin Zou,

NUMA-Friendly Stack (using Delegation and Elimination) Irina Calciu Justin Gottschlich Maurice

Using Loss Data to Win Over Clients Webinar: June 29 th @ 11am EDT How can you use loss data to

Sambuz

Useful Links

Newsletter

Mail Us