NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: - - PowerPoint PPT Presentation

ntcir 9 kick off event ff
SMART_READER_LITE
LIVE PREVIEW

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: - - PowerPoint PPT Presentation

Welcome! Twitter: #ntcir9 Ust: ntcir-9-kick NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30- li h S i 30 1 Program Program About NTCIR Ab t NTCIR About NTCIR-9 Accepted Tasks


slide-1
SLIDE 1

Welcome!

Twitter: #ntcir9

ff

Ust: ntcir-9-kick

NTCIR-9 Kick-Off Event

2010.10.05

日本語セッション: 13:30-

li h S i 30 English Session: 15:30-

1

slide-2
SLIDE 2

Program Program

Ab t NTCIR

  • About NTCIR
  • About NTCIR-9
  • Accepted Tasks
  • Why participate?
  • How to participate

How to participate

  • Important Dates
  • Q & A

2

slide-3
SLIDE 3

About NTCIR

3

slide-4
SLIDE 4

NTCIR: NII Testbeds and Community for Information access Research

Research Infrastructure for Evaluating IA Research Infrastructure for Evaluating IA

A series of evaluation workshops designed to enhance h i i f ti t h l i b idi research in information-access technologies by providing an infrastructure for large-scale evaluations.

■Data sets, evaluation methodologies, and forum

Project started in late 1997

Once every 18 months , g ,

Data sets (Test collections or TCs)

Scientific, news, patents, and web Chi K J d E li h Chinese, Korean, Japanese, and English

Tasks (Research Areas)

IR: Cross-lingual tasks, patents, web, Geo QA:Monolingual tasks, cross-lingual tasks Summarization, trend info., patent maps Opinion analysis, text mining

C i b d R h A i i i

4

Community-based Research Activities

slide-5
SLIDE 5

Information retrieval (IR) Information retrieval (IR)

  • Retrieve RELEVANT information from vast collection to meet

’ i f ti d users’ information needs

  • Using computers since the 1950s
  • First CS uses human assessments as success criteria
  • First CS uses human assessments as success criteria

– Judgments vary – Comparative evaluations on the same infrastructure – Comparative evaluations on the same infrastructure

Information access (IA)

  • at o access (

)

Whole process to make information usable by users. Q d ex.: IR, text summarization, QA, text mining, and clustering

5

slide-6
SLIDE 6

NTCIR 1 2 3 4 5 6 7 8

Tasks at Past NTCIRs

'99'01 '02'04'05'07'08'09- ■

Community QA

■ ■ ■

Opinion Analysis User Generated Contents

■ ■ ■

Opinion Analysis Module-Based

■ ■

Cross-Lingual QA + IR

Geo Temporal

■ ■ ■ ■ □

Patent Contents IR for Focused Domain

■ ■ ■ ■ □

Patent

■ ■ ■

Complex/ Any Types

■ ■

Dialog C Li l Domain Question A i

■ ■ ■ ■

Cross-Lingual

■ ■ ■ ■

Factoid, List

■ ■ ■ ■ ■

Text Mining / Classification

Summarization / Answering

■ ■ ■

Trend Info Visualization

■ ■ ■

Text Summarization Web

■ ■ ■

Web Summarization / Consolidation

■ ■

Statistical MT

■ ■ ■ ■ ■ ■ ■ ■

Cross-Lingual IR

■ ■ ■ ■ ■ ■ ■ ■

Non-English Search Crosslingual Retrieval

■ ■ ■ ■ ■ ■ ■ ■

Non English Search Text Retrieval

■ ■ ■ ■ ■ ■ ■ ■

Ad Hoc IR, IR for QA The Years the meetings were held. The tasks started 18 months before

6

slide-7
SLIDE 7

Procedures in NTCIR Workshops p

  • Call forTask Proposals

Call for Task Proposals

  • Selection of Task Proposals by Committee
  • Discussion about Experimental Designs and Evaluation Methods

b d l (can be continued to Formal Runs)

  • Registration to Task(s)

– DeliverTraining Data (Documents Topics Answers) – Deliver Training Data (Documents, Topics, Answers)

  • Experiments and Tuning by Each Participants

– Deliver Test Data (Documents and Topics)

  • Experiments by Each Participants
  • Submission of Experimental Results
  • Pooling the Answer Candidates from the Submissions and
  • Pooling the Answer Candidates from the Submissions, and

Conduct Manual Judgments

  • Return Answers (Relevance Judgments) and Evaluation Results
  • Workshop Meeting

Discussion for the Next Round

7

slide-8
SLIDE 8

NTCIR Workshop Meeting NTCIR: Workshop Meeting

http://research nii ac jp/ntcir/

8

http://research.nii.ac.jp/ntcir/

8

slide-9
SLIDE 9

NTCIR-7 & -8 Program Committee

Mark Sanderson, Doug Oard, Atsushi Fujii, Tatsunori Mori, Fred Gey, Noriko Kando (and EllenVoorhees Sung Hyun Myaeng Hsin Hsi Chen (and Ellen Voorhees, Sung Hyun Myaeng, Hsin-Hsi Chen, Tetsuya Sakai)

9

slide-10
SLIDE 10

NTCIR Test Collections

Test Collections = Docs + Topics/Questions + Answers est Co ect o s

  • cs
  • p cs/Quest o s

s e s

10

Available to Non-participants for Research Purpose

slide-11
SLIDE 11

Focus of NTCIR Focus of NTCIR

Lab-type IR Test New Challenges yp g

Asian Languages/cross-language Variety of Genre Intersection of IR + NLP To make information in the documents more usable for users! Parallel/comparable Corpus Realistic eval/user task Interactive/Exploratory search QA t t t i QA types at topic crea

Forum for Researchers and Other Experts/users Other Experts/users

Idea Exchange Discussion/Investigation on Evaluation Discussion/Investigation on Evaluation methods/metrics

11

slide-12
SLIDE 12

IR Systems Evaluation IR Systems Evaluation

  • Engineering Level: Efficiency
  • Input Level: ex. Exhaustivity, quality, novelty of DB
  • Process Level: Effectiveness ex. recall, precision

Process Level: Effectiveness ex. recall, precision

  • Output Level: Display of output

U L l Eff t th t d

  • User Level: ex. Effort that users need
  • Social Level: ex. Importance (Cleverdon & Keen 1966)

12

slide-13
SLIDE 13

Difficulty of retrieval varies with topics

J-J Level1 D auto 1.0000

Effectiveness across SYSTEMS

検索システム別の11pt再現率精度 101 102 103

Effectiveness across TOPICS on system

0.8000

A B C

Average over 50 topics

1 103 104 105 106 107 108

0.6000 cision

C D E F G

topics

0.8 109 110 111 112 113

0.4000 pre

G H I J K 0.4 0.6 precision 114 115 116 117 118

0.2000

L M N O 0.2 119 120 121 122 123

0.0000 . . 2 . 4 . 6 . 8 1 . recall

P 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 recall 124 125 126 127 128

13

recall

129

slide-14
SLIDE 14

Difficulty of retrieval varies with topics

J J L l1 D t J-J Level1 D auto 1.0000

検索システム別の11pt再現率精度 101 102 103

Effectiveness across SYSTEMS Effectiveness across TOPICS on system

0.8000

A B C 1 104 105 106 107 108

system

Average over 50 topics

J-J Level1 D auto A

“Difficult topics” vary with systems

0.6000 ecision

D E F G 0 6 0.8

  • n

109 110 111 112 113 114

topics

J J Level1 D auto 0 8000 1.0000 B C D E

0.4000 pre

H I J K 0.4 0.6 precisio 114 115 116 117 118 119

0.6000 0.8000 ecision E F G H

cision

0.2000

L M N O 0.2 119 120 121 122 123 124

0.2000 0.4000 pre I J K L

n av. pre

0.0000 . . 2 . 4 . 6 . 8 1 . recall

P 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 recall 124 125 126 127 128 129

0.0000 1 1 1 4 1 7 1 1 1 1 3 1 1 6 1 1 9 1 2 2 1 2 5 1 2 8 1 3 1 1 3 4 1 3 7 1 4 1 4 3 1 4 6 1 4 9 Topic# L M N O

Requests #101 150 Mea

For reliable and stable evaluation, using substantial #

  • f topics is necessary.

14

129

Topic# P

Requests #101-150

slide-15
SLIDE 15

What are TCs usable for evaluating? g

Phase I: Phase II: Animal experiments Phase III: Tests with healthy Phase IV: Clinical tests Pharmeceutical R & D In vitro experiments Animal experiments Tests with healthy human subjects Clinical tests

15

slide-16
SLIDE 16

What are TCs usable for evaluating? g

NTCIR

Test collections

Users’ information-seeking tasks

Phase I: Laboratory-type Phase II: Sharing modules, P t t Phase III:

Controlled interactive testing

Phase IV: Uncontrolled pre-

  • perational testing

testing Prototype testing

interactive testing using human subjects

  • perational testing

Phase I: Phase II: Animal experiments Phase III: Tests with healthy Phase IV: Clinical tests Pharmeceutical R & D In vitro experiments Animal experiments Tests with healthy human subjects Clinical tests

4.User level、5.Output level

2.Input level、

6.Social level

Levels of evaluation

16

3.Process level: Effectiveness

  • 1. Engineering level:

Efficiency

slide-17
SLIDE 17
  • Information Seeking Task

g

– document types + user community – user’s situation purpose of search realistic user s situation, purpose of search, realistic Experiments are Abstraction of the RealWorldTasks Experiments are Abstraction of the Real World Tasks. Trade-off between “reality” and “contorable”

  • Testing & Bench marking

To learn how and why the system works better (worse) than others To learn how it can be improved Scientific Understanding of the effectiveness g

17

slide-18
SLIDE 18

Improvement of Effectiveness by Evaluation Workshops p y p

1.5 – 2 times in 3 years Cornell University TREC Systems

0.5 n 0.4 Precision

TREC-1

0.2 0.3 verage P

TREC-2 TREC-3 TREC 4

0.1 Mean Av

TREC-4 TREC-5 TREC-6

'92 System '93 System '94 System '95 System '96 System '97 System '98 System

M

TREC-6 TREC-7

System System System System System System System

18

slide-19
SLIDE 19

Research Trends

450

Number of Papers Presented at ACM-SIGIR

350 400

WEB User

250 300

pers Evaluation Non-Text QA & Summarization NLP

150 200

# of pap NLP Cross-Lingual ML Clustering

100 150

g Efficiency Filtering Query Processign

50 '77-79 '80-84 '85-89 '90-94 '95-99 '00-04 '05-09

IR Models General

77 79 80 84 85 89 90 94 95 99 00 04 05 09

PpublicationYears 19

slide-20
SLIDE 20

Some Thoughts on Future g

  • Requirements for Evaluating Individual Applications

Ex Enterprise search Federated Search etc – Ex. Enterprise search, Federated Search, etc.

  • Interactive and Exploratory Information Access

– Users’ Intention Diversity – Users Intention, Diversity – Collaborative Search – Expert Search, Search for Expertise and Knowledge, p , p g , Inference, etc.

  • Answer “No”
  • Using Ontology, Metadata
  • Multilingual, Cross-Lingual

etc.

20

slide-21
SLIDE 21

9

About NTCIR-9

What’s new? What s new?

21

slide-22
SLIDE 22

NTCIR 9: Objectives NTCIR-9: Objectives

S lid f d i C i l d k

  • Solid foundation

– New structure

  • Community-led task
  • rganisation

New structure

  • Task diversity

– Sustainability of research – Covers a wide context in research

  • Promotion of

context in Information Access St di i h di

research resources

– Studies rich media types – Show case in the NTCIR-9 Meeting

22

slide-23
SLIDE 23

NTCIR 9: Structure NTCIR-9: Structure

  • General Co-Chairs

– Noriko Kando (NII)

  • Task Organisers

– 31 researchers – Tsuneaki Kato (Tokyo University) worldwide – Participants (You!) – Eiichiro Sumita (NICT)

  • Evaluation Co-Chairs

p ( )

  • EVIA Co-Chairs

k d – Hideo Joho (Tsukuba University) – Mark Sanderson (RMIT) – William Webber – Tetsuya Sakai (MSRA) (Melbourne University)

23

slide-24
SLIDE 24

NTCIR 9: Development so far NTCIR-9: Development so far

f f Jun 2010 New structure formed for NTCIR-9 July 2010 Call for task proposal announced and l b i d 10 proposals were submitted Aug 2010 7 proposals were accepted by the task selection itt d E l ti h i committee and Evaluation co-chairs Sep 2010 Calls for task participation prepared k ff Oct 2010 NTCIR-9 Kick-Off Event

24

slide-25
SLIDE 25

NTCIR-9 Evaluation Tasks

Calls for task participation Calls for task participation

25

slide-26
SLIDE 26

Tasks accepted for NTCIR 9 Tasks accepted for NTCIR-9

CORETASKS CORE TASKS

  • [Intent] Intent (with One-Click Access)

[RITE] R i i I f i T t

  • [RITE] Recognizing Inference in Text
  • [GeoTime] Geotemporal information retrieval
  • [SpokenDoc] IR for Spoken Documents

PILOT TASKS

  • [CrossLink] Cross-lingual Link Discovery
  • [Vis Ex] InteractiveVisual Exploration
  • [Vis-Ex] Interactive Visual Exploration
  • [PatentMT] Patent Machine Translation

26

slide-27
SLIDE 27

Intent + One-Click Access

Calls for task participation Calls for task participation

27

slide-28
SLIDE 28

NTCIR-9 Intent Task

28

slide-29
SLIDE 29

Subtasks

  • Subtopic Mining [Chinese and Japanese]

Given a real web query participating systems mine – Given a real web query, participating systems mine possible intents from web collections and query logs

  • A one-month click-through log is available for Chinese

QUERY: Harry Potter INTENTS: Books? Movies? Character?...

Submitted intent lists will be evaluated in terms of – Submitted intent lists will be evaluated in terms of coverage and novelty; each intent will be weighted by votes from many assessors

  • Ranking [Chinese and Japanese]

– Participating systems selectively diversify search p g y y y results – Search results will be evaluated by diversity metrics using key intents obtained from Subtopic Mining using key intents obtained from Subtopic Mining

29

slide-30
SLIDE 30

Subtasks (cont.)

  • One Click Access (“1CLICK”) [Japanese]

Current search engines – Current search engines User (1) enters query (2) clicks on Search button (3) scans ranked list (4) clicks on URL that looks relevant (3) scans ranked list (4) clicks on URL that looks relevant (5) reads the page (6) finds the answer One Click Access (for desktop and mobile) – One Click Access (for desktop and mobile): User (1) enters query ( ) li k S h b tt (2) clicks on Search button (3) finds the answer Z Cli k A – Zero Click Access User (1) finds the answer ith t li ki S h! without clicking on Search!

30

slide-31
SLIDE 31

d k Introduction to NTCIR-9 RITE Task

(Recognizing Inference inTExt) ( g g )

Hideki Teruko Hiroshi Koichi Chuan-Jie Cheng-Wei Shima1 Mitamura1 Kanayama2 Takeda2 Lin3 g Lee4

1Carnegie Mellon University 2IBM Research - Tokyo 3National Taiwan

Ocean University

4Academia Sinica

NTCIR-9 Kick-Off Event

October 5th 2010 Ocean University October 5th, 2010

slide-32
SLIDE 32

Overview of RITE Overview of RITE

RITE i b h k t k ( t titi !) RITE is a benchmark task (not a competition!) for automatically detecting entailment, h d t di ti i t t paraphrase, and contradiction in text. Given t1, can a computer infer that t2 is most likely true?

2

y

– t1: Yasunari Kawabata won the Nobel Prize in Literature for his novel “Snow Country” t Y i K b t i th it f “S C t ” – t2: Yasunari Kawabata is the writer of “Snow Country”

(Target languages: Japanese, Simplified Chinese, Traditional Chinese)

slide-33
SLIDE 33

Three Subtasks in RITE Three Subtasks in RITE

Binary-class subtask

Gi t t i t t t ill d t t h th t t il

  • Given a text pair <t1 ,t2>, your system will detect whether t1 entails a

hypothesis t2 or not

Multi-class (5-way) subtask ( y)

  • Given a text pair <t1 ,t2>, your system detects whether t1 and t2

– has entailment relation: t1 t2 / t1  t2 / t1  t2 – does not have entailment relation: contradiction / independence does not have entailment relation: contradiction / independence

RITE4QA subtask

  • Same input and output as the binary-class subtask
  • Evaluated in an extrinsic way

– Evaluation method: design the dataset/metric as if a system is an answer-filtering module in a Question Answering system. – Data: t2 is a question converted to affirmative statement with a wh-word replaced with an answer candidate. t1 is a sentence/paragraph containing the answer candidate.

slide-34
SLIDE 34

Why you should participate Why you should participate

In addition to researchers in entailment and paraphrase In addition to researchers in entailment and paraphrase, various research fields can benefit from RITE:

  • Core technologies: Semantic processing, Lexical acquisition,

Machine learning, … g,

  • Applications: Information retrieval, Question answering,

Summarization, … W t h d t l id i t f ti i ti f We try hard to welcome wide variety of participations – from undergraduate students to industry researchers, from all over the world.

  • Resource pool will be available to help you build a prototype system

Resource pool will be available to help you build a prototype system quickly or participate in collaboration by sharing useful resource with

  • thers.
  • Mailing list is available for receiving important announcements and

joining the task design discussion joining the task design discussion Website: http://artigas.lti.cs.cmu.edu/rite

slide-35
SLIDE 35

GeoTime

Calls for task participation Calls for task participation

35

slide-36
SLIDE 36

GeoTime2 (E J K?) GeoTime2 (E, J, K?)

Organisers: Fred Gey, Ray Larson (UCB), Noriko Kando (NII),

  • Second round of ad hoc IR for WHEN and WHERE
  • GeoTime1 topic: How old was Max Schmeling when

p g he died, and where did he die? At GeoTime1 docs that contain bothWHEN and At GeoTime1, docs that contain both WHEN and WHERE were treated as relevant; Those that contain onlyWHEN or onlyWHERE info Those that contain only WHEN or only WHERE info were treated as partially relevant. C d b ? Can we do better? Can we create more realistic GeoTime topics?

36

slide-37
SLIDE 37

GeoTime2: Some ideas to make the task more challenging (under discussion): challenging (under discussion):

Search New Information

all topics including timestamps to indicate the query period. Search new information on a topic since some start date (up to the query time).

Time Reasoning

  • Relative expression such as “yesterday”, “last week”, “after”,

etc.

Geo Reference

  • Geo Tagging: Annotating geographical names

gg g g g g p

  • Geo Coding: Coding geographical points by codes –

geographical reasoning such as “near”, “part of” feasible

37

slide-38
SLIDE 38

Test Collection Test Collection

  • News Papers

– English: New York Times (2002-2005) from LDC g ( 5) – Japanese: Mainichi News Papers (2002-2005) Newspapers from Korea (2002 2005) (?) – Newspapers from Korea (2002-2005) (?)

  • Topics 25 +?
  • Topic creation and Relevance Judgments will

be done by the participants be done by the participants

38

slide-39
SLIDE 39

NTCIR 9 Core Task NTCIR-9 Core Task

IR for Spoken Documents (SpokenDoc)

Tomoyosi Akiba (Toyohashi University of Technology) Hiromitsu Nishizaki (Yamanashi University) Hiromitsu Nishizaki (Yamanashi University) Kiyoaki Aikawa (Tokyo University of Technology) Tatsuya Kawahara (Kyoto University) Tomoko Matsui (The Institute of Statistical Mathematics) Spoken Document Processing WG, SIG-SLP,IPSJ

39

slide-40
SLIDE 40

Background Background

C t IR th d l t t

  • Current IR methods assume clean target

documents, i.e. text without errors.

  • However the real world documents are noisy!
  • However, the real-world documents are noisy!

– UGC with typos and specific usage of terms. Text data obtained by automatic processing like OCR – Text data obtained by automatic processing like OCR

  • r MT.

– Speech data (spoken documents), e.g. podcasts, p ( p ), g p , broadcast news clips, spoken lectures, etc. – ...

  • Require methodologies to deal with those noisy

documents for IR.

O u r F

  • c

u s O u r F

  • c

u s

40

slide-41
SLIDE 41

Task Overview Task Overview

T t D t

  • Target Documents

– 2702 lectures in Corpus of Spontaneous Japanese (CSJ), 628hrs.

  • Subtasks

– Task1: Spoken Term Detection

  • Find the occurrences of the given queried term.

– Task2: Spoken Document Retrieval

  • Find the passages including the relevant information related to a

Find the passages including the relevant information related to a given query topic.

  • We are going to provide the reference speech

recognition results (N best or lattice representation) recognition results (N-best or lattice representation) so that those who are interested in SDR but not in ASR can participate our tasks. p p

41

slide-42
SLIDE 42

Merits of the Participation Merits of the Participation

  • Organi ing gro ps (IPSJ SIG SLP SDP
  • rking gro p) ha e alread
  • Organizing groups (IPSJ SIG-SLP SDP working group) have already

released 2 test collections.

1. CSJ STD test collection [Itoh et al., 2010]

t i t t l

  • 250 query terms in total.

2. CSJ SDR test collection [Akiba et al., 2009]

  • 39 query topics asking for passages in lectures.
  • Relevance Judgment has been conducted manually (maybe imperfect)
  • Relevance Judgment has been conducted manually (maybe imperfect).
  • Task Participants can get the extended and refined version of those

test collections.

M t d t i – More query terms and query topics – Pooling based relevance judgment – New evaluation metrics (including time and space efficiency, d t l l d l l l ) document-level and passage-level relevancy)

  • Visit our Web site for more information

– http://www.nlp.cs.tut.ac.jp/~sdpwg/index.php?ntcir9

42

slide-43
SLIDE 43

Cross-Lingual Link Discovery

Calls for task participation Calls for task participation

43

slide-44
SLIDE 44

Why Cross-Lingual Link Discovery? y g y

cross-lingual link discovery

example

but not linked yet in the text for the crabs Actually, there is a page about “花蟹” in English mono-lingual link discovery

link these two pages link these two pages with each other

Language link here, but there is no language link to a page in languages other than Chinese

T i t h h t i “花蟹” Wiki di 1.Trying to search what is “花蟹” on Wikipedia, and maybe the “花蟹” articles in other languages.

  • 2. Wikipedia brings you the “香港十元紙幣(Hong

Kong ten-dollar note)” not the “flower crab”.

But there is no language link here to the equivalent page in Chinese

  • 3. What about the actual crab? Where can we find

the English page about花蟹?

  • 4. What link discovery can do?

44

slide-45
SLIDE 45

Cross-lingual Link Discovery Task in NTCIR-9

  • Cross-lingual link discovery (CLLD) is a way of

automatically finding potential links between y g p isolated documents in different languages.

  • Task goal: to create a reusable resource for
  • Task goal: to create a reusable resource for

evaluating automated cross language link d h h l f h discovery approaches. The results of this research will be used in building and refining g g systems for automated link discovery.

45

slide-46
SLIDE 46

We Need Cross Lingual Link Discovery We Need Cross-Lingual Link Discovery

N t j t f Wiki di d t j t f iki

  • Not just forWikipedia and not just for wiki.
  • Cross language link discovery can allow us to

discover documents on web or on a digital discover documents on web or on a digital library in languages which we either are more familiar with or which have a richer set of familiar with, or which have a richer set of documents than in our language of choice.

FACT:

  • There are at least 83 popular Wikis in the world serving

people from different language background for different needs needs.

  • There

are at least 44 wiki-style documentation management software and their forks to help numerous projects or corporations to manage knowledge. (source from Wikipedia)

logos of various Wiki software

46

slide-47
SLIDE 47

VisEx

Interactive Visual Exploration Task

Organized by

Tsuneaki Kato

The University ofTokyo The University of Tokyo

Mitsunori Matsushita

Kansai University

47

slide-48
SLIDE 48

Wh t i Vi E What is VisEx

  • A pilot task for establishing an evaluation framework of explorative

information access environments P ti i t b it th i i I f ti A E i t

  • Participants submit their proposing Information Access Environment

Systems (IAESs)

– which should be able to be embedded in a common framework – which shoud be able to hundle given experimental tasks

  • Submitted IAESs are evaluated in laboratory experiments with

human subjects for gathering subjective and objective data human subjects for gathering subjective and objective data

  • Reports are requested, that explain the experimental results in

terms of the process primives and process model of subimitted IAEs

An efficient and effective evaluation framework A model of explorative information access Final Objective! p

48

slide-49
SLIDE 49

T k O tli Task Outline

Browser (Log Collection) Editor IAESCore Experimental Tasks Provide Framework IAES Core Log Collection Organizer Provide Baseline Mainly by the Organizer Log Collection Display

  • etc. Func.

… Laboratory Experiments Submit

  • Info. Retrieval

Engine Documents Participants Human Subjects Participants It is important to discuss the followings through the WS

  • I/F between an IAES core and the framework

Subjects

/ bet ee a S co e a d t e a e o

  • Taxonomy of process primitives
  • Detailed design of the laboratory experiments

49

slide-50
SLIDE 50

E i t lT k Experimental Tasks

  • Event Collection Task

– Uses the event-list questions in the NTCIR-7 ACLIA Task f

  • Please tell me about incidents where NATO has recognized cases of

friendly fire.

  • Please tell me about airplane crashes that have happened in Asia.

– Requests subejcts to gather nuggets (event characteristics such as its time and place) as many as possible in a given time period

  • Trend Summarization Task

– Is on summarization of the trend (not only changes but also those background and influence) of time-series statisitcal information such as the subjects of NTCIR-5,6,7 MuST

Pl t ll f th t t f th bi t l ti f 8 t

  • Please tell me a summary of the states of the cabinet approval rating from 1998 to

1999.

– Requests subjects to gather nuggets as many as possible in a given time period, which are primitive information that constitute a requested summary p p q y

50

slide-51
SLIDE 51

Info Info.

  • Schedule

– End of Oct. 2010 Participation Registration (First) Due – End of Dec. 2010 IAE I/F description Release – Latter part of Mar. 2011 IAE Framework and Baseline IAE Core Release – Latter part of Jul. 2011 Laboratory Experiments – Latter part of Aug. 2011 Experiment Results Release

C

  • Contacts

– Tsuneaki Kato kato@boz.c.u-tokyo.ac.jp – Mitsunori Matsushita mat@res.kutc.kansai-u.ac.jp Mitsunori Matsushita mat@res.kutc.kansai u.ac.jp

  • Home Page

– http://must.c.u-tokyo.ac.jp/visex

51

slide-52
SLIDE 52

Patent Machine Translation task (P MT) (PatentMT)

Call for task participation Call for task participation

52

slide-53
SLIDE 53

Background Background

  • Patent information is important information in

society worldwide

  • There is a large need for translation to access

patent information written in foreign languages patent information written in foreign languages and to apply for patents in foreign countries W h i d t t hi t l ti

  • We have organized patent machine translation

tasks to address this significant practical need

53

slide-54
SLIDE 54

Outline Outline

P ti i t hi t l t t t t

  • Participants machine translate patent sentences

Subtasks Parallel data

N

Chinese to English 1 million sentence pairs Japanese to English

New

Japanese to English 3 million sentence pairs English to Japanese Test data: 2 000 sentences

  • Participants select subtasks in which they want to

Test data: 2,000 sentences Data type: patent description

  • Participants select subtasks in which they want to

participate

  • Human evaluations for MT quality will be carried
  • Human evaluations for MT quality will be carried
  • ut

Primary evaluation

54

slide-55
SLIDE 55

Why is it so exciting to participate in? Why is it so exciting to participate in?

  • Patents are one of the challenging domains for
  • Patents are one of the challenging domains for

MT

– Patent sentences could be quite long and contain – Patent sentences could be quite long and contain complex structures

  • Participants will receive reliable evaluation for

p their MT quality

– Human evaluations will be carried out

  • Participants can use large-scale patent parallel

and monolingual corpora

  • Participants can choose subtasks from three

language directions, including the popular l di ti f Chi t E li h language direction of Chinese to English

55

slide-56
SLIDE 56

Highlights Highlights

  • Chinese to English subtask has been added
  • Human evaluations will be carried out

Human evaluations will be carried out

  • 1 million Chinese-English and 3 million

J E li h ll l Japanese-English patent parallel corpora will be provided

56

slide-57
SLIDE 57

NTCIR-9 Task Map

Summary Summary

57

slide-58
SLIDE 58

Context of Information Access Context of Information Access

Hi t Infrastructure Infrastructure

Roles Roles

Economy Technology Society History Temporal data

Roles, work task Roles, work task Interaction Interaction

Organisation User and

Interaction Interaction

Between- document Between- document Organisation Contextual- task Time Time Information User and system Document Document Character d Hyperlink Document structure Document structure Word Syntax Structure Hyperlink Reference Multi-docs Adapted from Ingwersen & Järvelin (2005)

58

slide-59
SLIDE 59

NTCIR 9Tasks NTCIR-9 Tasks

Infrastructure Infrastructure

Work task Work task

Covers wide context + rich

Work task, roles Work task, roles Interaction Interaction

Intent

co te t c media types

Interaction Interaction

Between- document Between- document Time Time

RITE CrossLink Vis-Ex

Document Document

RITE PatentMT CrossLink

SpokenDoc

GeoTime

Document structure Document structure

PatentMT

SpokenDoc

GeoTime

News Web Legal Speech g p

59

slide-60
SLIDE 60

NTCIR’s Grand Challenge NTCIR s Grand Challenge

Infrastructure Infrastructure

Work task Work task

Impact to real challenges in

Work task, roles Work task, roles Interaction Interaction

Intent

c a e ges

  • ur society

Interaction Interaction

Between- document Between- document

RITE CrossLink Vis-Ex

Time Time

NTCIR’s Infrastructure for

Document Document

RITE PatentMT CrossLink

SpokenDoc

GeoTime

Infrastructure for IA Evaluation

Document structure Document structure

PatentMT

SpokenDoc

GeoTime

News Web Legal Speech g p

60

slide-61
SLIDE 61

Why participate?

Case for students and industry Case for students and industry

61

slide-62
SLIDE 62

Why participate? (Students) Why participate? (Students)

P f h d l P bli i

  • Perfect schedule

– Task: Jan - Aug 2011

  • Publications

– Comparison with other participants can produce – Writing: Sep – Nov 2011 – Presentation: Dec 2011 participants can produce stronger arguments – Inspired by the

  • Easy start-up

Much of experimental p y international community for future work – Much of experimental setup is provided – Performance measures work

  • Diverse tasks

Range of Information – Performance measures are (often) defined – Range of Information access tasks to tackle

62

slide-63
SLIDE 63

Why participate? (Industry) Why participate? (Industry)

  • Establish your brand

– To your end-users and

  • Faster development

– Brush up your product or y competitors – Recruit smart people p y p eliminating bugs in a short period of time p p

  • Fair benchmarking

h

  • Early access to resulted

resources

– Comparison with your

  • wn products can be

biased – Secondary resources developed by the task biased – Critical self-assessments p y are yours, too

63

slide-64
SLIDE 64

How to participate

Simple six steps Simple six steps

64

slide-65
SLIDE 65

How to participate How to participate

R d h k Fill i U A

  • 1. Read the task

description and CFP carefully

  • 5. Fill in User Agreement

Forms 6 K carefully

  • 2. Contact a TO if you

have questions

  • 6. Keep an eye on a

task’s ML, website, etc to follow the have questions

  • 3. Decide a task to

participate

  • etc. to follow the

activity participate

  • 4. Register as a

participant at NTCIR

Don’t hesitate to send a

participant at NTCIR website

to send a feedback to TO

65

slide-66
SLIDE 66

Important Dates

For your diary For your diary

66

slide-67
SLIDE 67

Important Dates

/ / Ki k ff i T k

Important Dates

05/10/2010 Kick-off event in Tokyo 20/12/2010 Task registration due 05/01/2011 Document set release 05/01/2011 Document set release 01 - 05/2011 Dry run 03 - 07/2011 Formal run Contact TO for the exact schedule 22/08/2011 Evaluation results due 22/08/2011 Task overview partial release 20/09/2011 Participant paper submission due 04/11/2011 All camera-ready copy for the Proceedings due 06-09/12/2011 NTCIR-9 Meeting NII Tokyo Japan 06-09/12/2011 NTCIR-9 Meeting, NII, Tokyo, Japan

67

slide-68
SLIDE 68

Wrap up Wrap-up

  • The ninth cycle of NTCIR has started

– Under the new structure Under the new structure

  • Seven exciting tasks are running

– Organised by 31 researchers worldwide

L t f t iti f i ti k

  • Lots of opportunities for innovative work

– Exchange great ideas with the community g g y

  • What’s missing is your participation!

68

slide-69
SLIDE 69

http://research.nii.ac.jp/ntcir/ntcir-9/

Thank you for your attention!

For further enquiries, contact the NTCIR office ntc-secretariat nii.ac.jp

Q & A

ntc secretariat nii.ac.jp

69