A Proposal to Extend and Enrich the A Proposal to Extend and Enrich - - PowerPoint PPT Presentation

a proposal to extend and enrich the a proposal to extend
SMART_READER_LITE
LIVE PREVIEW

A Proposal to Extend and Enrich the A Proposal to Extend and Enrich - - PowerPoint PPT Presentation

First International Workshop on Evaluating Information Access (E First International Workshop on Evaluating Information Access (EVIA 2007) VIA 2007) Tokyo, May 15, 2007 Tokyo, May 15, 2007 A Proposal to Extend and Enrich the A Proposal to


slide-1
SLIDE 1

First International Workshop on Evaluating Information Access (E First International Workshop on Evaluating Information Access (EVIA 2007) VIA 2007) Tokyo, May 15, 2007 Tokyo, May 15, 2007

Information Management S ystems (IMS ) Research Group Information Management S ystems (IMS ) Research Group Department of Information Engineering Department of Information Engineering University of Padua, Italy University of Padua, Italy Maristella Agosti, Giorgio Maria Di Nunzio, Nicola Ferro

A Proposal to Extend and Enrich the A Proposal to Extend and Enrich the Scientific Data Scientific Data Curation Curation of Evaluation Campaigns

  • f Evaluation Campaigns
slide-2
SLIDE 2

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 2 2 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Outline Outline

  • Background on experimental evaluation

Background on experimental evaluation

  • Motivations and obj ectives

Motivations and obj ectives

  • Discussion on the current methodology

Discussion on the current methodology

  • Possible extensions to the current methodology

Possible extensions to the current methodology

  • A concrete example: the DIRECT system

A concrete example: the DIRECT system

slide-3
SLIDE 3

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 3 3 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Experimental Evaluation Background Experimental Evaluation Background

  • The

The Cranfield Cranfield evaluation methodology is a very well understood evaluation methodology is a very well understood paradigm paradigm

  • its main focus is on

its main focus is on experiment comparability experiment comparability and and performance performance evaluation evaluation

  • S

everal successful evaluation initiatives (TREC, CLEF, NTCIR, S everal successful evaluation initiatives (TREC, CLEF, NTCIR, … … ) ) have adopted this paradigm have adopted this paradigm

  • they have produced a

they have produced a huge amount of data huge amount of data, , promoted the research promoted the research in in the IR field, and favoured the creation of the IR field, and favoured the creation of cross cross-

  • disciplinary communities

disciplinary communities

  • S

teve Robertson, in his keynote at ECIR 2007, pointed out S teve Robertson, in his keynote at ECIR 2007, pointed out

  • the IR

the IR field field has has a a long long tradition tradition in in evaluation evaluation and and “ “ t t he tradition that he tradition that began with began with Cranfield Cranfield is is alive and kicking alive and kicking, half a century later , half a century later” ”

  • we need to understand

we need to understand “ “ how and when to how and when to push its boundaries push its boundaries, how and , how and when to when to transcend it without throwing it away transcend it without throwing it away” ”

  • such

such evaluation initiatives evaluation initiatives are an " are an "extremely valuable infrastructure extremely valuable infrastructure for for the [IR] field". the [IR] field".

slide-4
SLIDE 4

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 4 4 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Motivation Motivation

  • Comparable experiments

Comparable experiments

  • Performance measurements concerning the experiments

Performance measurements concerning the experiments

  • Descriptive statistics about a collection of experiment

Descriptive statistics about a collection of experiment

  • Hypothesis tests for in

Hypothesis tests for in-

  • depth analysis of the experiments

depth analysis of the experiments The experimental evaluation is a scientific activity and, as such, we have to realise that its outcomes are very valuable scientific data.

slide-5
SLIDE 5

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 5 5 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Objectives Objectives

Investigate whether these scientific data are Investigate whether these scientific data are

  • properly modelled

properly modelled

  • effectively managed and archived

effectively managed and archived

  • carefully

carefully curated curated and enriched and enriched in the current evaluation methodology in the current evaluation methodology

Information Hierarcy Data Curation External Stakeholders

slide-6
SLIDE 6

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 6 6 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Information Hierarchy Information Hierarchy

  • experimental collections

experimental collections and the and the experiments experiments are are data data, since they are the raw, basic , since they are the raw, basic elements needed for any further investigation elements needed for any further investigation

  • performance measurements

performance measurements are are information information, since they are the result of , since they are the result of computations and processing on the data, computations and processing on the data,

  • descriptive statistics

descriptive statistics and the and the hypothesis tests hypothesis tests are are knowledge knowledge, since they are a further , since they are a further elaboration of the information carried by the performance measur elaboration of the information carried by the performance measurements ements

  • theories, models, algorithms, and techniques

theories, models, algorithms, and techniques are are wisdom wisdom, since they provide , since they provide interpretation, explanation, and formalization of the content of interpretation, explanation, and formalization of the content of the previous levels. the previous levels.

slide-7
SLIDE 7

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 7 7 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Data Data Curation Curation

  • S

cientific data should be S cientific data should be archived, archived, preserved preserved, , maintained over the time maintained over the time, , and and made made easily accessible easily accessible to interested users; to interested users;

  • Re

Re-

  • use of data for new research

use of data for new research

  • Retention of expensive or difficult to generate data

Retention of expensive or difficult to generate data

  • Their

Their lineage lineage should be tracked since it allows us to j udge the quality and should be tracked since it allows us to j udge the quality and applicability of information for a given use; applicability of information for a given use;

  • Validation of published research results

Validation of published research results

  • S

cientific data should be S cientific data should be enriched enriched progressively adding further analyses and progressively adding further analyses and interpretations on them; interpretations on them;

  • Enhancement of existing data available for research proj ects

Enhancement of existing data available for research proj ects

  • It should be possible to

It should be possible to cite cite scientific data and their further elaboration scientific data and their further elaboration

  • cross

cross-

  • dissemination of scientific results to research communities and

dissemination of scientific results to research communities and industrial industrial partners partners

slide-8
SLIDE 8

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 8 8 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

External Stakeholders External Stakeholders

  • The EC in the 7FP i2010 Digital Library Initiative states that

The EC in the 7FP i2010 Digital Library Initiative states that

  • digital repositories of scientific information are essential ele

digital repositories of scientific information are essential elements to ments to build European build European eInfrastructure eInfrastructure for knowledge sharing and transfer, for knowledge sharing and transfer, feeding the cycles of scientific research and innovation up feeding the cycles of scientific research and innovation up-

  • take

take

  • The US

National S cientific Board points out that The US National S cientific Board points out that

  • rganizations make choices on behalf of the current and future u
  • rganizations make choices on behalf of the current and future user

ser community on issues such as collection access; collection struct community on issues such as collection access; collection struct ure; ure; technical standards and processes for data technical standards and processes for data curation curation; ontology ; ontology development; annotation; and peer review development; annotation; and peer review

  • The Australian Working Group on Data for S

cience suggests to The Australian Working Group on Data for S cience suggests to

  • establish a nationally supported long

establish a nationally supported long-

  • term strategic framework for

term strategic framework for scientific data management, including guiding principles, polici scientific data management, including guiding principles, policies, best es, best practices and infrastructure practices and infrastructure

slide-9
SLIDE 9

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 9 9 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Extending the Approach to the Evaluation (1/2) Extending the Approach to the Evaluation (1/2)

  • Introduce a

Introduce a conceptual model conceptual model

  • it makes clear what are the

it makes clear what are the entities entities entailed by the information space of entailed by the information space of an evaluation campaign, their an evaluation campaign, their features features, and their , and their relationships relationships

  • logical models

logical models can be derived from it to can be derived from it to manage manage and and preserve preserve the the experimental data experimental data

  • commonly agreed

commonly agreed data formats data formats for for exchanging information exchanging information can be can be derived from it derived from it

  • Develop common

Develop common metadata formats metadata formats

  • they provide meaning to the data, and thereby enable their

they provide meaning to the data, and thereby enable their sharing sharing and and re re-

  • use

use

  • they allow to keep track of the

they allow to keep track of the lineage lineage of the managed information

  • f the managed information
  • Adopt a

Adopt a unique identification unique identification mechanism mechanism

  • it allows for explicit

it allows for explicit citation citation and and easy access easy access to the scientific data and it to the scientific data and it supports the supports the enrichement enrichement of the scientific data

  • f the scientific data
slide-10
SLIDE 10

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 10 10 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Extending the Approach to the Evaluation (2/2) Extending the Approach to the Evaluation (2/2)

  • Provide

Provide common tools for statistical analyses common tools for statistical analyses

  • they allow for j udging whether measured differences between retr

they allow for j udging whether measured differences between retrieval methods ieval methods can be considered statistically significant can be considered statistically significant

  • a uniform way of performing statistical analyses on experiments

a uniform way of performing statistical analyses on experiments make the analysis make the analysis and assessment of the experiments comparable too and assessment of the experiments comparable too

  • Design and develop a

Design and develop a Digital Library System (DLS) for IR scientific data Digital Library System (DLS) for IR scientific data

  • it is well suited for managing and making accessible the scienti

it is well suited for managing and making accessible the scientific data and the fic data and the experiments produced during the course of an evaluation campaign experiments produced during the course of an evaluation campaign

  • it also provides tools for analyzing, comparing, and citing the

it also provides tools for analyzing, comparing, and citing the scientific data of an scientific data of an evaluation campaign, as well as evaluation campaign, as well as curating curating, preserving, annotating, enriching, and , preserving, annotating, enriching, and promoting the re promoting the re-

  • use of them

use of them

  • Give to

Give to organizations

  • rganizations responsible for evaluation initiatives an

responsible for evaluation initiatives an active role active role in in this process this process

  • they should take a leadership role in developing a comprehensive

they should take a leadership role in developing a comprehensive strategy for long strategy for long-

  • lived digital data collections and drive the research community

lived digital data collections and drive the research community through this process through this process in order to improve the way of doing research in order to improve the way of doing research

  • they should take care also of defining guiding principles, polic

they should take care also of defining guiding principles, policies, best practices for ies, best practices for making use of the scientific data produced during the evaluation making use of the scientific data produced during the evaluation campaign itself campaign itself

slide-11
SLIDE 11

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 11 11 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

DIRECT: a DLS for IR Scientific Data DIRECT: a DLS for IR Scientific Data

  • DIRECT (Distributed Information Retrieval Evaluation Campaign

DIRECT (Distributed Information Retrieval Evaluation Campaign Tool) is a digital library system for managing the scientific da Tool) is a digital library system for managing the scientific data ta produced during an evaluation campaign produced during an evaluation campaign

  • DIRECT has been adopted in

DIRECT has been adopted in

  • CLEF 2005

CLEF 2005: 30 participants spread over 15 different nations submitted : 30 participants spread over 15 different nations submitted more than 530 experiments; 15 assessors assessed more than 160,0 more than 530 experiments; 15 assessors assessed more than 160,000 00 documents in 7 different languages (Latin and Cyrillic alphabets documents in 7 different languages (Latin and Cyrillic alphabets) )

  • CLEF 2006

CLEF 2006: nearly : nearly 75 participants spread over 25 different nations submitted around 570 experiments; 40 assessors assessed more than 198,500 documents in 9 languages (Latin and Cyrillic alphabets) (Latin and Cyrillic alphabets)

  • CLEF 2007

CLEF 2007: ongoing : ongoing

slide-12
SLIDE 12

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 12 12 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

DIRECT Conceptual Model DIRECT Conceptual Model

STRUCTURE USE

(0, N) (1, N)

CONSISTOF

(1, N) (1, 1)

SUBSCRIBE

(0, N) (1, 1) (1, N) (0, N)

HASTOPIC

(1, N) (0, N) (1, N)

COMPOSE

(1, N)

ORGANIZE

(0, N) (0, N) (1, N) (1, 1) (1, N) (1, N) (1, N) (1, N) (0, N) (0, N) (1, N) (1, 1) (0, N)

HASFIELD Mandatory COLLECTION ID SUBMIT Timesamp (TS) Priority

(1, N)

USEFIELD

(0, N) (0, N)

Language Parser PARTICIPATE CAMPAIGN Acronym ID Status

(0, N)

USER ID Pwd ContactMail FullName UserType Country Language TRACK ID Description ClosingDate MetricStatus SubmissionStatus TASK ID MaxExperiments Description COLLECTIONFILE ID

(1, 1) (1, 1)

Mime

TOPICCONTENT Content

Mime

ID Language Position CONTAIN COMPOSE TOPICFIELD ID TOPIC ID ITEM Rank Score DOCUMENT ID Content EXPERIMENT ID Handle Type Description SourceLanguage QueryConstruction TopicField HASBLOB

(0, 1) (1, 1)

EXPERIMENTBLOB FileName SubmittedFile FileSize

slide-13
SLIDE 13

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 13 13 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

DIRECT Architecture DIRECT Architecture

Application Logic Data Logic Inteface Logic

slide-14
SLIDE 14

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 14 14 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

DIRECT Participant Interface (1/2) DIRECT Participant Interface (1/2)

slide-15
SLIDE 15

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 15 15 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

DIRECT Participant Interface (2/2) DIRECT Participant Interface (2/2)

slide-16
SLIDE 16

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 16 16 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

DIRECT DIRECT Assessor Assessor Interface (1/2) Interface (1/2)

slide-17
SLIDE 17

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 17 17 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

DIRECT Assessor Interface (2/2) DIRECT Assessor Interface (2/2)

slide-18
SLIDE 18

1 1st

st Int. Workshop on Evaluating Information Access

  • Int. Workshop on Evaluating Information Access

EVIA 2007, Tokyo, May 15, 2007 EVIA 2007, Tokyo, May 15, 2007 18 18 M.

  • M. Agosti

Agosti, G.M. Di , G.M. Di Nunzio Nunzio, N. Ferro , N. Ferro

Conclusions Conclusions

  • We have discussed the issues concerning the management,

We have discussed the issues concerning the management, enrichment and enrichment and curation curation of the scientific data produce during

  • f the scientific data produce during

the evaluation the evaluation

  • We have proposed some possible extension to the current

We have proposed some possible extension to the current evaluation methodology evaluation methodology

  • We have designed and developed the DIRECT system, a DLS

for We have designed and developed the DIRECT system, a DLS for IR scientific data, which has been tested in the context of CLEF IR scientific data, which has been tested in the context of CLEF 2005 and 2006 2005 and 2006