Course Work The trec_eval tool IR4 Course Iadh Ounis Winter 2002 - - PDF document

course work
SMART_READER_LITE
LIVE PREVIEW

Course Work The trec_eval tool IR4 Course Iadh Ounis Winter 2002 - - PDF document

IO IR4; November 2002 Course Work The trec_eval tool IR4 Course Iadh Ounis Winter 2002 1 About trec_eval This evaluation tool works only on a Solaris machine Location: /local/ir/ir4tools Please * Read * the README file!


slide-1
SLIDE 1

IO IR4; November 2002 IR Assessed Exercise 1

1

Course Work

The trec_eval tool

IR4 Course Iadh Ounis Winter 2002

2

About trec_eval

  • This evaluation tool works only on a

Solaris machine

– Location: /local/ir/ir4tools

  • Please *Read* the README file!

– Instructions + useful information about the tool

slide-2
SLIDE 2

IO IR4; November 2002 IR Assessed Exercise 2

3

The trec_eval Syntax

  • Syntax

trec_eval [-q] [-a] MED.REL YOUR_TOP_REL

MED.REL: The MEDLINE relevant docs file (provided in /local/ir/ir4tools) YOUR_TOP_REL: The relevant docs given by YOUR system

4

MED.REL (given)

1 13 1 1 14 1 1 15 1 1 72 1 ... 2 80 1 2 90 1 ... i.e. tuples of the form (qid, iter, docno, rel) qid iter docno rel

slide-3
SLIDE 3

IO IR4; November 2002 IR Assessed Exercise 3

5

YOUR_TOP_REL

1 18 2.789045 Bingo! 1 19 2.129078 Bingo! 1 31 2.000091 Bingo! 1 45 1.889005 Bingo! ... 2 58 4.567980 Bingo! 2 99 3.210000 Bingo! ... i.e. tuples of the form (qid, iter, docno, rank, sim, run_id) Bingo! is the name of our system (please feel free to use any other name) qid iter docno rank sim run_id

6

Ensure that ...

  • Your 2 input files are sorted numerically by

qid

  • YOUR_TOP_REL is also sorted so that

higher similarity measures (sim) are given first (regarding a particular query)

  • You *read* the (short) README file!
slide-4
SLIDE 4

IO IR4; November 2002 IR Assessed Exercise 4

7

At the end ….

  • Once your YOUR_TOP_REL file is ready

(MED.REL is given in /local/ir/i4tools!), all you have to do is to write on your console:

trec_eval MED.REL YOUR_TOP_REL (ensure that trec_eval, MED.REL and Your_Top_REL are copied to your local directory) Look at the -q option (but do not print it)

8

As a Result you should Get ...

  • A lot of tables (these tables should be included in your

report and/or floppy disk according to the instructions of the README file!), things like …. Interpolated Recall - Precision Averages: at 0.00 0.49 at 0.10 0.36 at 0.20 0.32 at 0.30 0.26 etc… at 1.00 0.09 Use these values to draw your precision/recall graphs

slide-5
SLIDE 5

IO IR4; November 2002 IR Assessed Exercise 5

9 Query 1 Query 2 R P R P 0.1 1 0.1 0.8 0.2 0.8 0.3 0.6 0.4 0.6 0.5 0.5 0.6 0.4 0.7 0.4 0.8 0.3 0.9 0.4

PR Curve

0.2 0.4 0.6 0.8 1 0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1 Recall Precision 10

About Matching ….

  • You should first process the MED.QRY file.

– Hint: Open up the Query File. Take the first query. Compute the result list. Write the result list into the file YOUR_TOP_REL using the right trec_eval output format! Take the second query, so the same, etc.

  • For the implementation of the similarity function,

we suggest you to use the similarity matching of the Best-match Model

slide-6
SLIDE 6

IO IR4; November 2002 IR Assessed Exercise 6

11

Submission Guidelines

  • Final system and reports are due (on or before

20/12/2002)

– A short design report (4-5 pages).

  • Inverted file structure
  • details of matching
  • details of building your inverted index

– Input and output files of trec_eval (YOUR_TOP_REL and -q flag output should be given in electronic format

  • nly)

– print out of your codes. – Precision-Recall graph of the trec_eval output

  • For more details : See /local/ir/ir4tools/README