Lecture 6: Evaluation
Information Retrieval Computer Science Tripos Part II Helen Yannakoudakis1
Natural Language and Information Processing (NLIP) Group helen.yannakoudakis@cl.cam.ac.uk
2018
1Based on slides from Simone Teufel and Ronan Cummins 1
Lecture 6: Evaluation Information Retrieval Computer Science Tripos - - PowerPoint PPT Presentation
Lecture 6: Evaluation Information Retrieval Computer Science Tripos Part II Helen Yannakoudakis 1 Natural Language and Information Processing (NLIP) Group helen.yannakoudakis@cl.cam.ac.uk 2018 1 Based on slides from Simone Teufel and Ronan
1Based on slides from Simone Teufel and Ronan Cummins 1
2
3
Document Normalisation
Indexer UI Ranking/Matching Module
Query Norm.
Indexes
4
Document Normalisation
Indexer UI Ranking/Matching Module
Query Norm.
Indexes Evaluation
5
6
7
7
7
7
7
1
2
3
8
9
9
9
9
9
9
10
11
11
True Positives True Negatives False Negatives False Positives Relevant Retrieved
12
True Positives True Negatives False Negatives False Positives Relevant Retrieved
12
True Positives True Negatives False Negatives False Positives Relevant Retrieved
12
13
13
13
P+R
14
P+R
14
15
15
15
3 × 1 4 1 3 + 1 4
15
16
17
18
18
18
18
19
20
20
21
21
21
22
23
j 10: r0 = 0, r1 = 0.1 ... r10 = 1
24
j 10: r0 = 0, r1 = 0.1 ... r10 = 1
24
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
Query 1 Rank R P 1 X 0.2 1.00 ˜ P1(r2) = 1.00 2 3 X 0.4 0.67 ˜ P1(r4) = 0.67 4 5 6 X 0.6 0.50 ˜ P1(r6) = 0.50 7 8 9 10 X 0.8 0.40 ˜ P1(r8)= 0.40 11 12 13 14 15 16 17 18 19 20 X 1.0 0.25 ˜ P1(r10) = 0.25
25
j 10: r0 = 0, r1 = 0.1 ... r10 = 1
26
j 10: r0 = 0, r1 = 0.1 ... r10 = 1
r ′≥rj Pi(r ′)
26
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
Query 1 ˜ P1(r0) = 1.00 Rank R P ˜ P1(r1) = 1.00 1 X .20 1.00 ˜ P1(r2) = 1.00 2 ˜ P1(r3) = .67 3 X .40 .67 ˜ P1(r4) = .67 4 5 ˜ P1(r5) = .50 6 X .60 .50 ˜ P1(r6) = .50 7 8 9 ˜ P1(r7) = .40 10 X .80 .40 ˜ P1(r8)= .40 11 12 13 14 ˜ P1(r9) = .25 15 16 17 18 19 20 X 1.00 .25 ˜ P1(r10) = .25
(Worked avg-11-pt prec example for supervisions at the end of slides.) 27
28
N
j=1
Qj
i=1
29
Query 1 Rank P(doci ) 1 X 1.00 2 3 X 0.67 4 5 6 X 0.50 7 8 9 10 X 0.40 11 12 13 14 15 16 17 18 19 20 X 0.25 AVG: 0.564 Query 2 Rank P(doci ) 1 X 1.00 2 3 X 0.67 4 5 6 7 8 9 10 11 12 13 14 15 X 0.2 AVG: 0.623
30
31
32
33
33
33
34
35
36
37
38
39
40
40
40
41
41
42
43
44
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
Query 1 Rank R P 1 X 0.2 1.00 ˜ P1(r2) = 1.00 2 3 X 0.4 0.67 ˜ P1(r4) = 0.67 4 5 6 X 0.6 0.50 ˜ P1(r6) = 0.50 7 8 9 10 X 0.8 0.40 ˜ P1(r8)= 0.40 11 12 13 14 15 16 17 18 19 20 X 1.0 0.25 ˜ P1(r10) = 0.25
45
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
Query 1 ˜ P1(r0) = 1.00 Rank R P ˜ P1(r1) = 1.00 1 X .20 1.00 ˜ P1(r2) = 1.00 2 ˜ P1(r3) = .67 3 X .40 .67 ˜ P1(r4) = .67 4 5 ˜ P1(r5) = .50 6 X .60 .50 ˜ P1(r6) = .50 7 8 9 ˜ P1(r7) = .40 10 X .80 .40 ˜ P1(r8)= .40 11 12 13 14 ˜ P1(r9) = .25 15 16 17 18 19 20 X 1.00 .25 ˜ P1(r10) = .25
46
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
Query 2 Rank Relev. R P 1 X .33 1.00 2 3 X .67 .67 4 5 6 7 8 9 10 11 12 13 14 15 X 1.0 .2 ˜ P2(r10) = .20
47
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
˜ P2(r0) = 1.00 ˜ P2(r1) = 1.00 ˜ P2(r2) = 1.00 Query 2 ˜ P2(r3) = 1.00 Rank Relev. R P 1 X .33 1.00 ˜ P2(r4) = .67 2 ˜ P2(r5) = .67 3 X .67 .67 ˜ P2(r6) = .67 4 5 6 7 8 9 10 11 12 ˜ P2(r7) = .20 13 ˜ P2(r8) = .20 14 ˜ P2(r9) = .20 15 X 1.0 .2 ˜ P2(r10) = .20
48
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
49
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
Recall
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Precision
0.8 0.9 1
50