Evaluation Metrics
Presented by Dawn Lawrie
1
Evaluation Metrics Presented by Dawn Lawrie 1 Some Possibilities - - PowerPoint PPT Presentation
Evaluation Metrics Presented by Dawn Lawrie 1 Some Possibilities Precision Recall F-measure Mean Average Precision Mean Reciprocal Rank 2 Precision Set Proportion of things of interest in some set Example: Im interested in apples
Presented by Dawn Lawrie
1
Precision Recall F-measure Mean Average Precision Mean Reciprocal Rank
2
Proportion of things
Example: I’m interested in apples
Set Precision = 3 apples / 5 pieces of fruit
3
Proportion of things
interest Example: I’m looking for apples
Set Recall = 3 apples / 6 total apples
4
Harmonic mean of precision and recall Combined measure that values each the same
F1= 2 * precision * recall precision +recall
5
The set is well defined Order of things in the set doesn’t matter
6
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
7
Also known as MAP Favored IR metric for ranked retrieval
8
Let Relevant = Set of Apples
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
1/2
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
1/2
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
1/2 + 2/3
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
1/2 + 2/3
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
1/2 + 2/3 + 3/6
9
Let Relevant = Set of Apples
2 3 6 101112
AP Relevant
Precision Rank r
r ∈Relevant
Relevant
Ordered list = ranked list
1/2 + 2/3 + 3/6 + 4/10 + 5/11 + 6/12
9
Compute average over a query set
Apple Query Blueberry Query Pineapple Query Banana Query
MAP Query
( ) =
AP Relevant( q )
( )
q ∈Query
Query
10
Results can be biased for query sets that include queries with few relevant documents
11
RR (q ) =
if q retrieves no relevant documents
1 TopRank q
( )
! " # # $ # # MRR Query
( ) =
RR (q )
q ∈Query
Query
12
RR (q ) =
if q retrieves no relevant documents
1 TopRank q
( )
! " # # $ # # MRR Query
( ) =
RR (q )
q ∈Query
Query
12
Ranks 5 15
13
Ranks 5 15 205 215
13
Ranks 5 15 RR values 0.2 0.067 205 215
13
Ranks 5 15 RR values 0.2 0.067 0.0049 0.0047 205 215
13
Ranks 5 15 RR values 0.2 0.067 0.0049 0.0047 Average: 110 MRR: 0.069 205 215
13
MRR=MAP when one relevant document Bound result between 0 and 1 1 is perfect retrieval Average rank greatly influenced by documents retrieved at large ranks
High Ranks does not reflect the importance of those documents in practice Minimizes difference between 750 and 900
14
P /R and f-measure good for well defined sets MAP good for ranked results when your looking for 5+ things MRR good for ranked results when your looking for <5 things and best when just 1 thing
15