Evaluation Metrics Presented by Dawn Lawrie 1 Some Possibilities - - PowerPoint PPT Presentation

evaluation metrics
SMART_READER_LITE
LIVE PREVIEW

Evaluation Metrics Presented by Dawn Lawrie 1 Some Possibilities - - PowerPoint PPT Presentation

Evaluation Metrics Presented by Dawn Lawrie 1 Some Possibilities Precision Recall F-measure Mean Average Precision Mean Reciprocal Rank 2 Precision Set Proportion of things of interest in some set Example: Im interested in apples


slide-1
SLIDE 1

Evaluation Metrics

Presented by Dawn Lawrie

1

slide-2
SLIDE 2

Some Possibilities

Precision Recall F-measure Mean Average Precision Mean Reciprocal Rank

2

slide-3
SLIDE 3

Precision

Proportion of things

  • f interest in some set

Example: I’m interested in apples

Set Precision = 3 apples / 5 pieces of fruit

3

slide-4
SLIDE 4

Recall

Proportion of things

  • f interest in the set
  • ut of all the things of

interest Example: I’m looking for apples

Set Recall = 3 apples / 6 total apples

4

slide-5
SLIDE 5

F-measure

Harmonic mean of precision and recall Combined measure that values each the same

F1= 2 * precision * recall precision +recall

5

slide-6
SLIDE 6

Where to use

The set is well defined Order of things in the set doesn’t matter

6

slide-7
SLIDE 7

But with a Ranked List

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

7

slide-8
SLIDE 8

Mean Average Precision

Also known as MAP Favored IR metric for ranked retrieval

8

slide-9
SLIDE 9

Let Relevant = Set of Apples

Computing Average Precision

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

9

slide-10
SLIDE 10

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

9

slide-11
SLIDE 11

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

9

slide-12
SLIDE 12

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

1/2

9

slide-13
SLIDE 13

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

1/2

9

slide-14
SLIDE 14

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

1/2 + 2/3

9

slide-15
SLIDE 15

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

1/2 + 2/3

9

slide-16
SLIDE 16

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

1/2 + 2/3 + 3/6

9

slide-17
SLIDE 17

Let Relevant = Set of Apples

Computing Average Precision

2 3 6 101112

AP Relevant

( ) =

Precision Rank r

( )

( )

r ∈Relevant

Relevant

Ordered list = ranked list

1/2 + 2/3 + 3/6 + 4/10 + 5/11 + 6/12

9

slide-18
SLIDE 18

Compute MAP

Compute average over a query set

Apple Query Blueberry Query Pineapple Query Banana Query

MAP Query

( ) =

AP Relevant( q )

( )

q ∈Query

Query

10

slide-19
SLIDE 19

Limitation of MAP

Results can be biased for query sets that include queries with few relevant documents

11

slide-20
SLIDE 20

Mean Reciprocal Rank

RR (q ) =

if q retrieves no relevant documents

  • therwise

1 TopRank q

( )

! " # # $ # # MRR Query

( ) =

RR (q )

q ∈Query

Query

12

slide-21
SLIDE 21

Mean Reciprocal Rank

RR (q ) =

if q retrieves no relevant documents

  • therwise

1 TopRank q

( )

! " # # $ # # MRR Query

( ) =

RR (q )

q ∈Query

Query

Reciprocal Rank

12

slide-22
SLIDE 22

Understanding MRR

Ranks 5 15

13

slide-23
SLIDE 23

Understanding MRR

Ranks 5 15 205 215

13

slide-24
SLIDE 24

Understanding MRR

Ranks 5 15 RR values 0.2 0.067 205 215

13

slide-25
SLIDE 25

Understanding MRR

Ranks 5 15 RR values 0.2 0.067 0.0049 0.0047 205 215

13

slide-26
SLIDE 26

Understanding MRR

Ranks 5 15 RR values 0.2 0.067 0.0049 0.0047 Average: 110 MRR: 0.069 205 215

13

slide-27
SLIDE 27

MRR vs. Average Rank

MRR=MAP when one relevant document Bound result between 0 and 1 1 is perfect retrieval Average rank greatly influenced by documents retrieved at large ranks

High Ranks does not reflect the importance of those documents in practice Minimizes difference between 750 and 900

14

slide-28
SLIDE 28

Take Home Message

P /R and f-measure good for well defined sets MAP good for ranked results when your looking for 5+ things MRR good for ranked results when your looking for <5 things and best when just 1 thing

15