Performance Metrics for Group-Detection Algorithms Presented at - - PowerPoint PPT Presentation

performance metrics for group detection algorithms
SMART_READER_LITE
LIVE PREVIEW

Performance Metrics for Group-Detection Algorithms Presented at - - PowerPoint PPT Presentation

ALPHATECH, Inc. Performance Metrics for Group-Detection Algorithms Presented at Interface 2004 May 29, 2004 Jim White jim.white@alphatech.com Sam Steingold Connie Fournelle Introduction ALPHATECH, Inc. What is the group detection


slide-1
SLIDE 1

ALPHATECH, Inc.

Performance Metrics for Group-Detection Algorithms

Presented at Interface 2004 May 29, 2004 Jim White jim.white@alphatech.com Sam Steingold Connie Fournelle

slide-2
SLIDE 2

ALPHATECH, Inc.

Introduction

  • What is the group detection problem?
  • Evaluating Group-Detection Algorithms (GDAs) with synthetic data
  • Performance metrics
  • Some performance evaluation results
slide-3
SLIDE 3

ALPHATECH, Inc.

Introduction to Group Detection

Links GDA Putative Groups Link-quality parameters

1. P45, P671, P7 2. P456, P73 3. P7, P55, P873, P1356, P561, P3 1. List of proteins 2. List of proteins 3. List of proteins …

  • Each link is a list of proteins that were observed to be working together or

interacting, probably because they belong to a larger group of interacting proteins

  • Groups may be cellular processes, bio subsystems, …
  • Links are noisy fragments of evidence, possibly much smaller than the

generating groups

slide-4
SLIDE 4

ALPHATECH, Inc.

Groups and Links

Group 1 Group 2 Group 3 Group 4 Group n

Orphans

  • Entities (proteins)
  • Exchangeable
  • May belong to more than one group
  • Groups (processes, systems)
  • Independent, may overlap
  • Generate links
  • Observed Links
  • Either group-generated or clutter
  • Each group-generated link is produced by
  • ne of the groups
  • Orphan entities
  • Don’t belong to any groups
  • Link-quality parameters
  • PI = prior probability that a link is clutter

(independent of the groups)

  • PR = prior probability that an entity in a

group-generated link is noise (not in the group)

slide-5
SLIDE 5

ALPHATECH, Inc.

How Links Are Generated

Clutter Link PI = Prob. Link is independent of groups 1 - PI

Select N entities from population, uniform random sampling

Add Noise to Link

Randomly select a group, then randomly sample that group N times

Make a group-generated link

Each entity in link has probability PR

  • f being replaced

by an entity from

  • utside the group

Group- Generated Link

Each link is either clutter or is generated by one of the groups

slide-6
SLIDE 6

ALPHATECH, Inc.

Evaluating GDAs with Synthetic Data

Noisy Synthetic Links

GDA

Synthetic Groups Statistical Analysis System Performance Metrics GDA Outputs Group Detection System Under Test

System performance depends on both the GDA and the information content of the Links

slide-7
SLIDE 7

ALPHATECH, Inc.

Testing with Synthetic Data Can Answer Important Questions

  • How many links are needed?
  • Is link size critical?
  • How sensitive is performance to noise and clutter?
  • How does performance vary with # of groups and group size?
  • What are typical scenarios in which the GDA does very well?
  • What are problem scenarios in which the GDA under performs?
  • Testing with synthetic data provides a rational basis for planning

follow-up tests with real data.

slide-8
SLIDE 8

ALPHATECH, Inc.

Performance Metrics

slide-9
SLIDE 9

ALPHATECH, Inc.

Input-Output Model for Analyzing Detection System Performance

Detection System

P(y,x) = P(y|x)P(x) x y Input SNR = P(x=1)/P(x=0) Output SNR = P(x=1|y=1)/P(x=0|y=1)

Actual Group Membership Detector Output

Link data & GDA

Indicator variables (x,y) for membership of a generic entity in a generic group

x = 1 if entity actually belongs to the group, x = 0 otherwise y = 1 if detector assigns the entity to the group, y =0 otherwise

Four probabilities characterize detection performance (joint distribution)

P(x=0,y=0) P(x=0,y=1) P(x,y) = P(x=1,y=0) P(x=1,y=1)

slide-10
SLIDE 10

ALPHATECH, Inc.

Performance in a 3-D World

The four probabilities in P(x,y) sum to 1, so detection performance lives in a 3-D world A nice parameterization

Pg = P(x=1)

prior probability of an entity belonging to the group (group prevalence)

Fn = P(y=0|x=1) false-negative rate (miss rate) Fp = P(y=1|x=0) false-positive rate (false-alarm rate)

Classical performance metrics are functions of Fn, Fp, and Pg

Error rate Pe = P(x does not equal y) = Pg Fn + (1-Pg) Fp Detection Probability Tp = P(y=1|x=1) = 1 – Fn (Recall metric, sensitivity) Positive Predictive Value PV+ = P(x=1|y=1) (Precision metric) Negative Predictive Value PV- = P(x=0|y=0) Bayes Factor G1 = Posterior odds favoring x=1 divided by prior odds favoring x=1 Signal-to-noise ratios SNRout = G1 SNRin, where SNRin = P(x=1)/P(x=0)

slide-11
SLIDE 11

ALPHATECH, Inc.

Proficiency Metric

Avoids a Limitation of Classical Metrics

  • No single classical metric is sensitive to both Fn and Fp as input

SNR goes to 0

  • Analyst must consider two metrics simultaneously to measure performance
  • ROC curves
  • Fn and Fp
  • Precision and Recall
  • Juggling two metrics complicates algorithm optimization and the interpretation of

detection performance

  • Usual focus on error rate can be very misleading
  • The Proficiency metric from information theory is never blind
  • Proficiency = I(x,y) / H(x)
  • I(x,y) = amount of information about x that is provided by y (the mutual information)
  • H(x) = the amount of information about x that is required to achieve ideal error-free

performance (the entropy of x)

  • 0 ≤ Proficiency ≤ 1
  • Deficiency is defined as 1 - Proficiency
slide-12
SLIDE 12

ALPHATECH, Inc.

Definitions of I(X,Y) and H(X)

  • Mutual information of joint distribution P(x,y)

I(X,Y) = ΣΣ P(x,y) log P(x,y) / [ P(x)P(y) ]

  • Entropy of marginal distribution P(x)

H(X) = - Σ p(x) log p(x)

slide-13
SLIDE 13

ALPHATECH, Inc.

Proficiencies and ROC Curves

  • 4
  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Log10(False-Positive Rate) Recall ROC curves for different proficiencies 90 80 70 70 60 60 5 50 4 40 4 3 30 3 2 2 20 2

  • Each curve has

constant proficiency

  • Red Box
  • Recall > 0.75
  • Precision > 0.5
  • Contains operating

points such that Proficiency > 0.56

P(x=1) is 0.002 Precision = 0.5

slide-14
SLIDE 14

ALPHATECH, Inc.

Comparing Deficiency with Error Rate

  • A detection system is looking for two groups: a large one and a small one
  • The group sizes are 1/10 and 1/10,000 of the population
  • The detection system has low error rates: Fn = Fp = 1/10,000
  • Deficiency metric shows that the smaller group is harder to find while the Error

Rate is the same for both

  • Deficiencies: 0.0026 vs 0.136
  • Error Rates: 0.0001 vs 0.0001 (insensitive to changes in group prevalence)
  • Precision metric (and the output SNR) track the performance difference in this

case

  • Precision: 0.999 vs 0.5
  • Output SNR: 999 vs 1
slide-15
SLIDE 15

ALPHATECH, Inc.

Example of Performance Evaluation Using Synthetic Data

slide-16
SLIDE 16

ALPHATECH, Inc.

Sensitivity to Number of Experiments

Using CMU’s K-Groups Algorithm (J. Schneider & A. Moore)

  • Synthetic Universe
  • 10,000 proteins
  • 5 groups, each containing 20

proteins

  • Synthetic Links
  • Link = proteins observed to

interact or work together

  • Size = 2, 4, or 6 proteins
  • One link per experiment
  • Link Quality
  • 10% clutter links
  • 10% noise in group links
  • Evaluation Objective
  • Determine proficiency vs #

experiments

  • Google autonlab to get k-

groups software

  • Unsupervised detection

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

Group Proficiency (%) Orphan Proficiency (%)

2 4 8 16 32 64 128 256 512

P(x=1) is 0.002

slide-17
SLIDE 17

ALPHATECH, Inc.

More Noise or Clutter

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

Group Proficiency (%) Orphan Proficiency (%)

2 4 8 16 32 64 128 256 512

PI = 0.2 (twice the clutter)

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

Group Proficiency (%) Orphan Proficiency (%)

2 4 8 16 32 64 128 256 512

PR = 0.2 (twice the noise)

slide-18
SLIDE 18

ALPHATECH, Inc.

Less Noise or Clutter

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

Group Proficiency (%) Orphan Proficiency (%)

2 4 8 16 32 64 128 256 512

PI = 0.05 (half the clutter)

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

Group Proficiency (%) Orphan Proficiency (%)

2 4 8 16 32 64 128 256 512

PR = 0.05 (half the noise)

slide-19
SLIDE 19

ALPHATECH, Inc.

Summary

  • Group detection
  • Is distinct from clustering
  • Looks for small groups of interacting entities in large populations
  • Proficiency Metric
  • Is a rigorous information-theoretic performance measure
  • Much safer than using just error rate or accuracy
  • May be used when tuning the parameters in machine-learning algorithms that use

supervised learning

  • Simplifies the interpretation of performance evaluations based on synthetic or

labeled real data

slide-20
SLIDE 20

ALPHATECH, Inc.

Appendix

slide-21
SLIDE 21

ALPHATECH, Inc.

Proficiency and Area Under ROC Curve

P(x=1) is 0.1

Proficiency

slide-22
SLIDE 22

ALPHATECH, Inc.

Finding scientific teams doing research on aerosols

Using CMU’s K-Groups Algorithm (J. Schneider & A. Moore)

  • Synthetic Links
  • Authors of 504 research papers

published over last 3 years

  • Synthetic Universe
  • 10,000 scientists, engineers, and

mathematicians

  • Link Quality
  • 10% clutter

The links that were not generated by a single aerosol research team

  • 10% noise

Percentage of authors that were actually not on the research team that wrote the paper

  • Synthetic Ground Truth
  • Twenty research teams
  • Test Objective
  • Determine proficiency vs # groups to

find (an input to K-Groups)

  • Underestimating is worse than
  • verestimating the # research teams

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

Group Proficiency (%) Orphan Proficiency (%)

10 15 20 30 40

slide-23
SLIDE 23

ALPHATECH, Inc.

Proficiency and the ROC

P(x=1) is 0.1

  • Proficiency → ROC

curve

  • Each ROC curve

corresponds to a different value of proficiency

  • Mapping depends
  • n the group

prevalence P(x=1)

  • Red box
  • recall > 75%
  • precision > 50%

True positive rate