SLIDE 1 SVMpAUC-tight: A new algorithm for
- ptimizing partial AUC based on a
tight convex upper bound
Harikrishna Narasimhan and Shivani Agarwal
Department of Computer Science and Automation Indian Institute of Science, Bangalore
SLIDE 2
Receiver Operating Characteristic Curve
SLIDE 3 Binary Classification
Vs.
Non-Spam Spam
Area Under the ROC Curve (AUC)
Receiver Operating Characteristic Curve
SLIDE 4 Binary Classification
Vs.
Non-Spam Spam
Bipartite Ranking
Ranking of documents Area Under the ROC Curve (AUC)
Receiver Operating Characteristic Curve
SLIDE 5
Partial AUC?
Full AUC
SLIDE 6
Partial AUC?
Vs Full AUC Partial AUC
SLIDE 7 Ranking
http://www.google.com/
SLIDE 8 Ranking
http://www.google.com/
SLIDE 9 Medical Diagnosis
http://en.wikipedia.org/
SLIDE 10 Medical Diagnosis
KDD Cup 2008
http://en.wikipedia.org/
SLIDE 11 Bioinformatics
― Drug Discovery ― Gene Prioritization ― Protein Interaction Prediction ― ……
http://en.wikipedia.org/wiki http://commons.wikimedia.org/ http://www.google.com/imghp
SLIDE 12 Bioinformatics
― Drug Discovery ― Gene Prioritization ― Protein Interaction Prediction ― ……
http://en.wikipedia.org/wiki http://commons.wikimedia.org/ http://www.google.com/imghp
SLIDE 13
Partial Area Under the ROC Curve is critical to many applications
SLIDE 14 Narasimhan, H. and Agarwal, S. “A structural SVM based approach for optimizing partial AUC”, ICML 2013.
SVMpAUC (ICML 2013)
SVMpAUC
SLIDE 15 Narasimhan, H. and Agarwal, S. “A structural SVM based approach for optimizing partial AUC”, ICML 2013.
SVMpAUC (ICML 2013)
SVMpAUC SVM-AUC Joachims, 2005
SLIDE 16
Improved Version of SVMpAUC
Tighter upper bound Improved accuracy Better runtime guarantee
SLIDE 17 Outline
- Overview of SVMpAUC
- Upper Bound Optimized by SVMpAUC
- Improved Formulation: SVMpAUC-tight
- Optimization Methods
- Experiments
SLIDE 18 Receiver Operating Characteristic Curve
Positive Instances Negative Instances ……..
x1
+
x2
+
x3
+
xm
+
……..
x1
Set
Learn a scoring function GOAL?
SLIDE 19 Receiver Operating Characteristic Curve
Positive Instances Negative Instances ……..
x1
+
x2
+
x3
+
xm
+
……..
x1
Set
Learn a scoring function GOAL?
Rank objects
….
x3
+
x5
+
x6
+
x1
….
x3
+
x5
+
x6
+
x1
Threshold
SLIDE 20 Receiver Operating Characteristic Curve
Positive Instances Negative Instances ……..
x1
+
x2
+
x3
+
xm
+
……..
x1
Set
Learn a scoring function GOAL?
Rank objects
….
x3
+
x5
+
x6
+
x1
….
x3
+
x5
+
x6
+
x1
Threshold Quality of scoring function?
Threshold Assignment
SLIDE 21 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 22 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 23 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 24 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 25 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives Area Under the ROC Curve (AUC)
SLIDE 26 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives Area Under the ROC Curve (AUC) Partial AUC
SLIDE 27 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 28 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 29 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 30 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
SLIDE 31 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
β = 0.5
Top 3 negatives!
SLIDE 32 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
β = 0.5
Top 3 negatives!
SLIDE 33 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
β = 0.5
Top 3 negatives!
SLIDE 34 20 15 14 13 11 9 8 6 5 3 2
Receiver Operating Characteristic Curve ROC Curve
False Positives True Positives
β = 0.5
Top 3 negatives!
SLIDE 35
(1 – pAUC) for f
SLIDE 36 (1 – pAUC) for f
Convex Upper Bound
SLIDE 37 (1 – pAUC) for f
Convex Upper Bound
+ Regularizer
SLIDE 38 SVMpAUC (ICML 2013)
SVMpAUC: Structural SVM Approach Narasimhan and Agarwal, 2013
SLIDE 39 SVMpAUC (ICML 2013)
Ordering of training examples:
1 1 1 1 1 1 1 1 m n
SVMpAUC: Structural SVM Approach Narasimhan and Agarwal, 2013
SLIDE 40 SVMpAUC (ICML 2013)
Ordering of training examples:
1 1 1 1 1 1 1 1 m n
SVMpAUC: Structural SVM Approach Narasimhan and Agarwal, 2013
Scoring function f
SLIDE 41 SVMpAUC (ICML 2013)
Ordering of training examples:
1 1 1 1 1 1 1 1 m n
SVMpAUC: Structural SVM Approach Narasimhan and Agarwal, 2013
Scoring function f
SLIDE 42 SVMpAUC (ICML 2013)
Ordering of training examples:
1 1 1 1 1 1 1 1 m n
SVMpAUC: Structural SVM Approach Narasimhan and Agarwal, 2013
Scoring function f
SLIDE 43 (1 – pAUC) for f
Convex Upper Bound
+ Regularizer
≤
SLIDE 44 (1 – pAUC) for f
Convex Upper Bound
+ Regularizer
≤
How does this upper bound look?
SLIDE 45 (1 – pAUC) for f
Convex Upper Bound
+ Regularizer
≤
Can we obtain a tighter upper bound?
SLIDE 46 Outline
- Overview of SVMpAUC
- Upper Bound Optimized by SVMpAUC
- Improved Formulation: SVMpAUC-tight
- Optimization Methods
- Experiments
SLIDE 47
1 - pAUC Upper bound we want? ∝
SLIDE 48
1 - pAUC Upper bound we want? ∝
SLIDE 49
1 - pAUC Upper bound we want? ∝
SLIDE 50
1 - pAUC Upper bound we want? ≤ ∝
pair-wise hinge loss!
SLIDE 51
Upper optimized by SVMpAUC?
SLIDE 52
= pair-wise hinge loss + extra term Upper optimized by SVMpAUC?
SLIDE 53 = pair-wise hinge loss + extra term Upper optimized by SVMpAUC?
Subset of pairs of positive-negative examples
SLIDE 54 = pair-wise hinge loss + extra term Upper optimized by SVMpAUC?
?
Subset of pairs of positive-negative examples
SLIDE 55
Upper optimized by SVMpAUC?
SLIDE 56
Upper optimized by SVMpAUC? ≤ pair-wise hinge loss + extra term
SLIDE 57 Upper optimized by SVMpAUC? ≤ pair-wise hinge loss + extra term
- approx. pair-wise hinge loss + extra term
≤
SLIDE 58 Upper optimized by SVMpAUC? ≤ pair-wise hinge loss + extra term
- approx. pair-wise hinge loss + extra term
≤
? ?
SLIDE 59 Outline
- Overview of SVMpAUC
- Upper Bound Optimized by SVMpAUC
- Improved Formulation: SVMpAUC-tight
- Optimization Methods
- Experiments
SLIDE 60 20 15 14 13 11 9 8 6 5 3 2
Rewriting the Partial AUC Loss
False Positives True Positives
3 + 2 + 2 = 7 α = 0, β = 0.5
SLIDE 61 20 15 14 13 11 9 8 6 5 3 2
Rewriting the Partial AUC Loss
False Positives True Positives
3 + 2 + 2 = 7 2 + 2 + 1 = 5 α = 0, β = 0.5
SLIDE 62 20 15 14 13 11 9 8 6 5 3 2
Rewriting the Partial AUC Loss
False Positives True Positives
3 + 2 + 2 = 7 2 + 2 + 1 = 5 . . . 1 + 1 + 1 = 3 α = 0, β = 0.5
SLIDE 63 20 15 14 13 11 9 8 6 5 3 2
Rewriting the Partial AUC Loss
False Positives True Positives
3 + 2 + 2 = 7 2 + 2 + 1 = 5 . . . 1 + 1 + 1 = 3
1 - AUC restricted to top β fraction of negatives
Maximum!
α = 0, β = 0.5
SLIDE 64
Top jβ negatives
SLIDE 65
SVM-AUC Top jβ negatives
SLIDE 66
Negatives jα to jβ
SLIDE 67
Negatives jα to jβ Truncated SVMpAUC
SLIDE 68
SVMpAUC-tight: Improved Formulation
SVMpAUC objective restricted to S
SLIDE 69
SVMpAUC-tight: Improved Formulation
Top jβ negatives
SLIDE 70 SVMpAUC-tight: Improved Formulation
Same pairs of positive-negative examples
= pair-wise hinge loss + extra term
Top jβ negatives
SLIDE 71
SVMpAUC-tight: Improved Formulation
Negatives jα to jβ
SLIDE 72 SVMpAUC-tight: Improved Formulation
≤ pair-wise hinge loss + extra term
- approx. pair-wise hinge loss + extra term
≤
Negatives jα to jβ
SLIDE 73 Outline
- Overview of SVMpAUC
- Upper Bound Optimized by SVMpAUC
- Improved Formulation: SVMpAUC-tight
- Optimization Methods
- Experiments
SLIDE 74 SVMpAUC-tight: Optimization Problem
+ Regularizer
exponential in size
SLIDE 75 SVMpAUC-tight: Optimization Problem
+ Regularizer
exponential in size
Quadratic program with an exponential number of constraints
SLIDE 76
SVMpAUC-tight: Cutting-Plane Solver
Repeat: 1. Solve OP for a subset of constraints. 2. Add the most violated constraint.
SLIDE 77
SVMpAUC-tight: Cutting-Plane Solver
Repeat: 1. Solve OP for a subset of constraints. 2. Add the most violated constraint.
SLIDE 78
SVMpAUC-tight: Cutting-Plane Solver
Repeat: 1. Solve OP for a subset of constraints. 2. Add the most violated constraint.
Better Runtime Guarantees: Maximum number of iterations Time taken per iteration
SLIDE 79 SVMpAUC-tight: Projected Subgradient Solver
Primal formulation:
SLIDE 80 SVMpAUC-tight: Projected Subgradient Solver
Repeat: 1. Compute subgradient and perform update 2. Project on to the constraint set.
Primal formulation:
SLIDE 81 SVMpAUC-tight: Projected Subgradient Solver
Repeat: 1. Compute subgradient and perform update 2. Project on to the constraint set.
Primal formulation:
Sparsity-inducing regularizations
LASSO Group LASSO Elastic-Net
SLIDE 82 Outline
- Overview of SVMpAUC
- Upper Bound Optimized by SVMpAUC
- Improved Formulation: SVMpAUC-tight
- Optimization Methods
- Experiments
SLIDE 83 Partial AUC in [0, 0.1] Partial AUC in [0.2s, 0.3s]
SVMpAUC-tight Vs SVMpAUC
Leukemia PPI Chem- informatics KDD Cup 2001 Ovarian Cancer SVMpAUC-tight 30.44 52.95 65.30 69.91 91.84 SVMpAUC 24.64 51.96 65.28 70.12 91.84 SVMAUC 28.83 39.72 62.78 62.23 92.17 KDD Cup 2008 SVMpAUC-tight 53.43 SVMpAUC 51.89 SVMAUC 50.66
SLIDE 84 Run-time Analysis
Interval [0, β]
Repeat: 1. Solve OP for a subset of constraints. 2. Add the most violated constraint.
SLIDE 85 Run-time Analysis
Interval [0, β]
Repeat: 1. Solve OP for a subset of constraints. 2. Add the most violated constraint.
SLIDE 86 Run-time Analysis
Interval [0, β]
Repeat: 1. Solve OP for a subset of constraints. 2. Add the most violated constraint.
SLIDE 87
Cutting-Plane vs. Projected Subgradient
Cutting-plane method is faster on high dimensional data with L2 regularization Projected subgradient method is faster with L1 regularization
SLIDE 88
Sparse and Group Sparse Extensions
Sparse models at the cost of decrease in accuracy
SLIDE 89 Conclusions
- A new support vector algorithm for optimizing
partial AUC based on a tight convex upper bound
- Cutting-plane solver with better run-time
guarantees
- Experiments on several bioinformatics tasks
demonstrate improved accuracy
- Projected subgradient solver allows sparse and
group sparse extensions
SLIDE 90
Questions?