 
              Supervised Sequential Classification Under Budget Constraints Kirill Trapeznikov and Venkatesh Saligrama Boston University May 1st, 2013 1 / 50
Overview Introduce sequential decision problem Myopic approach: relies on current uncertainty to make a decision Consider synthetic examples Why it does not always work Our approach: incorporate future uncertainty in current decision Examine a two stage system Reduce to supervised learning Experiment Extend to Multiple Stages Generalization Results 2 / 50
The Problem: Sequential Decision System cheap/fast slow/costly sensor sensor reject reject reject f K ( f 1 ( f 2 ( ) ) ) classify classify classify K stage decision system : Stage k can use sensor k for a cost c k Measurements can be high dimensional Order of stages/sensors is fixed 3 / 50
The Problem: Sequential Decision System cheap/fast slow/costly sensor sensor reject reject reject f K ( f 1 ( f 2 ( ) ) ) classify classify classify K stage decision system : Stage k can use sensor k for a cost c k Measurements can be high dimensional Order of stages/sensors is fixed Decision at each stage: classify using current measurements, or request (reject to) next sensor 3 / 50
The Problem: Sequential Decision System cheap/fast slow/costly sensor sensor reject reject reject f K ( f 1 ( f 2 ( ) ) ) classify classify classify K stage decision system : Stage k can use sensor k for a cost c k Measurements can be high dimensional Order of stages/sensors is fixed Decision at each stage: classify using current measurements, or request (reject to) next sensor Goal: Find decisions: F = { f 1 , f 2 , . . . , f K } trade-off error rate vs average acquisition cost 3 / 50
Example Sensors of Increasing Resolutions classify handwritten digit images high resolution low resolution (expensive) (cheap) f 4 ( f 1 ( f 2 ( f 3 ( ) ) ) ) ? Do we need all sensors for every decision? 4 / 50
Difficult Decision f 1 ( f 2 ( f 3 ( f 4 ( ) ) ) ) ? classify 8 5 / 50
Difficult Decision f 1 ( reject f 2 ( f 3 ( f 4 ( ) ) ) ) ? classify 8 5 / 50
Difficult Decision f 4 ( f 1 ( f 2 ( reject f 3 ( ) ) ) ) ? classify 8 5 / 50
Difficult Decision reject f 1 ( f 2 ( f 3 ( f 4 ( ) ) ) ) ? classify 8 5 / 50
Difficult Decision f 1 ( f 2 ( f 3 ( f 4 ( ) ) ) ) ? classify 8 high acquisition cost: need full resolution to make a decision 5 / 50
Easy Decision f 1 ( f 2 ( f 3 ( f 4 ( ) ) ) ) ? classify 1 6 / 50
Easy Decision reject f 3 ( f 4 ( f 1 ( f 2 ( ) ) ) ) ? classify 1 6 / 50
Easy Decision f 1 ( f 2 ( f 3 ( f 4 ( ) ) ) ) ? classify 1 6 / 50
Easy Decision f 1 ( f 2 ( f 3 ( f 4 ( ) ) ) ) ? classify 1 small acquisition cost: full resolution is unnecessary 6 / 50
How to reduce sensor cost? Sensor 1 is cheap, Sensor 2 is expensive Sensor 2 Centralized Sensor 1 Non-Adaptive Sensor 2 Sensor 1 Sensor 1 7 / 50
How to reduce sensor cost? Sensor 1 is cheap, Sensor 2 is expensive Sensor 2 Centralized Sensor 1 Non-Adaptive Sensor 2 Sensor 1 Sensor 1 Centralized strategy: use both sensors high cost, low error 7 / 50
How to reduce sensor cost? Sensor 1 is cheap, Sensor 2 is expensive Sensor 2 Centralized Sensor 1 Non-Adaptive Sensor 2 Sensor 1 Sensor 1 Centralized strategy: Non-adaptive strategy: use both sensors only use sensor 1 high cost, low error low cost, high error 7 / 50
A better strategy: be adaptive Only request 2nd sensor on difficult examples Stage 2 Decision Stage 1 Decision reject Sensor 2 Sensor 2 Sensor 1 Sensor 1 Sensor 1 classify Sensor 1 8 / 50
How does it compare? Same error rate as centralized for half the cost Centralized 2nd sensor Error Rate .2 Non-adaptive cost = 1 .1 Adaptive .5 1 1st sensor cost=0 Average Cost / Sample 9 / 50
Deciding to reject How to decide if to use the next sensor? cheap/fast expensive/slow sensor sensor x reject f 1 ( ) f 2 ( ) classify classify 10 / 50
Deciding to reject How to decide if to use the next sensor? cheap/fast expensive/slow sensor sensor x reject f 1 ( ) f 2 ( ) classify classify Risk of a decision: min [ current uncertainty , α × cost + future uncertainty ] � �� � � �� � classify reject to next stage (uncertainty is in correct classification) Acquisition cost justify the reduction in uncertainty? 10 / 50
Deciding to reject Risk = min [ current uncertainty , α × cost + future uncertainty ] � �� � � �� � reject to next stage classify Difficulty: sensor output is not known since it has not been acquired How to determine future uncertainty? Must base decision on collected measurements! 2nd sensor 1st sensor 11 / 50
Myopic Approach Not clear how to determine uncertainty of the future: min [ current uncertainty , α × cost + future uncertainty ] � �� � � �� � classify reject to next stage 12 / 50
Myopic Approach Not clear how to determine uncertainty of the future: min [ current uncertainty , α × cost + future uncertainty ] � �� � � �� � classify reject to next stage Ignore the future, and only use current uncertainty to make a decision: min [ current uncertainty , α × cost ] � �� � � �� � reject to next stage classify 12 / 50
Myopic Approach Not clear how to determine uncertainty of the future: min [ current uncertainty , α × cost + future uncertainty ] � �� � � �� � classify reject to next stage Ignore the future, and only use current uncertainty to make a decision: min [ current uncertainty , α × cost ] � �� � � �� � reject to next stage classify Reduces to: � classify , uncertainly < threshold decision = reject , uncertainty ≥ threshold 12 / 50
Myopic In Discriminative Setting Train a classifier at a stage h ( x ) Classifier uncertainty ≈ distance to decision boundary (margin) Small distance → high uncertainty Large distance → low uncertainty threshold reject to next stage h ( x ) Related work: [Liu et al., 2008] 13 / 50
Example 1 Data: − 6 − 4 − 2 Sensor 2 0 2 4 6 − 6 − 4 − 2 0 2 4 6 Sensor 1 14 / 50
Example 1 1st Stage Classifier: only utilizes Sensor 1 − 6 − 4 − 2 Sensor 2 0 2 4 6 − 6 − 4 − 2 0 2 4 6 Sensor 1 14 / 50
Example 1 2nd Stage Classifier: utilizes Sensors 1 and 2 − 6 − 4 − 2 Sensor 2 0 2 4 6 − 6 − 4 − 2 0 2 4 6 Sensor 1 14 / 50
Example 1 Myopic Reject Classifier Stage 1 Decision Stage 2 Decision − 6 − 6 − 4 − 4 Reject − 2 − 2 0 0 2 2 4 4 6 6 − 6 − 4 − 2 0 2 4 6 − 6 − 4 − 2 0 2 4 6 Classify − 6 − 4 − 2 0 2 4 6 − 6 − 4 − 2 0 2 4 6 15 / 50
Example 1 Myopic Reject Classifier Requests sensor 2 where sensor 1 is ambiguous Current uncertainty seems to be a good criteria to reject reject to 2nd stage (request 2nd sensor) − 6 − 4 − 2 Sensor 2 0 2 4 6 − 6 − 4 − 2 0 2 4 6 Sensor 1 16 / 50
Example 1: Error vs Budget sweep threshold to generate different operating points − 6 0.09 optimal myopic 0.08 − 4 0.07 − 2 0.06 Sensor 2 Error 0.05 0 0.04 2 0.03 4 0.02 0.01 6 0 0.2 0.4 0.6 0.8 1 − 6 − 4 − 2 0 2 4 6 Budget Sensor 1 Good performance: close to optimal, seems to work 17 / 50
Example 2 2.5 Sensor 2 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 Sensor 1 18 / 50
Example 2 1st Stage Classifier: only utilizes Sensor 1 2.5 Sensor 2 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 Sensor 1 18 / 50
Example 2 2nd Stage Classifier: utilizes Sensors 1 and 2 2.5 Sensor 2 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 Sensor 1 18 / 50
Example 2 Region 1 separable only with sensor 2 2.5 Sensor 2 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 Sensor 1 18 / 50
Example 2 Region 2 neither sensor helps 2.5 Sensor 2 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 Sensor 1 18 / 50
Example 2 Myopic Reject Decision Sensor 1 uncertainty is equally distributed between regions 1 and 2 Uniformly rejects in both regions reject to 2nd stage 2.5 2 Sensor 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 Sensor 1 19 / 50
Example 2 Myopic Reject Decision Current uncertainty is equally distributed between regions 1 and 2 Without future uncertainty cannot tell where sensor 2 is useful 0.26 myopic optimal 0.24 2.5 0.22 2 0.2 error 1.5 0.18 1 0.16 0.5 0.14 0 0.12 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 budget 20 / 50
Myopic Myopic Works Myopic Fails 0.26 0.09 myopic optimal optimal myopic 0.08 0.24 0.07 0.22 0.06 0.2 Error error 0.05 0.18 0.04 0.16 0.03 0.02 0.14 0.01 0.12 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 budget Budget 21 / 50
Future Uncertainty is Important Need to incorporate future uncertainty in the decision min [ current uncertainty , α × cost + future uncertainty ] � �� � � �� � classify reject to next stage 22 / 50
Recommend
More recommend