Supervised Sequential Classification Under Budget Constraints
Kirill Trapeznikov and Venkatesh Saligrama Boston University May 1st, 2013
1 / 50
Supervised Sequential Classification Under Budget Constraints Kirill - - PowerPoint PPT Presentation
Supervised Sequential Classification Under Budget Constraints Kirill Trapeznikov and Venkatesh Saligrama Boston University May 1st, 2013 1 / 50 Overview Introduce sequential decision problem Myopic approach: relies on current uncertainty to
1 / 50
2 / 50
reject classify reject reject classify classify
cheap/fast sensor slow/costly sensor
3 / 50
reject classify reject reject classify classify
cheap/fast sensor slow/costly sensor
3 / 50
reject classify reject reject classify classify
cheap/fast sensor slow/costly sensor
3 / 50
low resolution (cheap) high resolution (expensive)
4 / 50
5 / 50
5 / 50
5 / 50
5 / 50
5 / 50
6 / 50
6 / 50
6 / 50
6 / 50
Sensor 2 Sensor 1 Sensor 1 Sensor 2 Sensor 1 Non-Adaptive Centralized
7 / 50
Sensor 2 Sensor 1 Sensor 1 Sensor 2 Sensor 1 Non-Adaptive Centralized
7 / 50
Sensor 2 Sensor 1 Sensor 1 Sensor 2 Sensor 1 Non-Adaptive Centralized
7 / 50
Sensor 2 Sensor 1 Sensor 2 Sensor 1 Sensor 1
classify reject Stage 1 Decision Stage 2 Decision
Sensor 1 8 / 50
Average Cost / Sample cost = 1 2nd sensor Error Rate .1 .2 .5 1 1st sensor cost=0 Centralized Non-adaptive Adaptive
9 / 50
reject classify classify cheap/fast sensor expensive/slow sensor
10 / 50
reject classify classify cheap/fast sensor expensive/slow sensor
10 / 50
2nd sensor 1st sensor
11 / 50
12 / 50
12 / 50
12 / 50
reject to next stage threshold
13 / 50
−6 −4 −2 2 4 6 −6 −4 −2 2 4 6
14 / 50
−6 −4 −2 2 4 6 −6 −4 −2 2 4 6 14 / 50
−6 −4 −2 2 4 6 −6 −4 −2 2 4 6
14 / 50
−6 −4 −2 2 4 6 −6 −4 −2 2 4 6 −6 −4 −2 2 4 6 −6 −4 −2 2 4 6 −6 −4 −2 2 4 6 −6 −4 −2 2 4 6
15 / 50
reject to 2nd stage (request 2nd sensor)
Sensor 1 Sensor 2
−6 −4 −2 2 4 6 −6 −4 −2 2 4 6
16 / 50
0.2 0.4 0.6 0.8 1 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Budget Error
myopic
Sensor 1 Sensor 2
−6 −4 −2 2 4 6 −6 −4 −2 2 4 6
17 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
18 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
18 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
18 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
18 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
18 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
reject to 2nd stage
19 / 50
0.2 0.4 0.6 0.8 1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26
budget error
myopic
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
20 / 50
0.2 0.4 0.6 0.8 1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26
budget error
myopic
0.2 0.4 0.6 0.8 1 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Budget Error
myopic
21 / 50
22 / 50
Previous Work: [Ji and Carin, 2007, Kapoor and Horvitz, 2009, Zubek and Dietterich, 2002]
Previous Work: [Kanani and Melville, 2008, Koller and Gao, 2011]
23 / 50
24 / 50
reject classify classify cheap/fast sensor expensive/slow sensor
25 / 50
reject classify classify cheap/fast sensor expensive/slow sensor
26 / 50
reject classify classify cheap/fast sensor expensive/slow sensor
27 / 50
f1( ) f2( )
reject classify classify cheap/fast sensor expensive/slow sensor
x
28 / 50
f1( ) f2( )
reject classify classify cheap/fast sensor expensive/slow sensor
x
28 / 50
f1( ) f2( )
reject classify classify cheap/fast sensor expensive/slow sensor
x
28 / 50
29 / 50
29 / 50
g Ex,y[R(g, x, y)] ≈ min g∈G
N
29 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
30 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
reject to 2nd stage
31 / 50
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
32 / 50
0.2 0.4 0.6 0.8 1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26
budget error
myopic
33 / 50
34 / 50
35 / 50
g∈G
N
g∈G N
× weight of xi
36 / 50
g∈G N
× weight of xi
g∈G N
37 / 50
low resolution (cheap) high resolution (expensive)
38 / 50
Sensor 1 Sensor 2 Sensor 3 Sensor 4 Digit 0 Digit 1 Digit 8
39 / 50
1 1.5 2 2.5 3 3.5 4 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28
40 / 50
f K( ) f 1( ) f 2( )
reject classify reject reject classify classify cheap/fast sensor slow/costly sensor
41 / 50
42 / 50
42 / 50
43 / 50
44 / 50
f ∈F N
44 / 50
Dataset Stages Sensors Target Error Myopic Ours synthetic 2 .147 52% 28% pima 3 weight, age, blood tests .245 41% 15% threat 3 ir,pmmw,ammw .16 89% 71% covertype 3 soils, wild. areas, elev, aspect .285 79% 40% letter 3 pixel counts, moments, edge feat’s .25 81% 51% mnist 4
.085 90% 52% landsat 4 hyperspectral bands .17 56% 31% mam 2 CAD feat’s, expert rating .173 65 % 25%
45 / 50
46 / 50
h + 1) + log 4 δ
46 / 50
k {VCD(Fk)}
47 / 50
48 / 50
49 / 50
50 / 50
[Ji and Carin, 2007] Ji, S. and Carin, L. (2007). Cost-sensitive feature acquisition and classification. In Pattern Recognition. [Kanani and Melville, 2008] Kanani, P. and Melville, P. (2008). Prediction-time active feature-value acquisition for cost-effective customer targeting. In NIPS. [Kapoor and Horvitz, 2009] Kapoor, A. and Horvitz, E. (2009). Breaking boundaries: Active information acquisition across learning and diagnosis. In NIPS. [Koller and Gao, 2011] Koller and Gao (NIPS 2011). Active value. [Liu et al., 2008] Liu, L.-P., Yu, Y., Jiang, Y., and Zhou, Z.-H. (2008). Tefe: A time-efficient approach to feature extraction. In ICDM. [Zubek and Dietterich, 2002] Zubek, V. B. and Dietterich, T. G. (2002). Pruning improves heuristic search for cost-sensitive learning. In ICML. 50 / 50