SLIDE 12 KHARDON AND WACHMAN
75 80 85 90 95 100 5 10 20 40 60 80 ∞ Accuracy % α Parameter Search on α: f = 50, N = 0.05, M = 0.05 Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 50 60 70 80 90 100 2−4 2−3 2−2 2−1 20 21 22 Accuracy % λ Parameter Search on λ: f = 50, N = 0.05, M = 0.05 Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s
Longest △ △ △ △ △ △ △ 75 80 85 90 95 100 2−4 2−3 2−2 2−1 20 21 22 Accuracy % τ Parameter Search on τ: f = 50, N = 0.05, M = 0.05 Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s
Longest △ △ △ △ △ △ △ 75 80 85 90 95 100 5 10 20 40 60 80 ∞ Accuracy % α Parameter Search on α: promoters Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 50 60 70 80 90 100 2−3 2−2 2−1 20 21 22 Accuracy % λ Parameter Search on λ: promoters Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 75 80 85 90 95 100 2−3 2−2 2−1 20 21 22 Accuracy % τ Parameter Search on τ: promoters Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 75 80 85 90 95 100 5 10 20 40 60 80 ∞ Accuracy % α Parameter Search on α: USPS Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 75 80 85 90 95 100 2−3 2−2 2−1 20 21 22 Accuracy % λ Parameter Search on λ: USPS Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 75 80 85 90 95 100 2−3 2−2 2−1 20 21 22 Accuracy % τ Parameter Search on τ: USPS Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 75 80 85 90 95 100 2−3 2−2 2−1 20 21 22 Accuracy % λ Parameter Search on λ: MNIST Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △ 75 80 85 90 95 100 2−3 2−2 2−1 20 21 22 Accuracy % τ Parameter Search on τ: MNIST Voted ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ Last
s s s s s s s s
Longest △ △ △ △ △ △ △ △
Figure 4: Parameter Search on Artificial and Real-world Data parameters among τ,α,λ includes all values for the parameter except the non-active value. For example, any accuracy obtained in the τ column is necessarily obtained with a value τ = 0. The column labeled “Nothing” shows results when all parameters are inactive. Several things can be observed in the tables. Consider first the margin parameters. We can see that τ is useful even in data sets with noise; this is obvious both in the noisy artificial data sets and in the real-world data sets, all of which are inseparable in the native feature space. We can also see that α and λ do improve performance in a number of cases. However, they are less effective in general than τ, and do not provide additional improvement when combined with τ. Looking next at the on-line to batch conversions we see that the differences between the basic algorithm, the longest survivor and the voted perceptron are noticeable without margin based vari-
- ants. For the artificial data sets this only holds for one group of data sets ( f = 50), the one with
238