SLIDE 12 Introduction Learning MT-DTs MT-Adaboost Experiments Conclusion
MT-Adaboost
We adapt Adaboost.M1 which was introduced in [Schapire and Singer, 1999]. Any other boosting algorithm can be used. In the case of multi-task, the distribution is defined over pairs
- f (example, task), i.e., D ⊂ X × {1, . . . , N}
The output of the algorithm is a function which takes an example as input and give a label per task as output: Hj(x) = arg max
y∈Yj
(
i=T
(ln 1/βt)), 1 ≤ j ≤ N
Introduction Learning MT-DTs MT-Adaboost Experiments Conclusion
MT-Adaboost
Require: S = ∪N
j=1{ei =< xi , yi , j >| xi ∈ X; yi ∈ Yj }
1: D1 = init(S) initialize distribution 2: for t = 1 to T do 3:
ht = WL(S, Dt) {train the weak learner and get an hypothesis MT-DT}
4:
Calculate the error of ht: ǫt = N
j=1
j (xi )=yi Dj (xi ).
5:
if ǫt > 1/2 then
6:
Set T = t − 1 and abort loop.
7:
end if
8:
βt =
ǫt 1−ǫt
{Update distribution:}
9:
if ht
j (xi ) == yi then
10:
Dt+1(ei ) = Dt (ei )×βt
Zt
11:
else
12:
Dt+1(ei ) = Dt (ei )
Zt
13:
end if
14: end for
{Where Zt is a normalization constant chosen so that Dt+1 is a distribution}
15: return Classifier H defined by:
Hj (x) = arg max
y∈Yj
(
i=T
(ln 1/βt)), 1 ≤ j ≤ N