SLIDE 3 Univariate Trees Rule Extraction Multivariate Trees
Greedy Splitting
If node m is pure, generate a leaf and stop. Otherwise split and continue recursively Impurity after the split: Suppose Nmj of the Nm take branch j and that Ni
mj of these belong to Ci
ˆ P(Ci| x, m, j) ≡ pi
mj =
Ni
mj
Nmj I ′
m = − n
Ni
mj
Nmj
K
pi
mj log2 pi mj
Select the variable and split that minimizes impurity
For numeric variables, include choices of split positions
9 Univariate Trees Rule Extraction Multivariate Trees
GenerateTree(X)
if I(X) < θI then Create leaf labelled by majority class in X else i ← SplitAttribute(X) for all branches of xi do Find Xi falling in branch Generatetree(Xi) end for end if
10 Univariate Trees Rule Extraction Multivariate Trees
SplitAttribute(X)
MinEnt ← ∞ for all attributes i = 1, . . . , d do if xi is discrete with n values then Split X into X1, . . . , Xn by xi e ← SplitEntropy(X1, . . . , Xn) if e < MinEnt then MinEnt ← e bestf ← i end if else for all possible splits of numeric attribute do Split X into X1, X2 by xi e ← SplitEntropy(X1, X2) if e < MinEnt then MinEnt ← e bestf ← i end if end for end if end for return bestf 11 Univariate Trees Rule Extraction Multivariate Trees
Regression Trees
Let bm( x) = 1 if x ∈ Xm
Error at node m: Em = 1 Nm
(rt − gm)2bm( xt) gm =
xt)rt
xt) After splitting E ′
m
= 1 Nm
(rt − gmj)2bmj( xt) gmj =
xt)rt
xt)
12