CS570 Data Mining Classification: Ensemble Methods
Cengiz Günay
- Dept. Math & CS, Emory University
Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al., and Li Xiong
Günay (Emory) Classification: Ensemble Methods Fall 2013 1 / 6
CS570 Data Mining Classification: Ensemble Methods Cengiz Gnay - - PowerPoint PPT Presentation
CS570 Data Mining Classification: Ensemble Methods Cengiz Gnay Dept. Math & CS, Emory University Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al. , and Li Xiong Gnay (Emory) Classification: Ensemble Methods Fall 2013 1 /
Cengiz Günay
Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al., and Li Xiong
Günay (Emory) Classification: Ensemble Methods Fall 2013 1 / 6
Due today midnight: Homework #2 – Frequent itemsets Given today: Homework #3 – Classification Today’s menu: Classification: Ensemble Methods
Günay (Emory) Classification: Ensemble Methods Fall 2013 2 / 6
Suppose there are 25 base classifiers
Each classifier has error rate, ε = 0.35
Assume classifiers are independent
Probability that the ensemble classifier makes a wrong prediction:
i=13 25
25 i ) εi(1−ε)25−i=0.06
Can be obtained by manipulating:
1 Training set:
Bagging Boosting
Günay (Emory) Classification: Ensemble Methods Fall 2013 3 / 6
Can be obtained by manipulating:
1 Training set:
Bagging Boosting
2 Input features:
Random forests Multi-objective evolutionary algorithms Forward/backward elimination?
Günay (Emory) Classification: Ensemble Methods Fall 2013 3 / 6
Can be obtained by manipulating:
1 Training set:
Bagging Boosting
2 Input features:
Random forests Multi-objective evolutionary algorithms Forward/backward elimination?
3 Class labels:
Multi-classes Active learning
Günay (Emory) Classification: Ensemble Methods Fall 2013 3 / 6
Can be obtained by manipulating:
1 Training set:
Bagging Boosting
2 Input features:
Random forests Multi-objective evolutionary algorithms Forward/backward elimination?
3 Class labels:
Multi-classes Active learning
4 Learning algorithm:
ANNs Decision trees
Günay (Emory) Classification: Ensemble Methods Fall 2013 3 / 6
Sampling with replacement Build classifier on each bootstrap sample Each sample has probability (1 – 1/n)n of being
selected
Original Data 1 2 3 4 5 6 7 8 9 10 Bagging (Round 1) 7 8 10 8 2 5 10 10 5 9 Bagging (Round 2) 1 4 9 1 2 3 2 7 3 2 Bagging (Round 3) 1 8 5 10 5 5 9 6 3 7
Advantages: Less overfitting Helps when classifier is unstable (has high variance) Disadvantages: Not useful when classifier is stable and has large bias
Günay (Emory) Classification: Ensemble Methods Fall 2013 4 / 6
– Weight is determined by their accuracy
An iterative procedure to adaptively change
Initially, all N records are assigned equal weights
Unlike bagging, weights may change at the end of boosting round
Records that are wrongly classified will have their
Records that are classified correctly will have their
Original Data 1 2 3 4 5 6 7 8 9 10 Boosting (Round 1) 7 3 2 8 7 9 4 10 6 3 Boosting (Round 2) 5 4 9 4 2 5 1 7 4 2 Boosting (Round 3) 4 4 8 10 4 5 4 6 3 4
likely to be chosen again in subsequent rounds
Advantages: Focuses on samples that are hard to classify Sample weights can be used for:
1
Sampling probability
2
Used by classifier to value them more
Adaboost: Calculates classifier importance instead of voting Exponential weight update rules But, susceptible to overfitting
Günay (Emory) Classification: Ensemble Methods Fall 2013 5 / 6
Base classifiers: C1, C2, …, CT Error rate: Importance of a classifier:
j=1 N
Weight update: If any intermediate rounds produce error rate
Classification:
( j+ 1)=wi
( j )
α j
=
T j j j y
1
(C) Vipin Kumar, Parallel Issues in Data Mining, V 11
Data points for training Initial weights for each data point
(C) Vipin Kumar, Parallel Issues in Data Mining, V 12
Advantages: Only for decision trees Lowers generalization error Uses randomization in tree construction: #features= log2 d + 1 Equivalent accuracy to Adaboost, but faster See table in Tan et al p. 294 for comparison of ensemble methods.
Günay (Emory) Classification: Ensemble Methods Fall 2013 6 / 6