DM825 Introduction to Machine Learning Lecture 14
Tree-based Methods Principal Components Analysis
Marco Chiarandini
Department of Mathematics & Computer Science University of Southern Denmark
Tree-based Methods Principal Components Analysis Marco Chiarandini - - PowerPoint PPT Presentation
DM825 Introduction to Machine Learning Lecture 14 Tree-based Methods Principal Components Analysis Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Tree-Based Methods Outline PCA 1.
Department of Mathematics & Computer Science University of Southern Denmark
Tree-Based Methods PCA
2
Tree-Based Methods PCA
3
Tree-Based Methods PCA
Example Attributes Target Alt Bar F ri Hun P at P rice Rain Res T ype Est WillWait X1 T F F T Some $$$ F T French 0–10 T X2 T F F T Full $ F F Thai 30–60 F X3 F T F F Some $ F F Burger 0–10 T X4 T F T T Full $ F F Thai 10–30 T X5 T F T F Full $$$ F T French >60 F X6 F T F T Some $$ T T Italian 0–10 T X7 F T F F None $ T F Burger 0–10 F X8 F F F T Some $$ T T Thai 0–10 T X9 F T T F Full $ T F Burger >60 F X10 T T T T Full $$$ F T Italian 10–30 F X11 F F F F None $ F F Thai 0–10 F X12 T T T T Full $ F F Burger 30–60 T
4
Tree-Based Methods PCA
Alternate? Hungry? Reservation? Bar? Raining? Alternate? Patrons? Fri/Sat? WaitEstimate? F T F T T T F T T F T T F
5
Tree-Based Methods PCA
6
Tree-Based Methods PCA
7
Tree-Based Methods PCA
F T A B F T B
8
Tree-Based Methods PCA
9
Tree-Based Methods PCA
◮ test the most important attribute first ◮ divide the problem up into smaller subproblems that can be solved
10
Tree-Based Methods PCA
None Some Full
Patrons?
French Italian Thai Burger
Type?
11
Tree-Based Methods PCA
12
◮ Suppose we have p positive and n negative examples is a training set,
◮ An attribute A splits the training set E into subsets E1, . . . , Ed, each of
◮ Let Ei have pi positive and ni negative examples
◮ The information gain from attribute A is
Tree-Based Methods PCA
Fri/Sat?
Patrons?
Hungry? Type?
F T T F F T F T
14
Tree-Based Methods PCA
d
16
Tree-Based Methods PCA
◮ Missing data ◮ Multivalued attributes ◮ Continuous input attributes ◮ Continuous-valued output attributes
17
Tree-Based Methods PCA
◮ Classification tree analysis is when the predicted outcome is the class to
◮ Regression tree analysis is when the predicted outcome can be
◮ Classification And Regression Tree (CART) analysis is used to refer to
◮ CHi-squared Automatic Interaction Detector (CHAID). Performs
◮ A Random Forest classifier uses a number of decision trees, in order to
◮ Boosting Trees can be used for regression-type and classification-type
18
Tree-Based Methods PCA
19
Tree-Based Methods PCA
j,θ
c1
c2
m
20
Tree-Based Methods PCA
τ =
21
Tree-Based Methods PCA
22
Tree-Based Methods PCA
23
Tree-Based Methods PCA
24