 
              1/14 Extending Decision Trees Alice Gao Lecture 20 Based on work by K. Leyton-Brown, K. Larson, and P. van Beek
2/14 Outline Learning Goals Non-binary Class Variable Real-valued features Noise and over-fjtting Revisiting the Learning goals
3/14 Learning Goals By the end of the lecture, you should be able to
4/14 Normal Yes Weak Normal Mild Rain 10 Yes Weak Cool Sunny Sunny 9 No Weak High Mild Sunny 8 Yes 11 Mild Normal Hot Strong High Mild Rain 14 Yes Weak Normal Overcast Normal 13 Yes Strong High Mild Overcast 12 Yes Strong Strong Cool Jeeves the valet - training set Weak 3 No Strong High Hot Sunny 2 No High Hot Hot Sunny 1 Tennis? Wind Humidity Temp Outlook Day Overcast High Overcast Normal 7 No Strong Normal Cool Rain 6 Yes Weak Cool Weak Rain 5 Yes Weak High Mild Rain 4 Yes No
5/14 High No Strong Normal Mild Rain 10 Yes Weak Cool Overcast Rain 9 Yes Weak High Cool Overcast 8 Yes 11 Mild Normal Cool Weak High Cool Sunny 14 No Strong High Sunny High 13 Yes Weak Normal Mild Sunny 12 Yes Weak Weak Mild Jeeves the valet - the test set Strong 3 No Strong Normal Hot Rain 2 No High Cool Mild Sunny 1 Tennis? Wind Humidity Temp Outlook Day Rain High Overcast Normal 7 Yes Weak High Hot Rain 6 Yes Weak Cool Strong Overcast 5 Yes Strong High Hot Overcast 4 No No
6/14 Extending Decision Trees 2. Real-valued features 3. Noise and over-fjtting 1. Non-binary class variable
7/14 The modifjed ID3 algorithm 9: 8: add arc with label v 7: for each value v of feature f do 6: choose feature f with the maximum information gain 5: 4: else the examples in the parent. 3: If no examples left, return a leaf node with the majority decision of examples. 2: If no features left, return a leaf node with the majority decision of the a decision for that class. 1: If all examples belong to the same class, return a leaf node with Algorithm 1 ID3 Algorithm (Features, Examples) end for add subtree ID 3 ( F − f , s ∈ S | f ( s ) = v )
8/14 CQ: Calculating the information gain CQ: Suppose that we are classifying examples into three classes. Before testing feature X , there are 3 examples in class c 1 , 5 examples in class c 2 , and 2 examples in class c 3 . Feature X has there are 2 examples in class c 1 , 0 examples in class c 2 , and 2 examples in class c 3 . What is the information gain for testing feature X at this node? two values a and b . When X = a , there are 1 examples in class c 1 , 5 examples in class c 2 , and 0 examples in class c 3 . When X = b , (A) [ 0 , 0 . 2 ) (B) [ 0 . 2 , 0 . 4 ) (C) [ 0 . 4 , 0 . 6 ) (D) [ 0 . 6 , 0 . 8 ) (E) [ 0 . 8 , 1 ]
9/14 Normal Yes Weak Normal 23.9 Rain 10 Yes Weak 20.6 Sunny Sunny 9 No Weak High 22.2 Sunny 8 Yes 11 23.9 Normal 27.2 Strong High 21.7 Rain 14 Yes Weak Normal Overcast Normal 13 Yes Strong High 22.2 Overcast 12 Yes Strong Strong 17.7 Jeeves dataset with real-valued temperatures Weak 3 No Strong High 26.6 Sunny 2 No High 28.3 29.4 Sunny 1 Tennis? Wind Humidity Temp Outlook Day Overcast High Overcast Normal 7 No Strong Normal 18.3 Rain 6 Yes Weak 20.0 Weak Rain 5 Yes Weak High 21.1 Rain 4 Yes No
10/14 Normal Yes Strong Normal 23.9 Sunny 11 Yes Weak 23.9 Sunny Rain 10 Yes Strong High 22.2 Overcast 12 No 2 26.6 High 28.3 Weak High 29.4 Sunny 1 Yes Weak High Overcast High 3 Yes Weak Normal 27.2 Overcast 13 No Strong Weak 22.2 Jeeves dataset ordered by temperatures Strong 5 No Strong Normal 18.3 Rain 6 Yes Normal 20.0 17.7 Overcast 7 Tennis? Wind Humidity Temp Outlook Day Rain Normal Sunny High 8 No Strong High 21.7 Rain 14 Yes Weak 21.1 Weak Rain 4 Yes Weak Normal 20.6 Sunny 9 Yes No
11/14 CQ: Testing a discrete feature CQ: Suppose that feature X has discrete values (e.g. Temp is Cool, Mild, or Hot.) On any path from the root to a leaf, how many times can we test feature X ? (A) 0 times (B) 1 time (D) Two of (A), (B), and (C) are correct. (E) All of (A), (B), and (C) are correct. (C) > 1 time
12/14 CQ: Testing a continuous feature CQ: Suppose that feature X has continuous values (e.g. Temp ranges from 17.7 to 29.4.) On any path from the root to a leaf, how many times can we test feature X ? (A) 0 times (B) 1 time (D) Two of (A), (B), and (C) are correct. (E) All of (A), (B), and (C) are correct. (C) > 1 time
13/14 Normal Yes Weak Normal Mild Rain 10 Yes Weak Cool Sunny Sunny 9 No Weak High Mild Sunny 8 Yes 11 Mild Normal Hot Strong High Mild Rain 14 Yes Weak Normal Overcast Normal 13 Yes Strong High Mild Overcast 12 Yes Strong Strong Cool Jeeves training set is corrupted Weak 3 No Strong High Hot Sunny 2 No High Hot Hot Sunny 1 Tennis? Wind Humidity Temp Outlook Day Overcast High Overcast Normal 7 No Strong Normal Cool Rain 6 Yes Weak Cool Weak Rain 5 Yes Weak High Mild Rain 4 No No
14/14 Revisiting the Learning Goals By the end of the lecture, you should be able to
Recommend
More recommend