Decision Trees
LING 572 Advanced Statistical Methods for NLP January 9, 2020
1
Decision Trees LING 572 Advanced Statistical Methods for NLP - - PowerPoint PPT Presentation
Decision Trees LING 572 Advanced Statistical Methods for NLP January 9, 2020 1 Sunburn Example Name Hair Height Weight Lotion Result Sarah Blonde Average Light No Burn Dana Blonde Tall Average Yes None Alex Brown Short
LING 572 Advanced Statistical Methods for NLP January 9, 2020
1
2
Name Hair Height Weight Lotion Result Sarah Blonde Average Light No Burn Dana Blonde Tall Average Yes None Alex Brown Short Average Yes None Annie Blonde Short Average No Burn Emily Red Average Heavy No Burn Pete Brown Tall Heavy No None John Brown Average Heavy No None Katie Blonde Short Light Yes None
thousands or even millions of features.
3
4
5
Suburban (5) Rural (4) Urban (5)
Detached (2)
Yes(3) No (2) S e m i
e t a c h e d ( 3 )
6
7
NLTK book ch 6
8
➔ Find a decision tree that is as small as possible and fits the data
9
10
Find the “best” feature, A, and assign A as the decision feature for the node
For each value (or a range of values) of A, create a new branch, and divide up training examples
Repeat the process 1-2 until the gain is small enough
Repeatedly draws lines in different axes
11
(e.g., creating tests that look at feature combinations)
12
13
f1 > 10 f2 > 10 yes no f1 > 0 yes no yes no f1 > 20 yes no L1 L2 L3 L4 f2 > 20 yes no L5 yes no f1 > -10 L6 L7
f1 f2
14
15
16
i
us if both ends of the line knew X?
17
18
) ( | | | | ) ( ) | ( ) ( ) ( ) | ( ) ( ) , (
) ( a A Values a a a
S H S S S H a A S H a A p S H A S H S H A S InfoGain
∈
− = = = − = − =
Average Entropy
19
InfoGain (S, Income) =0.940 - (7/14)*0.985 - (7/14)*0.592 =0.151
InfoGain(S, PrevCustom) =0.940 - (8/14)*0.811 - (6/14)*1.0 =0.048
Where Sa is subset of S for which A has value a.
20
2 ) (
a A Values a a S
∈
21
22
23
24
25
26
Suburban (5) Rural (4) Urban (5)
Detached (2)
Yes(3) No (2) S e m i
e t a c h e d ( 3 )
➔ Question: how to choose split points?
27
28
29
30
Possible solutions:
target class.
31
32
2
33
34
35
36
and then let the m classifiers vote on a test instance.
ensemble
37
38
39
40
41
42
43