SLIDE 19 73
Entropy Based Discretization (1)
1. Sort examples in increasing order
- 2. Each value forms an interval (‘m’ interals)
- 3. Calculate the entropy measure of this discretazation
- 4. Merge the pair of adjacent intervals whose merging would
improve hte entropy
- 5. Repeat 3-4 with new discretization intervals
74
Entropy Based Discretization (2)
1. Sort examples in increasing order
- 2. Each value forms an interval (‘m’ interals)
- 3. Calculate the entropy measure of this discretazation
- 4. Find the binary split boundary that minimizes the entropy
function over all possible boundaries. The split is selected as a binary discretization.
- 5. Apply the process recursively until some stopping criterion is
met, e.g.,
1 2 1 2
= + | | | | ( , ) ( ) ( ) | | | | E S T Ent Ent S S
S S S S
− > ( ) ( , ) Ent S E T S δ
75
Example
11 11 11 14 15 15 15 16 16 18 21 21 21 22 23 x freq. p p.log(p) 11 3 0.20
14 1 0.07
15 3 0.20
16 2 0.13
18 1 0.07
21 3 0.20
22 1 0.07
23 1 0.07
15 1.00 2.82 x freq. p p.log(p) 11 3 0.75
14 1 0.25
4 1.00 0.811
1.99 15 3 0.27
16 2 0.18
18 1 0.09
21 3 0.27
22 1 0.09
23 1 0.09
11 1.00 2.41
x freq. p p.log(p) 11 3 1.00 0.00
3 1.00 0.00
2.1 14 1 0.08
15 3 0.25
16 2 0.17
18 1 0.08
21 3 0.25
22 1 0.08
23 1 0.08
12 1.00 2.63
x freq. p p.log(p) 11 3 0.43 -0.52 14 1 0.14 -0.40 15 3 0.43 -0.52
7 1.00 1.45
1.83 16 2 0.25 -0.50 18 1 0.13 -0.38 21 3 0.38 -0.53 22 1 0.13 -0.38 23 1 0.13 -0.38
8.00 1.00 2.16
x freq. p p.log(p) 11 3 0.33
14 1 0.11
15 3 0.33
16 2 0.22
9 1.00 1.89
1.85 18 1 0.17
21 3 0.50
22 1 0.17
23 1 0.17
6 1.00 1.79
... 1.90 2.26 2.47
99 . 83 . 1 82 . 2 ) , ( ) ( = − = = − T S I S Ent Best Split
76
Concept Hierarchies
- A concept hierarchy defines a sequence of mappings from a set
- f low-level concepts to higher-level, more general concepts
- E.g. City values for location include Los Angeles or New York
- Each city can be mapped to the state or province where it belongs
- A state or province can in turn, be mapped to the country to which
they belong
- These mappings form a concept hierarchy