SLIDE 10 33459-01: Principles of Knowledge Discovery in Data – March-June, 2006
(Dr. O. Zaiane)
55
Naïve Bayes for the Tennis Example - 3
- Hence
- Probabilities in the numerator will be
estimated from the data.
- There is no need to estimate P(E) as it will
appear also in the denominators of the
- ther hypotheses, i.e. it will disappear
when we compare them.
) ( ) ( ) | ( ) | ( ) | ( ) | ( ) | (
4 3 2 1
E P yes P yes E P yes E P yes E P yes E P E yes P = ) ( ) ( ) | ( ) | ( ) | ( ) | ( ) | (
4 3 2 1
E P no P no E P no E P no E P no E P E no P =
33459-01: Principles of Knowledge Discovery in Data – March-June, 2006
(Dr. O. Zaiane)
56
- Tennis data - counts and probabilities:
- utlook
temperature humidity windy play yes no yes no yes no yes no yes no sunny 2 3 hot 2 2 high 3 4 false 6 2 9 5
4 mild 4 2 normal 6 1 true 3 3 rainy 3 2 cool 3 1 sunny 2/9 3/5 hot 2/9 2/5 high 3/9 4/5 false 6/9 2/5 9/14 5/14
4/9 0/5 mild 4/9 2/5 normal 6/9 1/5 true 3/9 3/5 rainy 3/9 2/5 cool 3/9 1/5
Naïve Bayes for the Tennis Example – cont.1
proportions of days when humidity is normal and play is yes i.e. the probability of humidity to be normal given that play is yes proportions of days when play is yes
Outlook Tempreature Humidity Windy Play sunny hot high false No sunny hot high true No
high false Yes rain mild high false Yes rain cool normal false Yes rain cool normal true No
normal true Yes sunny mild high false No sunny cool normal false Yes rain mild normal false Yes sunny mild normal true Yes
high true Yes
normal false Yes rain mild high true No 33459-01: Principles of Knowledge Discovery in Data – March-June, 2006
(Dr. O. Zaiane)
57
temperature humidity windy play yes no yes no yes no yes no yes no sunny 2 3 hot 2 2 high 3 4 false 6 2 9 5
4 mild 4 2 normal 6 1 true 3 3 rainy 3 2 cool 3 1 sunny 2/9 3/5 hot 2/9 2/5 high 3/9 4/5 false 6/9 2/5 9/14 5/14
4/9 0/5 mild 4/9 2/5 normal 6/9 1/5 true 3/9 3/5 rainy 3/9 2/5 cool 3/9 1/5
- P(yes) =? - the probability of a Play=yes without knowing
any E, i.e. anything about the particular day; the prior probability of yes; P(Play=yes) = 9/14
Naïve Bayes for the Tennis Example – cont.2
⇒ P(E1|yes)=P(outlook=sunny|yes)=2/9 P(E2|yes)=P(temperature=cool|yes)=3/9 P(E3|yes)=P(humidity=high|yes)=3/9 P(E4|yes)=P(windy=true|yes)=3/9
? ) | ( = E yes P
) ( ) ( ) | ( ) | ( ) | ( ) | ( ) | (
4 3 2 1
E P yes P yes E P yes E P yes E P yes E P E yes P =
33459-01: Principles of Knowledge Discovery in Data – March-June, 2006
(Dr. O. Zaiane)
58
- By substituting the respective evidence probabilities:
- Similarly calculating:
) ( 0053 . ) ( 14 9 9 3 9 3 9 3 9 2 ) | ( E P E P E yes P = =
- =>
- => for the new day play = no is more likely than
play = yes (4 times more likely) ) ( 0206 . ) ( 14 5 5 3 5 4 5 1 5 3 ) | ( E P E P E no P = = Naïve Bayes for the Tennis Example – cont.3 ) | ( E no P
) | ( ) | ( E yes P E no P >
Outlook Yes No Humidity Yes No sunny 2/9 3/5 high 3/9 4/5
4/9 normal 6/9 1/5 rain 3/9 2/5 Windy Tempreature true 3/9 3/5 hot 2/9 2/5 false 6/9 2/5 mild 4/9 2/5 Play=yes 9/14 cool 3/9 1/5 Play=No 5/14 33459-01: Principles of Knowledge Discovery in Data – March-June, 2006
(Dr. O. Zaiane)
59
A Problem with Naïve Bayes
- Suppose that the training data for the tennis example was different:
– outlook=sunny had been always associated with play=no (i.e. outlook=sunny had never occurred together with play=yes )
– P(yes|outlook=sunny)=0 and P(no|outlook=sunny)=1 – => final probability P(yes|E)=0 no matter of the other probabilities, i.e. zero probabilities hold a veto over the other probabilities
– If it happens in the training set => poor prediction on new data
- Solution: use Laplace estimator (correction) to calculate
probabilities
– Adds 1 to the numerator and k to the denominator, where k is the number of attribute values for a given attribute
) ( ) ( ) | ( ) | ( ) | ( ) | ( ) | (
4 3 2 1
E P yes P yes E P yes E P yes E P yes E P E yes P = =0
33459-01: Principles of Knowledge Discovery in Data – March-June, 2006
(Dr. O. Zaiane)
60
Laplace Correction – Modified Tennis Example
yes no … sunny 5 …
… rainy 3 2 … … sunny 0/7 5/7 …
rainy 3/7 2/7 …
- Laplace correction adds 1 to the numerator and 3 to the denominator
P(sunny|yes)=0/7 P(overcast|yes)=4/7 P(rainy|yes)=3/7
5 . 10 5 3 7 1 4 ) | ( = = + + = yes
P 4 . 10 4 3 7 1 3 ) | ( = = + + = yes rainy P
1 . 10 1 3 7 1 ) | ( = = + + = yes sunny P
Ensures that an attribute value which occurs 0 times will receive a nonzero (although small) probability.