Decision Tree Algorithm Decision Tree Algorithm Week 4 1 Team - - PowerPoint PPT Presentation

decision tree algorithm decision tree algorithm
SMART_READER_LITE
LIVE PREVIEW

Decision Tree Algorithm Decision Tree Algorithm Week 4 1 Team - - PowerPoint PPT Presentation

Decision Tree Algorithm Decision Tree Algorithm Week 4 1 Team Homework Assignment #5 Team Homework Assignment #5 Read pp. 105 117 of the text book. R d 105 117 f h b k Do Examples 3.1, 3.2, 3.3 and Exercise 3.4 (a). Prepare


slide-1
SLIDE 1

Decision Tree Algorithm Decision Tree Algorithm

Week 4

1

slide-2
SLIDE 2

Team Homework Assignment #5 Team Homework Assignment #5

R d 105 117 f h b k

  • Read pp. 105 – 117 of the text book.
  • Do Examples 3.1, 3.2, 3.3 and Exercise 3.4 (a). Prepare for the

results of the homework assignment. results of the homework assignment.

  • Due date

– beginning of the lecture on Friday February 25th.

slide-3
SLIDE 3

Team Homework Assignment #6 Team Homework Assignment #6

D id d h i l f f h k

  • Decide a data warehousing tool for your future homework

assignments

  • Play the data warehousing tool

Play the data warehousing tool

  • Due date

– beginning of the lecture on Friday February 25th.

slide-4
SLIDE 4

Classification - A Two-Step Process Classification A Two Step Process

  • Model usage: classifying future or unknown objects

– Estimate accuracy of the model

  • The known label of test data is compared with the

classified result from the model A t i th t f t t t l

  • Accuracy rate is the percentage of test set samples

that are correctly classified by the model – If the accuracy is acceptable, use the model to classify y p , y data tuples whose class labels are not known

4

slide-5
SLIDE 5

ess (1): Model Construction ess (1): Model Construction

Figure 6.1 The data classification process: (a) Learning: Training data are analyzed by a classification algorithm Here the class label attribute is loan decision and the

5

a classification algorithm. Here, the class label attribute is loan_decision, and the learned model or classifier is represented in the form of classification rules.

slide-6
SLIDE 6

Figure 6.1 The data classification process: (b) Classification: Test data are used to estimate the accuracy of the classification rules. If the accuracy is considered t bl th l b li d t th l ifi ti f d t t l acceptable, the rules can be applied to the classification of new data tuples.

6

slide-7
SLIDE 7

De c isio n T re e Classific atio n E l E xample

7

slide-8
SLIDE 8

Decision Tree Learning Overview Decision Tree Learning Overview

  • Decision Tree learning is one of the most widely used and
  • Decision Tree learning is one of the most widely used and

practical methods for inductive inference over supervised data.

  • A decision tree represents a procedure for classifying
  • A decision tree represents a procedure for classifying

categorical data based on their attributes.

  • It is also efficient for processing large amount of data, so

i ft d i d t i i li ti is often used in data mining application.

  • The construction of decision tree does not require any

domain knowledge or parameter setting, and therefore appropriate for exploratory knowledge discovery.

  • Their representation of acquired knowledge in tree form

is intuitive and easy to assimilate by humans y y

8

slide-9
SLIDE 9

Decision Tree Algorithm – ID3 Decision Tree Algorithm ID3

  • Decide

hich attrib te (splitting point) to test at

  • Decide which attribute (splitting‐point) to test at

node N by determining the “best” way to separate or partition the tuples in D into separate or partition the tuples in D into individual classes

  • The splitting criteria is determined so that
  • The splitting criteria is determined so that,

ideally, the resulting partitions at each branch are as “pure” as possible as pure as possible.

– A partition is pure if all of the tuples in it belong to the same class

9

slide-10
SLIDE 10

Figure 6.3 Basic algorithm for inducing a decision tree from training examples.

10

slide-11
SLIDE 11

What is E ntro py? What is E ntro py?

  • T

he entro py is a

  • T

he entro py is a measure o f the unc e rtainty asso c iate d ith d i bl with a rando m variable

  • As unc e rtainty and o r

rando mne ss inc re ase s fo r a re sult se t so do e s the entro py Value s range fro m 0 1

  • Value s range fro m 0 – 1

to represent the entro py

  • f info rmatio n

c

11

) ( log ) (

2 1 i c i i

p p D Entropy

=

− ≡

slide-12
SLIDE 12

E ntro py E xample (1) E ntro py E xample (1)

12

slide-13
SLIDE 13

Entropy Example (2) Entropy Example (2)

13

slide-14
SLIDE 14

Entropy Example (3) Entropy Example (3)

14

slide-15
SLIDE 15

E ntro py E xample (4) E ntro py E xample (4)

15

slide-16
SLIDE 16

Information Gain

  • Information gain is used as an attribute selection

Information Gain

Information gain is used as an attribute selection measure

  • Pick the attribute that has the highest Information

g Gain

) (D Entropy D (D) Entropy A) (D Gain

j v j |

| ,

− = ) ( py D ( ) py ) (

j j

| | ,

1

=

D: A given data partition A: Attribute v: Suppose we were partition the tuples in D on some attribute A having v distinct values D is split into v partition or subsets, {D1, D2, … Dj}, where Dj contains those tupes in D that have outcome aj of A.

16

slide-17
SLIDE 17

Table 6 1 Class‐labeled training tuples from AllElectronics customer database Table 6.1 Class labeled training tuples from AllElectronics customer database.

17

slide-18
SLIDE 18
  • Class P: buys_computer = “yes”
  • Class N: buys computer = “no”

Class N: buys_computer = no

940 . ) 14 5 ( log 14 5 ) 14 9 ( log 14 9 ) (

2 2

= − − = D Entropy

  • Compute the expected information requirement for each

attribute: start with the attribute age

14 14 14 14

) ( | | ) ( ) , ( − =

v v

S Entropy S D Entropy D age Gain ) ( 14 5 ) ( 14 4 ) ( 14 5 ) ( ) ( | | ) (

_ } , , {

− − − = =

− ∈ senior aged middle youth v Senior aged Middle Youth v

S Entropy S Entropy S Entropy D Entropy S Entropy S D Entropy 246 . 14 14 14 =

029 . ) , ( = D income Gain 048 . ) , _ ( 151 . ) , ( = = D rating credit Gain D student Gain

18

slide-19
SLIDE 19

Figure 6.5 The attribute age has the highest information gain and therefore becomes the Figure 6.5 The attribute age has the highest information gain and therefore becomes the splitting attribute at the root node of the decision tree. Branches are grown for each outcome

  • f age. The tuples are shown partitioned accordingly.

19

slide-20
SLIDE 20

Figure 6.2 A decision tree for the concept buys_computer, indicating whether a customer at AllElectronics is likely to purchase a computer. Each internal (nonleaf) node represents a test ib E h l f d l ( i h b

  • n an attribute. Each leaf node represents a class (either buys_computer = yes or

buy_computers = no.

20

slide-21
SLIDE 21

E xe rc ise E xe rc ise

Construct a decision tree to classify “golf play.”

Weather and Possibility of Golf Play

Weather Temperature Humidity Wind Golf Play fine hot high none no fine hot high few no fine hot high few no cloud hot high none yes rain warm high none yes rain cold midiam none yes rain cold midiam few no cloud cold midiam few yes fine warm high none no fine cold midiam none yes fine cold midiam none yes rain warm midiam none yes fine warm midiam few yes cloud warm high few yes l d h t idi cloud hot midiam none yes rain warm high few no

21