Outline Univariate Trees 1 Decision Trees Classification - PowerPoint PPT Presentation

Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil Old Dominion Univ. Rule Extraction 2 Fall 2010 Multivariate Trees 3 1 2 Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Decision Trees Tree Structure Internal nodes: split an attribute (univariate) or multiple attributes (multivariate) Numeric x i : relational split x i > w m Discrete x i : n-way split on all possible values Leaves Classification: class label or labels with proportions Regression: r average or local fit 3 4

Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Classification Trees Entropy For a node m , let N m be # of training instances that reach Shannon’s entropy is a measure whose m , of which N i m are in class C i . expected value is the # of bits required to encode a message m = N i ˆ x , m ) ≡ p i m P ( C i | � Alteratively, the amount of N m information we are missing if we don’t know the message Node m is pure if p i m is 0 or 1 Represents a limit on what can be Measure of purity is “entropy” achived with lossless compression Variants: ID3, CART, C4.5 5 6 Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Why Entropy? Entropy and Tree Nodes Information content of x ∈ { v 1 , ..., v n } Entropy is a measure of the impurity of I ( x ) = − log p ( x ) a node If all instance that reach a node are in Log (base 2) reflects idea of encoding as a bit string. the same class, then we lose no Entropy infromation by not asking which � instances actually brought us here. I ( § ) ≡ E [ I ( § )] = − √ ( § � ) log √ ( § � ) By contrast, if the instances reaching � this node are evenly distributed among Why “entropy” rather than “expected information content”? the classes, we learn nothing by coming This formula resembles one from physics: here that we did not know in the step before. � H ( x ) = − k B p i ln p i i and, like the original entropy, information loss is minimized as order increases. 7 8

Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Greedy Splitting GenerateTree(X) If node m is pure, generate a leaf and stop. if I ( X ) < θ I then Create leaf labelled by majority class in X Otherwise split and continue recursively else Impurity after the split: Suppose N mj of the N m take branch j i ← SplitAttribute(X) and that N i mj of these belong to C i for all branches of x i do Find X i falling in branch N i ˆ mj x , m , j ) ≡ p i P ( C i | � mj = Generatetree( X i ) N mj end for end if n N i K I ′ � mj � p i mj log 2 p i m = − mj N mj j =1 i =1 Select the variable and split that minimizes impurity For numeric variables, include choices of split positions 9 10 Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees SplitAttribute(X) Regression Trees � 1 if x ∈ X m MinEnt ← ∞ Let b m ( � x ) = for all attributes i = 1 , . . . , d do 0 otherwise if x i is discrete with n values then Split X into X 1 , . . . , X n by x i e ← SplitEntropy( X 1 , . . . , X n ) Error at node m: if e < MinEnt then MinEnt ← e 1 bestf ← i ( r t − g m ) 2 b m ( � � x t ) E m = end if N m else for all possible splits of numeric attribute do t Split X into X 1 , X 2 by x i x t ) r t � t b m ( � e ← SplitEntropy( X 1 , X 2 ) g m = if e < MinEnt then � t b m ( � x t ) MinEnt ← e bestf ← i end if After splitting end for end if end for return bestf 1 ( r t − g mj ) 2 b mj ( � E ′ � � x t ) = m N m j t x t ) r t � t b mj ( � g mj = � t b mj ( � x t ) 11 12

Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Examples Pruning Pre-pruning: Stop generating nodes when number of training instances reaching a node is small (e.g., 5% of trianing set) Post-pruning: Grow full tree then prune subtrees that overfit on trianing set use a set-aside pruning set Replace subtrees by a leaf and eval on pruning set. If leaf performs as well as subtree, keep the leaf. Example: Replace lowest node ( x < 6 . 31) in third tree by leaf with 0.9 13 14 Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Rule Extraction Multivariate Trees Trees can be interpreted as programming instructions: R1: IF (age > 38.5) AND years-in-job>2.5) then y=0.8 R2: IF (age > 38.5) AND years-in-job <= 2.5) then y=0.6 R3: IF (age <= 38.5) AND job-type=’A’) then y=0.4 R4: IF (age <= 38.5) AND job-type=’B’) then y=0.3 R5: IF (age <= 38.5) AND job-type=’C’) then y=0.2 Called a rule base . 15 16

Outline Univariate Trees 1 Decision Trees Classification - PowerPoint PPT Presentation

Univariate Trees Rule Extraction Multivariate Trees Univariate Trees Rule Extraction Multivariate Trees Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil Old Dominion Univ. Rule Extraction 2

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

Lecture 24: Other (Non-linear) Classifjers: Decision Tree Learning, Boosting, and Support Vector

Decision trees Subhransu Maji CMPSCI 670: Computer Vision November 1, 2016 Recall: Steps

Decision Trees with Numeric Tests Industrial-strength algorithms For an algorithm to be useful

Continuous Improvement Toolkit Decision Tree Continuous Improvement Toolkit . www.citoolkit.com

FARs Related to Rating Sec. 25.815 Width of aisle. The passenger aisle width at any point between

Agenda ABOUT US WHAT IS COGNOSOS? PROJECT SCOPE OVERVIEW OF PORT AUTHORITIES COMPETITVE

Benin Kingdom YEAR FIVE Autumn 1 LESSON FIVE WHAT WAS THE TRANSATLANTIC SLAVE TRADE?

Regulatory challenges for g y g automatic enrolment Charles Counsell Executive Director for