decision trees
play

Decision trees PRISM - Nicolas Sutton-Charani 20/01/2020 N. - PowerPoint PPT Presentation

Decision trees PRISM - Nicolas Sutton-Charani 20/01/2020 N. Sutton-Charani Artificial intelligence Decision trees 1 / 47 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning


  1. Decision trees PRISM - Nicolas Sutton-Charani 20/01/2020 N. Sutton-Charani Artificial intelligence Decision trees 1 / 47

  2. 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning of decision trees 3.1 Purity criteria 3.2 Stopping criteria 3.3 Learning algorithm 4. Pruning of decision trees 4.1 Cost-complexity trade-off 5. Extension : Random forest N. Sutton-Charani Artificial intelligence Decision trees 2 / 47

  3. Introduction What is a decision tree ? label label prediction prediction label label label attribute J 4 prediction prediction prediction attribute J 3 attribute J 2 attribute J 1 N. Sutton-Charani Artificial intelligence Decision trees 3 / 47

  4. Introduction What is a decision tree ? attribute J 1 attribute J 2 attribute J 3 label label label attribute J 4 prediction prediction prediction label label prediction prediction N. Sutton-Charani Artificial intelligence Decision trees 4 / 47

  5. Introduction What is a decision tree ? → supervised learning attribute J 1 values values attribute J 2 attribute J 3 values values values values label label label attribute J 4 prediction prediction prediction values values label label prediction prediction N. Sutton-Charani Artificial intelligence Decision trees 5 / 47

  6. Introduction A little history △ machine learning (or data mining) decision trees ! � = decision theory decision trees N. Sutton-Charani Artificial intelligence Decision trees 6 / 47

  7. Introduction Types of decision trees type of class label numerical → regression tree nominal → classification tree type of algorithm ( → structure) CART : statistics, binary tree C4.5 : computer science, small tree N. Sutton-Charani Artificial intelligence Decision trees 7 / 47

  8. Use of decision trees Prediction Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning of decision trees 3.1 Purity criteria 3.2 Stopping criteria 3.3 Learning algorithm 4. Pruning of decision trees 4.1 Cost-complexity trade-off 5. Extension : Random forest N. Sutton-Charani Artificial intelligence Decision trees 8 / 47

  9. Use of decision trees Prediction Classification trees Will the badminton match take place ? N. Sutton-Charani Artificial intelligence Decision trees 9 / 47

  10. Use of decision trees Prediction Classification trees What fruit is it ? N. Sutton-Charani Artificial intelligence Decision trees 10 / 47

  11. Use of decision trees Prediction Classification trees What he/she come to my party ? N. Sutton-Charani Artificial intelligence Decision trees 11 / 47

  12. Use of decision trees Prediction Classification trees Will they wait ? N. Sutton-Charani Artificial intelligence Decision trees 12 / 47

  13. Use of decision trees Prediction Classification trees Who will win he election in this county ? N. Sutton-Charani Artificial intelligence Decision trees 13 / 47

  14. Use of decision trees Prediction Regression trees What grade will a student get (given his homework average grade) ? N. Sutton-Charani Artificial intelligence Decision trees 14 / 47

  15. Use of decision trees Interpretability : Descriptive data analysis Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning of decision trees 3.1 Purity criteria 3.2 Stopping criteria 3.3 Learning algorithm 4. Pruning of decision trees 4.1 Cost-complexity trade-off 5. Extension : Random forest N. Sutton-Charani Artificial intelligence Decision trees 15 / 47

  16. Use of decision trees Interpretability : Descriptive data analysis Data analysis tool Trees are very interpretable : attributes spaces partitioning → a tree can be resumed by its leaves which define a law mixture → wonderful collaboration tool with experts △ INSTABILITY ← overfitting ! N. Sutton-Charani Artificial intelligence Decision trees 16 / 47

  17. Learning of decision trees Formalism Learning dataset (supervised learning)    x 1 x J  x 1 , y 1 y 1 . . . 1 1 . . . . . . . .  = samples are assumed to be i.i.d     . . . .    x 1 x J x N , y N y N . . . N N Attributes X = ( X 1 , . . . , X J ) ∈ X = X 1 × · · · × X J Spaces X j can be categorical or numerical Class label Y ∈ Ω = { ω 1 , . . . , ω K } ( ∈ R K for regression) Tree and π h = P ( t h ) ≈ | t h | P H = { t 1 , . . . , t H } with | t h | = # { i : x i ∈ t h } N N. Sutton-Charani Artificial intelligence Decision trees 17 / 47

  18. Learning of decision trees Recursive partitioning N. Sutton-Charani Artificial intelligence Decision trees 18 / 47

  19. Learning of decision trees Recursive partitioning N. Sutton-Charani Artificial intelligence Decision trees 19 / 47

  20. Learning of decision trees Recursive partitioning N. Sutton-Charani Artificial intelligence Decision trees 20 / 47

  21. Learning of decision trees Recursive partitioning N. Sutton-Charani Artificial intelligence Decision trees 21 / 47

  22. Learning of decision trees Learning principle Start with all the dataset in the initial node Chose the best splits (on attributes) in order to get pure leaves Classification trees purity = homogeneity in term of class labels K CART → Gini impurity : i ( t h ) = � p k (1 − p k ) k =1 whith p k = P ( Y = ω k | t h ) K ID3, C4.5 → Shanon entropy : i ( t h ) = − � p k log 2 ( p k ) k =1 Regression trees purity = low variance of class labels � � E ( Y | t h )) 2 with � → i ( t h ) = � ( y i − � 1 1 Var ( Y | t h ) = E ( Y | t h ) = y i | t h | | t h | x i ∈ t h x i ∈ t h N. Sutton-Charani Artificial intelligence Decision trees 22 / 47

  23. Learning of decision trees Impurity measures N. Sutton-Charani Artificial intelligence Decision trees 23 / 47

  24. Learning of decision trees Purity criteria Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning of decision trees 3.1 Purity criteria 3.2 Stopping criteria 3.3 Learning algorithm 4. Pruning of decision trees 4.1 Cost-complexity trade-off 5. Extension : Random forest N. Sutton-Charani Artificial intelligence Decision trees 24 / 47

  25. Learning of decision trees Purity criteria Purity criteria leaf to split t h Impurity measure + tree structure → criteria CART, ID3 : purity gain C4.5 : information gain ratio Regression trees CART : Variance minimisation N. Sutton-Charani Artificial intelligence Decision trees 25 / 47

  26. Learning of decision trees Purity criteria Purity criteria attribute ? t h values ? values ? prediction ? prediction ? t L t R Impurity measure + tree structure → criteria CART, ID3 : purity gain → ∆ i = i ( t h ) − π L i ( t L ) − π R i ( t R ) ∆ i C4.5 : information gain ratio → IGR = H ( π L ,π R ) Regression trees CART : Variance minimisation → ∆ i = i ( t h ) − π L i ( t L ) − π R i ( t R ) N. Sutton-Charani Artificial intelligence Decision trees 26 / 47

  27. Learning of decision trees Stopping criteria Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning of decision trees 3.1 Purity criteria 3.2 Stopping criteria 3.3 Learning algorithm 4. Pruning of decision trees 4.1 Cost-complexity trade-off 5. Extension : Random forest N. Sutton-Charani Artificial intelligence Decision trees 27 / 47

  28. Learning of decision trees Stopping criteria Stopping criteria (pre-pruning) For all leaves { t h } h =1 ,..., H and their potential children : leaves purity : ∃ k ∈ { 1 , . . . , K } : p k = 1 leaves and children sizes : | t h | ≤ minLeafSize leaves and children weights : π h = | t h | t 0 ≤ minLeafProba leaves number : H ≥ maxNumberLeaves tree depth : depth ( P H ) ≥ maxDepth purity gain : ∆ i ≤ minPurityGain N. Sutton-Charani Artificial intelligence Decision trees 28 / 47

  29. Learning of decision trees Learning algorithm Plan 1. Introduction 2. Use of decision trees 2.1 Prediction 2.2 Interpretability : Descriptive data analysis 3. Learning of decision trees 3.1 Purity criteria 3.2 Stopping criteria 3.3 Learning algorithm 4. Pruning of decision trees 4.1 Cost-complexity trade-off 5. Extension : Random forest N. Sutton-Charani Artificial intelligence Decision trees 29 / 47

  30. Learning of decision trees Learning algorithm Learning algorithm Result: Learnt tree Start with all the learning data in an initial node (single leaf); while Stopping criteria not verified for all leaves do for each splitable leaf do compute the purity gains obtained from all possible split; end SPLIT : select the split achieving the maximum purity gain; end prune the obtained tree; Recursive partitioning N. Sutton-Charani Artificial intelligence Decision trees 30 / 47

  31. Learning of decision trees Learning algorithm ID3 - Training Examples – [9+,5-] N. Sutton-Charani Artificial intelligence Decision trees 31 / 47

  32. Learning of decision trees Learning algorithm ID3 - Selecting Next Attribute N. Sutton-Charani Artificial intelligence Decision trees 32 / 47

  33. Learning of decision trees Learning algorithm ID3 - Selecting Next Attribute N. Sutton-Charani Artificial intelligence Decision trees 33 / 47

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend