forest for the monitoring of
play

Forest for the monitoring of wetland vegetation with multispectral - PowerPoint PPT Presentation

Comparing CART and Random Forest for the monitoring of wetland vegetation with multispectral data. Julie Campagna, phD student, Angers France Aurlie Davranche, Associate professor Angers France IBS-DR Biometry Workshop cnrs.fr Wrzburg


  1. Comparing CART and Random Forest for the monitoring of wetland vegetation with multispectral data. Julie Campagna, phD student, Angers France Aurélie Davranche, Associate professor Angers France IBS-DR Biometry Workshop cnrs.fr Würzburg University 09 october 2015 1 Workshop Wurszburg 10/2015

  2. SUMMARY  Decisions trees : Generalities  CART and Random Forest presentation  CART functionment  Random Forest functionment  Exemple of application : Remote sensing 2 Workshop Wurszburg 10/2015

  3. DECISION TREES  Method of classification (or regression)  Non parametric method  Can deal with a lot of data  Separate each sample to obtain the most homogeneous classes as possible  Separability criterions existing :  Gini Index : CART  Chi square automatic interaction detection : CHAID  Shannon Entropy :C5.0 3 Workshop Wurszburg 10/2015

  4. COMPARAISON CART ET RANDOM FOREST Two decision tree methods developped essentially by Breiman et al.  Cart was the first in 1984  Random Forest 2001  Different applications: biology, medecine, remote sensing ,…  Deal with a lot of data sample and variables  Not perturbated with extrems data or variables not required 4 Workshop Wurszburg 10/2015

  5. CART : FUNDAMENTALS  Cart use Ginny criterion to separate a training sample n    1 ² I f i i n : Number of class to predict Fi : Class frequencie in the node  Dichotomous partionning  Decision rule appears 5 Workshop Wurszburg 10/2015

  6. CART : IMPROVEMENT Sample Choose the result tree • 75% for training sample and 25% for 75% 25% validation • 10 cross validation • ( Esposito et al, CV-1SE ) Cross Validation 10 folds>Error minimal Final tree Validation Accuracy 6 Workshop Wurszburg 10/2015

  7. CART : PRUNNING RESULT 7 Workshop Wurszburg 10/2015

  8. CART : PARAMETERS  Cart was implement in R using the package Rpart  Presence = « 1 » ; absence = « 2 »  Unbalanced sample Optimal « Prior » parameter : iterative runs of the algorithm 8 Workshop Wurszburg 10/2015

  9. RANDOM FOREST : GENERAL OPERATION  RF grows many classification trees  To classify, each variable goes down each of the trees in the forest.  Each tree gives a classification: we say the tree “votes” for that class.  The forest choses the classification having the most vote (over all the trees in the forest). 9 Workshop Wurszburg 10/2015

  10. RANDOM FOREST : STEP ONE  For each tree it selects randomly 2/3 of the sample for training set and 1/3 for validation (Out Of Bag, OOB)  Variables are chosen randomly (generally sqrt(variables)) at each node with replacement 10 Workshop Wurszburg 10/2015

  11. RANDOM FOREST : STEP TWO FOREST CONSTRUCTION 11 Workshop Wurszburg 10/2015

  12. RANDOM FOREST : PARAMETERS  Can not deal with unbalanced samples  Two ways to ajust datas :  Up-sampling based on the size of the largest class  Down-sampling based on the size of the smallest class 12 Workshop Wurszburg 10/2015

  13. EXEMPLE OF APPLICATION : REMOTE SENSING  Satellite images usefull for monitoring of wetland environments  In this case we used a high spatial resolution image (World View 2) on Camargue in South of France. [image removed]  Needs :  Mapping the vegetation  Create a method easy to apply without knowledge in remote sensing and R programmation 13 Workshop Wurszburg 10/2015

  14. SAMPLE  21 landcover classes from field data  49 descriptive variables : reflectance values from bands spectral data and multispectral indices Class size 180 160 140 120 100 80 60 40 20 0 14 Workshop Wurszburg 10/2015

  15. EXEMPLE OF APPLICATION : REMOTE SENSING Classification of Salicornia Fruticosa [image removed] 15 Workshop Wurszburg 10/2015

  16. EXEMPLE OF APPLICATION : REMOTE SENSING Cartography Results : Sarcocornia Fruticosa 16 Workshop Wurszburg 10/2015

  17. EXEMPLE OF APPLICATION : REMOTE SENSING Confusion matricies : RF Carte de référence Cart Précision Carte de référence Classe 1 Classe 2 Globale Erreur OOB Précision Classe 1 Classe 2 Globale Carte produite, classification RF_Up Classe 1 858 9 Entraînement Classe 1 49 65 Classe2 0 1822 Classe2 1 1316 0,991 858 1831 0,26% 50 1381 0,953878407 0 0,00494 0,04706 Carte produite, classification RF_Down Classe 1 59 50 Erreur d'omission 0,02 7 Classe2 7 1781 Validation Classe 1 16 22 0,97 66 1831 3% Classe2 0 428 Erreur 0,11 0,027 16 450 0,9527897 d'omission 0,04888 Erreur d'omission 0 9 Total Classe 1 65 87 Classe2 1 1744 66 1831 0,953610965 0,015 0,047 Erreur d'omission  Close classification accuracy values 17 Workshop Wurszburg 10/2015

  18. EXEMPLE OF APPLICATION : REMOTE SENSING  The difference between global accuracy is really low between CART and Random forest (around 1,5%) and both results are good.  CART provides an explicit model, the one of Random Forest is implicit  An explicit model can be used again on a new dataset or an other image of the same date without repeat all the steps of modeling : more easy to use without specific knowledge 18 Workshop Wurszburg 10/2015

  19. CONCLUSION AND DISCUSSION  On a same dataset and with all parameters suitable to CART we obtain results not significantly different from Random Forest  This two models need some parameters to be capable to deal with unbalanced samples  CART can generate an explicit model as Random Forest can’t  This two algorithms also permit to identify important variables 19 Workshop Wurszburg 10/2015

  20. THANKS FOR YOUR ATTENTION ! 20 Workshop Wurszburg 10/2015

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend