Influence measures for CART
Jean-Michel Poggi
Orsay, Paris Sud & Paris Descartes
Joint work with Avner Bar-Hen Servane Gey
(MAP5, Paris Descartes )
J-M. Poggi Influence measures for CART
Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud - - PowerPoint PPT Presentation
Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud & Paris Descartes Joint work with Avner Bar-Hen Servane Gey (MAP5, Paris Descartes ) J-M. Poggi Influence measures for CART Introduction Influence measures for CART CART
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset CART
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset CART
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset CART
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset CART
▶ R package rpart ▶ the default parameters (Gini heterogeneity function to grow the maximal tree
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset CART
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Influence on predictions Influence on partitions CART specific notion of influence
▶ reference tree T obtained from the complete sample ℒn ▶ jackknife trees
▶ influence on predictions focusing on predictive performance ▶ influence on partitions highlighting the tree structure
▶ CART specific influence derived from the pruned sequences of trees J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Influence on predictions Influence on partitions CART specific notion of influence
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Influence on predictions Influence on partitions CART specific notion of influence
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Influence on predictions Influence on partitions CART specific notion of influence
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Influence on predictions Influence on partitions CART specific notion of influence
cpj
cpj
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Influence on predictions Influence on partitions CART specific notion of influence
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
▶ Paris: 20 ”arrondissements”
▶ Seine-Saint-Denis (north of
▶ Hauts-de-Seine (west of
▶ Val-de-Marne (south of
▶ first and 9th deciles (D1, D9) ▶ quartiles (Q1, Q2 and Q3) ▶ mean, and % of the tax
Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
▶ each leaf: the predicted county and
▶ on the left subtree, homogeneous ▶ half the nodes of the right subtree are
▶ Paris and Hauts-de-Seine on the left
▶ while Val-de-Marne appears in both
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
▶ extreme quantiles separate
▶ global predictors are useful to
▶ splits on the left part mainly
▶ splits on the right part are
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
40 60 80 100 120 140 10 20 30 40
Influence function I1
Observations indices Number of observations differently labeled
40 60 80 100 120 140 0.2 0.6 1.0
Influence function I3
Observations indices Total variation distance
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
40 60 80 100 120 140 0.1 0.2 0.3 0.4 0.5 0.6 Influence function I5 Observations indices Dissimilarity
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
−4 −2 2 4 −4 −3 −2 −1 1 2 3 1st principal component (78.5%) 2nd principal component (17.1%)
Haut de Seine Seine Saint Denis Paris
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset Presentation Classification problem Influential cities
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset
J-M. Poggi Influence measures for CART
Introduction Influence measures for CART Exploring the Paris Tax Revenues dataset
J-M. Poggi Influence measures for CART