1
play

1 2.1 Mobility trees Mobility tree, Men Working status 04 - PDF document

IPUC, Neuch atel, February 23-24, 2007 1 Aim of the research project Innovative Data Mining based approaches for Just started February 1, 2007 FNS project on life course analysis Mining event histories: Towards new


  1. ✬ ✩ ✬ ✩ IPUC, Neuchˆ atel, February 23-24, 2007 1 Aim of the research project Innovative Data Mining based approaches for Just started February 1, 2007 FNS project on life course analysis “Mining event histories: Towards new insight on personal Swiss life courses” Gilbert Ritschard Methodological concern Explore and develop data mining approaches for Alexis Gabadinho, Nicolas M¨ uller, Matthias Studer individual longitudinal data University of Geneva, Switzerland • Methods for time to event analysis Outline • Methods for sequence data analysis 1 Aim of the research project Socio-demographic concern Using mainly SHP data, but also other sources, 2 Our first results gain original insight on 2.1 Mobility trees 2.2 Survival trees • How familial, professional and other socio-demographic events are 2.3 Characteristic sequences entwined, 3 Foreseen Developments • Typical characteristics of Swiss life trajectories, • Changes in these characteristics over time. ✫ ✪ ✫ ✪ http://mephisto.unige.ch IPUC07 toc intro mob surv seq conc ◭ ◮ � � 22/2/2007gr 1 IPUC07 toc intro mob surv seq conc ◭ ◮ � � 22/2/2007gr 2 ✬ ✩ ✬ ✩ What is data mining? What is data mining? (2) Concerned with characterization of interesting patterns “Data Mining is the process of finding new and potentially useful knowledge from data” • per se (unsupervised learning) Gregory Piatetsky-Shapiro editor of http://www.kdnuggets.com – Clustering – Frequent itemsets – Association rules “Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel • for classification or prediction purposes (supervised learning) ways that are both understandable and useful to the data owner” – Decision trees (Hand et al., 2001) – Bayesian networks – SVM and Kernel Methods – CBR (case based reasoning), K-NN ( k nearest neighbors) Also called Knowledge Discovery in Databases, KDD. Origin: IJCAI Workshop, 1989, Piatetsky-Shapiro (1989) Proceeds mainly heuristically . Unlike statistical modeling, makes no assumptions about process Textbooks : Han and Kamber (2001), Hand et al. (2001) generating the data. ✫ ✪ ✫ ✪ IPUC07 toc intro mob surv seq conc ◭ ◮ � � 22/2/2007gr 3 IPUC07 toc intro mob surv seq conc ◭ ◮ � � 22/2/2007gr 4 ✬ ✩ ✬ ✩ 2 Our first results Typology of methods for individual longitudinal data nature of data • Mobility trees questions time stamped event state/event sequences • Survival trees descriptive - Survival curves: - Optimal matching clustering Parametric (Weibull, Gompertz) - Frequencies of typical • Characteristic sequences and non parametric patterns (Kaplan-Meier, Nelson-Aalen) - Discovering typical patterns estimators causality - Hazard regression models - Markov models, Mobility trees - Survival trees - Association rules between subsequences ✫ ✪ ✫ ✪ IPUC07 toc intro mob surv seq conc ◭ ◮ � � 22/2/2007gr 5 IPUC07 toc intro mob surv seq conc ◭ ◮ � � 22/2/2007gr 6 1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend