1 Longitudinal Analysis Survival Trees Mining Frequent Episodes - PDF document

Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Outline Longitudinal Analysis 1 Motivation Mining Event Histories: A Social Scientist View Methods for Longitudinal Data Survival Trees 2 Gilbert Ritschard Principle Example Department of Econometrics, University of Geneva Social Science Issues http://mephisto.unige.ch Mining Frequent Episodes 3 IASC 2007, Aveiro, Portugal, August 30 - September 1 What Is It About? Example: Counting Alternate Episode Structures Issues Regarding Episode Rules 10/8/2007gr 1/34 10/8/2007gr 2/34 Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Motivation Motivation Need for suited methods for discovering interesting knowledge Individual life course paradigm. from these individual longitudinal data. Following macro quantities (e.g. #divorces, fertility rate, mean education level, ...) over time Social scientists use insufficient for understanding social behavior. Essentially Survival analysis (Event History Analysis) Need to follow individual life courses. More rarely sequential data analysis (Optimal Matching, Data availability Markov Chain Models) Large panel surveys in many countries (SHP, Could social scientists benefit from data-mining approaches? Biographical retrospective surveys (FFS, ...). Which methods? Statistical matching of censuses, population registers and other Are there specific issues with those methods for social administrative data. scientists? 10/8/2007gr 4/34 10/8/2007gr 5/34 Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Alternative views of Individual Longitudinal Data Issues with life course data Incomplete sequences Table: Time stamped events, record for Sandra Censored and truncated data: Cases falling out of observation before experiencing an event of ending secondary school in 1970 first job in 1971 marriage in 1973 interest. Sequences of varying length. Time varying predictors. Table: State sequence view, Sandra Example: When analysing time to divorce, presence of children year 1969 1970 1971 1972 1973 is a time varying predictor. civil status single single single single married Data collected by clusters education level primary secondary secondary secondary secondary Example: Household panel surveys. job no no first first first Multi-level analysis to account for unobserved shared characteristics of members of a same cluster. 10/8/2007gr 6/34 10/8/2007gr 7/34 1

Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Classical statistical approaches Multi-level: Simple linear regression example Survival Approaches 9 Survival or Event history analysis (Blossfeld and Rohwer, 2002) 8 y = 15.6 - 0.8 x Focuses on one event. y = 12.5 - 0.8 x Concerned with duration until event occurs 7 or with hazard of experiencing event. 6 Survival curves: Distribution of duration until event occurs 5 Children S ( t ) = p ( T ≥ t ) . 4 3 y = 3.2 + 0.2 x Hazard models: Regression like models for S ( t , x ) or hazard 2 h ( t ) = p ( T = t | T ≥ t ) y = 6.2 - 0.8 x 1 � � h ( t , x ) = g t , β 0 + β 1 x 1 + β 2 x 2 ( t ) + · · · . 0 1 3 5 7 9 11 13 15 Education 10/8/2007gr 8/34 10/8/2007gr 9/34 Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Survival curves (Switzerland, SHP 2002 biographical survey) Analysis of sequences 1 Frequencies of given subsequences 0.9 Essentially event sequences. 0.8 Subsequences considered as categories ⇒ Methods for Survival probability 0.7 categorical data apply (Frequencies, cross tables, log-linear 0.6 models, logistic regression, ...). 0.5 Markov chain models 0.4 State sequences. 0.3 Focuses on transition rates between states. Does the rate also depend on previous states? 0.2 Women How many previous states are significant? 0.1 Optimal Matching (Abbott and Forrest, 1986) . 0 State sequences. 0 10 20 30 40 50 60 70 80 Edit distance (Levenshtein, 1966; Needleman and Wunsch, AGE (years) 1970) between pairs of sequences. Clustering of sequences. Leaving home Marriage 1st Chilbirth Parents' death Last child left Divorce Widowing 10/8/2007gr 10/34 10/8/2007gr 11/34 Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Optimal Matching Typology of methods for life course data Example from (Gauthier, Widmer, Bucher, and Notredame, 2007) Issues Questions duration/hazard state/event sequencing Professional life course, age 16-64, Switzerland SHP retrospective survey, ∼ 3000 cases descriptive • Survival curves: • Optimal matching 5 clusters: Full Time, Part Time, Come Back, Home, Erratic Parametric clustering (Weibull, Gompertz, ...) • Frequencies of given and non parametric patterns 100% 100% (Kaplan-Meier, Nelson- • Discovering typical 80% 80% Aalen) estimators. episodes 60% 60% causality • Hazard regression models • Markov models 40% 40% (Cox, ...) • Mobility trees 20% 20% • Survival trees • Association rules 0% 0% among episodes 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 6 1 8 1 2 0 2 2 2 4 2 6 2 8 3 0 3 2 3 4 3 6 3 8 4 0 4 2 4 4 4 6 8 4 0 5 5 2 5 4 5 6 5 8 6 0 6 2 6 4 Full time Part time Negative interruption Positive interruption Home Retired Education Full time Part time Negative i nterruption Positive interruption Home Re tired Education Full Time, 53% Come Back, 16% 10/8/2007gr 12/34 10/8/2007gr 13/34 2

1 Longitudinal Analysis Survival Trees Mining Frequent Episodes - PDF document

Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Longitudinal Analysis Survival Trees Mining Frequent Episodes Summary Outline Longitudinal Analysis 1 Motivation Mining Event Histories: A Social Scientist View

Complex Security Policy? A Longitudinal Analysis of Deployed Content Security Policies Sebastian

Third-party Authentication Landscape Anna Vapen , Niklas Carlsson, Nahid Shahmehri Linkping

Innovative retirement products An Chen , University of Ulm joint with: Peter Hieber (Technical

Gift Planning in a Capital Campaign Stelter Webinar November 13, 2019 Michael Degenhart

Mobile Network Performance from User Devices: A Longitudinal, Multidimensional Analysis Ashkan

Introduction to Longitudinal Data Brandon LeBeau Assistant Professor DataCamp Longitudinal

Designing a Multipurpose Designing a Multipurpose Longitudinal Incentives Experiment for the

Adoption and Adult Outcomes in the Early Chiaki Moriguchi John Parman 20th Century Introduction

Creating a Cradle to Career Data System Why Early Childhood Data Are Critical March 27, 2019

Numerical modeling of non-destructjve testjng of composites Katerina Beklemysheva, Alexey

Structural Modelling of Nonlinear Exposure- Response Relationships for Longitudinal Data

HMIS 101: Understanding the Interconnectedness of HMIS Data Natalie Matthews, Abt Associates,

Markov-switching autoregressive latent variable models for longitudinal data University of

Social Media as a Passive Sensor in Longitudinal Studies of Human Behavior and Wellbeing Saha, K.

Longitudinal Data Analysis I PSYC 575 October 3, 2020 (updated: 3 October 2020) Learning

Longitudinal Analysis CSE545 - Fall2017 Supplemental Presentation Introduction Time Series

1 | Core SMA Dataset Review 2020 Core SMA Dataset for TREAT-NMD affiliated Registries First

HOW GRATITUDE CAN IMPROVE STUDENTS AND SCHOOLS: EDUCATING HEARTS AND MINDS IN THE 21ST CENTURY

Investigating Association Using Surrogate Marker Methodology Abel Tilahun Interuniversity

A comparison between landmarking and joint modeling for producing predictions using longitudinal

Data mining methods for longitudinal data Gilbert Ritschard, Dept of Econometrics, University of

HIT/HIE Community and Organizational Panel Office of Health Information Technology February 22,

Longitudinal Study of Astronomy Graduate Students Rachel Ivie Arnell Ephraim Statistical

Alternative Methods For Evaluating the Impact of Interventions: An Overview Excerpt from the

Sambuz

Useful Links

Newsletter

Mail Us