symbolic analysis of hierarchical structured data
play

Symbolic Analysis of Hierarchical-Structured Data. Application to - PowerPoint PPT Presentation

Symbolic Analysis of Hierarchical-Structured Data. Application to Veterinary epidemiology C. Fablet 1 , E. Diday 2 , S. Bougeard 1 , C. Toque 3 & L. Billard 4 1 French agency for food, environmental and occupational health safety (Anses),


  1. Symbolic Analysis of Hierarchical-Structured Data. Application to Veterinary epidemiology C. Fablet 1 , E. Diday 2 , S. Bougeard 1 , C. Toque 3 & L. Billard 4 1 French agency for food, environmental and occupational health safety (Anses), France 2 University of Paris Dauphine, France 3 SYROKKO, France 4 University of Georgia, Athens, USA 19th International Conference on Computational Statistics, Paris, August 22-27, 2010

  2. Context of veterinary epidemiological surveys Statistical issue 1. Description of the relationships between the dependent variables � variable selection, 2. Summary of the dependent variables into an overall single variable ( i.e. the disease), … with a hierarchical structure of observations (P animals each within N farms). Disease intensity Unapparent disease Dependent variable Farms x Farms Average disease animals (disease) y Y Fatal disease

  3. Dataset: Study of pig respiratory diseases 19 variables Disease intensity Unapparent disease Description of pig respiratory 125 farms x 125 Average disease diseases 30 animals farms y Y Fatal disease • Pneumonia (0 � 28), pleuritis (0 � 4), • Lung abscess (0/1), lung nodules (0/1), healing from pneumonia (0/1), • Hypertrophy of lung lymph nodes (0 � 3), pericarditis (0/1), • Frequency of coughs at 16 and 22 weeks of age.

  4. Step 1: Variable synthesis Classical procedure Symbolic procedure 19 variables Description of • Categorical variable: 125 farms x pig respiratory 30 animals histogram of the diseases frequencies based on 30 animals, Animal frequencies Median score (categorical var.) (continuous var.) • Continuous variable: 64 variables histogram which keep the data variation. Description of 125 farms pig respiratory diseases

  5. Step 1: Variable synthesis (symbolic results) SYR software with the TABSYR & STATSYR modules

  6. Step 2: Variable selection Classical procedure Symbolic procedure • Symbolic Principal • Principal Component Component Analysis of Analysis of the 64 the 19 variables, variables, • ‘Global’ variable selection (best var. contribution) • Selection of the variables • ‘Quadrants’ variable with the best contribution, selection (best var. correlation), • Principal Component • Final symbolic PCA Analysis of the selected representation of the selected ‘bins’ variables. variables.

  7. Step 2: Variable selection (symbolic results) • Var. group PNEU+: severe pneumonia, • Var. group PLEU_PNEU: average level of pleuritis and pneumonia, • Var. group PLEU0_PNEU0: few lung lesions, • Var. group PNEU-: light pneumonia lesions. Symbolic PCA of the 8 ‘bins’ selected var. SYR software with the ACPSYR module

  8. Step 3: Individual clustering Classical procedure Symbolic procedure • Hierarchical Ascendant • Symbolic partitioning Classification (Ward (inertia criterion) criterion) • Cluster description • Cluster description • Variables sorted in order of overall discriminant power, • Comparison of the variable means (& standard • Cluster description with the deviations) of each cluster, most discriminant with the variable means on variables (or variable the whole sample. modalities).

  9. Step 3: Individual clustering (symbolic results) SYR software with the CLUSTSYR module

  10. Conclusion & perspectives Conclusion • Symbolic analysis to process hierarchical-structured data without reducing information, • Relevant and useful methods for veterinary epidemiological surveys (competes with GEE including a random measurement effect), • Available software (SYR). Perspectives • Other symbolic methods available for various aims, • Extension to multiblock modelling (hierarchical-structured observations and variables).

  11. Symbolic Analysis of Hierarchical-Structured Data. Application to Veterinary epidemiology C. Fablet 1 , E. Diday 2 , S. Bougeard 1 , C. Toque 3 & L. Billard 4 1 French agency for food, environmental and occupational health safety (Anses), France 2 University of Paris Dauphine, France 3 SYROKKO, France 4 University of Georgia, Athens, USA 19th International Conference on Computational Statistics, Paris, August 22-27, 2010

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend