M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
Structuring Typical Evolutions using Temporal-Driven Constrained - - PowerPoint PPT Presentation
Structuring Typical Evolutions using Temporal-Driven Constrained - - PowerPoint PPT Presentation
Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 8 November 2012 Marian-Andrei Rizoiu ERIC Laboratory Julien Velcin Universit Lumire Lyon 2 Stphane Lallich France M-A. Rizoiu, J. Velcin and S. Lallich
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 x1
d
x4
d
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 x1
d
x2
d
x4
d
x5
d
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
2
Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 x1
d
x2
d
x3
d
x4
d
x5
d
x6
d
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
3
Goal: Detect typical evolution patterns of individuals in the dataset Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
3
Goal: Detect typical evolution patterns of individuals in the dataset a) the phases through which the entity collection went
- ver time
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
3
Goal: Detect typical evolution patterns of individuals in the dataset a) the phases through which the entity collection went
- ver time
b) the trajectory of entities through the different phases Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
4
Summary:
- 1. Problem
1.1 Data 1.2 Goal
- 3. Experiments
3.1 Qualitative evaluation 3.2 Quantitative evaluation
- 2. Proposed solutions:
2.1 A clustering solution 2.2 Temporal-Aware Dissimilarity Measure 2.3 Contiguity Penalty Measure 2.4 TDCK-Means algorithm 2.5 Evaluation measures
- 4. Conclusion and perspectives
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
5
A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Proposed solution: Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
5
A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Proposed solution: The resulted partition must ensure: Problem Proposed Solutions Experiments Conclusion
- the descriptive coherence of clusters;
- the temporal coherence of clusters;
- continuous segmentation of
- bservations belonging to an entity.
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
5
A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Temporal-aware dissimilarity measure Contiguity penalty measure Proposed solution: The resulted partition must ensure: Problem Proposed Solutions Experiments Conclusion
- the descriptive coherence of clusters;
- the temporal coherence of clusters;
- continuous segmentation of
- bservations belonging to an entity.
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
5
A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Temporal-aware dissimilarity measure Contiguity penalty measure Proposed solution: The resulted partition must ensure: K-Means like algorithm. Objective function to minimize:
J =∑
μ j∈M ∑ xi∈C j(∥xi−μ j∥ TE+
∑
(xk∉C j)∧(xk
φ=xi φ)
w(xi , xk))
Problem Proposed Solutions Experiments Conclusion
- the descriptive coherence of clusters;
- the temporal coherence of clusters;
- continuous segmentation of
- bservations belonging to an entity.
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
5
A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Temporal-aware dissimilarity measure Contiguity penalty measure Proposed solution: The resulted partition must ensure: K-Means like algorithm. Objective function to minimize:
J =∑
μ j∈M ∑ xi∈C j(∥xi−μ j∥ TE+
∑
(xk∉C j)∧(xk
φ=xi φ)
w(xi , xk))
1 2 1 2 Problem Proposed Solutions Experiments Conclusion
- the descriptive coherence of clusters;
- the temporal coherence of clusters;
- continuous segmentation of
- bservations belonging to an entity.
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
6
Euclidean distance distance in the description space Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
6
Euclidean distance distance in the description space Temporal-aware dissimilarity measure distance in both description space and temporal space Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
6
Euclidean distance distance in the description space Temporal-aware dissimilarity measure distance in both description space and temporal space
∥xi−x j∥TE=1−(1−∥xi
d−x j d∥ 2
∆ xmax )(1−∣xi
t−x j t∣ 2
∆t max )
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
6
Euclidean distance distance in the description space Temporal-aware dissimilarity measure distance in both description space and temporal space
∥xi−x j∥TE=1−(1−∥xi
d−x j d∥ 2
∆ xmax )(1−∣xi
t−x j t∣ 2
∆t max )
Properties:
∥xi−x j∥TE∈[0,1],∀ xi, x j∈X ∥xi−x j∥TE=0⇔xi
d=x j d∧xi t=x j t
∥xi−x j∥TE=1⇔∥xi
d−x j d∥=∆ xmax∨∣xi t−x j t∣=∆t max
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
7
Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Problem Proposed Solutions Experiments Conclusion
[Wagstaff & Cardie '00]
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
7
Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Segmentation contiguity soft MUST-LINK pair-wise constraints time-dependent Contiguity Penalty Function Problem Proposed Solutions Experiments Conclusion
[Wagstaff & Cardie '00]
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
7
Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Segmentation contiguity soft MUST-LINK pair-wise constraints time-dependent Contiguity Penalty Function
w(xi,x j)=β∗e
−1 2 ( ∣xi
t−xj t∣
δ )
2
for xi
φ=x j φ
Contiguity Penalty Function: Problem Proposed Solutions Experiments Conclusion
[Wagstaff & Cardie '00]
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
7
Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Segmentation contiguity soft MUST-LINK pair-wise constraints time-dependent Contiguity Penalty Function
w(xi,x j)=β∗e
−1 2 ( ∣xi
t−xj t∣
δ )
2
for xi
φ=x j φ
Contiguity Penalty Function: Problem Proposed Solutions Experiments Conclusion
[Wagstaff & Cardie '00]
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
8
The TDCK-Means algorithm: Inspired from K-Means. Iteratively recomputes centroids and assignments of
- bservations to clusters.
Uses the Temporal-Aware Dissimilarity Function and the Contiguity Penalty Function. Centroids: (μj
t, μj d)
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
8
The TDCK-Means algorithm: Inspired from K-Means. Iteratively recomputes centroids and assignments of
- bservations to clusters.
Uses the Temporal-Aware Dissimilarity Function and the Contiguity Penalty Function. Centroids update:
μ j
d=
∑
xi∈C j
xi
d∗(1−∣xi t−μ j t∣ 2
∆t max
2
)
∑
xi∈C j(1−∣xi t−μ j t∣ 2
∆t max
2
)
μ j
t=
∑
xi ∈C j
xi
t∗(1−∥xi d−μ j d∥ 2
∆ xmax
2
)
∑
xi∈C j(1−∥xi d−μ j d∥ 2
∆ xmax
2
)
Centroids: (μj
t, μj d)
Weighted averages Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
9
Partition evaluation measures
- descriptive coherence of clusters;
- temporal coherence of clusters;
- continuous segmentation of
- bservations belonging to an entity.
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
9
Partition evaluation measures
- descriptive coherence of clusters;
- temporal coherence of clusters;
- continuous segmentation of
- bservations belonging to an entity.
variance MDvar Tvar Shannon Entropy Problem Proposed Solutions Experiments Conclusion A – B – A – B ???
ShaP=∑
xi∈X ∑ j=1 k
(−p(μ j)∗log2( p(μ j))∗(1+ nch−nmin
nobs−1 ))
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering
9
Partition evaluation measures
- descriptive coherence of clusters;
- temporal coherence of clusters;
- continuous segmentation of
- bservations belonging to an entity.
variance MDvar Tvar Shannon Entropy Proposal: Correct the Shannon entropy to penalize changes Problem Proposed Solutions Experiments Conclusion A – B – A – B ???
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 10
- 3. Experiments
3.1 Qualitative evaluation 3.2 Quantitative evaluation
- 4. Conclusion and perspectives
Problem Proposed Solutions Experiments Conclusion Summary:
- 1. Problem
1.1 Data 1.2 Goal
- 2. Proposed solutions:
2.1 A clustering solution 2.2 Temporal-Aware Dissimilarity Measure 2.3 Contiguity Penalty Measure 2.4 TDCK-Means algorithm 2.5 Evaluation measures
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11
Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11
Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11
Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11
Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11
Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 12
Quantitative evaluation 5 algorithms: 3 measures:
- K-Means [MacQueen '67];
- tcK-Means [Lin and Hauptmann '10]
- Temporal-Driven K-Means;
(uses Temporal-Aware Measure)
- Constrained K-Means;
(uses Contiguity Penalty Function)
- TDCK-Means;
(combines the two above)
- MDvar
- Tvar
- ShaP
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 13
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 14
- 3. Experiments
3.1 Qualitative evaluation 3.2 Quantitative evaluation
- 4. Conclusion and perspectives
Problem Proposed Solutions Experiments Conclusion Summary:
- 1. Problem
1.1 Data 1.2 Goal
- 2. Proposed solutions:
2.1 A clustering solution 2.2 Temporal-Aware Dissimilarity Measure 2.3 Contiguity Penalty Measure 2.4 TDCK-Means algorithm 2.5 Evaluation measures
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 15
Conclusion: Problem Proposed Solutions Experiments Conclusion
- Studied the detection of typical evolutions starting from a
collection of observations corresponding to entities;
- Proposed a new Temporal-Aware Measure;
- Proposed a new Contiguity Penalty Function;
- Proposed a new algorithm for detecting evolutions:
TDCK-Means;
- Other applications: political careers, life trajectories etc.
Perspectives:
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 16
- Generating the evolution graph;
- Automatic description of generated evolution phases (clusters);
- Flexible configuring the ration between the descriptive component
and the temporal component in the dissimilarity measure. Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 17
Thank you! Questions?
Problem Proposed Solutions Experiments Conclusion
M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 18