Structuring Typical Evolutions using Temporal-Driven Constrained - - PowerPoint PPT Presentation

structuring typical evolutions using temporal driven
SMART_READER_LITE
LIVE PREVIEW

Structuring Typical Evolutions using Temporal-Driven Constrained - - PowerPoint PPT Presentation

Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 8 November 2012 Marian-Andrei Rizoiu ERIC Laboratory Julien Velcin Universit Lumire Lyon 2 Stphane Lallich France M-A. Rizoiu, J. Velcin and S. Lallich


slide-1
SLIDE 1

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

Marian-Andrei Rizoiu Julien Velcin Stéphane Lallich 8 November 2012 ERIC Laboratory Université Lumière Lyon 2 France

slide-2
SLIDE 2

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) Problem Proposed Solutions Experiments Conclusion

slide-3
SLIDE 3

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) Problem Proposed Solutions Experiments Conclusion

slide-4
SLIDE 4

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) Problem Proposed Solutions Experiments Conclusion

slide-5
SLIDE 5

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 Problem Proposed Solutions Experiments Conclusion

slide-6
SLIDE 6

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 Problem Proposed Solutions Experiments Conclusion

slide-7
SLIDE 7

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 Problem Proposed Solutions Experiments Conclusion

slide-8
SLIDE 8

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 x1

d

x4

d

Problem Proposed Solutions Experiments Conclusion

slide-9
SLIDE 9

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 x1

d

x2

d

x4

d

x5

d

Problem Proposed Solutions Experiments Conclusion

slide-10
SLIDE 10

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

2

Dataset: the values for a certain number of numerical features (xd) for multiple entities (φ) at different moments of time (t) φ2 φ1 t2 t3 t1 t1 t2 t3 t1 t2 t3 x1

d

x2

d

x3

d

x4

d

x5

d

x6

d

Problem Proposed Solutions Experiments Conclusion

slide-11
SLIDE 11

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

3

Goal: Detect typical evolution patterns of individuals in the dataset Problem Proposed Solutions Experiments Conclusion

slide-12
SLIDE 12

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

3

Goal: Detect typical evolution patterns of individuals in the dataset a) the phases through which the entity collection went

  • ver time

Problem Proposed Solutions Experiments Conclusion

slide-13
SLIDE 13

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

3

Goal: Detect typical evolution patterns of individuals in the dataset a) the phases through which the entity collection went

  • ver time

b) the trajectory of entities through the different phases Problem Proposed Solutions Experiments Conclusion

slide-14
SLIDE 14

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

4

Summary:

  • 1. Problem

1.1 Data 1.2 Goal

  • 3. Experiments

3.1 Qualitative evaluation 3.2 Quantitative evaluation

  • 2. Proposed solutions:

2.1 A clustering solution 2.2 Temporal-Aware Dissimilarity Measure 2.3 Contiguity Penalty Measure 2.4 TDCK-Means algorithm 2.5 Evaluation measures

  • 4. Conclusion and perspectives

Problem Proposed Solutions Experiments Conclusion

slide-15
SLIDE 15

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

5

A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Proposed solution: Problem Proposed Solutions Experiments Conclusion

slide-16
SLIDE 16

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

5

A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Proposed solution: The resulted partition must ensure: Problem Proposed Solutions Experiments Conclusion

  • the descriptive coherence of clusters;
  • the temporal coherence of clusters;
  • continuous segmentation of
  • bservations belonging to an entity.
slide-17
SLIDE 17

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

5

A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Temporal-aware dissimilarity measure Contiguity penalty measure Proposed solution: The resulted partition must ensure: Problem Proposed Solutions Experiments Conclusion

  • the descriptive coherence of clusters;
  • the temporal coherence of clusters;
  • continuous segmentation of
  • bservations belonging to an entity.
slide-18
SLIDE 18

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

5

A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Temporal-aware dissimilarity measure Contiguity penalty measure Proposed solution: The resulted partition must ensure: K-Means like algorithm. Objective function to minimize:

J =∑

μ j∈M ∑ xi∈C j(∥xi−μ j∥ TE+

(xk∉C j)∧(xk

φ=xi φ)

w(xi , xk))

Problem Proposed Solutions Experiments Conclusion

  • the descriptive coherence of clusters;
  • the temporal coherence of clusters;
  • continuous segmentation of
  • bservations belonging to an entity.
slide-19
SLIDE 19

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

5

A temporal-aware constrained clustering algorithm, resulted clusters serve as phases. Temporal-aware dissimilarity measure Contiguity penalty measure Proposed solution: The resulted partition must ensure: K-Means like algorithm. Objective function to minimize:

J =∑

μ j∈M ∑ xi∈C j(∥xi−μ j∥ TE+

(xk∉C j)∧(xk

φ=xi φ)

w(xi , xk))

1 2 1 2 Problem Proposed Solutions Experiments Conclusion

  • the descriptive coherence of clusters;
  • the temporal coherence of clusters;
  • continuous segmentation of
  • bservations belonging to an entity.
slide-20
SLIDE 20

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

6

Euclidean distance distance in the description space Problem Proposed Solutions Experiments Conclusion

slide-21
SLIDE 21

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

6

Euclidean distance distance in the description space Temporal-aware dissimilarity measure distance in both description space and temporal space Problem Proposed Solutions Experiments Conclusion

slide-22
SLIDE 22

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

6

Euclidean distance distance in the description space Temporal-aware dissimilarity measure distance in both description space and temporal space

∥xi−x j∥TE=1−(1−∥xi

d−x j d∥ 2

∆ xmax )(1−∣xi

t−x j t∣ 2

∆t max )

Problem Proposed Solutions Experiments Conclusion

slide-23
SLIDE 23

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

6

Euclidean distance distance in the description space Temporal-aware dissimilarity measure distance in both description space and temporal space

∥xi−x j∥TE=1−(1−∥xi

d−x j d∥ 2

∆ xmax )(1−∣xi

t−x j t∣ 2

∆t max )

Properties:

∥xi−x j∥TE∈[0,1],∀ xi, x j∈X ∥xi−x j∥TE=0⇔xi

d=x j d∧xi t=x j t

∥xi−x j∥TE=1⇔∥xi

d−x j d∥=∆ xmax∨∣xi t−x j t∣=∆t max

Problem Proposed Solutions Experiments Conclusion

slide-24
SLIDE 24

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

7

Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Problem Proposed Solutions Experiments Conclusion

[Wagstaff & Cardie '00]

slide-25
SLIDE 25

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

7

Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Segmentation contiguity soft MUST-LINK pair-wise constraints time-dependent Contiguity Penalty Function Problem Proposed Solutions Experiments Conclusion

[Wagstaff & Cardie '00]

slide-26
SLIDE 26

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

7

Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Segmentation contiguity soft MUST-LINK pair-wise constraints time-dependent Contiguity Penalty Function

w(xi,x j)=β∗e

−1 2 ( ∣xi

t−xj t∣

δ )

2

for xi

φ=x j φ

Contiguity Penalty Function: Problem Proposed Solutions Experiments Conclusion

[Wagstaff & Cardie '00]

slide-27
SLIDE 27

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

7

Semi-Supervised clustering pair-wise constraints apply penalty when constraints are broken Segmentation contiguity soft MUST-LINK pair-wise constraints time-dependent Contiguity Penalty Function

w(xi,x j)=β∗e

−1 2 ( ∣xi

t−xj t∣

δ )

2

for xi

φ=x j φ

Contiguity Penalty Function: Problem Proposed Solutions Experiments Conclusion

[Wagstaff & Cardie '00]

slide-28
SLIDE 28

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

8

The TDCK-Means algorithm: Inspired from K-Means. Iteratively recomputes centroids and assignments of

  • bservations to clusters.

Uses the Temporal-Aware Dissimilarity Function and the Contiguity Penalty Function. Centroids: (μj

t, μj d)

Problem Proposed Solutions Experiments Conclusion

slide-29
SLIDE 29

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

8

The TDCK-Means algorithm: Inspired from K-Means. Iteratively recomputes centroids and assignments of

  • bservations to clusters.

Uses the Temporal-Aware Dissimilarity Function and the Contiguity Penalty Function. Centroids update:

μ j

d=

xi∈C j

xi

d∗(1−∣xi t−μ j t∣ 2

∆t max

2

)

xi∈C j(1−∣xi t−μ j t∣ 2

∆t max

2

)

μ j

t=

xi ∈C j

xi

t∗(1−∥xi d−μ j d∥ 2

∆ xmax

2

)

xi∈C j(1−∥xi d−μ j d∥ 2

∆ xmax

2

)

Centroids: (μj

t, μj d)

Weighted averages Problem Proposed Solutions Experiments Conclusion

slide-30
SLIDE 30

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

9

Partition evaluation measures

  • descriptive coherence of clusters;
  • temporal coherence of clusters;
  • continuous segmentation of
  • bservations belonging to an entity.

Problem Proposed Solutions Experiments Conclusion

slide-31
SLIDE 31

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

9

Partition evaluation measures

  • descriptive coherence of clusters;
  • temporal coherence of clusters;
  • continuous segmentation of
  • bservations belonging to an entity.

variance MDvar Tvar Shannon Entropy Problem Proposed Solutions Experiments Conclusion A – B – A – B ???

slide-32
SLIDE 32

ShaP=∑

xi∈X ∑ j=1 k

(−p(μ j)∗log2( p(μ j))∗(1+ nch−nmin

nobs−1 ))

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering

9

Partition evaluation measures

  • descriptive coherence of clusters;
  • temporal coherence of clusters;
  • continuous segmentation of
  • bservations belonging to an entity.

variance MDvar Tvar Shannon Entropy Proposal: Correct the Shannon entropy to penalize changes Problem Proposed Solutions Experiments Conclusion A – B – A – B ???

slide-33
SLIDE 33

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 10

  • 3. Experiments

3.1 Qualitative evaluation 3.2 Quantitative evaluation

  • 4. Conclusion and perspectives

Problem Proposed Solutions Experiments Conclusion Summary:

  • 1. Problem

1.1 Data 1.2 Goal

  • 2. Proposed solutions:

2.1 A clustering solution 2.2 Temporal-Aware Dissimilarity Measure 2.3 Contiguity Penalty Measure 2.4 TDCK-Means algorithm 2.5 Evaluation measures

slide-34
SLIDE 34

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11

Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Problem Proposed Solutions Experiments Conclusion

slide-35
SLIDE 35

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11

Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion

slide-36
SLIDE 36

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11

Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion

slide-37
SLIDE 37

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11

Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion

slide-38
SLIDE 38

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 11

Compared Political Dataset I 23 countries, 60 years, 207 political, demographic, social and economic variables. Execution TDCK-Means (8 clusters, β = 0.003 and δ = 3) Problem Proposed Solutions Experiments Conclusion

slide-39
SLIDE 39

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 12

Quantitative evaluation 5 algorithms: 3 measures:

  • K-Means [MacQueen '67];
  • tcK-Means [Lin and Hauptmann '10]
  • Temporal-Driven K-Means;

(uses Temporal-Aware Measure)

  • Constrained K-Means;

(uses Contiguity Penalty Function)

  • TDCK-Means;

(combines the two above)

  • MDvar
  • Tvar
  • ShaP

Problem Proposed Solutions Experiments Conclusion

slide-40
SLIDE 40

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 13

Problem Proposed Solutions Experiments Conclusion

slide-41
SLIDE 41

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 14

  • 3. Experiments

3.1 Qualitative evaluation 3.2 Quantitative evaluation

  • 4. Conclusion and perspectives

Problem Proposed Solutions Experiments Conclusion Summary:

  • 1. Problem

1.1 Data 1.2 Goal

  • 2. Proposed solutions:

2.1 A clustering solution 2.2 Temporal-Aware Dissimilarity Measure 2.3 Contiguity Penalty Measure 2.4 TDCK-Means algorithm 2.5 Evaluation measures

slide-42
SLIDE 42

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 15

Conclusion: Problem Proposed Solutions Experiments Conclusion

  • Studied the detection of typical evolutions starting from a

collection of observations corresponding to entities;

  • Proposed a new Temporal-Aware Measure;
  • Proposed a new Contiguity Penalty Function;
  • Proposed a new algorithm for detecting evolutions:

TDCK-Means;

  • Other applications: political careers, life trajectories etc.
slide-43
SLIDE 43

Perspectives:

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 16

  • Generating the evolution graph;
  • Automatic description of generated evolution phases (clusters);
  • Flexible configuring the ration between the descriptive component

and the temporal component in the dissimilarity measure. Problem Proposed Solutions Experiments Conclusion

slide-44
SLIDE 44

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 17

Thank you! Questions?

Problem Proposed Solutions Experiments Conclusion

slide-45
SLIDE 45

M-A. Rizoiu, J. Velcin and S. Lallich Structuring Typical Evolutions using Temporal-Driven Constrained Clustering 18

Impact of parameters β and δ Problem Proposed Solutions Experiments Conclusion