Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Distance Measure for Querying Arrangements of Temporal Intervals - - PowerPoint PPT Presentation
Distance Measure for Querying Arrangements of Temporal Intervals - - PowerPoint PPT Presentation
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Distance Measure for Querying Arrangements of Temporal Intervals Orestis Kostakis, Panagiotis Papapetrou, and Jaakko Hollm en Department of
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Motivation
Sign Language similarity search
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Motivation
An expression in sign language contains a set of event-channels that are on or off over time. Each event is characterized by: a label: e.g., eye-brow raise. a duration, defined by a start and an end point.
Figure: An example of a Wh-question expressed in sign language.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Motivation
Problem: How to assess the similarity of such representations?
C A B C
(a) (b)
A B
Figure: Two examples.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Outline
Background Method Experiments Conclusions Discussion
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Background: Definitions
Sequences of interval-based events allow the representation of a wide range of real-world sequences. Formally, an e-sequence is defined as an ordered set S = {S1, . . . , Sn}, where each Si = (Ei, ti
start, ti end) is called an event-interval, Ei ∈ σ.
A B C C 3 1 4 7 15 19 23 30 42 D time
Figure: S = {(A, 1, 10), (B, 5, 13), (C, 17, 30), (A, 20, 26), (D, 24, 30)}
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Background: Related Work
Existing work (e.g., Papapetrou et al. 2009, Moerchen 2010, Hoeppner 2001) focuses mainly on: mining frequent patterns of interval-based events; mining association rules involving interval-based events; mining semi-interval partial order events. So far: no formulation of any type of robust distance or similarity metrics.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Problem: Example
Problem: how to assess the similarity of two e-sequences?
C A B C
(a) (b)
A B
Figure: How similar are these two e-sequences?
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Problem: Formulation
Problem Formulation Given two e-sequences S and T , define a distance measure D, such that ∀S, T : D(S, T ) ≥ (1) D(S, S) = (2) D(S, T ) = D(T , S) (3) The degree to which the two e-sequences differ should be reflected in the value of D(S, T ) and should be in accordance with the knowledge obtained from domain experts.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Problem: Solutions
Problem: how to assess the similarity of two e-sequences?
C A B C
(a) (b)
A B
Some options: map them to traditional sequences of instantaneous events?
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Problem: Solutions
Sequences of instantaneous events do not depict all the important information:
A A A A
Transforming the above sequences to sequences of instantaneous events would yield the same result: Astart, Astart, Aend, Aend.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Problem: Solutions
Problem: how to assess the similarity of two e-sequences?
C A B C
(a) (b)
A B
Some options: map them to traditional sequences of instantaneous events? × compare event-labels? √ compare event-interval relations? √
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Problem: Solutions
Problem: how to assess the similarity of two e-sequences?
C A B C
(a) (b)
A B
what about event durations? for simplicity we ignore them. arrangement: an e-sequence where start and end “tags” are dropped [Papapetrou et al. 2009].
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Method: Key Idea
Our approach: Focus on the relations between pairs of intervals.
A B A B A B A B A B A B A B Follow(A,B) Meet(A,B) Overlap(A,B) Match(A,B) Right Contain(A,B) Left Contain(A,B) Contain(A,B)
Figure: Allen’s temporal model [Allen et al. 1983].
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Method: Relation Matrix
The solution: Given an event interval sequence S Create its relation matrix MA
relation {A,A} {A,B} {B,A} {B,B} meet 1 match 1
- verlap
1 2 1 contain left-contain right-contain follow
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Method: Distance
Arrangement Distance δp(A, B) =
|I|
- i=1
|σ|2
- j=1
|MA(i, j) − MB(i, j)|p
1 p
, p ∈ N∗ (4) Question: What would be a suitable value for p? Manhattan Distance For p = 1, Eq. 8 corresponds to the Manhattan distance. δ1(A, B) =
|I|
- i=1
|σ|2
- j=1
|MA(i, j) − MB(i, j)| (5)
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Method: Distance
Arrangement Distance δp(A, B) =
|I|
- i=1
|σ|2
- j=1
|MA(i, j) − MB(i, j)|p
1 p
, p ∈ N∗ (6) Question: What would be a suitable value for p? Frobenius Norm For p = 2, Eq. 8 corresponds to the Frobenius norm of MA − MB: δ2(A, B) =
- |I|
- i=1
|σ|2
- j=1
|MA(i, j) − MB(i, j)|2 (7)
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Method: Distance
Normalized arrangement distance δnorm(A, B) =
|I|
- i=1
|σ|2
- j=1
|MA(i, j) − MB(i, j)| MA(i, j) + MB(i, j) (8) based on the L1 norm. normalized over the total possible # of relations where A and B can differ. non-metric:
δnorm(A, B) if-f A = B (identity of the indiscernibles) is violated.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Experiments: ASL Dataset
SignStream Database: by the National Center for Sign Language and Gesture Resources at Boston University. # of e-sequences: 873. # of intervals: 15675. Min size: 4. Max size: 41. Average size: 18. Labels: 216. Classes: 5.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Experiments: Setup
We tested: robustness against artificial noise. classification accuracy. Artificial noise: shift probability s: each event-interval in S is shifted with probability s. distortion level d: the start point of each event-interval was shifted by ±d%|S|.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Experiments: Robustness
Robustness to noise: we compared the Normalized, the Manhattan, and the Frobenius distance in terms of:
A nearest neighbor retrieval accuracy: the fraction of noisy
queries for which the originating sequence is retrieved.
B rank of nearest neighbor: for each query, the number of
database sequences with distance less than or equal to that of the originating counterpart.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Experiments: Robustness
0.1 0.2 0.3 0.4 0.5 0.5 0.6 0.7 0.8 0.9 1
Distortion Retrieval accuracy
Probability 0.2 Probability 0.4 Probability 0.6 Probability 0.8 Probability 1.0
(a) Manhattan
0.1 0.2 0.3 0.4 0.5 0.95 0.96 0.97 0.98 0.99 1
Distortion Retrieval accuracy
Probability 0.2 Probability 0.4 Probability 0.6 Probability 0.8 Probability 1.0
(b) Normalized
Figure: Retrieval accuracy: success ratio of matching the noisy sequences to their original counterpart.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Experiments: Robustness
0.02 0.04 0.06 0.08 0.1 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Rank of NN, Ratio Database Ratio
Normalized Frobenius Manhattan
(a) Probability 0.6, distortion 50%
0.02 0.04 0.06 0.08 0.1 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Rank of NN, Ratio Database Ratio
Normalized Frobenius Manhattan
(b) Probability 1.0, distortion 50%
Figure: Comparison of the cumulative histograms for the rank of the 1-NN for each distance measure. Ranks are denoted as a ratio of the database size.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Experiments: 1-NN Classification Accuracy
1-NN classification accuracy: the fraction of e-sequences for which their class is the same as that of their 1-NN. Data: # of classes: 5. # of e-sequences: 873. 1-NN Classification Accuracy ≈ 88%.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Conclusions
Reduced a problem related to assistive environments, to the problem of comparing temporal interval sequences. Proposed a distance measure for temporal interval sequences, which allows to quantify similarity among sequences. The distance measure relies on creating Relation matrices and creating the matrices. Experimented with three methods to compare sequences. One of the methods proved very robust against artificial noise.
Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou.
Directions for Future work
Formulate more robust distance metrics (ongoing work): Examine the applicability of temporal interval sequences in
- ther domains.