Distance Measure for Querying Arrangements of Temporal Intervals - PowerPoint PPT Presentation

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Distance Measure for Querying Arrangements of Temporal Intervals Orestis Kostakis, Panagiotis Papapetrou, and Jaakko Hollm´ en Department of Information and Computer Science, Aalto University. May 27, 2011

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Motivation Sign Language similarity search

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Motivation An expression in sign language contains a set of event-channels that are on or off over time. Each event is characterized by: a label: e.g., eye-brow raise. a duration, defined by a start and an end point. Figure: An example of a Wh-question expressed in sign language.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Motivation Problem: How to assess the similarity of such representations? A� A� B� B� C� C� (a)� (b)� Figure: Two examples.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Outline Background Method Experiments Conclusions Discussion

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Background: Definitions Sequences of interval-based events allow the representation of a wide range of real-world sequences. Formally, an e-sequence is defined as an ordered set S = { S 1 , . . . , S n } , where each S i = ( E i , t i start , t i end ) is called an event-interval , E i ∈ σ . A� B� C� C� D� time� 1� 3� 4� 7� 15� 19� 23� 30� 42� Figure: S = { ( A , 1 , 10) , ( B , 5 , 13) , ( C , 17 , 30) , ( A , 20 , 26) , ( D , 24 , 30) }

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Background: Related Work Existing work (e.g., Papapetrou et al. 2009, Moerchen 2010, Hoeppner 2001) focuses mainly on: mining frequent patterns of interval-based events; mining association rules involving interval-based events; mining semi-interval partial order events. So far: no formulation of any type of robust distance or similarity metrics.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Problem: Example Problem: how to assess the similarity of two e-sequences? A� A� B� B� C� C� (a)� (b)� Figure: How similar are these two e-sequences?

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Problem: Formulation Problem Formulation Given two e-sequences S and T , define a distance measure D , such that ∀S , T : D ( S , T ) ≥ 0 (1) D ( S , S ) = 0 (2) D ( S , T ) = D ( T , S ) (3) The degree to which the two e-sequences differ should be reflected in the value of D ( S , T ) and should be in accordance with the knowledge obtained from domain experts.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Problem: Solutions Problem: how to assess the similarity of two e-sequences? A� A� B� B� C� C� (a)� (b)� Some options: map them to traditional sequences of instantaneous events?

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Problem: Solutions Sequences of instantaneous events do not depict all the important information: A A A A Transforming the above sequences to sequences of instantaneous events would yield the same result: A start , A start , A end , A end .

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Problem: Solutions Problem: how to assess the similarity of two e-sequences? A� A� B� B� C� C� (a)� (b)� Some options: map them to traditional sequences of instantaneous events? × compare event-labels? √ compare event-interval relations? √

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Problem: Solutions Problem: how to assess the similarity of two e-sequences? A� A� B� B� C� C� (a)� (b)� what about event durations? for simplicity we ignore them. arrangement: an e-sequence where start and end “tags” are dropped [Papapetrou et al. 2009].

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Method: Key Idea Our approach: Focus on the relations between pairs of intervals. A B Follow(A,B) A Meet(A,B) B A B Overlap(A,B) A Match(A,B) B A Left B Contain(A,B) A Right B Contain(A,B) A Contain(A,B) B Figure: Allen’s temporal model [Allen et al. 1983].

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Method: Relation Matrix The solution: Given an event interval sequence S Create its relation matrix M A relation { A,A } { A,B } { B,A } { B,B } meet 0 1 0 0 match 0 0 1 0 overlap 1 2 0 1 contain 0 0 0 0 left-contain 0 0 0 0 right-contain 0 0 0 0 follow 0 0 0 0

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Method: Distance Arrangement Distance 1   | σ | 2 |I| p � � p ∈ N ∗ | M A ( i , j ) − M B ( i , j ) | p δ p ( A , B ) = , (4)   i =1 j =1 Question: What would be a suitable value for p ? Manhattan Distance For p = 1, Eq. 8 corresponds to the Manhattan distance . | σ | 2 |I| � � δ 1 ( A , B ) = | M A ( i , j ) − M B ( i , j ) | (5) i =1 j =1

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Method: Distance Arrangement Distance 1   | σ | 2 |I| p � � p ∈ N ∗ | M A ( i , j ) − M B ( i , j ) | p δ p ( A , B ) = (6) ,   i =1 j =1 Question: What would be a suitable value for p ? Frobenius Norm For p = 2, Eq. 8 corresponds to the Frobenius norm of M A − M B : � � | σ | 2 |I| � � � | M A ( i , j ) − M B ( i , j ) | 2 δ 2 ( A , B ) = � (7) � i =1 j =1

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Method: Distance Normalized arrangement distance | σ | 2 |I| | M A ( i , j ) − M B ( i , j ) | � � δ norm ( A , B ) = (8) M A ( i , j ) + M B ( i , j ) i =1 j =1 based on the L 1 norm. normalized over the total possible # of relations where A and B can differ. non-metric: δ norm ( A , B ) if-f A = B (identity of the indiscernibles) is violated.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Experiments: ASL Dataset SignStream Database: by the National Center for Sign Language and Gesture Resources at Boston University. # of e-sequences: 873. # of intervals: 15675. Min size: 4. Max size: 41. Average size: 18. Labels: 216. Classes: 5.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Experiments: Setup We tested: robustness against artificial noise. classification accuracy. Artificial noise: shift probability s : each event-interval in S is shifted with probability s . distortion level d : the start point of each event-interval was shifted by ± d % |S| .

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Experiments: Robustness Robustness to noise: we compared the Normalized, the Manhattan, and the Frobenius distance in terms of: A nearest neighbor retrieval accuracy: the fraction of noisy queries for which the originating sequence is retrieved. B rank of nearest neighbor: for each query, the number of database sequences with distance less than or equal to that of the originating counterpart.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Experiments: Robustness 1 1 0.9 0.99 Retrieval accuracy Retrieval accuracy 0.8 0.98 Probability 0.2 0.7 Probability 0.2 0.97 Probability 0.4 Probability 0.4 Probability 0.6 Probability 0.6 Probability 0.8 0.6 0.96 Probability 0.8 Probability 1.0 Probability 1.0 0.5 0.95 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 Distortion Distortion (a) Manhattan (b) Normalized Figure: Retrieval accuracy: success ratio of matching the noisy sequences to their original counterpart.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Experiments: Robustness 1 1 0.95 0.95 Database Ratio Database Ratio 0.9 0.9 0.85 0.85 0.8 0.8 Normalized 0.75 Normalized 0.75 Frobenius Frobenius 0.7 0.7 Manhattan Manhattan 0.65 0.65 0 0.02 0.04 0.06 0.08 0.1 0 0.02 0.04 0.06 0.08 0.1 Rank of NN, Ratio Rank of NN, Ratio (a) Probability 0 . 6, distortion 50% (b) Probability 1 . 0, distortion 50% Figure: Comparison of the cumulative histograms for the rank of the 1-NN for each distance measure. Ranks are denoted as a ratio of the database size.

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Experiments: 1-NN Classification Accuracy 1-NN classification accuracy: the fraction of e-sequences for which their class is the same as that of their 1-NN. Data: # of classes: 5. # of e-sequences: 873. 1-NN Classification Accuracy ≈ 88%.

Distance Measure for Querying Arrangements of Temporal Intervals - PowerPoint PPT Presentation

Distance Measure for Querying Arrangements of Temporal Intervals, by Panagiotis Papapetrou. Distance Measure for Querying Arrangements of Temporal Intervals Orestis Kostakis, Panagiotis Papapetrou, and Jaakko Hollm en Department of

Distance Education Distance education used to be about the distance. 1700s 1800s 1900s 2000s

Computational Geometry Lecture 11: Arrangements and Duality Computational Geometry Lecture 11:

Spatio-Temporal Statistics with R Chapter Two: Exploring Spatio-Temporal Data Spatio-Temporal

Mark-recapture distance sampling (MRDS) in Distance 7.1 Setting up Distance for MRDS

Wavelets for Efficient Querying of Large Wavelets for Efficient Querying of Large

Combining XML querying Combining XML querying with ontology reasoning: with ontology reasoning:

The problem Combining querying of XML data with ontology queries Example XML document

Querying XML Documents Querying XML Documents How XML may be supported in databases with

Querying and Mining Data Streams: Querying and Mining Data Streams: You Only Get One Look You

QUERYING AND MINING QUERYING AND MINING DATA STREAMS Elena Ikonomovska Joef Stefan Institute

Querying Term Associations and their Temporal Evolution in Social Data Vassilis Plachouras

Bilateral arrangements Bilateral arrangements Challenges and opportunities for Challenges and

Lect 14a - Line Arrangements: Definitions and Zone Theorem Lect 14b - Line Arrangements:

Distance in data space Notion of distance (metrics) in data space Who is my closest neighbor?

Temporal, Spatial, and Spatio-temporal Granularities Gabriele Pozzani Department of Computer

Temporal Code Temporal Code Temporal Code (Acoustic Front-end) Human Recognition Machine

Phylogenetics Eliran Avni, Reuven Cohen, Sagi Snir Presentation by Ashu Gupta Motivation

On the Limitations of Unsupervised Bilingual Dictionary Induction Anders Sgaard Sebastian

Multilevel refinement based on neighborhood similarity Alan Valejo, Jorge Valverde-Rebaza, Brett

Using Transportation Distances for Measuring Melodic Similarity Rainer Typke, Panos Giannopoulos,

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto

Paraphrase Recognition Using Machine Learning to Combine Similarity Measures Prodromos

Investor Similarity Affects Investment Decisions This Paper: investors who trade an asset care

Citation networks in economics Carlo D Ippoliti Carlo D Ippoliti Citation Networks in