Outline Introduction semantic trajectories over streaming movement - - PowerPoint PPT Presentation

outline
SMART_READER_LITE
LIVE PREVIEW

Outline Introduction semantic trajectories over streaming movement - - PowerPoint PPT Presentation

SeTraStream : Semantic-Aware Trajectory Construction Over Streaming Movement Data Zhixian Yan * Nikos Giatrakos Vangelis Katsikaros Nikos Pelekis Yannis Theodoridis * Distributed Informa2on Systems Lab Informa2on Management Lab


slide-1
SLIDE 1

SeTraStream:

Semantic-Aware Trajectory Construction Over Streaming Movement Data

Minneapolis, MN, USA, 26 August 2011

12th International Symposium on Spatial and Temporal Databases

*Distributed Informa2on Systems Lab

Swiss Federal Ins2tute of Technology (EPFL), Lausanne, Switzerland

† Informa2on Management Lab

University of Piraeus, Piraeus, Greece

Zhixian Yan* Nikos Giatrakos† Vangelis Katsikaros† Nikos Pelekis† Yannis Theodoridis†

slide-2
SLIDE 2

Outline

2

 Introduction

 semantic trajectories…  …over streaming movement data?

 Related Work  SeTraStream Framework

 Big Picture  Details of each module

 Data Cleaning  Data Compression  Segmentation – Episode Identification

 Experimental Evaluation  Conclusions

slide-3
SLIDE 3

Outline

3

 Introduction

 semantic trajectories…  …over streaming movement data?

 Related Work  SeTraStream Framework

 Big Picture  Details of each module

 Data Cleaning  Data Compression  Segmentation – Episode Identification

 Experimental Evaluation  Conclusions

slide-4
SLIDE 4

4

  • Semantic Trajectory: T={efirst,…,elast}
  • Episode: ei=(tfrom, tto, place, tag)

raw mobility data sequence (x,y,t) points e.g., GPS feeds

Home (breakfast)

  • ffice (work)

Market (shopping) Home (relax) Road (bus) Train (metro) Sideway (walk) [~, 8am] [8am, 9am] [6pm, 6:30am] [7:30pm, 8pm] [9am, 6pm] [6:30pm, 7:30pm] [8pm,~]

meaningful mobility tuples <place, timein, timeout, tags>

What is semantic trajectory?

slide-5
SLIDE 5

 Detection of homogenous fractions of movement,

 Trajectory is recreated as a sequence of episodes (stops/moves)  E.g., home, shopping, move with bus, in train …

 Semantic data abstraction & compression (efficiency/effectiveness)  Better mobility understanding & LBS 5

Road name Start time

Walk

  • Ch. veilloud 08:50:26
  • Rt. du Boi

08:54:46

  • Rt. de Villar 08:57:24

Tir Fédéra 08:58:41

Metro

M1 08:59:24

Walk

  • Rt. de la Sorg 09:03:57
  • Ch. du Barrag 09:04:42

La Diagonal 09:05:24

Raw GPS Points Trajectory Notion

  • f Segments

Semantic-Aware Trajectory

Why semantic trajectories?

(a) HomeOffice via Bike (b) HomeOffice via Bus

Home-office trajectory examples

slide-6
SLIDE 6

6

Antennas Moving objects

Status updates - Batches

Server-side Client-side

Why on streaming mobility data?

 Offline vs. Real-time

 Offline: past trajectories  mobility streams: ongoing trajectories  efficient computation

 Real-life scenarios

 Traffic Control Scenarios: real time

placement & rearrangement of traffic wardens

 Modern Navigation & Social Networking

Services e.g. www.waze.com

 …

 Distributed setting

 local site vs. coordinator  client vs. server side

slide-7
SLIDE 7

Outline

7

 Introduction

 semantic trajectories…  …over streaming movement data?

 Related Work  SeTraStream Framework

 Big Picture  Details of each module

 Data Cleaning  Data Compression  Segmentation – Episode Identification

 Experimental Evaluation  Conclusions

slide-8
SLIDE 8

HOME SCHOOL MARKET SCHOOL HOME CUSTOMER OFFICE FACTORY

Trajectory Structure Layer

  • velocity-based
  • density-based
  • orientation

Trajectory Identification Layer

  • raw GPS gap
  • time interval
  • spatial extent

Semantic Annotation Layer

  • spatial join (region)
  • map-matching (line)
  • HMM (point)

Data Preprocess Layer

  • outlier removal
  • kernel smoothing
  • compression

input

  • utput

semantic trajectory cleansed GPS feeds

  • riginal

GPS feeds structured trajectory

S1 S2 S3 S4 S5 S6 S7 S8 S9

spatio- temporal trajectory

a trajectory another trajectory

 Offline Construction of Semantic Trajectories (ESWC ’10, EDBT ’11)

slide-9
SLIDE 9

 Semantic Trajectories (DKE ’08, ESWC ’10, EDBT ’11)  High-level trajectory concepts like episodes (e.g., stops/moves),

trajectory ontologies

 Offline training & tuning parameters (particularly on raw

movement features like velocity/direction/density)

 Tuning parameters, not efficient in real-time settings  Streaming data processing  Online mobility data compression (e.g., Honle @GIS ’10)  Time series online segmentation (e.g., Keogh @ICDM ’01)

 Tilted time window specification (Giannotti ’02)

Related Work & Motivation

Semantic Trajectories + Online Algorithms

slide-10
SLIDE 10

Outline

10

 Introduction

 semantic trajectories…  …over streaming movement data?

 Related Work  SeTraStream Framework

 Big Picture  Details of each module

 Data Cleaning  Data Compression  Segmentation – Episode Identification

 Experimental Evaluation  Conclusions

slide-11
SLIDE 11

SeTraStream - Server Side

11

… ON T

O8

… Oi

Oi

Buffer of incoming batches

  • f objects (arriving every τ)

Candidate Div Point O1

e1: walk e2: shopping

W1l W2l W3l Wr

O5

  • 1. Filter noisy data
  • 2. Compress batch
  • 3. Extract Movement Feature

Vectors Location Stream Instances Complementary Feature Instances <x,y,t> Position in Lane Distance to Headway Vehicle Steering Wheel Activity 123.34, 121.21, 18:35:43 0.1m 1m π/36 … … … … 120.34, 125.21, 18:36:59 0.05m 3m π/16

Short term change? Long term change?

slide-12
SLIDE 12

Online Cleaning (1)

 Two types of GPS errors

 systematic errors (outlier) - removing  random errors (e.g. ±15 meter) – smoothing

 ONE LOOP

 build Kernal smooth

 calculate residual  calculate the outlier bound & the smooth bound  filter outlier or smooth error

12

∞ smooth remove keep

slide-13
SLIDE 13

Online Cleaning (2)

13

slide-14
SLIDE 14

SeTraStream - Compression

14

… ON T

O8

… Oi

Oi

Buffer of incoming batches

  • f objects (arriving every τ)

O1 e1 e2

O5

  • 1. Filter noisy data
  • 2. Compress batch
  • 3. Extract Movement Feature

Vectors

slide-15
SLIDE 15

Online Compression (1)

 Why Compression?

 Data continuously growing  Remove “redundant” data points  Reduce transmission cost (local?)  Fast computation, application performance

15

SED (Synchronized Euclidean Distance) Qls

p

Q`ls

p(xp,yp,tp)

sed

Qls

p-1

Qls

p+1

u1 u2 u3 ε Q2 Q1 Q7 Q3 Q5 Q4 Q6

slide-16
SLIDE 16

Online Compression (2)

 SED (Synchronized Euclidean Distance)

 Relative Spatio-Temporal Significance

 SCC (Synchronized Correlation Coefficient)

 Relative Significance of the Complementary Features 16

Simple combination: Normalization:

slide-17
SLIDE 17

SeTraStream - Feature Extraction

17

… ON T

O8

… Oi

Oi

Buffer of incoming batches

  • f objects (arriving every τ)

O1 e1 e2

O5

  • 1. Filter noisy data
  • 2. Compress batch
  • 3. Extract Movement Feature

Vectors

slide-18
SLIDE 18

Movement Feature Vectors (MFVs)

18

<x,y,t> Position in Lane Distance to Headway Vehicle Steering Wheel Activity 123.34, 121.21, 18:35:43 0.1m 1m π/36 … … … … 120.34, 125.21, 18:36:59 0.05m 3m π/16

speed direction acceleration

35 m/s 76o 40 m/s2 … … … 60 m/s 85o 55 m/s2 MFVs in Batch make up a Matrix

35 … 60 76 … 85 40 … 55 0.1 … 0.05 1 … 3 π/36 … π/16

slide-19
SLIDE 19

SeTraStream - Segmentation

19

… ON T

O8

… Oi

Oi

Buffer of incoming batches

  • f objects (arriving every τ)

Candidate Div Point O1 e1 e2 W1l Wr

O5

Similar Movement Pattern? If YES σ thres Which types of similarity measurement?

slide-20
SLIDE 20

Movement Similarity

 RV-coefficient:

 A multivariate correlation coefficient, focusing on “trend”

similarity; NOT on absolute differences

 Measures the relative resemblance of two sequences of vectors  Dimension independent since WlWl’, WrWr’ possess d * d

dimension – d the number of features

20

  • Existing trajectory computing:

– Offline, thresholds on movement features like velocity/direction/density

  • Online solution:

– Similarity on movement patterns (not individual attributes) – Threshold on movement pattern alteration

slide-21
SLIDE 21

Short-term Movement Change

21

… ON T

O8

… Oi …

Buffer of incoming batches

  • f objects (arriving every τ)

Div Point O1 e1 e2

O5

e3 Start of e4 End of e3 As soon as we find an episode, we tag it Tagging Episodes: Training offline, tagging online W1l Wr

slide-22
SLIDE 22

Long-term Movement Change

22

… ON T

O8

… Oi

Oi

Buffer of incoming batches

  • f objects (arriving every τ)

Candidate Div Point O1 e1 e2 W1l W2l W3l Wr

O5

Similar Patter? If NO σ thres Similarity (W1, W2) e.g. RV-coefficient (W1, W2)

slide-23
SLIDE 23

Outline

23

 Introduction

 semantic trajectories…  …over streaming movement data?

 Related Work  SeTraStream Framework

 Big Picture  Details of each module

 Data Cleaning  Data Compression  Segmentation – Episode Identification

 Experimental Evaluation  Conclusions

slide-24
SLIDE 24

24

Experiment - Dataset

 GPS data from Nokia Research Center @ Lausanne  User tags: home_cook, office_work, stand, jog, walk, bus ….

slide-25
SLIDE 25

Experiment - Compression

25

slide-26
SLIDE 26

Experiment - Segmentation

26

50 100 150 200 250 300 350 400 450 0.4 0.5 0.6 0.7 0.8 0.9 1

!"#$%&'$() *+%,-$.."("$/!

0%1' 0%2' 0%3' 0%45' 6-77"/7 89:;"/7 '!9/<"/7

Different batch sizes Different RV threshold

slide-27
SLIDE 27

Experiment - Latency

27

slide-28
SLIDE 28

Outline

28

 Introduction

 semantic trajectories…  …over streaming movement data?

 SeTraStream Framework

 Big Picture  Details of each module

 Data Cleaning  Data Compression  Segmentation – Episode Identification

 Experimental Evaluation  Related Work  Conclusions

slide-29
SLIDE 29

Conclusion and Future Work

 We developed SeTraStream

 Online Semantic Trajectory Construction  Complete Framework

 Data Cleaning, Load Shedding, Trajectory Segmentation, Tagging

 To our knowledge, the first work tackles with semantic trajectories in the

context of streaming movement data

 Future Work

 Explore new similarity measurement (rather than RV-coefficients)

 …and still allow Wl expansion so as to seek for long term motion pattern changes

(e.g. Sketch Summaries ?)

 Further experimentation with larger datasets  Extensions to distributed settings: Local vs. global computation

 Can any part of the computation be conducted locally?  Most likely only cleaning & load shedding can be done locally 

29

slide-30
SLIDE 30

Thank You!

Minneapolis, MN, USA 26 August 2011

12th International Symposium on Spatial and Temporal Databases

*Distributed Informa2on Systems Lab

Swiss Federal Ins2tute of Technology (EPFL), Lausanne, Switzerland

† Informa2on Management Lab

University of Piraeus, Piraeus, Greece

Zhixian Yan* Nikos Giatrakos† Vangelis Katsikaros† Nikos Pelekis† Yannis Theodoridis†