Mining and Exploration of Multiple Intersecting Axis-aligned Objects - - PowerPoint PPT Presentation

mining and exploration of multiple intersecting axis
SMART_READER_LITE
LIVE PREVIEW

Mining and Exploration of Multiple Intersecting Axis-aligned Objects - - PowerPoint PPT Presentation

Mining and Exploration of Multiple Intersecting Axis-aligned Objects Click to edit Master text styles Second level Third level Fourth level Masters Thesis Fifth level Tilemachos Pechlivanoglou Supervisor: Manos Papagelis 1


slide-1
SLIDE 1

1

Click to edit Master text styles Second level Third level Fourth level Fifth level

Mining and Exploration of Multiple Intersecting Axis-aligned Objects

Master’s Thesis

Tilemachos Pechlivanoglou

Supervisor: Manos Papagelis

slide-2
SLIDE 2

2

Axis-aligned objects

1-D line segments/intervals 2-D rectangles 3-D boxes/cuboids Multidimensional

Regions

slide-3
SLIDE 3

3

Object intersection problem

slide-4
SLIDE 4

4

Object intersection problem

In Input:

  • a set of axis aligned geometric objects

Output ut:

  • which pairs of objects intersect
  • how much
slide-5
SLIDE 5

5

Sweep-line algorithm (1-D)

L 3 4 1 2 5 L L 2 L 1 L L 3 L 4 (0, 2) (0, 1) (1, 2) (0, 3) (1, 3) (0, 4) (1, 4) (3, 4) L L L 5 L (0, 5)

slide-6
SLIDE 6

6

Sweep-line algorithm (2-D)

L L LL L LL L L L L L Interval tree:

slide-7
SLIDE 7

7

Divide-and-conquer algorithm

Computationally equivale valent nt to Sweep-Line

Interval tree

slide-8
SLIDE 8

8

Click to edit Master text styles Second level Third level Fourth level Fifth level

8

Multiple Intersecting Objects

slide-9
SLIDE 9

9

How to detect multiple ple intersecting objects? What is the si size of their overlap (com

  • mmon
  • n re

region

  • n)?

Where is that common region loc

  • cated

ed?

Research questions

slide-10
SLIDE 10

10 10

Applications

Circuit design Spatial databases Simulations Task scheduling

slide-11
SLIDE 11

11 11

The problem

In Input:

  • a set of regions in Rd:

Output ut: − enumeration of all intersecting sets of regions − size and position of each common region

A,B A,C A,D B,C B,D C,D D,E A,B,C A,B,D B,C,D A,B,C,D Sets:

slide-12
SLIDE 12

12 12

Multiple Intersection Calculation

slide-13
SLIDE 13

13 13

Common region

A set of 3 or more objects, all inter erse sect ctin ing g pair-wi wise se with each other have a non-empty common mon region ion.

(Helly’s theorem, convex sets) Common region

slide-14
SLIDE 14

14 14

Common region size: 1-D

For a fully intersecting set I, the common mon reg egion ion len ength |Z| is: |ZI|= max(start points) - min(end points)

|ZABC|= max( a0, b0, c0 ) - min( a1, b1, c1 ) = a1 - c0

slide-15
SLIDE 15

15 15

Common region size: 2-D, 3-D ...

For more dimensions, |Z| is the prod

  • duct

ct of the common region lengths hs in each dimension nsion |Zd|

slide-16
SLIDE 16

16 16

Intersection cardinality (k)

The number er of simultaneously overlapping objects in a set

kABCD = 4 kDE = 2 kAE = 0 kABCDE = 4

slide-17
SLIDE 17

17 17

Sensible baseline algorithms

slide-18
SLIDE 18

18 18

  • 1. Compare each object

ect with eve very other

  • 2. If any 2 intersect, compare the pair’s

common region with every other r objec ect

  • 3. If any 3 intersect, compare the triplet’s

common region with every other r objec ect 4.

  • 4. Repeat until no intersections found or

no objects left ⦁ many nested loops ⦁ very high computational cost

Naive approach

slide-19
SLIDE 19

19 19

  • 1. Execute sweep-li

line ne algorithm to find intersecting pairs

  • 2. Get the common regions of all resulting pairs

rs

  • 3. Execute sweep-line

line on them to find tripl plets ets, quadrup uple lets ts 4.

  • 4. Repeat until no intersections found

⦁ better performance than naive

Modified sweep-line approach

slide-20
SLIDE 20

20 20

⦁ High computational cost ⦁ Difficult to implement ⦁ Lack of versatility

− different implementations needed for different problems − hard to process/explore specific part of dataset

Limitations

slide-21
SLIDE 21

21 21

Our approach (SLIG)

slide-22
SLIDE 22

22 22

A graph data structure where: ⦁ Each ve vertex ex corresponds to an object ct ⦁ An edge edge exists between two vertices if the corresponding

  • bjects inters

ersect ect

Region intersection graph

slide-23
SLIDE 23

23 23

Subset of vertices where every two are connected (i.e. a fully connected subgraph)

Clique

size-3 cliques: ABC, ABD, ACD, BCA size-4 clique: ABCD (maximal imal clique)

slide-24
SLIDE 24

24 24

On an intersection graph, a cliq ique ue corresponds to a full lly inters ersect ectin ing g set with a commo mon n region

  • n

Observation

slide-25
SLIDE 25

25 25

  • 1. Execute sweep-li

line ne algorithm to find intersecting pairs

  • 2. Use pairs to construct the intersection graph
  • 3. Execute a cliq

ique ue enumer erati ation

  • n algorithm on graph

⦁ best performance ⦁ using established, efficient clique enumeration methods ⦁ much easier to implement

Sweep-Line with Intersection Graph (SLIG)

slide-26
SLIDE 26

26 26

The intersection graph provides additi tional

  • nal minin

ing g option ions, such as exploration using queries ries: ⦁ Singl ngle e Region

  • n Query: given an object find all other objects

intersecting with it ⦁ Mu Multi tipl ple e Region

  • n Query: given a set of objects, find all

intersections occuring in the set

Extensions: Querying capability

slide-27
SLIDE 27

27 27

Multiple Intersections Evaluation

slide-28
SLIDE 28

28 28

Randomly generated objects

1-D intervals 1-D intersection graph 2-D rectangles 2-D intersection graph

slide-29
SLIDE 29

29 29

Intersection graph size

slide-30
SLIDE 30

31 31

Performance of SLIG

SLIG scales much better than baseline

slide-31
SLIDE 31

32 32

Effect of graph topology

smaller/sparser objects -> sparser graphs -> faster execution

slide-32
SLIDE 32

33 33

SLIG query performance

slide-33
SLIDE 33

34 34

Real-world data

Overlapping areas of extreme weather in CA & NV, USA

slide-34
SLIDE 34

35 35

Click to edit Master text styles Second level Third level Fourth level Fifth level

35 35

Node Importance in Trajectory Networks

slide-35
SLIDE 35

36 36

Trajectories of moving objects

slide-36
SLIDE 36

37 37

Trajectory anomaly detection Trajectory pattern mining Trajectory classification ...more

Trajectory Mining

Trajectory similarity Trajectory clustering

slide-37
SLIDE 37

38 38

Node Importance

slide-38
SLIDE 38

39 39

Node importance (or centrality)

Degre ree centrality Betweenn enness ess centrality Clos

  • seness

ess centrality Eigen envec vector tor centrality

slide-39
SLIDE 39

40 40

Over time

Connected components over time (connectedness) Node degree over time Triangles over time

slide-40
SLIDE 40

41 41

Applications

Infection spreading Wireless signal security Rich dynamic network analytics

slide-41
SLIDE 41

42 42

Proximity networks

θ θ

slide-42
SLIDE 42

43 43

Distance can represent

line of sight Wifi signal range travel distance in a day

slide-43
SLIDE 43

44 44

Trajectory networks

slide-44
SLIDE 44

45 45

Node importance algorithms for static ic netwo work rks Sequence of static networks (sn snaps pshots

  • ts)

One larg rge network pe per r discrete time unit!

Problem difficulty

slide-45
SLIDE 45

46 46

Node Importance in Trajectory Networks

slide-46
SLIDE 46

47 47

Naive approach

slide-47
SLIDE 47

48 48

For every ry discrete time unit:

  • 1. get static sn

snapsho shot of network

  • 2. run st

static c node importance algorit

  • rithms

hms on snapshot Aggrega regate results at the end

Naive approach

slide-48
SLIDE 48

49 49

Similar to naive, but: ﹘ no fi final aggregation gregation ﹘ results calculated iter erat ativel ively y at every step Still every y time unit

Streaming approach

slide-49
SLIDE 49

50 50

Every discrete time unit

...

time T

4 1 2 3

...

slide-50
SLIDE 50

51 51

(algor

  • rithm

hm sk sketch) represent TN edges as time interval vals apply variation of sw swee eep line algorithm si simultan aneousl eously compute node degree, triangle membership, connected components in one pass ss

Sweep Line Over Trajectories (SLOT)

slide-51
SLIDE 51

52 52

Edges as time intervals...

e1:(n1,n2)

. . .

en T edges

t1 t3 t2 t4 t5 t6 t7 t8 t9 t10 t1

1

t12 t13

time L

slide-52
SLIDE 52

53 53

Sweep Line Over Trajectories (SLOT)

slide-53
SLIDE 53

54 54

At every edge star art

e:(u, v) edges

t1 t2

time

⦁ Degre ree

− nodes u, v now connected − increment u, v degree

T

⦁ Tri riangles les

− did a triangle just form? − look for u, v common neighbors − increment triangle (u,v,common)

⦁ Com

  • mpo

ponents

− did two previously unconnected components connect? − compare old components of u, v − if not same, merge them

u v

slide-54
SLIDE 54

55 55

At every edge stop

⦁ Degre ree

− nodes u, v now disconnected − decrement u, v degree

t3

e:(u, v) edges

t1 t2

T time

⦁ Tri riangles les

− did a triangle just break? − look for u, v common neighbors − decrement triangle (u,v,common)

⦁ Com

  • mpo

ponents

− did a component separate? − BFS to see if u, v still connected − if not, split component to two

u v

slide-55
SLIDE 55

56 56

Rich analytics tics ⦁ node degrees: start/end time, duration ⦁ triangles: start/end time, duration ⦁ connected components: start/end time, duration Exa xact results (not approximations)

SL SLOT OT: At the end of the algorithm...

slide-56
SLIDE 56

57 57

Evaluation of SLOT

slide-57
SLIDE 57

58 58

Simulating trajectories

constant velocity random velocity

slide-58
SLIDE 58

59 59

Degree

slide-59
SLIDE 59

60 60

SLOT performance (triangles, connectedness)

slide-60
SLIDE 60

61 61

with max=0.15, min=0

slide-61
SLIDE 61

62 62

Seagull migration trajectories

slide-62
SLIDE 62

63 63

Click to edit Master text styles Second level Third level Fourth level Fifth level

63 63

Summary

slide-63
SLIDE 63

64 64

Region intersection graph Axis-aligned object intersections Sweep-line algorithm SLIG properties:

  • Fast & efficient
  • Exact
  • Query capabilities

Multiple Intersections

slide-64
SLIDE 64

65 65

SLOT algorithm Trajectory networks Network Importance over time SLOT properties:

  • Fast
  • Exact
  • Scalable

Node importance in TNs

slide-65
SLIDE 65

66 66

Contributions

⦁ Fast and Accurate Mining of Node Importance in Trajectory Networks

− IEEE International Conference on Big Data, 2018

⦁ Efficient Mining and Exploration of Multiple Axis-aligned Intersecting Objects

− Pending review in IEEE International Conference on Data Mining, 2019

⦁ Working on extensions/applications of shown concepts

− Data visualization − Location-aware computation offloading − Distributed versions of algorithms

⦁ Industry collaboration project with Fortran Traffic

slide-66
SLIDE 66

67 67

Click to edit Master text styles Second level Third level Fourth level Fifth level

Thank you!