Spatial Access Methods (SAMs) I . - - PowerPoint PPT Presentation

spatial access methods sams i
SMART_READER_LITE
LIVE PREVIEW

Spatial Access Methods (SAMs) I . - - PowerPoint PPT Presentation

Spatial Access Methods (SAMs) I . . ( Silberchatz, Korth


slide-1
SLIDE 1

ΒΑΣΕΙΣ ΔΕΔΟΜΕΝΩΝ ΙΙ

Spatial Access Methods (SAMs) I

Β. Μεγαλοοικονόμου Δ. Χριστοδουλάκης

(παρουσίαση βασισμένη εν μέρη σε σημειώσεις των Silberchatz, Korth και Sudarshan, του C. Faloutsos και του V. S. Subrahmanian)

slide-2
SLIDE 2

General Overview

Multimedia Indexing

Spatial Access Methods (SAMs)

k-d trees Point Quadtrees MX-Quadtree z-ordering R-trees

slide-3
SLIDE 3

SAMs - Detailed outline

spatial access methods

problem dfn k-d trees point quadtrees MX-quadtrees z-ordering R-trees

slide-4
SLIDE 4

Spatial Access Methods - problem

Given a collection of geometric objects

(points, lines, polygons, ...)

  • rganize them on disk, to answer

spatial queries (like??)

slide-5
SLIDE 5

Spatial Access Methods - problem

Given a collection of geometric objects

(points, lines, polygons, ...)

  • rganize them on disk, to answer

point queries range queries k-nn queries spatial joins (‘all pairs’ queries)

slide-6
SLIDE 6

Spatial Access Methods - problem

Given a collection of geometric objects

(points, lines, polygons, ...)

  • rganize them on disk, to answer

point queries range queries k-nn queries spatial joins (‘all pairs’ queries)

slide-7
SLIDE 7

Spatial Access Methods - problem

Given a collection of geometric objects

(points, lines, polygons, ...)

  • rganize them on disk, to answer

point queries range queries k-nn queries spatial joins (‘all pairs’ queries)

slide-8
SLIDE 8

Spatial Access Methods - problem

Given a collection of geometric objects

(points, lines, polygons, ...)

  • rganize them on disk, to answer

point queries range queries k-nn queries spatial joins (‘all pairs’ queries)

slide-9
SLIDE 9

Spatial Access Methods - problem

Given a collection of geometric objects

(points, lines, polygons, ...)

  • rganize them on disk, to answer

point queries range queries k-nn queries spatial joins (‘all pairs’ within ε)

slide-10
SLIDE 10

SAMs - motivation

Q: applications?

slide-11
SLIDE 11

SAMs - motivation

salary age traditional DB GIS

slide-12
SLIDE 12

SAMs - motivation

salary age traditional DB GIS

slide-13
SLIDE 13

SAMs - motivation

CAD/CAM find elements too close to each other

slide-14
SLIDE 14

SAMs - motivation

CAD/CAM

slide-15
SLIDE 15

day 1 365 day 1 365 S1 Sn

F(S1) F(Sn)

SAMs - motivation

eg, avg eg,. std

slide-16
SLIDE 16

SAMs: solutions

K-d trees point quadtrees MX-quadtrees z-ordering R-trees (grid files)

Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page)

slide-17
SLIDE 17

SAMs - Detailed outline

spatial access methods

problem dfn k-d trees point quadtrees MX-quadtrees z-ordering R-trees

slide-18
SLIDE 18

k-d trees

Used to store k dimensional point data It is not used to store region data A 2-d tree (i.e., for k=2) stores 2-

dimensional point data while a 3-d tree stores 3-dimensional point data, etc.

slide-19
SLIDE 19

2-d trees – node structure

Binary trees Info: information field Xval,Yval: coordinates of a point associated with the node Llink, Rlink: pointers to children Properties (N: node):

If level N even ->

for all nodes M in the subtree rooted at N.Llink: M.Xval < N.Xval for all nodes P in the subtree rooted at N.Rlink: P.Xval >= N.Xval

If level N odd ->

Similarly use Yvals

slide-20
SLIDE 20

2-d trees – Example

slide-21
SLIDE 21

2-d trees: Insertion/Search

To insert a node N into the tree pointed by T

If N and T agree on Xval, Yval then overwrite T Else, branch left if N.Xval < T.xval, right

  • therwise (even levels)

Similarly for odd levels (branching on Yvals)

slide-22
SLIDE 22

2-d trees – Example of Insertion

City (Xval, Yval) Banja Luka (19, 45) Derventa (40, 50) Toslic (38, 38) Tuzla (54, 35) Sinj (4, 4)

Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Sinj

slide-23
SLIDE 23

2-d trees: Deletion

Deletion of point (x,y) from T

If N is a leaf node easy Otherwise either Tl (left subtree) or Tr (right

subtree) is non-empty

Find a “candidate replacement” node R in Tl or Tr Replace all of N’s non-link fields by those of R Recursively delete R from Ti

Recursion guaranteed to terminate - Why?

slide-24
SLIDE 24

2-d trees: Deletion

Finding candidate replacement nodes for

deletion

Replacement node R must bear same spatial

relation to all nodes in Tl and Tr as node N

slide-25
SLIDE 25

2-d trees: Range Queries

Q: Given a point (xc, yc) and a

distance r find all points in the 2-d tree that lie within the circle

A: Each node N in a 2-d tree

implicitly represents a region RN – If the circle (specified by the query) has no intersection with RN then there is no point in searching the subtree rooted at node N

slide-26
SLIDE 26

SAMs - Detailed outline

spatial access methods

problem dfn k-d trees point quadtrees z-ordering R-trees

slide-27
SLIDE 27

Point Quadtrees

Represent point data Always split regions into 4 parts 2-d tree: a node N splits a region into two by

drawing one line through the point (N.xval, N.yval)

Point quadtree: a node N splits a region by

drawing a horizontal and a vertical line through the point (N.xval, N.yval)

Four parts: NW, SW, NE, and SE quadrants Q: Quadtree nodes have 4 children?

slide-28
SLIDE 28

Point Quadtrees

Nodes in point quadtrees represent

regions

slide-29
SLIDE 29

Point quadtrees - Insertion

City (Xval, Yval) Banja Luka (19, 45) Derventa (40, 50) Toslic (38, 38) Tuzla (54, 35) Sinj (4, 4)

Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Sinj Splitting of region by Tuzla

slide-30
SLIDE 30

Point Quadtrees - Insertion

slide-31
SLIDE 31

Point quadtrees: Deletion

Deletion of point (x,y) from T

If N is a leaf node easy Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-

empty

Find a “candidate replacement” node R in one of the subtrees

such that:

Every other node R1 in N.NW is to the NW of R Every other node R2 in N.SW is to the SW of R etc… Replace all of N’s non-link fields by those of R Recursively delete R from Ti

In general, it may not always be possible to find such as

replacement node

Q: What happens in the worst case?

slide-32
SLIDE 32

Point quadtrees: Deletion

Deletion of point (x,y) from T

If N is a leaf node easy Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-

empty

Find a “candidate replacement” node R in one of the subtrees

such that:

Every other node R1 in N.NW is to the NW of R Every other node R2 in N.SW is to the SW of R etc… Replace all of N’s non-link fields by those of R Recursively delete R from Ti

In general, it may not always be possible to find such as

replacement node

Q: What happens in the worst case? May require all

nodes to be reinserted

slide-33
SLIDE 33

Point quadtrees: Range Searches

Each node in a point quadtree represents a

region

Do not search regions that do not intersect

the circle defined by the query

slide-34
SLIDE 34

SAMs - Detailed outline

spatial access methods

problem dfn k-d trees point quadtrees MX-quadtrees z-ordering R-trees

slide-35
SLIDE 35

MX-Quadtrees

Drawbacks of 2-d trees, point quadtrees:

shape of tree depends upon the order in which

  • bjects are inserted into the tree

each node represents a region and splits the

region into two or four

splits may be uneven depending upon where the

point (N.xval, N.yval) is located inside the region (represented by N)

MX-quadtrees: shape (and height) of tree

independent of number of nodes and order of insertion

slide-36
SLIDE 36

MX-Quadtrees

Assumption: the map is represented as

a grid of size (2k x 2k) for some k

They are like point quadtrees but when

a region gets “split” it is split down the middle

slide-37
SLIDE 37

MX-Quadtrees - Insertion

After insertion of A, B, C, and D respectively

slide-38
SLIDE 38

MX-Quadtrees - Insertion

After insertion of A, B, C, and D respectively

slide-39
SLIDE 39

MX-Quadtrees - Deletion

Fairly easy – why? All points are represented at the leaf

level

Total time for deletion: O(k)

slide-40
SLIDE 40

MX-Quadtrees –Range Queries

Same as in point quadtrees One difference:

Checking to see if a point is in the circle

defined by the range query needs to be performed at the leaf level (points are stored at the leaf level)

slide-41
SLIDE 41

SAMs - Detailed outline

spatial access methods

problem dfn k-d trees point quadtrees MX-quadtrees z-ordering R-trees

slide-42
SLIDE 42

z-ordering

Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points(!!) Q1: why? A: Q2: how?

slide-43
SLIDE 43

z-ordering

Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points (!!) Q1: why? A: B-trees! Q2: how?

slide-44
SLIDE 44

z-ordering

Q2: how? A: assume finite granularity; z-ordering = bit-shuffling = N-trees = Morton keys = geo-coding = ...

slide-45
SLIDE 45

z-ordering

Q2: how? A: assume finite granularity (e.g., 232x232 ; 4x4 here) Q2.1: how to map n-d cells to 1-d cells?

slide-46
SLIDE 46

z-ordering

Q2.1: how to map n-d cells to 1-d cells?

slide-47
SLIDE 47

z-ordering

Q2.1: how to map n-d cells to 1-d cells? A: row-wise Q: is it good?

slide-48
SLIDE 48

z-ordering

Q: is it good? A: great for ‘x’ axis; bad for ‘y’ axis

slide-49
SLIDE 49

z-ordering

Q: How about the ‘snake’ curve?

slide-50
SLIDE 50

z-ordering

Q: How about the ‘snake’ curve? A: still problems:

2^32 2^32

slide-51
SLIDE 51

z-ordering

Q: Why are those curves ‘bad’? A: no distance preservation (~ clustering) Q: solution?

2^32 2^32

slide-52
SLIDE 52

z-ordering

Q: solution? (w/ good clustering, and easy to compute, for 2-d and n-d?)

slide-53
SLIDE 53

z-ordering

Q: solution? (w/ good clustering, and easy to compute, for 2-d and n-d?) A: z-ordering/bit-shuffling/linear- quadtrees

‘looks’ better:

  • few long jumps;
  • scoops out the whole quadrant

before leaving it

  • a.k.a. space filling curves
slide-54
SLIDE 54

z-ordering

z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y) )? A: 3 (equivalent) answers!

slide-55
SLIDE 55

z-ordering

z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y))? A1: ‘z’ (or ‘N’) shapes, RECURSIVELY

  • rder-1
  • rder-2

...

  • rder (n+1)
slide-56
SLIDE 56

z-ordering

Notice:

self similar (we’ll see about fractals,

soon)

method is hard to use: z =? f(x,y)

  • rder-1
  • rder-2

...

  • rder (n+1)
slide-57
SLIDE 57

z-ordering

z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y) )? A: 3 (equivalent) answers!

Method #2?

slide-58
SLIDE 58

z-ordering

bit-shuffling

00 0110 11 11 10 01 00 x y x 0 0 y 1 1 z =( 0 1 0 1 )2 = 5

slide-59
SLIDE 59

z-ordering

bit-shuffling

00 0110 11 11 10 01 00 x y x 0 0 y 1 1 z =( 0 1 0 1 )2 = 5 How about the reverse: (x,y) = g(z) ?

slide-60
SLIDE 60

z-ordering

bit-shuffling

00 0110 11 11 10 01 00 x y x 0 0 y 1 1 z =( 0 1 0 1 )2 = 5 How about n-d spaces?

slide-61
SLIDE 61

z-ordering

z-ordering/bit-shuffling/linear-quadtrees Q: How to generate this curve (z = f(x,y) )? A: 3 (equivalent) answers!

Method #3?

slide-62
SLIDE 62

z-ordering

linear-quadtrees : assign N->1, S->0 e.t.c.

1 1 00... 01... 10... 11... W E N S

slide-63
SLIDE 63

z-ordering

... and repeat recursively. Eg.: zgray-cell = WN;WN = (0101)2 = 5

1 1 00... 01... 10... 11... W E N S 00 11

slide-64
SLIDE 64

z-ordering

Drill: z-value of grey cell, with the three methods?

1 1 W E N S

slide-65
SLIDE 65

z-ordering

Drill: z-value of grey cell, with the three methods?

1 1 W E N S method#1: 14 method#2: shuffle(11;10)= (1110)2 = 14

slide-66
SLIDE 66

z-ordering

Drill: z-value of grey cell, with the three methods?

1 1 W E N S method#1: 14 method#2: shuffle(11;10)= (1110)2 = 14 method#3: EN;ES = ... = 14

slide-67
SLIDE 67

z-ordering - Detailed outline

spatial access methods

z-ordering

main idea - 3 methods use w/ B-trees; algorithms (range, knn queries

...)

non-point (eg., region) data analysis; variations

R-trees

slide-68
SLIDE 68

z-ordering - usage & algo’s

Q1: How to store on disk? A: Q2: How to answer range queries etc

slide-69
SLIDE 69

z-ordering - usage & algo’s

Q1: How to store on disk? A: treat z-value as primary key; feed to B-tree

SF PGH

z cname etc 5 SF 12 PGH

slide-70
SLIDE 70

z-ordering - usage & algo’s

MAJOR ADVANTAGES w/ B-tree:

already inside commercial systems (no coding

/debugging!)

concurrency & recovery is ready

SF PGH

z cname etc 5 SF 12 PGH

slide-71
SLIDE 71

z-ordering - usage & algo’s

Q2: queries? (eg.: find city at (0,3) )?

SF PGH

z cname etc 5 SF 12 PGH

slide-72
SLIDE 72

z-ordering - usage & algo’s

Q2: queries? (eg.: find city at (0,3) )? A: find z-value; search B-tree

SF PGH

z cname etc 5 SF 12 PGH

slide-73
SLIDE 73

z-ordering - usage & algo’s

Q2: range queries?

SF PGH

z cname etc 5 SF 12 PGH

slide-74
SLIDE 74

z-ordering - usage & algo’s

Q2: range queries? A: compute ranges of z-values; use B-tree

SF PGH

z cname etc 5 SF 12 PGH

9,11-15

slide-75
SLIDE 75

z-ordering - usage & algo’s

Q2’: range queries - how to reduce # of qualifying ranges?

SF PGH

z cname etc 5 SF 12 PGH

9,11-15

slide-76
SLIDE 76

z-ordering - usage & algo’s

Q2’: range queries - how to reduce # of qualifying ranges? A: Augment the query! SF PGH

z cname etc 5 SF 12 PGH

9,11-15 -> 8-15

slide-77
SLIDE 77

z-ordering - usage & algo’s

Q2’’: range queries - how to break a query into ranges?

9,11-15

slide-78
SLIDE 78

z-ordering - usage & algo’s

Q2’’: range queries - how to break a query into ranges? A: recursively, quadtree-style; decompose only non-full quadrants 9,11-15 12-15

slide-79
SLIDE 79

z-ordering - usage & algo’s

Q2’’: range queries - how to break a query into ranges? A: recursively, quadtree-style; decompose only non-full quadrants 9,11-15 12-15 9, 11

slide-80
SLIDE 80

z-ordering - Detailed outline

spatial access methods

z-ordering

main idea - 3 methods use w/ B-trees; algorithms (range, knn queries

...)

non-point (eg., region) data analysis; variations

R-trees

slide-81
SLIDE 81

z-ordering - usage & algo’s

Q3: k-nn queries? (say, 1-nn)?

SF PGH

z cname etc 5 SF 12 PGH

slide-82
SLIDE 82

z-ordering - usage & algo’s

Q3: k-nn queries? (say, 1-nn)? A: traverse B-tree; find nn wrt z-values and ...

SF PGH

z cname etc 5 SF 12 PGH

slide-83
SLIDE 83

z-ordering - usage & algo’s

... ask a range query.

SF PGH 5 3 12 nn wrt z-value

slide-84
SLIDE 84

z-ordering - usage & algo’s

... ask a range query.

SF PGH 5 3 12 nn wrt z-value

slide-85
SLIDE 85

z-ordering - usage & algo’s

Q4: all-pairs queries? ( all pairs of cities within 10 miles from each other? )

SF PGH (we’ll see ‘spatial joins’ later: find all PA counties that intersect a lake)

slide-86
SLIDE 86

z-ordering - Detailed outline

spatial access methods

z-ordering

main idea - 3 methods use w/ B-trees; algorithms (range, knn queries

...)

non-point (eg., region) data analysis; variations

R-trees ...

slide-87
SLIDE 87

z-ordering - regions

Q: z-value for a region?

zB = ?? zC = ?? A B C

slide-88
SLIDE 88

z-ordering - regions

Q: z-value for a region? A: 1 or more z-values; by quadtree decomposition

zB = ?? zC = ?? A B C

slide-89
SLIDE 89

z-ordering - regions

Q: z-value for a region?

1 1 00... 01... 10... 11... W E N S 00 11 A B C zB = 11** zC = ??

“don’t care”

slide-90
SLIDE 90

z-ordering - regions

Q: z-value for a region?

1 1 00... 01... 10... 11... W E N S 00 11 A B C zB = 11** zC = {0010; 1000}

“don’t care”

slide-91
SLIDE 91

z-ordering - regions

Q: How to store in B-tree? Q: How to search (range etc queries)

A B C

slide-92
SLIDE 92

z-ordering - regions

Q: How to store in B-tree? A: sort (*<0<1) Q: How to search (range etc queries)

A B C z

  • bj-id

etc 0010 C 0101 A 1000 C 11** B

slide-93
SLIDE 93

z-ordering - regions

Q: How to search (range etc queries) – eg ‘red’ range query

A B C z

  • bj-id

etc 0010 C 0101 A 1000 C 11** B

slide-94
SLIDE 94

z-ordering - regions

Q: How to search (range etc queries) – eg ‘red’ range query A: break query in z-values; check B-tree

A B C z

  • bj-id

etc 0010 C 0101 A 1000 C 11** B

slide-95
SLIDE 95

z-ordering - regions

Almost identical to range queries for point data, except for the “don’t cares” - i.e.,

A B C z

  • bj-id

etc 0010 C 0101 A 1000 C 11** B 1100 ?? 11**

slide-96
SLIDE 96

z-ordering - regions

Almost identical to range queries for point data, except for the “don’t cares” - i.e., z1= 1100 ?? 11** = z2 Specifically: does z1 contain/avoid/intersect z2? Q: what is the criterion to decide?

slide-97
SLIDE 97

z-ordering - regions

z1= 1100 ?? 11** = z2 Specifically: does z1 contain/avoid/intersect z2? Q: what is the criterion to decide? A: Prefix property: let r1, r2 be the corresponding regions, and let r1 be the smallest (=> z1 has fewest ‘*’s). Then:

slide-98
SLIDE 98

z-ordering - regions

r2 will either contain completely, or

avoid completely r1.

it will contain r1, if z2 is the prefix of z1

A B C 1100 ?? 11** region of z1: completely contained in region of z2

slide-99
SLIDE 99

z-ordering - regions

Drill (True/False). Given:

z1= 011001** z2= 01****** z3= 0100****

T/F r2 contains r1 T/F r3 contains r1 T/F r3 contains r2

slide-100
SLIDE 100

z-ordering - regions

Drill (True/False). Given:

z1= 011001** z2= 01****** z3= 0100****

T/F r2 contains r1 - TRUE (prefix property) T/F r3 contains r1 - FALSE (disjoint) T/F r3 contains r2 - FALSE (r2 contains r3)

slide-101
SLIDE 101

z-ordering - regions

Drill (True/False). Given:

z1= 011001** z2= 01****** z3= 0100****

z2

slide-102
SLIDE 102

z-ordering - regions

Drill (True/False). Given:

z1= 011001** z2= 01****** z3= 0100****

z3 z2

T/F r2 contains r1 - TRUE (prefix property) T/F r3 contains r1 - FALSE (disjoint) T/F r3 contains r2 - FALSE (r2 contains r3)

slide-103
SLIDE 103

z-ordering - regions

Spatial joins: find (quickly) all counties intersecting lakes

slide-104
SLIDE 104

z-ordering - regions

Spatial joins: find (quickly) all counties intersecting lakes Naive algorithm: O( N * M) Something faster?

slide-105
SLIDE 105

z-ordering - regions

Spatial joins: find (quickly) all counties intersecting lakes

z

  • bj-id

etc 0011 Erie 0101 Erie … 10** Ont.

z

  • bj-id

etc 0010 ALG … … 1000 WAS 11** ALG

slide-106
SLIDE 106

z-ordering - regions

Spatial joins: find (quickly) all counties intersecting lakes Solution: merge the lists of (sorted) z-values, looking for the prefix property footnote#1: ‘*’ needs careful treatment footnote#2: need dup. elimination

slide-107
SLIDE 107

z-ordering - Detailed outline

spatial access methods

z-ordering

main idea - 3 methods use w/ B-trees; algorithms (range, knn queries

...)

non-point (eg., region) data analysis; variations

R-trees

slide-108
SLIDE 108

z-ordering - variations

Q: is z-ordering the best we can do?

slide-109
SLIDE 109

z-ordering - variations

Q: is z-ordering the best we can do? A: probably not - occasional long ‘jumps’ Q: then?

slide-110
SLIDE 110

z-ordering - variations

Q: is z-ordering the best we can do? A: probably not - occasional long ‘jumps’ Q: then? A1: Gray codes

slide-111
SLIDE 111

z-ordering - variations

A2: Hilbert curve! (a.k.a. Hilbert-Peano curve)

slide-112
SLIDE 112

z-ordering - variations

‘Looks’ better (never long jumps). How to derive it?

slide-113
SLIDE 113

z-ordering - variations

‘Looks’ better (never long jumps). How to derive it?

  • rder-1
  • rder-2

... order (n+1)

slide-114
SLIDE 114

z-ordering - variations

Q: function for the Hilbert curve ( h = f(x,y) )? A: bit-shuffling, followed by post-processing, to account for rotations. Linear on # bits. See textbook, for pointers to code/algorithms (eg., [Jagadish, 90])

slide-115
SLIDE 115

z-ordering - variations

Q: how about Hilbert curve in 3-d? n-d? A: Exists (and is not unique!). Eg., 3-d,

  • rder-1 Hilbert curves (Hamiltonian

paths on cube)

#1 #2

slide-116
SLIDE 116

z-ordering - Detailed outline

spatial access methods

z-ordering

main idea - 3 methods use w/ B-trees; algorithms (range, knn queries

...)

non-point (eg., region) data analysis; variations

R-trees ...

slide-117
SLIDE 117

z-ordering - analysis

Q: How many pieces (‘quad-tree blocks’) per region? A: proportional to perimeter (surface etc)

slide-118
SLIDE 118

z-ordering - analysis

(How long is the coastline, say, of England? Paradox: The answer changes with the yard- stick -> fractals ...)

slide-119
SLIDE 119

z-ordering - analysis

Q: Should we decompose a region to full detail (and store in B-tree)?

slide-120
SLIDE 120

z-ordering - analysis

Q: Should we decompose a region to full detail (and store in B-tree)? A: NO! approximation with 1-3 pieces/z- values is best [Orenstein90]

slide-121
SLIDE 121

z-ordering - analysis

Q: how to measure the ‘goodness’ of a curve?

slide-122
SLIDE 122

z-ordering - analysis

Q: how to measure the ‘goodness’ of a curve? A: e.g., avg. # of runs, for range queries

4 runs 3 runs (#runs ~ #disk accesses on B-tree)

slide-123
SLIDE 123

z-ordering - analysis

Q: So, is Hilbert really better? A: 27% fewer runs, for 2-d (similar for 3-d) Q: are there formulas for #runs, #of quadtree blocks etc? A: Yes ([Jagadish; Moon+ etc] see textbook)

slide-124
SLIDE 124

z-ordering - fun observations

Hilbert and z-ordering curves: “space filling curves”: eventually, they visit every point in n-d space - therefore:

  • rder-1
  • rder-2

... order (n+1)

slide-125
SLIDE 125

z-ordering - fun observations

... they show that the plane has as many points as a line (-> headaches for 1900’s mathematics/topology). (fractals, again!)

  • rder-1
  • rder-2

... order (n+1)

slide-126
SLIDE 126

z-ordering - fun observations

Observation #2: Hilbert (like) curve for video encoding [Y. Matias+, CRYPTO ‘87]: Given a frame, visit its pixels in randomized hilbert order; compress; and transmit

slide-127
SLIDE 127

z-ordering - fun observations

In general, Hilbert curve is great for preserving distances, clustering, vector quantization etc

slide-128
SLIDE 128

Conclusions

z-ordering is a great idea (n-d points ->

1-d points; feed to B-trees)

used by TIGER system and (most

probably) by other GIS products

works great with low-dim points