Orthogonal range searching Orthogonal range searching Problem: Given - - PDF document

orthogonal range searching orthogonal range searching
SMART_READER_LITE
LIVE PREVIEW

Orthogonal range searching Orthogonal range searching Problem: Given - - PDF document

CG Lecture 4 CG Lecture 4 Orthogonal range searching Orthogonal range searching Problem: Given a set of n points Orthogonal range search in d , preprocess them such Y that reporting or counting the 1. Problem definition and motivation k


slide-1
SLIDE 1

1

1

CG Lecture 4 CG Lecture 4

Orthogonal range search

  • 1. Problem definition and motivation
  • 2. Space decomposition: techniques and trade-offs
  • 3. Space decomposition schemes:
  • Grids: uniform, non-hierarchical decomposition
  • Quad trees: uniform, hierarchical decomposition
  • Range trees and kd-trees: non-uniform, hierarchical
  • 4. Higher dimensions, variations on the theme

2

Orthogonal range searching Orthogonal range searching

Problem: Given a set of n points in ℜd, preprocess them such that reporting or counting the k points inside a d- dimensional axis-parallel box (range) will be efficient. Sample application: Report all cities within 20 KM radius of Tel Aviv.

X Y

3

Space decomposition techniques Space decomposition techniques

  • Different schemes for different types of data, with

various trade-offs

  • Types of space decomposition:
  • Grids: uniform, non-hierarchical decomposition
  • Quad trees: uniform, hierarchical decomposition
  • Range trees and kd-trees: non-uniform, hierarchical
  • Key efficiency parameters:
  • Preprocessing time:

f(n) Ω(n)

  • Preprocessing storage

f(n) Ω(n)

  • Average query time

f(n,k)

  • Worst-case query time

f(n,k)

  • Dynamical update (insertions/deletions) f(n)

4

Grids: uniform, non Grids: uniform, non-

  • hierarchical

hierarchical

Method:

  • Store points in a uniform array

[1:n]×[1:n]

  • Query is answered by reporting

points in subarray [i:j] ×[k:l] Complexity:

  • Preprocessing in O(n) time and

O(n2) storage.

  • Query time O(k).
  • Update in O(1) when points on

grid. Alternatives:

  • List, sparse matrix, hashing

5

Quad trees: uniform, hierarchical Quad trees: uniform, hierarchical

Method:

  • Recursively partition the space

into four quadrants until leaf quadrants contain a single point

  • r no point.
  • Query is answered by recursively

intersecting the query rectangle R with the quadrants and reporting the results according to the intersection. Complexity: depends on the nature

  • f the data set, not just the

number of points!

6

Quad trees: construction Quad trees: construction

Method:

  • Start with smallest rectangle

containing all points.

  • Recusively classify point set into four

quadrants.

  • Stop when quadrant has one or no

points. The height h of the tree is related to the size of the initial rectangle s and the smallest distance c between two points: h = log (s/c) + 3/2. A quad tree with n points and height h has O(h+1)n nodes and can be constructed in O(h+1)n time.

s/2 s/2i

i

slide-2
SLIDE 2

2

7

Quad trees: query answer Quad trees: query answer

Method: At each evel, check how the query rectangle R intersects each quadrant:

  • If it does not intersect, stop.
  • If it is included, report all points

below

  • Else, recursively check each

quadrant. Complexity:

  • Tests takes O(1) at each level.
  • Worst case: must descend to deepest

level and test all four quadrants: O(4 (h+1)).

8

Non Non-

  • uniform, hierarchical decompositions

uniform, hierarchical decompositions

Range tree

  • Construct a binary search tree for coordinate x, and

associate to each node in that tree a binary search tree for the y coordinate.

  • Complexity: O(n log n) to build and store the trees,

O(log2 n +k) to report k points.

Kd-tree

  • Construct a binary search tree by recursively

splitting the points along a median line alternating between in the x and y coordinates.

  • Complexity: O(n log n) to build, O(n) to store the

tree, O(√n +k) to report k points.

9

1D range searching 1D range searching

Points are real numbers, ranges are defined by two numbers u and v. Algorithm:

  • Sort points in O(n log n) time
  • Store points in a balanced binary

tree whose leaves are points.

  • Each tree node stores the

largest value of its left subtree.

  • Do binary search for u and v

in the list in O(log n) time.

  • List all values in between.

u v

  • 4 -2 0 1 3 5 7 11

1

  • 2

5 7 3 4

  • 1

3 5 7 11

  • 2
  • 4

Search complexity: Search complexity: O(log n + k)

10

5

Range searching in a 1D tree Range searching in a 1D tree

  • Find the two boundaries of the

range in the leaves u and v.

  • Report all leaves in maximal

sub-trees between u and v.

  • Mark the vertex at which the

search paths diverge as v-split.

  • Proceed to find the two

boundaries, reporting values in the sub-trees:

When going to the left (right) endpoint of the range: If going left (right), report the entire right (left)

  • subtree. When a leaf is reached,

check its value.

1

  • 2

5 7 3

  • 4

1 3 5 7 11

  • 2
  • 4

Input Range: 3.5-8.2

1 11 7

v- split

5 3 7

11

1D range trees 1D range trees – – split node finding split node finding

12

1D range query algorithm 1D range query algorithm

slide-3
SLIDE 3

3

13

1D range query: run 1D range query: run-

  • time analysis

time analysis

  • k: output size
  • Leaves: O(k) time
  • Internal nodes: O(k) time (since this is a

binary tree)

  • Paths: O(log n) time
  • Total: O(log n + k) time
  • Worst case: k = n → Θ(n) time

14

2D range search: idea 2D range search: idea

Generalize 1D range searching: Construction:

  • Construct a tree ordered by x

coordinates.

  • Each inner vertex v contains a

pointer to a secondary containing all the points of the primary subtree ordered by y coordinate.

  • Points are stored only in the

secondary trees. sorted by x sorted by y T T v v T Tassoc

assoc(

(v v) )

15

2D range tree: idea 2D range tree: idea

Searching:

  • Given a 2D range, we

simulate a 1D search and find subtrees sorted by x.

  • Instead of reporting the

entire subtrees, invoke a search in the secondary trees sorted by y, and report only the points in the query range. T T v v T Tassoc

assoc(

(v v) )

16

2D range tree construction algorithm 2D range tree construction algorithm

17

2D range tree construction complexity 2D range tree construction complexity

  • Same as a 1D-Tree, except that in each level the

secondary trees are built as well.

  • Theorem: The space complexity is Θ(n log n).
  • Proof: The size of the primary tree is Θ(n).

Each of its Θ(log n) levels corresponds to a collection

  • f secondary trees that contains all the n points.
  • Time complexity (naïve analysis):

) log ( O 2 2 ) log ( O 1 ) 1 ( O ) (

2 n

n else n T n n n n T =            + = =

18

2D range tree construction complexity 2D range tree construction complexity

Improvement:

  • Source of inefficiency: repeated sorting by y

coordinate!

  • Instead, sort by y only once, and copy data in

the recursive calls in linear time.

  • The resulting recursive equation is:

) log ( O 2 2 ) ( O 1 ) 1 ( O ) ( n n else n T n n n T =            + = =

slide-4
SLIDE 4

4

19

2D range tree query algorithm 2D range tree query algorithm

20

2D range search complexity 2D range search complexity

Recurrence equation:

  • The running time can be reduced to

O(log n + k) by using fractional cascading. ) (log ) (log ) (log ) (

2

k n O k n n O n T

v v

+ = + + =

↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑

traversing traversing calls to traversing reporting calls to traversing reporting primary secondary primary secondary secondary secondary structure structure structure structure structure structure

21

d d-

  • dimensional range trees

dimensional range trees

The idea generalizes directly to d dimensions:

  • Create a series of trees, one for

each dimension.

  • Search each as before.

Complexity:

  • Construction (time and storage)

Td(n) = O(n log n) + Td–1(n).O(log n) = O(n logd–1n)

  • Query:

Qd(n) = O(log n) + Qd–1(n).O(log n) = O(logdn)

sorted by x1 T T v v sorted by x2 sorted by x3

22

2D 2D kd kd trees: idea trees: idea

  • Bound the points by a rectangle.
  • Split the points into two equal-size

subsets, using a horizontal or vertical line.

  • Continue recursively to partition the

subsets, alternating the directions of the lines, until point subsets are small enough (of constant size).

  • Canonical subsets are subtrees.
  • In higher (k) dimensions: Split

directions alternate between the k axes ( kd trees).

23

Example of a 2D Example of a 2D kd kd tree tree

24

2D 2D kd kd tree construction tree construction

  • Partition the plane into axis-

aligned rectangular regions.

  • Nodes represent partition

lines, and leaves represent input points.

  • The bottleneck is finding the

median, which requires only linear time!

  • Time complexity:

L1 L2 L3 L7 L6 L5 L4 C D E F G H B A L1 L3 L2 L4 L5 L6 L7 B A C D E G F H

(1) 1 ( ) ( ) 2 1 2 ( ) ( log ) O n T n n O n T n T n O n n =   =    + >       =

slide-5
SLIDE 5

5

25

2D 2D kd kd tree construction algorithm tree construction algorithm

26

2D range query search 2D range query search

  • Each node in the tree

defines an axis-parallel rectangle in the plane, bounded by the lines marked by this vertex’s ancestors.

  • Label each node with the

number of points in that rectangle.

L1 L2 L3 L7 L6 L5 L4 C D E F G H B A L5 L1 L3 L2 L4 L6 L7 B A C D E F G H

8 4 4 2 2 2 2

27

2D range query search (cont.) 2D range query search (cont.)

  • Given an axis-parallel

range query R, search for this range in the tree.

  • Traverse only subtrees

which represent regions

  • verlapping R.
  • If a subtree entirly

contained in R:

  • Counting: Add up its count.
  • Reporting: Report entire

subtree.

L1 L2 L3 L7 L6 L5 L4 C D E F G H B A L5 L1 L3 L2 L4 L6 L7 B A C D E F G H L1 L2 L4 A B L5 C

R

I L8 L8 I

9 4 5 2 2 2 3 2

28

Example of Example of kd kd tree query answering tree query answering

29

2D 2D kd kd tree search algorithm tree search algorithm

30

Query time complexity analysis Query time complexity analysis

  • k nodes are reported. How much time is

spent on internal nodes? The nodes visited are those that are stabbed by R but are not contained in R. How many such cells exist?

  • Theorem: Every side of R stabs O(√n) cells
  • f the tree.
  • Proof: Extend the side to a full

line (wlog, horizontal). In the first level it stabs two children, and in the next level it stabs two

  • ut of the four grandchildren.

Thus, the recursive equation is:

  • Total query time: O(√n + k).

( )

n O else n Q n n Q =            + = = 4 2 2 1 1 ) (

slide-6
SLIDE 6

6

31

k kd d-

  • trees: higher dimensions

trees: higher dimensions

  • For a d-dimensional space:
  • Same algorithm.
  • Construction time: O(d n log n). (O(d) time is needed to

handle a point.)

  • Space Complexity: O(d n).
  • Query time complexity: O(d (n1-1/d+k)).
  • Note: For large d, full scan is almost equally good.
  • Question: Are kd-trees useful for non-orthogonal

range queries, e.g., disks, convex polygons?

  • Fact: Using interval trees, orthogonal range queries can

be solved in O(d logd-1n + k) time using O(d n logd-1n) space.

32

Points in non Points in non-

  • general position

general position

  • Question: How can we handle sets of

points which are not in general position, i.e., with multiple points with the same x coordinate?

  • Answer: By two-step order checks.

When comparing according to x, resolve ties by y, and vice versa.

  • This splits points into two sides, having

the same effect as infinitesimally rotating the plane.

  • Theorem: The modified order checks

preserve the correctness of the algorithms.