Box-Trees and R-Trees with Near-Optimal Query Time Pankaj Agarwal - - PDF document

Box-Trees and R-Trees with Near-Optimal Query Time Pankaj Agarwal - Duke University Mark de Berg - Utrecht University Joachim Gudmundsson - Utrecht University Mikael Hammar - Lund University Herman Haverkort - Utrecht University 0

Box-Trees • each leaf nodes stores a geometric object • each internal node: – has two children (or: O (1) children) – stores for each child the bounding box of all objects in the child’s subtree 2D Example: 1

Applications Box-trees store geometric data (2D, 3D, higher-D): maps, CAD-models, etc. Applications in: • geographic information systems (e.g. point location) • computer graphics (e.g. visibility queries) • virtual reality (e.g. collision detection) • robotics • motion planning Examples: Point location Nearest neighbour Collision detection or Range searching 2

Pros and cons Advantages: • low storage costs GIS-databases and CAD-models can be very large – storage efficiency is critical; constants matter • simple to implement • flexible In many applications, many different types of objects must be stored and different types of queries are done • usually good performance in practice Disadvantages: • no guarantee on performance query time depends on the way the tree is built – little theoretical work has been done about efficient constructions 3

Rectangle-Intersection Queries Report all objects intersecting query rectangle R : 1. Check the bounding boxes stored at the root to see if they intersect R ; 2. For each bounding box that intersects R , recursively visit the corresponding subtree – if that is a leaf, check the corresponding object and report if it intersects R . R Running time: ≈ number of nodes visited = number of bounding boxes intersecting R . 4

Known results n = total number of input rectangles (object bounding boxes) in box-tree k = number of input rectangles intersected by R Lower bounds: • De Berg et al. (2000): – input: disjoint unit cubes in d dimensions – query ranges: very thin/flat rectangles – bound: Ω( n 1 − 1 /d + k ) Upper bounds: • A box in d dimensions can be represented by a point in 2 d -dimensional ‘configuration space’. ( x 1 , y 1 ) ( x 0 , y 0 , x 1 , y 1 ) ( x 0 , y 0 ) Determine which boxes are grouped together by partitioning the representative points using a kd-tree. Result: O ( n 1 − 1 / (2 d ) + k ) • De Berg et al. (2000): – input: rectangles in 2D – query range: rectangle with relative width w – bound: O (log 2 n + ( w + k ) log n ) (Θ( n ) in the worst case) 5

Our contribution Lower bounds: • Ω( n 1 − 1 /d + k ) also in the following case: – input: intersecting almost-unit-almost-cubes in d ≥ 2 dimensions – query ranges: points • Ω( n 1 − 1 /d + k ) also in the following case: – input: disjoint almost-unit-almost-cubes in d ≥ 3 dimensions – query ranges: cubes Upper bounds: • Better analysis of configuration space approach: O ( n 1 − 1 /d + k log n ) for point and rectangle queries • After small modification of the construction: Θ( n 1 − 1 /d + k ) = optimal • New construction for (almost) disjoint input in 2D: O ( √ n log n + k ) for rectangle queries O (log 2 n ) for point queries • Variant of this construction: O (log 2 n + k ) for queries with rectangles of bounded aspect ratio 6

Lower bound intersecting input Theorem: for all n , there is a set of almost-unit-squares in 2D such that in any box-tree on this set, a point query with result ∅ takes Ω( √ n ) time in the worst case. √ n upper Proof: right corners √ n lower left corners input boxes are all combinations of lower left corner with upper right corner ( n boxes) • any box-tree has Θ( n ) bounding boxes of pairs • each intersects one of O ( √ n ) query points • at least one query point gets Ω( √ n ) intersections Generalises to higher dimensions: Ω( n 1 − 1 /d ) 7

Lower bound disjoint input Lower bound holds also for disjoint input in 3D: start with 2D-construction on n 2 / 3 almost-squares. n 1 / 3 upper right corners n 1 / 3 lower left corners use 3rd dimension to make disjoint almost-cubes (query points become edges of large cubes), line up n 1 / 3 such sets with query points in between ⇒ each of Θ( n ) internal boxes intersects one of = O ( n 1 / 3 ) query points/cubes = ⇒ Ω( n 2 / 3 ) query time Result: • shows polylogarithmic point-query times are impossible without near-linear range-query time • generalises to higher dimensions: Ω( n 1 − 1 /d ) • does not work in 2D 8

Kd-Interval-Trees • on each level, cut such that at most half of the input lies to one side, at most half lies to the other side • store each side recursively • store intersected boxes in separate substructures • cut vertical on every odd level, horizontal on every even level 1 A D 3 2 C B 1 2 3 A B C D 9

Kd-Interval-Trees: substructures Substructures for boxes intersected by a cutting line: a binary tree on the order along the line R Analysis for search with query rectangle R : • O (log n ) bounding boxes may contain an endpoint of R ’s projection on the cutting line • A bounding box in between the endpoints only intersects R if there is a leaf node to be reported in its subtree: O ( k log n ) bounding boxes Total: O (log n + k log n ) 10

Kd-Interval-Trees: query time Analysis for the complete structure (rectangle query): • known about kd-trees: only O ( √ n ) kd-tree cells may intersect the boundary of a rectangle • for each of them, spend O (log n + k ′ log n ) in the associated “intersected substructure”, where � k ′ = k Total: O ( √ n log n + k log n ) Analysis for the complete structure (point query): • O (log n ) cells may be visited • spend O (log n ) in each “intersected substructure” Total: O (log 2 n ) 11

Priority nodes (like a priority search tree) In each subtree, store the leftmost, rightmost, topmost and bottommost input objects as priority leaves directly under the root. Effect for rectangle queries: • search time substructures improves to O (log n + k log n ) • total search time improves to O ( √ n log n + k log n ) 12

Conclusions Results: • lower bounds that hold with “normal” query ranges • an easy construction which achieves optimal query time for range searching in box-trees on overlapping input in any number of dimensions • an easy construction which achieves near-optimal query time for range and point searching in box-trees on disjoint input in 2D • generalisations of the bounds and efficient conversions to R-trees Open problems: • Why do bounding volume hierarchies seem to work well in practice, despite bad bounds? – analysis under realistic constraints on input? – analysis for approximate range searching? • How do our box-tree constructions compare to known heuristic approaches? • How to deal with insertions and deletions? 13

Box-Trees and R-Trees with Near-Optimal Query Time Pankaj Agarwal - - PDF document

Box-Trees and R-Trees with Near-Optimal Query Time Pankaj Agarwal - Duke University Mark de Berg - Utrecht University Joachim Gudmundsson - Utrecht University Mikael Hammar - Lund University Herman Haverkort - Utrecht University 0 Box-Trees

Paradoxes in Probability How probability continues to amuse me! Let's play a game! Box A Box B

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

The Origin of Near Earth The Origin of Near Earth The Origin of Near Earth The Origin of Near

Flux Box Flux Box A concept by Flux Laboratory Flux box : concept Flux box : concept What is Flux

Kid s Box American English Level 1 Presentation Plus: Kid s Box American English Kid s Box

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

CS4224/CS5424 Lecture 9 Distributed Query Processing Query Processing Translates query into a

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

Trees Eric McCreath Overview In this lecture we will explore: general trees, binary trees,

2-3-4 Trees and Red- Black Trees 204 erm CS 16: Balanced Trees 2-3-4 Trees Revealed Nodes

R-trees A Programmers Introduction Kent Williams-King kawillia@ucalgary.ca March 1, 2011

The R-Tree Yufei Tao ITEE University of Queensland INFS4205/7205, Uni of Queensland The R-Tree

R-trees Computational Geometry Heuristics Buffer Paradigm 1 Spatial Data Spatial data: points,

Advanced features and capabilities Platform architecture Ayman Hamed Solutions Architect How We

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture III:

Last time: abstraction and parametricity 1/ 44 This time: GADTs a b 2/ 44 What we

Search for heavy resonances decaying to long- lived neutral particles Emyr Clement on behalf of