Approximate Range Emptiness in Constant Time and Optimal Space - PowerPoint PPT Presentation

Approximate Range Emptiness in Constant Time and Optimal Space Mayank Goswami, Allan Grønlund, Kasper Larsen, Rasmus Pagh Max-Planck Institute for Informatics, (MADALGO-Aarhus) 2 , IT University of Copenhagen SODA 2015, San Diego

Approximate Range Emptiness 0 x 1 x 2 x i x n U Input Input a set S of n elements from [ U ]. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 2 / 20

Approximate Range Emptiness Query Empty? 0 x 1 x 2 x i x n U Input Input a set S of n elements from [ U ]. Preprocess it to answer Query: [ a , b ]; is [ a , b ] ∩ S � = ∅ ? M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 3 / 20

Motivation: Exact versus Approximate Membership Membership: Given a set S = { x 1 , · · · , x n } from a universe [ U ], preprocess the set to answer membership queries for a queried element q ( q ∈ S ?). 1 Currently 4757 citations! M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 4 / 20

Motivation: Exact versus Approximate Membership Membership: Given a set S = { x 1 , · · · , x n } from a universe [ U ], preprocess the set to answer membership queries for a queried element q ( q ∈ S ?). � U � Minimum space required B = lg bits. n There exist data structures using B + o ( B ) bits and O (1) query time. 1 Currently 4757 citations! M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 4 / 20

Motivation: Exact versus Approximate Membership Membership: Given a set S = { x 1 , · · · , x n } from a universe [ U ], preprocess the set to answer membership queries for a queried element q ( q ∈ S ?). � U � Minimum space required B = lg bits. n There exist data structures using B + o ( B ) bits and O (1) query time. Reduction in space if we only want ǫ -approximate answers? 1 Currently 4757 citations! M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 4 / 20

Motivation: Exact versus Approximate Membership Membership: Given a set S = { x 1 , · · · , x n } from a universe [ U ], preprocess the set to answer membership queries for a queried element q ( q ∈ S ?). � U � Minimum space required B = lg bits. n There exist data structures using B + o ( B ) bits and O (1) query time. Reduction in space if we only want ǫ -approximate answers? Yes. Bloom Filters 1 O ( n lg(1 /ǫ ) space, O ( k ) query. FPR ǫ . Here k is the number of hash functions used, and depends on ǫ . Optimal Bloom Filters (Pagh et. al.): Query time O (1) irrespective of ǫ and space usage (1 + o (1)) n lg(1 /ǫ ). 1 Currently 4757 citations! M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 4 / 20

Approximate Range Emptiness Range queries are more frequent in real life than membership queries. � U � Range emptiness: Minimum space required B = lg bits. n Follows from membership. Alstrup et. al.: O ( n ) words = O ( n lg U ) bits, O ( k ) reporting, where k is the number of reported points. Can also do emptiness (does there exist a point inside [ a , b ]?) in O (1) time (stop at the first reported point). M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 5 / 20

Approximate Range Emptiness Range queries are more frequent in real life than membership queries. � U � Range emptiness: Minimum space required B = lg bits. n Follows from membership. Alstrup et. al.: O ( n ) words = O ( n lg U ) bits, O ( k ) reporting, where k is the number of reported points. Can also do emptiness (does there exist a point inside [ a , b ]?) in O (1) time (stop at the first reported point). Approximate range emptiness (ARE): False negatives not allowed. A fraction ǫ of false positives allowed. Of all the u 2 / 2 range queries, only an ǫ fraction may have false positives. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 5 / 20

Main Question Can we reduce space usage for range queries to something lower than n lg U , by requiring approximate answers, similar to membership versus approximate membership queries? M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 6 / 20

One way to do ARE Let us say we want a data structure that answers only to ranges of size at most L < U One way to do approx. range emptiness query on [ a , b ] is to Build a Bloom Filter on S with FPR ǫ/ L . For every x ∈ [ a , b ], run a membership query on the Bloom Filter. By a union bound, the false positive rate is at most ǫ . M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 7 / 20

One way to do ARE Let us say we want a data structure that answers only to ranges of size at most L < U One way to do approx. range emptiness query on [ a , b ] is to Build a Bloom Filter on S with FPR ǫ/ L . For every x ∈ [ a , b ], run a membership query on the Bloom Filter. By a union bound, the false positive rate is at most ǫ . This uses space n lg( L /ǫ ). Achieves a query time of O ( r ), where r is the size of the range. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 7 / 20

Results: Lower Bounds M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 8 / 20

Lower Bounds We first show that the space error tradeoff cannot be improved significantly. Theorem Any data structure for the ARE problem answering all query intervals of a fixed length L ≤ u / 5 n with false positive rate ε > 0 , must use at least � L 1 − O ( ε ) � s ≥ n lg − O ( n ) ε bits of space. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 9 / 20

Extension to Two Sided Errors Theorem Any data structure for ARE with two sided error rate ǫ must use s ≥ n lg( L /ε ) − O ( n ) bits when 0 < ε < 1 / lg U , � � n lg( L lg U ) lg U ≤ ε ≤ 1 1 s = Ω bits when 2 − Ω(1) lg 1 /ε lg U M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 10 / 20

Results: Upper Bounds M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 11 / 20

Upper Bounds There is a data structure D a for the ARE problem that answers range emptiness for all ranges of length at most L , uses n lg( L /ε ) + O ( n lg δ ( L /ε )) bits of space, δ any desired constant, and has a false positive probability at most ǫ . 2 the previous best used O ( n lg U ) bits. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 12 / 20

Upper Bounds There is a data structure D a for the ARE problem that answers range emptiness for all ranges of length at most L , uses n lg( L /ε ) + O ( n lg δ ( L /ε )) bits of space, δ any desired constant, and has a false positive probability at most ǫ . A data structure D e that uses n lg( U / n ) + o ( n lg δ U / n ) bits 2 , answers exact range reporting in O ( k ) and exact emptiness in O (1) time, respectively. 2 the previous best used O ( n lg U ) bits. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 12 / 20

Upper Bounds: Reduction of Universe f : [ U ] → [ R ], where R = nL /ǫ M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 13 / 20

Upper Bounds: Reduction of Universe f : [ U ] → [ R ], where R = nL /ǫ On [ R ] we use the exact range emptiness/reporting data structure. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 13 / 20

Upper Bounds: Reduction of Universe f : [ U ] → [ R ], where R = nL /ǫ On [ R ] we use the exact range emptiness/reporting data structure. This would give us constant query time in n lg( R / n ) + n lg δ ( R / n ), or n lg( L /ǫ ) + n lg δ ( L /ǫ ) bits, which would be optimal. M. Goswami, A. Grønlund, K. Larsen, R. Pagh (Max-Planck Institute for Informatics) Approximate Range Membership SODA 2015, San Diego 13 / 20

Approximate Range Emptiness in Constant Time and Optimal Space - PowerPoint PPT Presentation

Approximate Range Emptiness in Constant Time and Optimal Space Mayank Goswami, Allan Grnlund, Kasper Larsen, Rasmus Pagh Max-Planck Institute for Informatics, (MADALGO-Aarhus) 2 , IT University of Copenhagen SODA 2015, San Diego Approximate

The Heart Sutra Amitabha Buddhist Center November 24, 2018 Why is emptiness important? The

Efficient Emptiness Check for Timed B uchi Automata F. Herbreteau, B. Srivathsan and I.

Non-constant Non-constant growth model growth model You are calculating the intrinsic value of

Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in

Approximate Computing Is Dead; Long Live Approximate Computing Adrian Sampson Cornell Hardware

Computational Geometry Lecture 8: Range trees 1 Computational Geometry Lecture 8: Range trees

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

QBF-BASED SYNTHESIS OF OPTIMAL WORD-SPLITTING IN APPROXIMATE MULTI-LEVEL CELLS Daniel E. Holcomb

Approximate Nearest Neighbors Sariel Har Peled: Notes Arya, Mount, Netenyahu, Silverman, Wu An

Motion with Constant Acceleration 1 Particle Under Constant Acceleration In the case of motion

Constant mean curvature surfaces in homogeneous manifolds Beno t Daniel August 29, 2012

UBIKAR Veld is like a space, a fmat and barren emptiness with skyline extending infjnitely. A

The motion of emptiness Dynamics and evolution of cosmic voids Laura Ceccarelli IATE,

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

+ - Can be constant or time varying +/- indicates polarity Current Source Can be

EHR Strategies for the Palliative Care Team: A Town Hall Discussion Leslie Blackhall, MD, MTS,

ICML 2019, Long Beach, June 12 th 2019 Session: Generative Models i n f o r ma t i c s

On Minimizing Crossings in Storyline Visualizations Irina Kostitsyna Martin N ollenburg

Informatics and Statistics Department Informatics and Statistics Department A Wireless Wireless

Choquet integral: distributions and decisions Vicen c Torra School of Informatics, University

The Need to Build a Health Inform atics Workforce In Developing Countries William Hersh, MD

Context-based security State of the art, open research topics and a case study Stephan Sigg The

How important is theory in health informatics? A survey of UK academics Oral presentation at MIE

Sambuz

Useful Links

Newsletter

Mail Us