Partial match queries: a limit process Nicolas Broutin Ralph - PowerPoint PPT Presentation

Partial match queries: a limit process Nicolas Broutin Ralph Neininger Henning Sulzbach Partial match queries: a limit process 1 / 19

Background/Introduction Data structures/Algorithms ◮ Analysis of costs/running times in natural conditions ◮ expected cost ◮ performance guarantee provided by concentration Methodology ◮ complex “objects” that decompose recursively (tree like, or related) ◮ general approach for convergence using contractions Partial match queries: a limit process 2 / 19

Searching geometric data and quadtrees 2 4 1 3 Partial match queries: a limit process 3 / 19

Model and Previous results Point set = { ( U i , V i ) , i ≥ 1 } iid uniform in [ 0 , 1 ] 2 C n ( s ) the number of lines intersecting { x = s } in a quadtree of size n Theorem (Flajolet, Gonnet, Puech and Robson (1993)) For ξ uniform independent of { ( U i , V i ) , i ≥ 1 } √ κ = Γ( 2 β + 2 ) 17 − 3 E [ C n ( ξ )] ∼ κ n β where 2 Γ( β + 1 ) 2 , β = 2 Theorem (Chern and Hwang (2003)) Let φ ( z ) = ( z + 1 )( z + 2 ) − 4 and β > β ′ the roots of φ . For ξ uniform independent of { ( U i , V i ) , i ≥ 1 } , one has the exact expression ( − 1 ) k + 1 2 ( 1 − β ) k − 1 ( 1 − β ′ ) k − 1 � n � � E [ C n ( ξ )] = k k !( k + 1 )! 1 ≤ k ≤ n Corollary (Chern and Hwang (2003)) For ξ uniform independent of { ( U i , V i ) , i ≥ 1 } E [ C n ( ξ )] = κ n β − 1 + O ( n β − 1 ) Partial match queries: a limit process 4 / 19

Idea of the method / heuristic for the constants Recursive decomposition ξ d We have Y = max { U 1 , U 2 } and ( I , J ) = Mult ( Bin ( n − 1 , Y ); V , ( 1 − V )) then ( U , V ) C n ( ξ ) d = 1 + C I ( ξ ′ ) + C J ( ξ ′ ) E [ C n ( ξ )] ≈ 2 E [ C nYV ( ξ ′ )] ⇒ Y Plugging E [ C n ( ξ )] = κ n β yields √ 4 17 − 3 1 = 2 E [ Y β V β ] = 2 E [ Y β ] · E [ V β ] = ⇒ β = ( β + 2 )( β + 1 ) 2 About the variance Var ( C n ( ξ )) Even when conditioning on the first point, the two terms are still dependent on the query line Partial match queries: a limit process 5 / 19

The cost at a fixed query line Idea: ◮ if the query line is fixed at s ∈ ( 0 , 1 ) , then we do have independence ◮ however, its relative position changes in the subproblems ◮ ⇒ consider the entire process ( C n ( s ) , s ∈ ( 0 , 1 )) Theorem (Flajolet, Labelle, Laforest and Salvy 1995) √ 2 − 1 ) = o ( n β ) E [ C n ( 0 )] = Θ( n Note : in particular, E [ C n ( U 1 )] = o ( n β ) , and C n ( s ) is not concentrated. Theorem (Curien and Joseph (2011)) For every fixed s ∈ ( 0 , 1 ) , one has Γ( 2 β + 2 )Γ( β + 2 ) E [ C n ( s )] ∼ K 1 ( s ( 1 − s )) β/ 2 n β , K 1 = 2 Γ( β + 1 ) 3 Γ( β/ 2 + 1 ) 2 . Partial match queries: a limit process 6 / 19

Main result Theorem There exists a random continuous function Z such that, as n → ∞ , � C n ( s ) � d K 1 n β , s ∈ [ 0 , 1 ] → ( Z ( s ) , s ∈ [ 0 , 1 ]) . (1) This convergence in distribution holds in the Banach space ( D [ 0 , 1 ] , � · � ) of right-continuous functions with left limits (c` adl` ag) equipped with the supremum norm. Proposition The distribution of the random function Z in (1) is a fixed point of the following equation � s � s � � �� Z ( s ) d ( UV ) β Z ( 1 ) + ( U ( 1 − V )) β Z ( 2 ) = 1 { s < U } U U � s − U � s − U � � �� (( 1 − U ) V ) β Z ( 3 ) + (( 1 − U )( 1 − V )) β Z ( 4 ) + 1 { s ≥ U } , 1 − U 1 − U where U and V are independent [ 0 , 1 ] -uniform random variables and Z ( i ) , i = 1 , . . . , 4 are independent copies of the process Z, which are also independent of U and V. Furthermore, Z in (1) is the only solution such that E [ Z ( s )] = ( s ( 1 − s )) β/ 2 for all s ∈ [ 0 , 1 ] and E [ � Z � 2 ] < ∞ . Partial match queries: a limit process 7 / 19

What does it look like I n = 1000 2.0 1.5 1.0 0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Partial match queries: a limit process 8 / 19

What does it look like II 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Partial match queries: a limit process 9 / 19

Moments and supremum Theorem We have for all s ∈ ( 0 , 1 ) , as n → ∞ , � � 2B ( β + 1 , β + 1 ) 2 β + 1 ( s ( 1 − s )) β n 2 β . Var ( C n ( s )) ∼ 3 ( 1 − β ) − 1 � 1 0 x a − 1 ( 1 − x ) b − 1 dx denotes the Eulerian beta integral (a , b > 0 ). Here, B ( a , b ) := Theorem Let S n = sup s ∈ [ 0 , 1 ] C n ( s ) . Then, as n → ∞ , d n − β S n E [ S n ] ∼ n β E [ S ] , Var ( S n ) ∼ n 2 β Var ( S ) . → S = sup Z ( s ) and s ∈ [ 0 , 1 ] Partial match queries: a limit process 10 / 19

Convergence in distribution by contraction I. 2 4 Cost of the construction of the quadtree / path length n � P n = D i with D i the depth of the i -th inserted point i = 1 ◮ I r n the number of points inside the r -th child cell ◮ Q r the volume or the r -th child cell 1 3 We have 4 X n = P n − α n log n � d P n = P I r n + n − 1 and write n r = 1 n ) d ( I 1 n , . . . , I 4 = Mult ( n − 1 ; UV , U ( 1 − V ) , ( 1 − U )( 1 − V ) , ( 1 − U ) V ) . Shifting and rescaling we obtain: � I r � I r � I r 4 � n − α I r n log I r 4 � � P I r P n − α n log n + n − 1 − α log n � � n n n n = + α log I r n n n n n n n r = 1 r = 1 � �� X n A r n b n Partial match queries: a limit process 11 / 19

Convergence in distribution by contraction II. General problem: = � 4 d r = 1 A r n · X r A recursive family of equations X n n + b n with I r ◮ ( A 1 n , . . . , A 4 n , I 1 n , . . . , I 4 n , b n ) independent of (( X 1 ) , . . . , ( X 4 )) ◮ ( X r n , n ≥ 1 ) iid copies of ( X ) The equation ”converges” to a limit equation: n = I r A r n n → Leb ( Q r ) � I r � I r 4 � � 4 b n = n − 1 − α log n � � n n + α log → 1 + α Leb ( Q r ) log Leb ( Q r ) n n n n r = 1 r = 1 4 4 � � Leb ( Q r ) · X r + 1 + α X d = Leb ( Q r ) log Leb ( Q r ) (2) r = 1 r = 1 Formalization: (2) a transfer map on a space of probability measures on R . d 2 ( φ, ϕ ) = inf {� X − Y � 2 : L ( X ) = φ, L ( Y ) = ϕ } � ◮ on M 2 = { probability measures µ : x 2 d µ < ∞} no contraction (can shift!) � ◮ on M 0 2 = { µ ∈ M 2 : xd µ = 0 } contraction Partial match queries: a limit process 12 / 19

Convergence for partial match processes s 2 4 n ) d ( I ( 1 ) n , . . . , I ( 4 ) = Mult ( n − 1 ; UV , U ( 1 − V ) , ( U , V ) ( 1 − U )( 1 − V ) , ( 1 − U ) V ) � s � s � � �� C n ( s ) d C ( 1 ) + C ( 2 ) = 1 + 1 { s < U } I ( 1 ) I ( 2 ) U U n n � 1 − s � 1 − s � � �� C ( 3 ) + C ( 4 ) 1 3 + 1 { s ≥ U } I ( 3 ) I ( 4 ) 1 − U 1 − U n n Partial match queries: a limit process 13 / 19

Convergence for partial match processes s 2 4 n ) d ( I ( 1 ) n , . . . , I ( 4 ) = Mult ( n − 1 ; UV , U ( 1 − V ) , ( U , V ) ( 1 − U )( 1 − V ) , ( 1 − U ) V ) � s � s � � �� C n ( s ) d C ( 1 ) + C ( 2 ) = 1 + 1 { s < U } I ( 1 ) I ( 2 ) U U n n � 1 − s � 1 − s � � �� C ( 3 ) + C ( 4 ) 1 3 + 1 { s ≥ U } I ( 3 ) I ( 4 ) 1 − U 1 − U n n Heuristic : If n − β C n ( · ) converges, we should have n − β C n ( · ) → Z ( · ) satisfying � s � s � � �� Z ( s ) d ( UV ) β Z ( 1 ) + ( U ( 1 − V )) β Z ( 2 ) = 1 { s < U } U U � s − U � s − U � � �� (( 1 − U ) V ) β Z ( 3 ) + (( 1 − U )( 1 − V )) β Z ( 4 ) + 1 { s ≥ U } 1 − U 1 − U Partial match queries: a limit process 13 / 19

Partial match queries: a limit process Nicolas Broutin Ralph - PowerPoint PPT Presentation

Partial match queries: a limit process Nicolas Broutin Ralph Neininger Henning Sulzbach Partial match queries: a limit process 1 / 19 Background/Introduction Data structures/Algorithms Analysis of costs/running times in natural conditions

Math 211 Math 211 Lecture #39 Limit Sets April 25, 2001 2 Limit Sets Limit Sets The

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Overview Partial Constituent Fronting in German The phenomenon: Partial constituent fronting

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

BOOSTING PERFORMANCE OF ORDER BY LIMIT QUERIES Varun Gupta Optimizer Developer MariaDB

Co Connection Advisor Match Consultants | Firm Overview Q1 2020 |

Blastns seed length Recall: blastns seed match is of length w = 11 , 12 exact match

Match.com Leonard Hock, DO, MACOI, CMD, HMDC, FAAHPM Match.com Profile Gender Age

Future Match Report 2014 ICT-Brokerage Event at CeBIT 10 14 March 2014

We now mention some useful modifications of the limit idea. One-sided limits. + or

Partial Functions and Categories of Partial Maps Science Atlantic at Acadia University Darien

Partial Orders on the integers. In this case ( a , b ) R if a b . a a so R is reflexive. a b

JUST THE MATHS SLIDES NUMBER 14.1 PARTIAL DIFFERENTIATION 1 (Partial derivatives of the

The Semantics of Partial Model Introduction Transformations Partial Models Transforming

The hyperbolic Brownian plane Thomas Budzinski ENS Paris July 7th, 2016 Thomas Budzinski The

Understanding MCMC Marcel Lthi, University of Basel Slides based on presentation by Sandro

Metropolis-Hastings Algorithm for Mixture Model and its Weak Convergence Kengo, KAMATANI

Limiting Spectral Distribution of Stochastic Block Model Yizhe Zhu University of Washington

Stochastic gradient methods for machine learning Francis Bach INRIA - Ecole Normale Sup

Clustering and K-means Root Mean Square Error (RMS) Data: ! x 1 , ! x 2 , , ! x N R d

Chapter 3 Asymptotic Equipartition Property Peng-Hua Wang Graduate Inst. of Comm. Engineering

Clustering 2 Clustering 2 Nov 3 2008 HAC Algorithm HAC Algorithm St t Start with all objects in

Sambuz

Useful Links

Newsletter

Mail Us