Complexity Theory of Polynomial-Time Problems Lecture 7: 3SUM II - - PowerPoint PPT Presentation

โ–ถ
complexity theory of
SMART_READER_LITE
LIVE PREVIEW

Complexity Theory of Polynomial-Time Problems Lecture 7: 3SUM II - - PowerPoint PPT Presentation

Complexity Theory of Polynomial-Time Problems Lecture 7: 3SUM II Sebastian Krinninger Reminder: 3SUM given sets , , of integers are there , , such that + + = 0 ?


slide-1
SLIDE 1

Complexity Theory of Polynomial-Time Problems

Sebastian Krinninger

Lecture 7: 3SUM II

slide-2
SLIDE 2

Reminder: 3SUM

June 16, 2016 2/19

given sets ๐ต, ๐ถ, ๐ท of ๐‘œ integers are there ๐‘ โˆˆ ๐ต, ๐‘ โˆˆ ๐ถ, ๐‘‘ โˆˆ ๐ท such that ๐‘ + ๐‘ + ๐‘‘ = 0? well-known: ๐‘ƒ(๐‘œ2) Conjecture: no ๐‘ƒ ๐‘œ2โˆ’๐œ algorithm โ†’ 3SUM-Hardness Alternative algorithm: ๐‘ƒ( ๐ต โ‹… ๐ถ + |๐ท|) (store negated pairwise sums in hashmap)

slide-3
SLIDE 3

Reminder: Hashing

June 16, 2016 3/19

Hash function โ„Ž: ๐‘‰ โ†’ [๐‘†]

1 2 ๐‘† โ‹ฏ ๐‘ฆ โ„Ž(๐‘ฆ)

Goal: Distribute uniformly, avoid collisions, etc.

slide-4
SLIDE 4

Magical hash functions

June 16, 2016 4/19

Uniform difference: Pr โ„Ž ๐‘ฆ โˆ’ โ„Ž(๐‘ง) = ๐‘— = 1/๐‘† Balanced: ๐‘ฆ โˆˆ ๐‘‡ โˆถ โ„Ž ๐‘ฆ = ๐‘— โ‰ค 3๐‘œ/๐‘† Linear: โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘ฆ + ๐‘ง (mod ๐‘†) (for any set ๐‘‡ = ๐‘ฆ1, โ€ฆ , ๐‘ฆ๐‘œ โІ ๐‘‰ and any ๐‘— โˆˆ ๐‘† ) Desired properties for family of hash functions from ๐‘‰ โ†’ ๐‘† (i.e., for every โ„Ž chosen from family) (for any ๐‘ฆ, ๐‘ง โˆˆ ๐‘‰ ) (for any ๐‘ฆ, ๐‘ง โˆˆ ๐‘‰ s.t. ๐‘ฆ โ‰  ๐‘ง and ๐‘— โˆˆ ๐‘† )

But: We do not know such a familyโ€ฆ

slide-5
SLIDE 5

June 16, 2016 5/19

slide-6
SLIDE 6

Almost magical hash functions

June 16, 2016 6/19

Uniform difference: Pr โ„Ž ๐‘ฆ โˆ’ โ„Ž(๐‘ง) = ๐‘— = 1/๐‘† Almost balanced: Expected number of elements from S hashed to heavy values is ๐‘ƒ(๐‘†), where value ๐‘— โˆˆ ๐‘† is heavy if ๐‘ฆ โˆˆ ๐‘‡ โˆถ โ„Ž ๐‘ฆ = ๐‘— > 3๐‘œ/๐‘† Almost linear: โ„Ž ๐‘ฆ + โ„Ž ๐‘ง โˆˆ โ„Ž ๐‘ฆ + ๐‘ง + ๐‘‘โ„Ž + 0,1 (mod ๐‘†) (for any set ๐‘‡ = ๐‘ฆ1, โ€ฆ , ๐‘ฆ๐‘œ โІ ๐‘‰ and any ๐‘— โˆˆ ๐‘† ) Desired properties for family of hash functions from ๐‘‰ โ†’ ๐‘† (i.e., for every โ„Ž chosen from family) (for any ๐‘ฆ, ๐‘ง โˆˆ ๐‘‰ and some integer ๐‘‘โ„Ž depending only on โ„Ž) (for any ๐‘ฆ, ๐‘ง โˆˆ ๐‘‰ s.t. ๐‘ฆ โ‰  ๐‘ง and ๐‘— โˆˆ ๐‘† )

slide-7
SLIDE 7

Definition of hash function

June 16, 2016 7/19

Rest of this lecture: โ„Ž picked randomly from this family โ„‹๐‘‰,๐‘†,๐‘  = โ„Ž๐‘,๐‘: ๐‘‰ โ†’ ๐‘† โˆฃ ๐‘ โˆˆ ๐‘  odd integer and ๐‘ โˆˆ ๐‘  โ„Ž๐‘,๐‘ ๐‘ฆ = ๐‘๐‘ฆ + ๐‘ mod ๐‘  div ๐‘ /๐‘† Thm: Family โ„‹๐‘‰,๐‘†,๐‘  is has the uniform difference property, is almost balanced and almost linear with ๐‘‘โ„Ž๐‘,๐‘ = (๐‘ โˆ’ 1 mod ๐‘ ) div ๐‘ /๐‘† . Set ๐‘  = ๐‘™๐‘› for some ๐‘™ โ‰ฅ ๐‘‰/2 and ๐‘‰, ๐‘†, ๐‘  powers of 2 (Pairwise independence [Dietzfelbinger โ€™96] implies uniform difference (easy to check) and almost balanced [Baran et al. โ€˜08]. Almost linear: easy to check.)

slide-8
SLIDE 8

Hashing down the universe

June 16, 2016 8/19

Lem: If 3SUM on universe of size ๐‘ƒ(๐‘œ3) solvable in exp. time ๐‘ƒ(๐‘œ2โˆ’๐œ—), then 3SUM on arbitrary universe solvable in expect. time ๐‘ƒ(๐‘œ2โˆ’๐œ—). Algorithm: Repeat until output:

  • Pick hash function โ„Ž: 1 โ€ฆ ๐‘‰ โ†’ 1 โ€ฆ 6๐‘œ3 at random
  • ๐ตโ€ฒ =

โ„Ž ๐‘ ๐‘ โˆˆ ๐ต , ๐ถโ€ฒ = {โ„Ž ๐‘ โˆฃ ๐‘ โˆˆ ๐ถ}, ๐ทโ€ฒ = โ„Ž ๐‘‘ + ๐‘‘โ„Ž ๐‘‘ โˆˆ ๐ท

  • ๐ตโ€ฒโ€ฒ =

โ„Ž ๐‘ ๐‘ โˆˆ ๐ต , ๐ถโ€ฒโ€ฒ = {โ„Ž ๐‘ โˆฃ ๐‘ โˆˆ ๐ถ}, ๐ทโ€ฒโ€ฒ = โ„Ž ๐‘‘ + ๐‘‘โ„Ž + 1 ๐‘‘ โˆˆ ๐ท

  • Solve two 3SUM instances ๐ตโ€ฒ, ๐ถโ€ฒ, ๐ทโ€ฒ and ๐ตโ€ฒโ€ฒ, ๐ถโ€ฒโ€ฒ, ๐ทโ€ฒโ€ฒ
  • If algorithm reports no 3SUM witness: output โ€˜no 3SUMโ€™
  • Consider first reported 3SUM witness ๐‘ฆโ€ฒ, ๐‘งโ€ฒ, ๐‘จโ€ฒ for ๐ตโ€ฒ, ๐ถโ€ฒ, ๐ทโ€ฒ :
  • If โ„Žโˆ’1 ๐‘ฆโ€ฒ , โ„Žโˆ’1 ๐‘งโ€ฒ , โ„Žโˆ’1 ๐‘จโ€ฒ โˆ’ ๐‘‘โ„Ž contains witness ๐‘ฆ, ๐‘ง, ๐‘จ: output

๐‘ฆ, ๐‘ง, ๐‘จ

  • Consider first reported 3SUM witness ๐‘ฆโ€ฒโ€ฒ, ๐‘งโ€ฒโ€ฒ, ๐‘จโ€ฒโ€ฒ for ๐ตโ€ฒโ€ฒ, ๐ถโ€ฒโ€ฒ, ๐ทโ€ฒโ€ฒ :
  • If โ„Žโˆ’1 ๐‘ฆโ€ฒโ€ฒ , โ„Žโˆ’1 ๐‘งโ€ฒโ€ฒ , โ„Žโˆ’1 ๐‘จโ€ฒโ€ฒ โˆ’ ๐‘‘โ„Ž โˆ’ 1 contains witness ๐‘ฆ, ๐‘ง, ๐‘จ:
  • utput ๐‘ฆ, ๐‘ง, ๐‘จ

No false negatives: If ๐‘ฆ + ๐‘ง = ๐‘จ, then โ„Ž ๐‘ฆ + โ„Ž ๐‘ง โˆˆ โ„Ž ๐‘จ + ๐‘‘โ„Ž + 0,1

Follows from [Baran et al. โ€˜08]

slide-9
SLIDE 9

Running Time

June 16, 2016 9/19

Number of iterations: Triple ๐‘ฆ, ๐‘ง, ๐‘จ gives false positive if ๐‘ฆ + ๐‘ง โ‰  ๐‘จ and one of โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘จ + ๐‘‘โ„Ž or โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘จ + ๐‘‘โ„Ž + 1 Linearity: โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘ฆ + ๐‘ง + ๐‘‘โ„Ž or โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘ฆ + ๐‘ง + ๐‘‘โ„Ž + 1 Thus, probability that fixed ๐‘ฆ, ๐‘ง, ๐‘จ (with ๐‘ฆ + ๐‘ง โ‰  ๐‘จ) gives false positive is: Pr โ„Ž ๐‘ฆ + ๐‘ง โˆ’ โ„Ž ๐‘จ โˆˆ {โˆ’1,0,1} โ‰ค

3 6๐‘œ3 = 1 2๐‘œ3

(uniform difference) Overall probability of false positive: โ‰ค ๐‘œ3 โ‹…

1 2๐‘œ3 = 1 2

In expectation: 2 iterations until no false positive (waiting time bound) (If no false positive, then algorithm certainly stops) We need to bound:

  • Number of iterations ๐‘ƒ(1)
  • Number of candidate witnesses ๐‘ƒ(1)

Then: number of calls to 3SUM algorithm: ๐‘ƒ(1)

slide-10
SLIDE 10

Running Time

June 16, 2016 10/19

Number of candidate witnesses: Fix 3SUM witness ๐‘ฆโ€ฒ, ๐‘งโ€ฒ, ๐‘จโ€ฒ of instance (๐ตโ€ฒ, ๐ถโ€ฒ, ๐ทโ€ฒ) Let ๐‘ฆโˆ— โˆˆ โ„Žโˆ’1 ๐‘ฆโ€ฒ For every ๐‘ฆ โ‰  ๐‘ฆโˆ—: Pr โ„Ž ๐‘ฆ = โ„Ž ๐‘ฆโˆ— =

1 6๐‘œ3

(uniform difference) ๐น โ„Žโˆ’1 ๐‘ฆโ€ฒ โ‰ค 1 +

๐‘œ 4๐‘œ3 โ‰ค 2

Similarly: ๐น โ„Žโˆ’1 ๐‘งโ€ฒ โ‰ค 2, ๐น โ„Žโˆ’1 ๐‘จโ€ฒ โ‰ค 2 ๐น โ„Žโˆ’1 ๐‘ฆโ€ฒ โˆช โ„Žโˆ’1 ๐‘งโ€ฒ โˆช โ„Žโˆ’1 ๐‘จโ€ฒ โ‰ค ๐‘ƒ(1) (linearity of expectation) In expectation, algorithm manually checks constant number of candidate witnesses per iteration We need to bound:

  • Number of iterations ๐‘ƒ(1)
  • Number of candidate witnesses ๐‘ƒ(1)

Then: number of calls to 3SUM algorithm: ๐‘ƒ(1)

slide-11
SLIDE 11

Convolution 3SUM

June 16, 2016 11/19

Given array ๐ต 1 โ€ฆ ๐‘œ of integers are there ๐‘—, ๐‘˜ such that ๐ต ๐‘— + ๐ต ๐‘˜ = ๐ต[๐‘— + ๐‘˜]? trivial algorithm: ๐‘ƒ(๐‘œ2) Thm: There is no ๐‘ƒ(๐‘œ2โˆ’๐œ—) algorithm for Convolution 3SUM unless the 3SUM Conjecture fails.

๐‘— ๐‘ฆ ๐‘ง ๐‘ฆ + ๐‘ง ๐‘˜ ๐‘— + ๐‘˜ [Pฤƒtraศ™cu 2010]

Stepping stone towards hardness of other โ€œstructuredโ€ problems

slide-12
SLIDE 12

Reduction from 3SUM

June 16, 2016 12/19

Given set ๐‘Œ โІ [๐‘‰] of integers are there ๐‘ฆ, ๐‘ง, ๐‘จ โˆˆ ๐‘Œ such that ๐‘ฆ + ๐‘ง = ๐‘จ? Preprocessing: Check if there is a solution 2๐‘ฆ = ๐‘จ ๐‘ƒ ๐‘œ log ๐‘œ Pick random hash function โ„Ž: ๐‘‰ โ†’ ๐‘† (almost linear, etc.) 1 2 ๐‘† โ‹ฏ In expectation: ๐‘ƒ ๐‘† elements in buckets with load > 3๐‘œ/๐‘† (almost bal.) For each such ๐‘ฆ: check for 3SUM triple involving ๐‘ฆ ๐‘ƒ ๐‘†๐‘œ (in exp.) 1 3๐‘œ/๐‘† โ‹ฎ For this proof: assume โ„Ž is almost balanced and linear (magicallyโ€ฆ)

slide-13
SLIDE 13

Convolution 3SUM instance

June 16, 2016 13/19

1 ๐‘ข ๐‘† โ‹ฏ

๐‘™ ๐‘— ๐‘˜

Number elements in each bucket from 0 to

3๐‘œ ๐‘† โˆ’ 1

Iterate over all triples ๐‘—, ๐‘˜, ๐‘™ โˆˆ 3๐‘œ/๐‘† โ‹ฏ โ‹ฏ โ‹ฏ For every bucket ๐‘ข:

  • Put ๐‘—-th element to ๐ต[8๐‘ข + 1]
  • Put ๐‘˜-th element to ๐ต[8๐‘ข + 3]
  • Put ๐‘™-th element to ๐ต[8๐‘ข + 4]

Set all other array entries to โˆž (sufficiently large number)

3๐‘œ ๐‘† 3

instances of Convolution 3SUM

slide-14
SLIDE 14

Correctness

Assume ๐‘ฆ + ๐‘ง = ๐‘จ Then โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘จ (mod ๐‘†) (linearity) Duplicate array for Convolution 3SUM instance If ๐‘ฆ = ๐‘ง, triple found in preprocessing If ๐‘ฆ, ๐‘ง, or ๐‘จ hashed to heavy bucket: triple found in second step Either โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘จ or โ„Ž ๐‘ฆ + โ„Ž ๐‘ง = โ„Ž ๐‘จ + ๐‘† Observation: ๐ต ๐‘— + ๐ต ๐‘˜ = ๐ต ๐‘— + ๐‘˜ only if ๐‘— = 8๐‘ข1 + 1 and ๐‘˜ = 8๐‘ข2 + 3 ๐ต 8โ„Ž ๐‘ฆ + 1 + ๐ต 8โ„Ž ๐‘ง + 3 = ๐ต 8โ„Ž ๐‘จ + 4 or ๐ต 8โ„Ž ๐‘ฆ + 1 + ๐ต 8โ„Ž ๐‘ง + 3 = ๐ต 8(โ„Ž ๐‘จ + ๐‘†) + 4 Thus, no false negatives. Also no false positives: (๐‘ฆ + ๐‘ง = ๐‘จ mod 8 has unique solution over 1,3,4 and ๐ต ๐‘— โ‰  ๐ต[๐‘˜])

slide-15
SLIDE 15

Running Time

Assumption: Convolution 3SUM in time ๐‘ƒ(๐‘œ2โˆ’๐œ—) Total expected running time: ๐‘ƒ ๐‘œ log ๐‘œ + ๐‘œ๐‘† +

๐‘œ ๐‘† 3

๐‘œ2โˆ’๐œ— Set ๐‘† = ๐‘œ1โˆ’๐œ—/4 Total time: ๐‘ƒ ๐‘œ2โˆ’๐œ—/4 Contradicts 3SUM Conjecture

slide-16
SLIDE 16

Set Disjointness Problem

  • 1. Preprocess subsets ๐’, โ„ฌ โІ ๐‘‰ over universe ๐‘‰
  • 2. Answer queries: Given ๐ต โˆˆ ๐’, ๐ถ โˆˆ โ„ฌ, is ๐ต โˆฉ ๐ถ โ‰  โˆ…?

Repeated queries (Static) data structure Goal: Lower bound on preprocessing and query time Queries not known in advance Offline Set Disjointness: ๐‘Ÿ queries known in advance (part of input)

slide-17
SLIDE 17

Reduction to 3SUM [Kopelowitz et al]

Thm: Let ๐‘” ๐‘œ be such that 3SUM requires expected time ฮฉ ๐‘œ2/๐‘”(๐‘œ) . For any constant 0 โ‰ค ๐›ฟ < 1, let ALG be an algorithm for offline Set Disjointness where ๐’ = โ„ฌ = ฮ˜(๐‘œ log ๐‘œ), ๐‘‰ = ฮ˜ ๐‘œ2โˆ’2๐›ฟ , each set in ๐’ โˆช โ„ฌ has at most ๐‘ƒ(๐‘œ1โˆ’๐›ฟ) elements from ๐‘‰, and ๐‘Ÿ = ฮ˜ ๐‘œ1+๐›ฟ log ๐‘œ . Then ALG requires expected time ฮฉ ๐‘œ2/๐‘”(๐‘œ) . Cor: Assuming the 3SUM conjecture, for any 0 < ๐›ฟ < 1, any data structure for Set Disjointness has ๐‘ข๐‘ž + ๐‘‚

1+๐›ฟ 2โˆ’๐›ฟ๐‘ข๐‘Ÿ = ฮฉ ๐‘‚ 2 2โˆ’๐›ฟโˆ’๐‘(1)

where ๐‘‚ is the sum of the set sizes, ๐‘ข๐‘ž is the preprocessing time, and ๐‘ข๐‘Ÿ is the time per query. Example: Data structures with constant query time Make ๐›ฟ tend to 1, need ๐‘ข๐‘ž = ฮฉ ๐‘‚2โˆ’๐‘(1) Evidence that trivial preprocessing algorithm is optimal (for constant query) (From Thm: ๐‘‚ = ฮ˜ ๐‘œ2โˆ’๐›ฟ log ๐‘œ )

slide-18
SLIDE 18

3SUM version

In the following proof we use a balanced, linear hash function with uniform difference property. (magicallyโ€ฆ) This can be modified for almost balanced, almost linear hash function with uniform difference property. Given set ๐‘Œ โІ [๐‘‰] of integers are there ๐‘ฆ, ๐‘ง, ๐‘จ โˆˆ ๐‘Œ such that ๐‘ฆ โˆ’ ๐‘ง = ๐‘จ?

slide-19
SLIDE 19

Crucial insight

June 16, 2016 19/19

โ‡”

๐‘ ๐‘ + = ๐‘‘ ๐‘ ๐‘โ†‘ + = ๐‘‘ โˆ’ ๐‘โ†“

higher order bits lower order bits

slide-20
SLIDE 20

Algorithm Overview

June 16, 2016 20/19

Set ๐‘† = ๐‘œ๐›ฟ, ๐‘… =

5๐‘œ ๐‘† 2

Pick random hash functions โ„Ž: ๐‘‰ โ†’ [๐‘†] and ๐‘•๐‘™: ๐‘‰ โ†’ [๐‘…] for ๐‘™ = 1 to 10 log ๐‘œ Initialize buckets ๐ถ 1 , โ€ฆ , ๐ถ[๐‘†] s.t. ๐ถ ๐‘— = {๐‘ฆ โˆถ โ„Ž ๐‘ฆ = ๐‘—} For all ๐‘— โˆˆ ๐‘† , ๐‘˜ โˆˆ ๐‘… , initialize buckets ๐ถ๐‘™

โ†‘ ๐‘—, ๐‘˜ and ๐ถ๐‘™ โ†“[๐‘—, ๐‘˜] s.t.

๐ถ๐‘™

โ†‘ ๐‘—, ๐‘˜ = ๐‘•๐‘™ ๐‘ฆ + ๐‘˜ โ‹…

๐‘… mod ๐‘… โˆฃ ๐‘ฆ โˆˆ ๐ถ ๐‘— ๐ถ๐‘™

โ†“ ๐‘—, ๐‘˜ = ๐‘•๐‘™ ๐‘ฆ โˆ’ ๐‘˜ (mod ๐‘…) โˆฃ ๐‘ฆ โˆˆ ๐ถ ๐‘—

Initialize ๐‘™ set intersection problems with ๐ถ๐‘™

โ†‘ ๐‘—, ๐‘˜ โ€™s and ๐ถ๐‘™ โ†“[๐‘—, ๐‘˜]โ€™s

For every ๐‘จ โˆˆ ๐‘Œ and every ๐‘— = 1 to ๐‘† Check if ๐ถ๐‘™

โ†‘ ๐‘—, ๐‘•๐‘™ โ†‘(๐‘จ) and ๐ถ๐‘™ โ†“ ๐‘— โˆ’ โ„Ž ๐‘จ

(mod ๐‘†), ๐‘•๐‘™

โ†“(๐‘จ) intersect

If intersection for all ๐‘™: Search for ๐‘ฆ โˆˆ ๐ถ ๐‘— and ๐‘ง โˆˆ ๐ถ[๐‘— โˆ’ โ„Ž ๐‘จ (mod ๐‘†)] s.t. ๐‘ฆ โˆ’ ๐‘ง = ๐‘จ and output it if found If nothing found: output โ€˜no 3SUMโ€™

๐‘•๐‘™

โ†‘ ๐‘จ : higher order bits of ๐‘•๐‘™(๐‘จ)

๐‘•๐‘™

โ†“ ๐‘จ : higher order bits of ๐‘•๐‘™(๐‘จ)

slide-21
SLIDE 21

Correctness I

June 16, 2016 21/19

Algorithm verifies every triple before stopping Need to show: if ๐‘ฆ โˆ’ ๐‘ง = ๐‘จ, then algorithm finds it Claim 1: If ๐‘ฆ โˆ’ ๐‘ง = ๐‘จ, then ๐ถ ๐‘— โˆฉ (๐ถ ๐‘˜ + ๐‘จ) โ‰  โˆ… where ๐‘— = โ„Ž ๐‘ฆ , ๐‘˜ = ๐‘— โˆ’ โ„Ž ๐‘จ mod ๐‘† Linear hash function: โ„Ž ๐‘ฆ โˆ’ โ„Ž ๐‘ง = โ„Ž ๐‘ฆ โˆ’ ๐‘ง = โ„Ž ๐‘จ mod ๐‘† Thus: ๐‘˜ = โ„Ž ๐‘ฆ โˆ’ โ„Ž ๐‘จ = ๐‘— โˆ’ โ„Ž ๐‘จ mod ๐‘† ๐‘ง โˆˆ ๐ถ ๐‘˜ โ‡’ ๐‘ฆ = ๐‘ง + ๐‘จ โˆˆ ๐ถ ๐‘˜ + ๐‘จ

๐‘ฆ

๐ถ[๐‘—]

๐‘ง

๐ถ[๐‘˜]

๐‘ง + ๐‘จ

๐ถ ๐‘˜ + ๐‘จ +๐‘จ ๐‘ฆ ๐‘ง

slide-22
SLIDE 22

Correctness II

June 16, 2016 22/19

Claim 2: If ๐ถ ๐‘— โˆฉ ๐ถ ๐‘˜ + ๐‘จ โ‰  โˆ…, then ๐ถโ†‘ ๐‘—, ๐‘•๐‘™

โ†‘(๐‘จ) โˆฉ ๐ถโ†“ ๐‘˜, ๐‘•๐‘™ โ†“ ๐‘จ

โ‰  โˆ… โˆ€๐‘™. ๐ถ ๐‘— โˆฉ ๐ถ ๐‘˜ + ๐‘จ โ‰  โˆ… โ‡“ ๐‘•๐‘™ ๐ถ ๐‘— โˆฉ ๐‘•๐‘™ ๐ถ ๐‘˜ + ๐‘จ โ‰  โˆ… โ‡• ๐‘•๐‘™ ๐ถ ๐‘— โˆฉ ๐‘•๐‘™ ๐ถ ๐‘˜ + ๐‘•๐‘™ ๐‘จ โ‰  โˆ… โ‡• ๐‘•๐‘™ ๐ถ ๐‘— โˆฉ ๐‘•๐‘™ ๐ถ ๐‘˜ + ๐‘•๐‘™

โ†‘ ๐‘จ + ๐‘•๐‘™ โ†“ ๐‘จ

โ‰  โˆ… โ‡• ๐‘•๐‘™ ๐ถ ๐‘— โˆ’ ๐‘•๐‘™

โ†‘ ๐‘จ

โˆฉ ๐‘•๐‘™ ๐ถ ๐‘˜ + ๐‘•๐‘™

โ†“ ๐‘จ

โ‰  โˆ… โ‡• ๐ถโ†‘ ๐‘—, ๐‘•๐‘™

โ†‘(๐‘จ) โˆฉ ๐ถโ†“ ๐‘˜, ๐‘•๐‘™ โ†“ ๐‘จ

โ‰  โˆ… Claim 1: If ๐‘ฆ โˆ’ ๐‘ง = ๐‘จ, then ๐ถ ๐‘— โˆฉ (๐ถ ๐‘˜ + ๐‘จ) โ‰  โˆ… where ๐‘— = โ„Ž ๐‘ฆ , ๐‘˜ = ๐‘— โˆ’ โ„Ž ๐‘จ mod ๐‘† Conclusion: If ๐‘ฆ โˆ’ ๐‘ง = ๐‘จ, then ๐ถโ†‘ ๐‘—, ๐‘•๐‘™

โ†‘(๐‘จ) โˆฉ ๐ถโ†“ ๐‘˜, ๐‘•๐‘™ โ†“ ๐‘จ

โ‰  โˆ… โˆ€๐‘™.

slide-23
SLIDE 23

Running time

June 16, 2016 23/19

Set intersection instance:

  • Number of sets: ๐‘ƒ ๐‘† ๐‘…๐‘™ = ๐‘ƒ(๐‘œ log ๐‘œ)
  • Number of elements in each set: ๐‘ƒ

๐‘… = ๐‘ƒ ๐‘œ1โˆ’๐›ฟ

  • Size of universe: ๐‘ƒ ๐‘… = ๐‘ƒ ๐‘œ2โˆ’2๐›ฟ
  • Number of set intersection queries: ๐‘ƒ ๐‘œ๐‘†๐‘™ = ๐‘ƒ ๐‘œ1+๐›ฟ log ๐‘œ

Finding witnesses:

  • If ๐ถ๐‘™

โ†‘ ๐‘—, ๐‘•๐‘™ โ†‘(๐‘จ) and ๐ถ๐‘™ โ†“ ๐‘˜, ๐‘•๐‘™ โ†“(๐‘จ) intersect, try to find ๐‘ฆ โˆˆ ๐ถ ๐‘— , ๐‘ง โˆˆ ๐ถ[๐‘˜]

s.t. ๐‘ฆ โˆ’ ๐‘ง = ๐‘จ

  • Time ๐‘ƒ

๐‘œ ๐‘† per witness check

  • But: pair ๐‘—, ๐‘˜ could be false positive with no such ๐‘ฆ โˆˆ ๐ถ ๐‘— , ๐‘ง โˆˆ ๐ถ[๐‘˜]
  • Probability of false positive is small
  • In expectation: ๐‘ƒ(1) false positives (next slide)
  • Total time: ๐‘ƒ

๐‘œ ๐‘† + #false positives ๐‘œ ๐‘† = ๐‘ƒ ๐‘œ ๐‘†

slide-24
SLIDE 24

Bounding number of false positive

June 16, 2016 24/19

For a fixed ๐‘จ and any pair ๐‘ฆ, ๐‘ง โˆˆ ๐‘‰ s.t. ๐‘ฆ โˆ’ ๐‘ง โ‰  ๐‘จ: Pr ๐‘•๐‘™ ๐‘ฆ = ๐‘•๐‘™ ๐‘ง + ๐‘•๐‘™(๐‘จ) = Pr ๐‘•๐‘™ ๐‘ฆ โˆ’ ๐‘ง = ๐‘•๐‘™(๐‘จ) = 1 ๐‘… (linear and uniform difference) Remember: Every bucket has size โ‰ค

3๐‘œ ๐‘†

(balanced)

  • Prob. of false positive in buckets ๐ถ[๐‘—] and ๐ถ[๐‘˜] for hash function ๐‘•๐‘™:

Pr ๐‘•๐‘™ ๐ถ[๐‘—] = ๐‘•๐‘™ ๐ถ[๐‘˜] + ๐‘•๐‘™(๐‘จ) โ‰ค 3๐‘œ ๐‘†

2 1

๐‘… = 9 25

  • Prob. of false positive in buckets ๐ถ[๐‘—] and ๐ถ[๐‘˜] for all hash functions ๐‘•๐‘™:

โ‰ค 1 ๐‘œ๐‘‘ In expectation: total number of false positives is a constant.