alice and bob show distribution testing lower bounds
play

alice and bob show distribution testing lower bounds They dont talk - PowerPoint PPT Presentation

alice and bob show distribution testing lower bounds They dont talk to each other anymore. Clment Canonne (Columbia University) July 9, 2017 Joint work with Eric Blais (UWaterloo) and Tom Gur (Weizmann Institute UC Berkeley)


  1. alice and bob show distribution testing lower bounds They don’t talk to each other anymore. Clément Canonne (Columbia University) July 9, 2017 Joint work with Eric Blais (UWaterloo) and Tom Gur (Weizmann Institute UC Berkeley)

  2. “distribution testing?”

  3. sublinear, approximate, randomized algorithms that take random samples ∙ Big Dataset: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: 2

  4. approximate, randomized algorithms that take random samples ∙ Big Dataset: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: sublinear, 2

  5. randomized algorithms that take random samples ∙ Big Dataset: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: sublinear, approximate, 2

  6. algorithms that take random samples ∙ Big Dataset: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: sublinear, approximate, randomized 2

  7. ∙ Big Dataset: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: sublinear, approximate, randomized algorithms that take random samples 2

  8. ∙ Expensive access: pricey data ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: sublinear, approximate, randomized algorithms that take random samples ∙ Big Dataset: too big 2

  9. ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: sublinear, approximate, randomized algorithms that take random samples ∙ Big Dataset: too big ∙ Expensive access: pricey data 2

  10. Need to infer information – one bit – from the data: fast, or with very few samples. why? Property testing of probability distributions: sublinear, approximate, randomized algorithms that take random samples ∙ Big Dataset: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options 2

  11. why? Property testing of probability distributions: sublinear, approximate, randomized algorithms that take random samples ∙ Big Dataset: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options Need to infer information – one bit – from the data: fast, or with very few samples. 2

  12. 3

  13. in an (egg)shell. how? (Property) Distribution Testing: 4

  14. in an (egg)shell. how? (Property) Distribution Testing: 4

  15. how? (Property) Distribution Testing: in an (egg)shell. 4

  16. Must decide: p , or 1 p ? (and be correct on any p with probability at least 2 3) how? Known domain (here [ n ] = { 1 , . . . , n } ) Property P ⊆ ∆([ n ]) Independent samples from unknown p ∈ ∆([ n ]) Distance parameter ε ∈ ( 0 , 1 ] 5

  17. , or 1 p ? (and be correct on any p with probability at least 2 3) how? Known domain (here [ n ] = { 1 , . . . , n } ) Property P ⊆ ∆([ n ]) Independent samples from unknown p ∈ ∆([ n ]) Distance parameter ε ∈ ( 0 , 1 ] Must decide: p ∈ P 5

  18. (and be correct on any p with probability at least 2 3) how? Known domain (here [ n ] = { 1 , . . . , n } ) Property P ⊆ ∆([ n ]) Independent samples from unknown p ∈ ∆([ n ]) Distance parameter ε ∈ ( 0 , 1 ] Must decide: p ∈ P , or ℓ 1 ( p , P ) > ε ? 5

  19. how? Known domain (here [ n ] = { 1 , . . . , n } ) Property P ⊆ ∆([ n ]) Independent samples from unknown p ∈ ∆([ n ]) Distance parameter ε ∈ ( 0 , 1 ] Must decide: p ∈ P , or ℓ 1 ( p , P ) > ε ? (and be correct on any p with probability at least 2 / 3) 5

  20. ∙ Uniformity [GR00, BFR 00, Pan08] ∙ Identity* [BFF 01, VV14] ∙ Equivalence [BFR 00, Val11, CDVV14] ∙ Independence [BFF 01, LRR13] ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… and? Many results on many properties: 6

  21. ∙ Identity* [BFF 01, VV14] ∙ Equivalence [BFR 00, Val11, CDVV14] ∙ Independence [BFF 01, LRR13] ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] 6

  22. ∙ Equivalence [BFR 00, Val11, CDVV14] ∙ Independence [BFF 01, LRR13] ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] ∙ Identity* [BFF + 01, VV14] 6

  23. ∙ Independence [BFF 01, LRR13] ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] ∙ Identity* [BFF + 01, VV14] ∙ Equivalence [BFR + 00, Val11, CDVV14] 6

  24. ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] ∙ Identity* [BFF + 01, VV14] ∙ Equivalence [BFR + 00, Val11, CDVV14] ∙ Independence [BFF + 01, LRR13] 6

  25. ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] ∙ Identity* [BFF + 01, VV14] ∙ Equivalence [BFR + 00, Val11, CDVV14] ∙ Independence [BFF + 01, LRR13] ∙ Monotonicity [BKR04] 6

  26. ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] ∙ Identity* [BFF + 01, VV14] ∙ Equivalence [BFR + 00, Val11, CDVV14] ∙ Independence [BFF + 01, LRR13] ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] 6

  27. ∙ and more… and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] ∙ Identity* [BFF + 01, VV14] ∙ Equivalence [BFR + 00, Val11, CDVV14] ∙ Independence [BFF + 01, LRR13] ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] 6

  28. and? Many results on many properties: ∙ Uniformity [GR00, BFR + 00, Pan08] ∙ Identity* [BFF + 01, VV14] ∙ Equivalence [BFR + 00, Val11, CDVV14] ∙ Independence [BFF + 01, LRR13] ∙ Monotonicity [BKR04] ∙ Poisson Binomial Distributions [AD14] ∙ Generic approachs for classes [CDGR15, ADK15] ∙ and more… 6

  29. We want more methods. Generic if possible, applying to many problems at once. but? Lower bounds… … are quite tricky. 7

  30. but? Lower bounds… … are quite tricky. We want more methods. Generic if possible, applying to many problems at once. 7

  31. but? Lower bounds… … are quite tricky. We want more methods. Generic if possible, applying to many problems at once. 7

  32. but? Lower bounds… … are quite tricky. We want more methods. Generic if possible, applying to many problems at once. 7

  33. but? Lower bounds… … are quite tricky. We want more methods. Generic if possible, applying to many problems at once. 7

  34. “communication complexity?”

  35. what now? f ( x , y ) 9

  36. what now? f ( x , y ) 9

  37. what now? f ( x , y ) 9

  38. what now? f ( x , y ) 9

  39. what now? f ( x , y ) 9

  40. what now? But communicating is hard. 10

  41. was that a toilet? ∙ f known by all parties ∙ Alice gets x, Bob gets y ∙ Private randomness Goal: minimize communication (worst case over x , y, randomness) to compute f ( x , y ) . 11

  42. ∙ f known by all parties ∙ Alice gets x, Bob gets y ∙ Both send one-way messages to a referee ∙ Private randomness SMP Simultaneous Message Passing model. also… …in our setting, Alice and Bob do not get to communicate. 12

  43. SMP Simultaneous Message Passing model. also… …in our setting, Alice and Bob do not get to communicate. ∙ f known by all parties ∙ Alice gets x, Bob gets y ∙ Both send one-way messages to a referee ∙ Private randomness 12

  44. also… …in our setting, Alice and Bob do not get to communicate. ∙ f known by all parties ∙ Alice gets x, Bob gets y ∙ Both send one-way messages to a referee ∙ Private randomness SMP Simultaneous Message Passing model. 12

  45. referee model (smp). 13

  46. referee model (smp). Upshot √ SMP ( Eq n ) = Ω( n ) (Only O ( log n ) with one-way communication!) 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend