composable core sets for diversity and coverage
play

Composable Core-sets for Diversity and Coverage Maximization Piotr - PowerPoint PPT Presentation

Composable Core-sets for Diversity and Coverage Maximization Piotr Indyk (MIT) Sepideh Mahabadi (MIT) Mohammad Mahdian (Google) Vahab S. Mirrokni (Google) Core-Set Definition Setup Set of points in -dimensional


  1. Composable Core-sets for Diversity and Coverage Maximization Piotr Indyk (MIT) Sepideh Mahabadi (MIT) Mohammad Mahdian (Google) Vahab S. Mirrokni (Google)

  2. Core-Set Definition Setup β€’ Set of π‘œ points 𝑸 in 𝑒 -dimensional – space Optimize a function 𝑔 –

  3. Core-Set Definition Setup β€’ Set of π‘œ points 𝑸 in 𝑒 -dimensional – space Optimize a function 𝑔 – 𝒅 -Core-set: Small subset of points S βŠ‚ 𝑄 β€’ which suffices to 𝑑 -approximate the optimal solution 𝑔 𝑝𝑝𝑝 𝑄 ≀ 𝑔 𝑝𝑝𝑝 𝑇 ≀ 𝑔 𝑝𝑝𝑝 ( 𝑄 ) β€’ Maximization: 𝑑

  4. Core-Set Definition Setup β€’ Set of π‘œ points 𝑸 in 𝑒 -dimensional – space Optimize a function 𝑔 – 𝒅 -Core-set: Small subset of points S βŠ‚ 𝑄 β€’ which suffices to 𝑑 -approximate the optimal solution 𝑔 𝑝𝑝𝑝 𝑄 ≀ 𝑔 𝑝𝑝𝑝 𝑇 ≀ 𝑔 𝑝𝑝𝑝 ( 𝑄 ) β€’ Maximization: 𝑑 Example β€’ – Optimization Function: Distance of the two farthest points

  5. Core-Set Definition Setup β€’ Set of π‘œ points 𝑸 in 𝑒 -dimensional – space Optimize a function 𝑔 – 𝒅 -Core-set: Small subset of points S βŠ‚ 𝑄 β€’ which suffices to 𝑑 -approximate the optimal solution 𝑔 𝑝𝑝𝑝 𝑄 ≀ 𝑔 𝑝𝑝𝑝 𝑇 ≀ 𝑔 𝑝𝑝𝑝 ( 𝑄 ) β€’ Maximization: 𝑑 Example β€’ – Optimization Function: Distance of the two farthest points 1 -Core-set: Points on the convex hull. –

  6. Composable Core-sets Setup β€’ 𝑸 𝟐 , 𝑸 πŸ‘ , … , 𝑸 𝒏 are set of points in – 𝑒 -dimensional space Optimize a function 𝑔 over their – union 𝑸 .

  7. Composable Core-sets Setup β€’ 𝑸 𝟐 , 𝑸 πŸ‘ , … , 𝑸 𝒏 are set of points in – 𝑒 -dimensional space Optimize a function 𝑔 over their – union 𝑸 . 𝒅 -Composable Core-sets: Subsets of β€’ points S 1 βŠ‚ 𝑄 1 , S 2 βŠ‚ 𝑄 2 , … , S m βŠ‚ 𝑄 𝑛 points such that the solution of the union of the core-sets approximates the solution of the point sets. β€’ Maximization : 1 𝑑 𝑔 𝑝𝑝𝑝 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 ≀ 𝑔 opt S 1 βˆͺ β‹― βˆͺ 𝑇 𝑛 ≀ 𝑔 𝑝𝑝𝑝 ( 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 )

  8. Composable Core-sets Setup β€’ 𝑸 𝟐 , 𝑸 πŸ‘ , … , 𝑸 𝒏 are set of points in – 𝑒 -dimensional space Optimize a function 𝑔 over their – union 𝑸 . 𝒅 -Composable Core-sets: Subsets of β€’ points S 1 βŠ‚ 𝑄 1 , S 2 βŠ‚ 𝑄 2 , … , S m βŠ‚ 𝑄 𝑛 points such that the solution of the union of the core-sets approximates the solution of the point sets. β€’ Maximization : 1 𝑑 𝑔 𝑝𝑝𝑝 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 ≀ 𝑔 opt S 1 βˆͺ β‹― βˆͺ 𝑇 𝑛 ≀ 𝑔 𝑝𝑝𝑝 ( 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 ) β€’ Example: two farthest points

  9. Composable Core-sets Setup β€’ 𝑸 𝟐 , 𝑸 πŸ‘ , … , 𝑸 𝒏 are set of points in – 𝑒 -dimensional space Optimize a function 𝑔 over their – union 𝑸 . 𝒅 -Composable Core-sets: Subsets of β€’ points S 1 βŠ‚ 𝑄 1 , S 2 βŠ‚ 𝑄 2 , … , S m βŠ‚ 𝑄 𝑛 points such that the solution of the union of the core-sets approximates the solution of the point sets. β€’ Maximization : 1 𝑑 𝑔 𝑝𝑝𝑝 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 ≀ 𝑔 opt S 1 βˆͺ β‹― βˆͺ 𝑇 𝑛 ≀ 𝑔 𝑝𝑝𝑝 ( 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 ) β€’ Example: two farthest points

  10. Composable Core-sets Setup β€’ 𝑸 𝟐 , 𝑸 πŸ‘ , … , 𝑸 𝒏 are set of points in – 𝑒 -dimensional space Optimize a function 𝑔 over their – union 𝑸 . 𝒅 -Composable Core-sets: Subsets of β€’ points S 1 βŠ‚ 𝑄 1 , S 2 βŠ‚ 𝑄 2 , … , S m βŠ‚ 𝑄 𝑛 points such that the solution of the union of the core-sets approximates the solution of the point sets. β€’ Maximization : 1 𝑑 𝑔 𝑝𝑝𝑝 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 ≀ 𝑔 opt S 1 βˆͺ β‹― βˆͺ 𝑇 𝑛 ≀ 𝑔 𝑝𝑝𝑝 ( 𝑄 1 βˆͺ β‹― βˆͺ 𝑄 𝑛 ) β€’ Example: two farthest points

  11. Applications – Streaming Computation β€’ Streaming Computation: Processing sequence of π‘œ data elements β€œon the fly” – – limited Storage

  12. Applications – Streaming Computation β€’ Streaming Computation: Processing sequence of π‘œ data elements β€œon the fly” – – limited Storage 𝒅 -Composable Core-set of size 𝒍 β€’ Chunks of size π‘œπ‘œ , thus number of chunks = π‘œ / π‘œ –

  13. Applications – Streaming Computation β€’ Streaming Computation: Processing sequence of π‘œ data elements β€œon the fly” – – limited Storage 𝒅 -Composable Core-set of size 𝒍 β€’ Chunks of size π‘œπ‘œ , thus number of chunks = π‘œ / π‘œ – – Core-set for each chunk Total Space: π‘œ π‘œ / π‘œ + π‘œπ‘œ = 𝑃 ( π‘œπ‘œ ) – Approximation Factor: 𝑑 –

  14. Applications – Distributed Systems β€’ Streaming Computation Distributed System: β€’ Each machine holds a block of data. – A composable core-set is computed and sent to the server –

  15. Applications – Distributed Systems β€’ Streaming Computation Distributed System: β€’ Each machine holds a block of data. – A composable core-set is computed and sent to the server – β€’ Map-Reduce Model: β€’ One round of Map-Reduce π‘œ / π‘œ mappers each getting π‘œπ‘œ points β€’ Mapper computes a composable core-set of size π‘œ β€’ Will be passed to a single reducer β€’

  16. Applications – Similarity Search β€’ Streaming Computation Distributed System β€’ β€’ Similarity Search: Small output size

  17. Applications – Similarity Search β€’ Streaming Computation Distributed System β€’ β€’ Similarity Search: Small output size Good to have result from each β€’ cluster: relevant and diverse

  18. Applications – Similarity Search β€’ Streaming Computation Distributed System β€’ β€’ Similarity Search: Small output size Good to have result from each β€’ cluster: relevant and diverse β€’ Diverse Near Neighbor Problem [ Abbar, Amer-Yahia, Indyk, Mahabadi WWW’13] [Abbar, Amer-Yahia, Indyk, Mahabadi, Varadarajan, SoCG’13]

  19. Applications – Similarity Search β€’ Streaming Computation Distributed System β€’ β€’ Similarity Search: Small output size Good to have result from each β€’ cluster: relevant and diverse β€’ Diverse Near Neighbor Problem [ Abbar, Amer-Yahia, Indyk, Mahabadi WWW’13] [Abbar, Amer-Yahia, Indyk, Mahabadi, Varadarajan, SoCG’13] uses Locality Sensitive Hashing – (LSH) and Composable Core- sets techniques.

  20. Diversity Maximization Problem A set of π‘œ points 𝑄 in metric space β€’ ( Ξ” , 𝑒𝑒𝑒𝑒 ) Optimization Problem: β€’ Find a subset of π‘œ points 𝑇 which – maximizes Diversity k=4 n = 6

  21. Diversity Maximization Problem A set of π‘œ points 𝑄 in metric space β€’ ( Ξ” , 𝑒𝑒𝑒𝑒 ) Optimization Problem: β€’ Find a subset of π‘œ points 𝑇 which – maximizes Diversity Diversity: β€’ – Minimum pairwise distance (Remote Edge) k=4 n = 6

  22. Diversity Maximization Problem A set of π‘œ points 𝑄 in metric space β€’ ( Ξ” , 𝑒𝑒𝑒𝑒 ) Optimization Problem: β€’ Find a subset of π‘œ points 𝑇 which – maximizes Diversity Diversity: β€’ – Minimum pairwise distance (Remote Edge) Sum of Pairwise distances (Remote – k=4 Clique) n = 6

  23. Diversity Maximization Problem A set of π‘œ points 𝑄 in metric space β€’ ( Ξ” , 𝑒𝑒𝑒𝑒 ) Optimization Problem: β€’ Find a subset of π‘œ points 𝑇 which – maximizes Diversity Diversity: β€’ – Minimum pairwise distance (Remote Edge) Sum of Pairwise distances (Remote – k=4 Clique) n = 6 Long list of variants [Chandra and β€’ Halldorsson β€˜01]

  24. Diversity Functions Diversity function over Description a set 𝑇 of π‘œ point Minimum Pairwise Distance: min 𝑝 , π‘Ÿβˆˆπ‘‡ 𝑒𝑒𝑒𝑒 ( π‘ž , π‘Ÿ ) Remote-edge Sum of Pairwise Distances : βˆ‘ 𝑒𝑒𝑒𝑒 ( π‘ž , π‘Ÿ ) Remote-clique 𝑝 , π‘Ÿβˆˆπ‘‡ Weight of Minimum Spanning Tree (MST) of the set 𝑇 Remote-tree Weight of minimum Traveling Salesman Tour (TSP) of the set 𝑇 Remote-cycle π‘βˆˆπ‘‡ βˆ‘ Weight of minimum star: min 𝑒𝑒𝑒𝑒 ( π‘ž , π‘Ÿ ) Remote-star π‘Ÿβˆˆπ‘‡ Remote-Pseudoforest Sum of the distance of each point to its nearest neighbor βˆ‘ min π‘Ÿβˆˆπ‘‡ 𝑒𝑒𝑒𝑒 ( π‘ž , π‘Ÿ ) π‘βˆˆπ‘‡ Weight of minimum perfect Matching of the set 𝑇 Remote-Matching Max-Coverage How well the points cover each coordinate 𝑒 οΏ½ max π‘βˆˆπ‘‡ π‘ž 𝑗 𝑗=1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend