l ecture 3
play

L ECTURE 3 Last time Properties of lists and functions. Testing if - PowerPoint PPT Presentation

Sublinear Algorithms L ECTURE 3 Last time Properties of lists and functions. Testing if a list is sorted/Lipschitz and if a function is monotone. Today Testing if a graph is connected. Estimating the number of connected


  1. Sublinear Algorithms L ECTURE 3 Last time • Properties of lists and functions. • Testing if a list is sorted/Lipschitz and if a function is monotone. Today • Testing if a graph is connected. • Estimating the number of connected components. • Estimating the weight of a MST 9/10/2020 Sofya Raskhodnikova;Boston University

  2. Graph Properties

  3. Testing if a Graph is Connected [Goldreich Ron] Input: a graph 𝐻 = (𝑊, 𝐹) on 𝑜 vertices • in adjacency lists representation (a list of neighbors for each vertex) • maximum degree d , i.e., adjacency lists of length d with some empty entries Query (𝑤, 𝑗) , where 𝑤 ∈ 𝑊 and 𝑗 ∈ [𝑒] : entry 𝑗 of adjacency list of vertex 𝑤 Exact Answer: W (dn) time • Approximate version: Is the graph connected or ² -far from connected? # 𝑝𝑔 𝑓𝑜𝑢𝑗𝑠𝑓𝑡 𝑗𝑜 𝑏𝑒𝑘𝑏𝑑𝑓𝑜𝑑𝑧 𝑚𝑗𝑡𝑢𝑡 𝑝𝑜 𝑥ℎ𝑗𝑑ℎ 𝐻 1 𝑏𝑜𝑒 𝐻 2 𝑒𝑗𝑔𝑔𝑓𝑠 dist 𝐻 1 , 𝐻 2 = 𝑒𝑜 1 Time: 𝑃 today No dependence on n! 𝜁 2 𝑒 + improvement on HW 3

  4. Testing Connectedness: Algorithm Connectedness Tester(n, d, ε , query access to G) Repeat s=8/ e d times: 1. pick a random vertex 𝑣 2. 3. determine if connected component of 𝑣 is small: perform BFS from 𝑣 , stopping after at most 4/ e d new nodes Reject if a small connected component was found, otherwise accept. 4. Run time: O( 𝑒 / e 2 𝑒 2 )=O(1/ e 2 𝑒 ) Analysis: • Connected graphs are always accepted. • Remains to show: 2 If a graph is ² -far from connected, it is rejected with probability ≥ 3 4

  5. Testing Connectedness: Analysis Claim 1 If G is e -far from connected, it has ≥ e 𝑒𝑜 connected components. 2 Claim 2 If G is e -far from connected, it has ≥ e 𝑒𝑜 connected components 4 of size at most 4/ e d . By Claim 2, at least e 𝑒𝑜 • nodes are in small connected components. 4 2⋅4 8 • By Witness lemma, it suffices to sample e 𝑒𝑜/𝑜 = e 𝑒 nodes to detect one from a small connected component. 5

  6. Testing Connectedness: Proof of Claim 1 Claim 1 If G is e -far from connected, it has ≥ e 𝑒𝑜 connected components. 2 We prove the contrapositive: If G has < e 𝑒𝑜 connected components, one can make G connected by 2 modifying < e fraction of its representation, i.e., < e 𝑒𝑜 entries. • If there are no degree restrictions, k components can be connected by adding 𝑙 -1 edges, each affecting 2 nodes. Here, 𝑙 < e 𝑒𝑜 2 , so 2𝑙 − 2 < e 𝑒𝑜 . • What if adjacency lists of all vertices in a component are full, i.e., all vertex degrees are d? 6

  7. Freeing up an Adjacency List Entry Claim 1 If G is e -far from connected, it has ≥ e 𝑒𝑜 connected components. 2 What if adjacency lists of all vertices in a component are full, i.e., all vertex degrees are d? 𝑤 • Consider an MST of this component. Let 𝑤 be a leaf of the MST. • Disconnect 𝑤 from a node other than its parent in the MST. • • Two entries are changed while keeping the same number of components. 7

  8. Freeing up an Adjacency List Entry Claim 1 If G is e -far from connected, it has ≥ e 𝑒𝑜 connected components. 2 What if adjacency lists of all vertices in a component are full, i.e., all vertex degrees are d? 𝑤 • Apply this to each component with <2 free spots in adjacency lists. • Now we can connect all the components using the freed up spots while ensuring that we never change more than 2 spots per component. • Thus, k components can be connected by changing 2k spots. Here, k < e 𝑒𝑜 2 , so 2k < e 𝑒𝑜 . 8

  9. Testing Connectedness: Proof of Claim 2 Claim 1 If G is e -far from connected, it has ≥ e 𝑒𝑜 connected components. 2 Claim 2 If G is e -far from connected, it has ≥ e 𝑒𝑜 connected components 4 of size at most 4/ e d . By Claim 1, there are at least e 𝑒𝑜 • connected components. 2 𝑜 2 • Their average size is at most e 𝑒𝑜/2 = e 𝑒 . • By an averaging argument (or Markov inequality), at least half of the components are of size at most twice the average. 9

  10. Testing if a Graph is Connected [Goldreich Ron] Input: a graph 𝐻 = (𝑊, 𝐹) on 𝑜 vertices • in adjacency lists representation (a list of neighbors for each vertex) • maximum degree d Connected or 𝜁 -far from connected? 1 𝑃 𝜁 2 𝑒 time (no dependence on 𝑜 ) 10

  11. Randomized Approximation in sublinear time A Simple Example

  12. Randomized Approximation: a Toy Example Input: a string 𝑥 ∈ 0,1 𝑜 0 0 0 1 … 0 1 0 0 Goal: Estimate the fraction of 1’s in 𝑥 (like in polls) It suffices to sample 𝑡 = 1 ⁄ 𝜁 2 positions and output the average to get the fraction of 1’s ±𝜁 (i.e., additive error 𝜁 ) with probability ¸ 2/3 Hoeffding Bound Let Y 1 , … , Y s be independently distributed random variables in [0,1]. 𝑡 1 ≥ 𝜁 ≤ 2e −2𝑡𝜁 2 . Let Y = 𝑡 ⋅ ∑ Y i (called sample mean ). Then Pr Y − E Y 𝑗=1 𝑡 1 Y i = value of sample 𝑗 . Then E[Y] = 𝑡 ⋅ ∑ E[Y i ] = (fraction of 1’s in 𝑥 ) 𝑗=1 Pr (sample mean) − fraction of 1′s in 𝑥 ≥ 𝜁 ≤ 2e −2𝑡𝜁 2 = 2𝑓 −2 < 1/3 substitute 𝑡 = 1 ⁄ 𝜁 2 Apply Hoeffding Bound 12

  13. Approximating # of Connected Components [Chazelle Rubinfeld Trevisan] Input: a graph 𝐻 = (𝑊, 𝐹) on n vertices • in adjacency lists representation (a list of neighbors for each vertex) • maximum degree d Exact Answer: W (dn) time Additive approximation: # of CC ± ε n with probability ¸ 2/3 Time: 𝑒 1 𝑒 𝜁 , W • Known: 𝑃 𝜁 2 log 𝜁 2 𝑒 𝜁 3 . No dependence on n! • Today: 𝑃 Partially based on slides by Ronitt Rubinfeld: 13 http://stellar.mit.edu/S/course/6/fa10/6.896/courseMaterial/topics/topic3/lectureNotes/lecst11/lecst11.pdf

  14. Approximating # of CCs: Main Idea Let 𝐷 = number of components • For every vertex 𝑣 , define • 𝑜 𝑣 = number of nodes in u’s component Breaks C up into for each component A : ∑ 𝑣∈𝐵 1 – 𝑜 𝑣 = 1 contributions 1 of different nodes ∑ = 𝐷 𝑜 𝑣 𝑣∈𝑊 • Estimate this sum by estimating 𝑜 𝑣 ’s for a few random nodes – If 𝑣 ’s component is small, its size can be computed by BFS. – If 𝑣 ’s component is big, then 1/𝑜 𝑣 is small, so it does not contribute much to the sum – Can stop BFS after a few steps Similar to property tester for connectedness [Goldreich Ron] 14

  15. Approximating # of CCs: Algorithm Estimating 𝑜 𝑣 = the number of nodes in 𝑣 ’s component : 2 Let estimate ො 𝑜 𝑣 = min 𝑜 𝑣 , • 𝜁 𝑏 – When 𝑣 ’s component has · 2/ e nodes , ො 𝑜 𝑣 = 𝑜 𝑣 1 − 1 ≤ 𝜁 𝑐 𝑜 𝑣 = 2/ e , and so 0 < 1 𝑜 𝑣 − 1 𝑜 𝑣 < 1 𝑜 𝑣 = 𝜁 ൢ – Else ො 𝑑 𝑜 𝑣 ො 𝑜 𝑣 2 ො ො 2 1 Corresponding estimate for C is መ 𝐷 = ∑ 𝑣∈𝑊 • 𝑜 𝑣 . It is a good estimate: ො 1 1 1 1 𝜁𝑜 መ 𝐷 − 𝐷 = ∑ 𝑣∈𝑊 𝑜 𝑣 − ∑ 𝑣∈𝑊 𝑜 𝑣 ≤ ∑ 𝑣∈𝑊 𝑜 𝑣 − 𝑜 𝑣 ≤ ො ො 2 APPROX_#_CCs (n, d, ε , query access to G) Repeat s= Θ (1/ e 2 ) times: 1. pick a random vertex 𝑣 2. 𝑜 𝑣 via BFS from 𝑣 , stopping after at most 2/ e new nodes 3. compute ො Return ሚ 𝐷 = (average of the values 1/ො 𝑜 𝑣 ) ∙ 𝑜 4. Run time: O(d / e 3 ) 15

  16. Approximating # of CCs: Analysis 𝜁𝑜 1 𝐷 − መ ሚ Want to show: Pr 𝐷 > ≤ 2 3 Hoeffding Bound Let Y 1 , … , Y s be independently distributed random variables in [0,1]. 𝑡 1 ≥ 𝜁 ≤ 2e −2𝑡𝜁 2 . Let Y = 𝑡 ⋅ ∑ Y i (called sample mean ). Then Pr Y − E Y 𝑗=1 𝑜 𝑣 for the i th vertex 𝑣 in the sample Let Y i = 1/ො 𝑡 ሚ 1 𝐷 • Y = 𝑡 ⋅ ∑ Y i = 𝑜 𝑗=1 𝑡 መ 1 1 1 𝐷 • E[Y] = 𝑡 ⋅ ∑ E[Y i ] = E[Y 1 ] = 𝑜 ∑ 𝑣∈𝑊 𝑜 𝑣 = ො 𝑜 𝑗=1 2 ≤ 2𝑓 − 𝜁2𝑡 𝜁𝑜 𝜁𝑜 𝜁 𝐷 − መ ሚ Pr 𝐷 > = Pr 𝑜𝑍 − 𝑜𝐹 𝑍 > = Pr Y − E Y > 2 2 2 1 1 • Need 𝑡 = Θ 𝜁 2 samples to get probability ≤ 3 16

  17. Approximating # of CCs: Analysis 𝜁𝑜 መ 𝐷 − 𝐷 ≤ So far: 2 𝜁𝑜 1 𝐷 − መ ሚ Pr 𝐷 > ≤ 2 3 2 • With probability ≥ 3 , 𝐷 − 𝐷 ≤ 𝜁𝑜 2 + 𝜁𝑜 ሚ 𝐷 − መ ሚ መ 𝐷 − 𝐷 ≤ 𝐷 + 2 ≤ 𝜁𝑜 Summary: The number of connected components in 𝑜 -vetex graphs of 𝑒 𝜁 3 . degree at most 𝑒 can be estimated within ±𝜁𝑜 in time 𝑃 17

  18. Minimum spanning tree (MST) • What is the cheapest way to connect all the dots? Input: a weighted graph 3 with n vertices and m edges 4 2 7 1 5 • Exact computation: – Deterministic 𝑃(𝑛 ∙ inverse-Ackermann (𝑛)) time [Chazelle] – Randomized 𝑃(𝑛) time [Karger Klein Tarjan] Partially based on slides by Ronitt Rubinfeld: 18 http://stellar.mit.edu/S/course/6/fa10/6.896/courseMaterial/topics/topic3/lectureNotes/lecst11/lecst11.pdf

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend