with constant multiplicative error
play

with Constant Multiplicative Error Uri Stemmer Ben-Gurion - PowerPoint PPT Presentation

Differentially Private k-Means with Constant Multiplicative Error Uri Stemmer Ben-Gurion University joint work with Haim Kaplan What is -Means Clustering? Given: Data points = , , and parameter


  1. Differentially Private k-Means with Constant Multiplicative Error Uri Stemmer Ben-Gurion University joint work with Haim Kaplan

  2. What is ๐’ -Means Clustering? Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹

  3. What is ๐’ -Means Clustering? Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ โœ“ Probably the most well-studied clustering problem โœ“ Tons of applications โœ“ Super popular

  4. What is ๐’ -Means Clustering? Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ What is Differentially Private ๐’ -Means? [Dwork, McSherry, Nissim, Smith 06] (informal) ๏ƒผ Every data point ๐’š ๐’‹ represents the (private) information of one individual ๏ƒผ Goal: the output (the set of centers) does not reveal information that is specific to any single individual

  5. What is ๐’ -Means Clustering? Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ What is Differentially Private ๐’ -Means? [Dwork, McSherry, Nissim, Smith 06] (informal) ๏ƒผ Every data point ๐’š ๐’‹ represents the (private) information of one individual ๏ƒผ Goal: the output (the set of centers) does not reveal information that is specific to any single individual ๏ƒผ Requirement: the output distribution is insensitive to any arbitrarily change of a single input point (an algorithm satisfying this requirement is differentially private )

  6. What is ๐’ -Means Clustering? Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Why is that a good privacy definition? Even if an observer knows all other data point but mine, and now she sees the outcome Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ of the computation, then she still cannot learn โ€œanythingโ€ on my data point ๐’‹ What is Differentially Private ๐’ -Means? [Dwork, McSherry, Nissim, Smith 06] (informal) ๏ƒผ Every data point ๐’š ๐’‹ represents the (private) information of one individual ๏ƒผ Goal: the output (the set of centers) does not reveal information that is specific to any single individual ๏ƒผ Requirement: the output distribution is insensitive to any arbitrarily change of a single input point (an algorithm satisfying this requirement is differentially private )

  7. Differentially Private ๐’ -Means Clustering Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ Requirement: the output distribution is insensitive to any arbitrarily change of a single input point

  8. Differentially Private ๐’ -Means Clustering Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ Requirement: the output distribution is insensitive to any arbitrarily change of a single input point Observe: With privacy we must have additive error โ€ข Assume ๐’ = ๐’ = ๐Ÿ’ โ€ข OPTโ€™s cost = 0

  9. Differentially Private ๐’ -Means Clustering Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ Requirement: the output distribution is insensitive to any arbitrarily change of a single input point Observe: With privacy we must have additive error โ€ข Assume ๐’ = ๐’ = ๐Ÿ’ โ€ข OPTโ€™s cost = 0 โ€ข Move one point โ€ข OPTโ€™s cost = 0 ๐šณ

  10. Differentially Private ๐’ -Means Clustering Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ Requirement: the output distribution is insensitive to any arbitrarily change of a single input point Observe: With privacy we must have additive error โ€ข Assume ๐’ = ๐’ = ๐Ÿ’ โ€ข OPTโ€™s cost = 0 โ€ข Move one point โ€ข OPTโ€™s cost = 0 ๐šณ โ€ข Each solution must remain approx. equally likely โ€ข On at least one of these inputs our cost is โ‰ˆ ๐šณ ๐Ÿ‘

  11. Differentially Private ๐’ -Means Clustering Given: Data points ๐‘ป = ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ โˆˆ โ„ ๐’† ๐’ and parameter ๐’ Identify ๐’ centers ๐‘ซ = ๐’— ๐Ÿ , โ€ฆ , ๐’— ๐’ minimizing ๐๐ฉ๐ญ๐ฎ ๐‘ซ = ๐ง๐ฃ๐จ โ„“ ๐’š ๐’‹ โˆ’ ๐’— โ„“ ๐Ÿ‘ ๐’‹ Requirement: the output distribution is insensitive to any arbitrarily change of a single input point Observe: With privacy we must have additive error โ€ข Assume ๐’ = ๐’ = ๐Ÿ’ โ€ข OPTโ€™s cost = 0 โ€ข Move one point โ€ข OPTโ€™s cost = 0 ๐šณ โ€ข Each solution must remain approx. equally likely โ€ข On at least one of these inputs our cost is โ‰ˆ ๐šณ ๐Ÿ‘ โŸน We assume that input points come from the unit ball

  12. Previous and New Bounds Ref Model Runtime Bounds ๐’ ๐Ÿ‘ โ‹… ๐’† ๐’ ๐’† GLMRTโ€™10 ๐ ๐Ÿ โ‹…๐๐๐” + ๐‘ท differential privacy ๐’ ๐ช๐ฉ๐ฆ๐ณ NCBNโ€™16 ๐ ๐ฆ๐ฉ๐ก ๐’ โ‹…๐๐๐” + ๐‘ท differential privacy ๐’ ๐Ÿ’/๐Ÿ‘ โ‹… FXZRโ€™17 ๐ช๐ฉ๐ฆ๐ณ ๐‘ท ๐’ ๐ฆ๐ฉ๐ก ๐’ โ‹…๐๐๐” + ๐‘ท ๐’† differential privacy ๐’ ๐Ÿ‘ + ๐’† ๐‘ท ๐ฆ๐ฉ๐ก ๐Ÿ’ ๐’ โ‹…๐๐๐” + ๐‘ท ๐ช๐ฉ๐ฆ๐ณ BDLMZโ€™17 differential privacy ๐’ ๐Ÿ.๐Ÿ”๐Ÿ โ‹… ๐’† ๐Ÿ.๐Ÿ”๐Ÿ NSโ€™18 ๐ช๐ฉ๐ฆ๐ณ ๐‘ท ๐’ โ‹…๐๐๐” + ๐‘ท differential privacy ๐’ ๐Ÿ.๐Ÿ๐Ÿ โ‹… ๐’† ๐Ÿ.๐Ÿ”๐Ÿ + ๐’ ๐Ÿ’/๐Ÿ‘ New ๐ช๐ฉ๐ฆ๐ณ ๐‘ท ๐Ÿ โ‹…๐๐๐” + ๐‘ท differential privacy

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend