privately detecting changes in unknown distributions
play

Privately Detecting Changes in Unknown Distributions Wanrong Zhang, - PowerPoint PPT Presentation

Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 1 Motivation I: Smart-home IoT devices 2 Motivation II: Disease outbreaks 3 Change-point problem:


  1. Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 1

  2. Motivation I: Smart-home IoT devices 2

  3. Motivation II: Disease outbreaks 3

  4. Change-point problem: Identify distributional changes in stream of highly sensitive data Model: Data points 𝑦 " , … , 𝑦 % βˆ— ∼ 𝑄 ) (pre-change) Need formal privacy 𝑦 % βˆ— , … , 𝑦 * ∼ 𝑄 " (post-change) guarantees for change-point detection algorithms Question: Estimate the unknown change time 𝑙 βˆ— Previous work: parametric model [CKM+18] ( 𝑄 ) and 𝑄 " known) Our work: nonparametric model (𝑄 ) and 𝑄 " unknown) 4

  5. Differential privacy [DMNS β€˜06] Bound the maximum amount that one person’s data can change the distribution of an algorithm’s output An algorithm 𝑁: π‘ˆ * β†’ 𝑆 is 𝝑 -differentially private if βˆ€ neighboring 𝑦, 𝑦′ ∈ π‘ˆ * and βˆ€ 𝑇 βŠ† 𝑆 , 𝑄 𝑁 𝑦 ∈ 𝑇 ≀ 𝑓 ; 𝑄 𝑁 𝑦 < ∈ 𝑇 β€’ 𝑇 as set of β€œbad outcomes” β€’ Worst-case guarantee 5

  6. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 6

  7. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 7

  8. Mann-Whitney test [MW β€˜47] Datasets: 𝑦 " , 𝑦 = , … 𝑦 % ~𝑄 ) and 𝑦 %?" , 𝑦 %?= , … 𝑦 * ~𝑄 " 𝐼 ) : 𝑄 ) = 𝑄 " , 𝐼 " : 𝑄 ) β‰  𝑄 " " * % %(*D%) βˆ‘ βˆ‘ Test statistic: π‘Š 𝑙 = 𝐽(𝑦 H > 𝑦 J ) JK%?" HK" Number of such pairs (𝑦 H , 𝑦 J ) such " Under 𝐼 " , require 𝑏: = 𝑄𝑠 N~O P ,Q~O R 𝑦 > 𝑧 β‰  that 𝑦 H > 𝑦 J = 8

  9. Non-private nonparametric change-point detection [Darkhovsky β€˜79] 1. F or every 𝑙 ∈ πœΉπ’ , … 𝟐 βˆ’ 𝜹 𝒐 2. Compute π‘Š 𝑙 Can we compute V(𝑙) or ] = 𝑏𝑠𝑕𝑛𝑏𝑦 % π‘Š(𝑙) 3. Output 𝑙 arg max π‘Š(𝑙) privately? 𝑾(𝒍) πœΉπ’ 𝒍 βˆ— 𝟐 (𝟐 βˆ’ 𝜹)𝒐 𝒐 9

  10. Adding differential privacy Differentially private algorithms add noise that scale with the sensitivity of a query. Query sensitivity: The sensitivity of real-valued query 𝑔 is: i,i j *kHlmnopq 𝑔 π‘Œ βˆ’ 𝑔 π‘Œ < Δ𝑔 = max . Laplace Mechanism: The mechanism 𝑁 𝑔, π‘Œ, πœ— = 𝑔 π‘Œ + Lap( xy z ) is πœ— -differentially private. 10

  11. Offline PNCPD = Mann-Whitney + ReportNoisyMax Private Nonparametric Change-Point Detector: 𝑄𝑂𝐷𝑄𝐸(π‘Œ, πœ—, 𝛿) 1. Input: database, privacy parameter πœ— , constraint parameter 𝛿 2. for k ∈ π›Ώπ‘œ , … 1 βˆ’ 𝛿 π‘œ 3. Compute statistic π‘Š(𝑙) = 4. Sample π‘Ž % ~π‘€π‘π‘ž ;…* † = 𝑏𝑠𝑕𝑛𝑏𝑦 % π‘Š 𝑙 + π‘Ž % 5. Output 𝑙 11

  12. Main results: OfflinePNCPD Theorem: Offline𝑄𝑂𝐷𝑄𝐸 π‘Œ, πœ—, 𝛿 is πœ— -differentially private and with † with probability 1 βˆ’ 𝛾 , it outputs private change-point estimator 𝑙 error at most ".)" 1 ’ log 1 † βˆ’ 𝑙 βˆ— < 𝑃 𝑙 𝝑𝛿 β€’ 𝑏 βˆ’ 1/2 = 𝛾 Previous non-private analysis [Darkhovsky β€˜76] Β§ ] βˆ’ 𝑙 βˆ— < 𝑃(π‘œ =/β€œ ) 𝑙 Our improved non-private analysis: Β§ 𝛿 β€’ 𝑏 βˆ’ 1/2 = log 1 1 ] βˆ’ 𝑙 βˆ— < 𝑃 𝑙 Ξ² = 𝑃 1 12

  13. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 13

  14. Online setting More challenging: must detect change quickly without much post- change data High Level Approach: 1. Privately detect online when V 𝑙 > π‘ˆ in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. Have DP algorithm (AboveNoisyThreshold) for this 14

  15. Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. π‘Š 𝑙 + π‘Ž % < π‘ˆ 15

  16. Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. π‘Š 𝑙 + π‘Ž % < π‘ˆ 16

  17. Online setting More challenging: must detect change quickly without much post- change data Our Approach: 1. Run AboveNoisyThreshold on Mann-Whitney queries in the center of a sliding window of last π‘œ data points. 2. Run OfflinePNCPD on the identified window. π‘Š 𝑙 + π‘Ž % β‰₯ π‘ˆ 17

  18. OnlinePNCPD 1. Input: database π‘Œ = {𝑦 " , … } , privacy parameter πœ— , threshold π‘ˆ ] = π‘ˆ + Lap ˜ 2. Let π‘ˆ zβ„’ 3. For each new data point 𝑦 % : 4. Compute Mann-Whitney statistic π‘Š(𝑙) in center of last π‘œ data points Sample π‘Ž % ∼ Lap RΕ‘ 5. zβ„’ ] , then 6. If π‘Š 𝑙 + π‘Ž J > π‘ˆ 7. Run OfflinePNCPD on last π‘œ data points with πœ—/2 8. Else, output βŠ₯ 18

  19. Main result: OnlinePNCPD Theorem: Online𝑄𝑂𝐷𝑄𝐸 π‘Œ, T, πœ—, 𝛿 is πœ— -differentially private. For appropriate threshold T, with probability 1 βˆ’ 𝛾 , it outputs private † with error at most change-point estimator 𝑙 † βˆ’ 𝑙 βˆ— < 𝑃 1 πœ— log π‘œ 𝑙 𝛾 where π‘œ is the window size. Choice of T β€’ Can’t raise alarm too early (False positive: π‘ˆ > π‘ˆ β€’ ) β€’ Can’t fail to raise alarm at true change (False negative: π‘ˆ < π‘ˆ ΕΎ ) 19

  20. Privately Detecting Changes in Unknown Distributions 1. Offline setting: dataset known in advance 2. Online setting: data points arrive one at a time 3. Drift change detection (in paper) 4. Empirical results (in paper) 20

  21. References β€’ Cummings, R., Krehbiel, S., Mei, Y., Tuo, R., & Zhang, W. Differentially private change-point detection. In Advances in Neural Information Processing Systems, NeurIPS’18 pp. 10848-10857,2018 β€’ Dwork, C., McSherry, F., Nissim, K., & Smith, A. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pp. 265-284, 2006. β€’ Dwork, C., Roth, A. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science , 9 (3– 4), 211-407, 2014. β€’ Darkhovsky, B. A nonparametric method for the a posteriori detection of the ``disorder’’ time of a sequence of independent random variables. Theory of Probability & Its Applications , 21(1):178-183, 1976. β€’ Mann, H.B. and Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics , pp 50-60, 1947. 21

  22. Privately Detecting Changes in Unknown Distributions Wanrong Zhang, Georgia Tech joint work with Rachel Cummings, Sara Krehbiel, Yuliia Lut 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend