Amplification by Shuffling: From Local to Central Differential - - PowerPoint PPT Presentation

β–Ά
amplification by shuffling
SMART_READER_LITE
LIVE PREVIEW

Amplification by Shuffling: From Local to Central Differential - - PowerPoint PPT Presentation

Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity Vitaly Feldman Ulfar Erlingsson Ilya Mironov Ananth Raghunathan Kunal Talwar Abhradeep Thakurta Local Differential Privacy (LDP) 1 1 For all ,


slide-1
SLIDE 1

Amplification by Shuffling:

From Local to Central Differential Privacy via Anonymity

Vitaly Feldman

Ulfar Erlingsson Ilya Mironov Ananth Raghunathan Kunal Talwar Abhradeep Thakurta

slide-2
SLIDE 2

Local Differential Privacy (LDP)

For all 𝑗, 𝐡𝑗 is a local πœ—-DP randomizer: for all 𝑀, 𝑀′ ∈ π‘Œ [Warner β€˜65; EGS β€˜03; KLNRS β€˜08] 𝐡𝑗(𝑦𝑗 = 𝑀) 𝐡𝑗(𝑦𝑗 = 𝑀′)

Server

𝑦2 𝑦1 𝑦3 π‘¦π‘œ 𝐡1 𝐡2 𝐡3 π΅π‘œ Compute (approximately) 𝑔(𝑦1, 𝑦2, … , π‘¦π‘œ)

slide-3
SLIDE 3

Outline

Online monitoring with LDP

3

Benefits of anonymity: privacy amplification by shuffling

slide-4
SLIDE 4

Online monitoring

4

𝑦1,1 𝑦2,1 𝑦3,1 π‘¦π‘œ,1 𝑦1,3 𝑦2,3 𝑦3,3 π‘¦π‘œ,3 𝑦1,2 𝑦2,2 𝑦3,2 π‘¦π‘œ,2 𝑦1,𝑒 𝑦2,𝑒 𝑦3,𝑒 π‘¦π‘œ,𝑒 Estimate the daily counts 𝑇

π‘˜ = σ𝑗=1 π‘œ

𝑦𝑗,π‘˜ for all π‘˜ ∈ [𝑒]

𝑦𝑗,π‘˜ ∈ {0,1} Status of user 𝑗 on day π‘˜ Assume that each user’s status changes at most 𝑙 times

  • only for utility

𝑇1 𝑇2 𝑇3 π‘‡π‘œ time

slide-5
SLIDE 5

Monitoring with LDP

  • Report the status changes (only first 𝑙)
  • Maintains a tree of counters each over an interval of time
  • Based on [DNPR β€˜10; CSS β€˜11]

5

There exists an πœ—-LDP algorithm that constructs estimates መ 𝑇1, መ 𝑇2, … , መ 𝑇𝑒 such that with high prob. for all π‘˜ ∈ [𝑒], 𝑇

π‘˜ βˆ’ መ

π‘‡π‘˜ = 𝑃 π‘œπ‘™ (log 𝑒)2 πœ—

slide-6
SLIDE 6

Encode-Shuffle-Analyze (ESA) [Bittau et al. β€˜17]

6

Server

𝑦2 𝑦1 𝑦3 π‘¦π‘œ 𝐡1 𝐡2 𝐡3 π΅π‘œ Shuffle and anonymize

slide-7
SLIDE 7

Privacy amplification by shuffling

7

For any πœ— = 𝑃(1) and any sequence of πœ—-LDP algorithms (𝐡1, … , π΅π‘œ), let 𝐡shuffle 𝑦1, … , π‘¦π‘œ = 𝐡1 π‘¦πœŒ 1 , 𝐡2 π‘¦πœŒ 2 , … , π΅π‘œ π‘¦πœŒ π‘œ for a random and uniform permutation 𝜌: π‘œ β†’ π‘œ Then 𝐡shuffle is πœ—β€², πœ€ -DP in the central model for πœ—β€² = 𝑃

πœ— log 1/πœ€ π‘œ

Holds for adaptive case: 𝐡𝑗 may depend on outputs of 𝐡1, … , π΅π‘—βˆ’1

slide-8
SLIDE 8

Comparison with subsampling

Advantages of shuffling:

  • does not affect the statistics of the dataset
  • does not increase LDP cost

8

Running πœ—-DP algorithm on random π‘Ÿ-fraction of elements is β‰ˆ π‘Ÿπœ—-DP (πœ— ≀ 1) [KLNRS β€˜08] Output 𝐡1 𝑦𝑗1 , 𝐡2 𝑦𝑗2 , … , π΅π‘œ π‘¦π‘—π‘œ where 𝑗1, 𝑗2, … , π‘—π‘œ ∼ [π‘œ] (independently) is πœ—β€², πœ€ -DP for πœ—β€² = 𝑃

πœ— log 1/πœ€ π‘œ

e.g. [BST β€˜14] Shuffling includes all elements so π‘Ÿ = 1

slide-9
SLIDE 9

Server

𝑦2 𝑦1 𝑦3 π‘¦π‘œ 𝐡1 𝐡2 𝐡3 π΅π‘œ Shuffle and anonymize

Implications for ESA

9

Set 𝑇 βŠ† [π‘œ] with the same randomizer

For every 𝑗 ∈ 𝑇, the output is 𝑃

πœ— log 1/πœ€ 𝑇

, πœ€ -DP for element at position 𝑗

slide-10
SLIDE 10

Output distribution is determined by 𝑛 = #1(RR(𝑦1), … , RR(π‘¦π‘œ)) 𝑛 ∼ Bin 𝑙, 2

3 + Bin π‘œ βˆ’ 𝑙, 1 3 , where 𝑙 = #1(𝑦1, … , π‘¦π‘œ)

For a neighboring dataset: 𝑙′ = 𝑙 Β± 1 Bin 𝑙, 2 3 + Bin π‘œ βˆ’ 𝑙, 1 3 β‰ˆ

log 1/πœ€ π‘œ ,πœ€

Bin 𝑙 + 1, 2 3 + Bin π‘œ βˆ’ 𝑙 βˆ’ 1, 1 3 [DKMMN β€˜06]

Special case: binary randomized response

10

RR: For 𝑦 ∈ 0,1 , return 𝑦 flipped with probability 1/3. Satisfies (log 2)-LDP

Also given in [Cheu,Smith,Ullman,Zeber,Zhilyaev β€˜18] (independently)

slide-11
SLIDE 11

Conclusions

  • Monitoring with LDP and log dependence on time
  • General privacy amplification technique
  • Match state of the art in the central model
  • Can be used to derive lower bounds for LDP
  • Provable benefits of anonymity for ESA-like architectures
  • To appear in SODA 2019
  • arxiv.org/abs/1811.12469

11