Privacy Amplification by Mixing and Diffusion Mechanisms Borja - - PowerPoint PPT Presentation

privacy amplification by mixing and diffusion mechanisms
SMART_READER_LITE
LIVE PREVIEW

Privacy Amplification by Mixing and Diffusion Mechanisms Borja - - PowerPoint PPT Presentation

Privacy Amplification by Mixing and Diffusion Mechanisms Borja Balle, Gilles Barthe, Marco Gaboardi, Joseph Geumlek Amplification by Postprocessing x 1 , , x n y 1 y 2 M K Post-processing Mechanism (Markov operator) Amplification by


slide-1
SLIDE 1

Privacy Amplification by Mixing and Diffusion Mechanisms

Borja Balle, Gilles Barthe, Marco Gaboardi, Joseph Geumlek

slide-2
SLIDE 2

Amplification by Postprocessing

x1, …, xn y1 y2

M

Mechanism

K

Post-processing (Markov operator)

slide-3
SLIDE 3

Amplification by Postprocessing

  • When is K◦M more private than M?

x1, …, xn y1 y2

M

Mechanism

K

Post-processing (Markov operator)

slide-4
SLIDE 4

Amplification by Postprocessing

  • When is K◦M more private than M?

x1, …, xn y1 y2

M

Mechanism

K

Post-processing (Markov operator)

y4

K

y3

K

slide-5
SLIDE 5

Amplification by Postprocessing

  • When is K◦M more private than M?
  • How does privacy relate to mixing in the Markov chain?

x1, …, xn y1 y2

M

Mechanism

K

Post-processing (Markov operator)

y4

K

y3

K

slide-6
SLIDE 6

Amplification by Postprocessing

  • When is K◦M more private than M?
  • How does privacy relate to mixing in the Markov chain?

x1, …, xn y1 y2

M

Mechanism

K

Post-processing (Markov operator)

y4

K

y3

K

slide-7
SLIDE 7

Amplification by Postprocessing

  • When is K◦M more private than M?
  • How does privacy relate to mixing in the Markov chain?
  • Starting point for “Hierarchical DP”

x1, …, xn y1 y2

M

Mechanism

K

Post-processing (Markov operator)

y4

K

y3

K

slide-8
SLIDE 8

Our Results

  • Amplification under uniform mixing
  • Relates to classical mixing conditions (eg. Dobrushin, Doeblin) and local DP properties of K
  • Eg. if M is 𝜁-DP and K is -LDP

, them K◦M is -DP

log 1 1 − γ

log(1 + γ(eε − 1))

slide-9
SLIDE 9

Our Results

  • Amplification under uniform mixing
  • Relates to classical mixing conditions (eg. Dobrushin, Doeblin) and local DP properties of K
  • Eg. if M is 𝜁-DP and K is -LDP

, them K◦M is -DP

  • Amplification from couplings
  • Generalizes amplification by iteration [Feldman et al. 2018]
  • Applied to SGD: exponential amplification in the strongly convex case

log 1 1 − γ

log(1 + γ(eε − 1))

slide-10
SLIDE 10

Our Results

  • Amplification under uniform mixing
  • Relates to classical mixing conditions (eg. Dobrushin, Doeblin) and local DP properties of K
  • Eg. if M is 𝜁-DP and K is -LDP

, them K◦M is -DP

  • Amplification from couplings
  • Generalizes amplification by iteration [Feldman et al. 2018]
  • Applied to SGD: exponential amplification in the strongly convex case
  • The continuous time limit: diffusion mechanisms
  • General RDP analysis via heat-flow argument
  • New Ornstein-Uhlenbeck mechanism with better MSE than Gaussian mechanism

log 1 1 − γ

log(1 + γ(eε − 1))

slide-11
SLIDE 11

Amplification by Iteration in NoisySGD

  • If D and D’ differ in position j, then the last n-j iterations are postprocessing
  • Can also use public data for the last r iterations
  • Start from a coupling between xj and xj’ and propagate it through
  • Keep all the mass as close to the diagonal as possible

Algorithm 1: Noisy Projected Stochastic Gradient Descent — NoisyProjSGD(D, `, ⌘, , ⇠0) Input: Dataset D = (z1, . . . , zn), loss function ` : K ⇥ D ! R, learning rate ⌘, noise parameter , initial distribution ⇠0 2 P(K) Sample x0 ⇠ ⇠0 for i 2 [n] do xi ΠK (xi1 ⌘(rx`(xi1, zi) + Z)) with Z ⇠ N(0, 2I) return xn

[FMTT’18]

slide-12
SLIDE 12

Projected Generalized Gaussian Mechanism

K(x) = Π𝕃(𝒪(ψ(x), σ2I)) ψ : ℝd → ℝd

x ψ(x)

ψ(x) + Z Π𝕃(ψ(x) + Z) 𝕃

slide-13
SLIDE 13

Amplification by Coupling

Suppose 𝜔1, …, 𝜔r are L-Lipschitz

Ki(x) = Π𝕃(𝒪(ψi(x), σ2I))

Rα(µK1 · · · Krk⌫K1 · · · Kr)  ↵L2 22

r

X

i=1

L2(ri)W1(µi, µi1)2

𝒬(𝕃)

μ = μ0 ν = μr μ1 μ2

Rényi Divergence Wasserstein Distance

𝕃 × 𝕃

π

|y − y′| ≤ w

“interpolating path”

Applications:

  • Bound L
  • Optimize path
slide-14
SLIDE 14

Per-index RDP in NoisySGD

ϵi(α) = O ( α (n − i)σ2 )

Suppose the loss is Lipschitz and smooth If loss is convex can take L=1. Then i-th person receives 𝜁i(𝛽)-RDP with If loss is strongly convex can take L< 1. Then i-th person receives 𝜁i(𝛽)-RDP with

ϵi(α) = O ( αL(n−i)/2 (n − i)σ2 )

[FMTT’18]

slide-15
SLIDE 15

Summary

  • Couplings (including overlapping mixtures) provide a powerful

methodology to study privacy amplification in many settings

  • Including: subsampling, postprocessing, shuffling and iteration
  • Properties of divergences related to (R)DP (eg. advanced joint convexity)

are “necessary” to get tight amplification bounds

  • Different types of couplings are useful (eg. maximal and small distance)