 
              Smooth Sensitivity and Sampling Sofya Raskhodnikova Penn State University Joint work with Kobbi Nissim ( Ben Gurion University ) and Adam Smith ( Penn State University )
Our main contributions • Starting point: Global sensitivity framework [DMNS06] (Cynthia’s talk) • Two new frameworks for private data analysis • Greatly expand the types of information that can be released 2
Road map I. Introduction • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate framework 3
Model x 1 ❥ Trusted x 2 ✛ q Compute f ( x ) . ✲ agency Users . . ✲ A ( x ) = ✯ A f ( x ) + noise x n Each row is arbitrarily complex data supplied by 1 person. For which functions f can we have: • utility: little noise • privacy: indistinguishability definition of [DMNS06] 4
Privacy as indistinguishability [DMNS06] Two databases are neighbors if they differ in one row. x 1 x 1 x 2 x ′ 2 x ′ = x = . . . . . . x n x n Privacy definition Algorithm A is ε -indistinguishable if • for all neighbor databases x, x ′ • for all sets of answers S Pr[ A ( x ) ∈ S ] ≤ (1 + ε ) · Pr[ A ( x ′ ) ∈ S ] 5
Privacy definition: composition If A is ε -indistinguishable on each query, it is εq -indistinguishable on q queries. x 1 ❥ ε -indisting. x 2 ✛ q Compute f ( x ) . ✲ agency Users . . ✲ A ( x ) = ✯ A f ( x ) + noise x n 6
Global sensitivity framework [DMNS06] Intuition: f can be released accurately when it is insensitive to individual entries x 1 , . . . , x n . neighbors x,x ′ � f ( x ) − f ( x ′ ) � . Global sensitivity GS f = max Example: GS average = 1 n if x ∈ [0 , 1] n . Theorem � � GS f If A ( x ) = f ( x ) + Lap then A is ε -indistinguishable. ε 7
Instance-Based Noise Big picture for global sensitivity framework: • add enough noise to cover the worst case for f • noise distribution depends only on f , not database x Problem: for some functions that’s too much noise 8
Instance-Based Noise Big picture for global sensitivity framework: • add enough noise to cover the worst case for f • noise distribution depends only on f , not database x Problem: for some functions that’s too much noise Example: median of x 1 , . . . , x n ∈ [0 , 1] x ′ = 0 · · · 0 x = 0 · · · 0 0 1 · · · 1 1 1 · · · 1 � �� � � �� � � �� � � �� � n − 1 n − 1 n − 1 n − 1 2 2 2 2 median( x ) = 0 median( x ′ ) = 1 GS median = 1 1 • Noise magnitude: ε . 8
Instance-Based Noise Big picture for global sensitivity framework: • add enough noise to cover the worst case for f • noise distribution depends only on f , not database x Problem: for some functions that’s too much noise Example: median of x 1 , . . . , x n ∈ [0 , 1] x ′ = 0 · · · 0 x = 0 · · · 0 0 1 · · · 1 1 1 · · · 1 � �� � � �� � � �� � � �� � n − 1 n − 1 n − 1 n − 1 2 2 2 2 median( x ) = 0 median( x ′ ) = 1 GS median = 1 1 • Noise magnitude: ε . Our goal: noise tuned to database x 8
Road map I. Introduction • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate framework 9
Local sensitivity x ′ : neighbor of x � f ( x ) − f ( x ′ ) � Local sensitivity LS f ( x ) = max Reminder: GS f = max LS f ( x ) x Example: median for 0 ≤ x 1 ≤ · · · ≤ x n ≤ 1, odd n . . . . . . x 1 x m − 1 x m x m +1 x n 0 1 ✲ r r r r r ✻ median LS median ( x ) = max( x m − x m − 1 , x m +1 − x m ) Goal: Release f ( x ) with less noise when LS f ( x ) is lower. 10
Local sensitivity x ′ : neighbor of x � f ( x ) − f ( x ′ ) � Local sensitivity LS f ( x ) = max Reminder: GS f = max LS f ( x ) x Example: median for 0 ≤ x 1 ≤ · · · ≤ x n ≤ 1, odd n . . . . . . x 1 x m − 1 x m x m +1 x n 0 1 ✲ r r r r r ❨ ✻ median new median when x ′ 1 = 1 LS median ( x ) = max( x m − x m − 1 , x m +1 − x m ) Goal: Release f ( x ) with less noise when LS f ( x ) is lower. 10
Local sensitivity x ′ : neighbor of x � f ( x ) − f ( x ′ ) � Local sensitivity LS f ( x ) = max Reminder: GS f = max LS f ( x ) x Example: median for 0 ≤ x 1 ≤ · · · ≤ x n ≤ 1, odd n . . . . . . x 1 x m − 1 x m x m +1 x n 0 1 ✲ r r r r r ❨ ✒ ✻ new median median new median when x ′ when x ′ n = 0 1 = 1 LS median ( x ) = max( x m − x m − 1 , x m +1 − x m ) Goal: Release f ( x ) with less noise when LS f ( x ) is lower. 10
Instance-based noise: first attempt Noise magnitude proportional to LS f ( x ) instead of GS f ? No! Noise magnitude reveals information. Lesson: Noise magnitude must be an insensitive function. 11
Smooth bounds on local sensitivity Design sensitivity function S ( x ) • S ( x ) is an ε -smooth upper bound on LS f ( x ) if: – for all x : S ( x ) ≥ LS f ( x ) – for all neighbors x, x ′ : S ( x ) ≤ e ε S ( x ′ ) ✻ LS f ( x ) ✲ x Theorem � S ( x ) � If A ( x ) = f ( x ) + noise then A is ε ′ -indistinguishable. ε Example: GS f is always a smooth bound on LS f ( x ) 12
Smooth bounds on local sensitivity Design sensitivity function S ( x ) • S ( x ) is an ε -smooth upper bound on LS f ( x ) if: – for all x : S ( x ) ≥ LS f ( x ) – for all neighbors x, x ′ : S ( x ) ≤ e ε S ( x ′ ) ✻ S ( x ) LS f ( x ) ✲ x Theorem � S ( x ) � If A ( x ) = f ( x ) + noise then A is ε ′ -indistinguishable. ε Example: GS f is always a smooth bound on LS f ( x ) 12
Smooth Sensitivity � LS f ( y ) e − ε · dist ( x,y ) � Smooth sensitivity S ∗ f ( x )= max y Lemma For every ε -smooth bound S : S ∗ f ( x ) ≤ S ( x ) for all x . Intuition: little noise when far from sensitive instances low local sensitivity high local sensitivity database space 13
Smooth Sensitivity � LS f ( y ) e − ε · dist ( x,y ) � Smooth sensitivity S ∗ f ( x )= max y Lemma For every ε -smooth bound S : S ∗ f ( x ) ≤ S ( x ) for all x . Intuition: little noise when far from sensitive instances low local sensitivity low smooth sensitivity high local sensitivity database space 13
Computing smooth sensitivity Example functions with computable smooth sensitivity • Median & minimum of numbers in a bounded interval • MST cost when weights are bounded • Number of triangles in a graph Approximating smooth sensitivity • only smooth upper bounds on LS are meaningful • simple generic methods for smooth approximations – work for median and 1-median in L d 1 14
Road map I. Introduction • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate framework 15
New goal • Smooth sensitivity framework requires understanding combinatorial structure of f – hard in general • Goal: an automatable transformation from an arbitrary f into an ε -indistinguishable A – A ( x ) ≈ f ( x ) for ”good” instances x 16
Example: cluster centers Database entries: points in a metric space. x ′ x ❜ r r ❜ r ❜ ❜ r ❜ r r ❜ r ❜ r ❜ r ❜ r ❜ r ❜ ❜ r r ❜ ❜ r ❜ r ❜ r r ❜ r ❜ ❜ r r ❜ ❜ r r ❜ ❜ r r ❜ r ❜ r ❜ r ❜ ❜ r r ❜ ❜ r ❜ r ❜ r ❜ r ❜ r r ❜ ❜ r r ❜ ❜ r r ❜ ❜ r ❜ r ❜ r ❜ r ❜ r ❜ r ❜ r ❜ r r ❜ r ❜ r ❜ ❜ r ❜ r ❜ r ❜ r r ❜ ❜ r ❜ r ❜ r ❜ r r ❜ ❜ r r ❜ r ❜ ❜ r ❜ r ❜ r ❜ r r ❜ ❜ r r ❜ ❜ r r ❜ ❜ r ❜ r r ❜ r ❜ ❜ r ❜ r ❜ r ❜ r • Comparing sets of centers: Earthmover-like metric • Global sensitivity of cluster centers is roughly the diameter of the space. But intuitively, if clustering is ”good”, cluster centers should be insensitive. • No efficient approximation for smooth sensitivity 17
Example: cluster centers Database entries: points in a metric space. x ′ x r ❜ r ❜ r ❜ r ❜ r ❜ ❜ r r ❜ r ❜ ❜ r ❜ r r ❜ r ❜ r ❜ r ❜ r ❜ r ❜ ❜ r r ❜ r ❜ r ❜ ❜ r ❜ r r ❜ ❜ r r ❜ r ❜ ✬✩ r ❜ ❜ r ✬✩ ❜ r ❜ r r ❜ ❜ r r ❜ r ❜ ❜ r ❜ r ❜ r r ❜ ❜ r r ❜ r ❜ ❜ r r ❜ ❜ r ❜ r ❜ r r ❜ r ❜ ❜ r r ❜ ❜ r r ❜ ❜ r r ❜ ❜ r ❜ r r ❜ ❜ r r ❜ r ❜ r ❜ ❜ r r ❜ ❜ r ✫✪ ✫✪ r ❜ ❜ r ❜ r r ❜ r ❜ r ❜ r ❜ r ❜ r ❜ r ❜ ❜ r ❜ r ❜ r ❜ r ❜ r ❜ r • Comparing sets of centers: Earthmover-like metric • Global sensitivity of cluster centers is roughly the diameter of the space. But intuitively, if clustering is ”good”, cluster centers should be insensitive. • No efficient approximation for smooth sensitivity 17
Recommend
More recommend