csc2412 definition of di ff erential privacy
play

CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1 An - PowerPoint PPT Presentation

CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1 An Ideal Goal The study reveals nothing new about any particular individual to an adversary. - not much Example: Adversary believes humans have four fingers on each hand. In


  1. CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1

  2. An Ideal Goal The study reveals nothing new about any particular individual to an adversary. - not much Example: • Adversary believes humans have four fingers on each hand. • In particular, believes Sasho has four fingers on each hand. 2

  3. An Ideal Goal The study reveals nothing new about any particular individual to an adversary. Example: • Adversary believes humans have four fingers on each hand. • In particular, believes Sasho has four fingers on each hand. • Study reveals distribution of number of fingers per person’s hand. • Adversary now has learned Sasho probably has five fingers per hand. 2

  4. An Ideal Goal The study reveals nothing new about any particular individual to an adversary. Example: Learning / • Adversary believes humans have four fingers on each hand. about the • In particular, believes Sasho has four fingers on each hand. world also • Study reveals distribution of number of fingers per person’s hand. : :3 earning • Adversary now has learned Sasho probably has five fingers per hand. Another example: • Adversary believes there is no link between smoking and cancer. • Also knows that Sasho smokes • Study reveals link between smoking and cancer. 2

  5. Statistical vs Personal Information In the examples, the adversary learns statistical information that pertains to Sasho. • If science works, it better reveal something about me. What information is statistical and what information is personal ? - - Test: Could the adversary have learned this information if my data were not analyzed ? vs five fingers four statistical } Yes smoking a cancer finding if ?I÷f } personal no 3

  6. Towards a Definition The algorithm doing the analysis should do almost the same in all the following cases: - • my data is included in the data set • my data is not included in the data set • my data is changed in the data set the algorithm publishes does not I. e. , what too strongly data depend my on . 4

  7. Data Model { ditty ? .at .es Data set: (multi-)set X of n data points X = { x 1 , . . . , x n } . • each data point (or row) x i is the data of one person - ¥-1 • each data point comes from a universe X binary attributes d Eg = 40,1yd . c- I Xi A data analysis algorithm (a mechanism ) is a randomized algorithm M that takes a - data set X and produces the results of the data analysis as output. The output random MIX ) is of for X any 5

  8. ↳ Almost a Definition data of differ in the a single individual We call two data sets X and X 0 neighbouring if . . - , Xu } X - ft , ' 1. ( variable n ) we can get X 0 from X by adding or removing an element - ft X , ti - g. Fun % dataset size - i . . . - xn ) 2. ( fixed n ) we can get X 0 from X by replacing an element with another - f r . . - , Xu } X - . ' =hX . - Hu } i' it ' . - , Xi - i , X " . Definition An mechanism M is di ff erentially private if, for any two neighbouring datasets X , X 0 M ( X ) ≈ M ( X 0 ) % MCH ' ) MIX and are random variables " similar " as 6

  9. Total Variation Distance Di ff erential Privacy - msatl PINCHES ) ' les ) ) ' ) ) - PINK dtullllt ) , UH - Definition An mechanism M is δ -tv di ff erentially private if, for any two neighbouring datasets - h . # t X , X 0 , and any set of outputs S X E- - - ftii . } X ' ' - in | P ( M ( X ) ∈ S ) − P ( M ( X 0 ) ∈ S ) | ≤ δ . - . neighbouring quot nee setshesufjchan.su data k What should δ be? ⇐ n ' X For any , X , there are - ipceeixyes , I earn ez } almost does 1 - • δ < 2 n ? the same for ' " ' " - X " ' . . , Hk - ' ' " ! X . yl neighbouring tot n y a1 datasets Lsat HPCUCH , . c- S ) . of - - we prof " mechanism : 1 For all i , output x , • δ ≥ 2 n ? " Name and shame - UCH ) ly ft - X ' - Ex . published , and NIH t - S not X - , prof . . , tug is ri - . neighbouring . published wreoustprof not intuitively private : some data pt ' - ft . 7 . , tf . - , # . . . .

  10. ↳ Finally, Di ff erential Privacy notes : any In Vadhana 's 2006 an adversary conclusion , Nissim , Smith draws from UCH could we , McSherry been drawn from Uct Dwork ' ) Definition An mechanism M is ε -di ff erentially private if, for any two neighbouring datasets X , X 0 , and any set of outputs S n P ( M ( X ) ∈ S ) ≤ e ε P ( M ( X 0 ) ∈ S ) . for small E HE constant positive small E P ( ell X 't c- S ) c.ee/PCdlHc- S ) " Name and shame " - something bad happens to - event that fails olefin S me this - almost My risks if used data same as for 2<0 are my any if they 8 used not are

  11. A Hypothesis Testing Viewpoint not essential Wasserman , Zhou ④ Suppose X = { X 1 , . . . , X n } are drawn IID from some distribution. The adversary A wants to use M ( X ) to test which hypothesis holds: T se - DP H 0 : X i = y 0 • E.g., “Sasho does not smoke” H 1 : X i = y 1 • E.g., “Sasho smokes” " ) " Ho " H outputs " sees Nlt ) and ( that Then for any A , , - yo - y , Xi Xi - - P ( A ( M ( X )) = ” H 1 ” | H 1 ) ≤ e ε P ( A ( M ( X )) = ” H 1 ” | H 0 ) - - True Positive rate false positive rate 1- Type II error Type I 9 error

  12. Randomized Response Warner Given smoker if is a • dataset X = { x 1 , . . . , x n } ⊆ X , x ( - { o VH E. g • query q : X → { 0 , 1 } ow - . output M ( X ) = ( Y 1 ( x 1 ) , . . . , Y n ( x n )), where, independently 8 > I e ε q ( x i ) w/ prob. < 1+ e ε 2 Y i ( x i ) = . < t 1 1 − q ( x i ) w/ prob. : 1+ e ε 10

  13. Privacy Analysis ETS for any y ∈ { 0 , 1 } n , and any neighbouring X , X 0 - # P ( M ( X ) = y ) ≤ e ε P ( M ( X 0 ) = y ) . " → ' les ) V-seso.it - e' Mutt ' ) . e ' - y ) - y ) Plait PINCH ¥ PINCHES ) . - - = fxiiriityi ! i neighbouring take ' xx some . . . - fu ) tty p ( Nlt ) . . . Yuki - y ) , Yachty . = Ply , Kil ya - - - - . - yn ) - yr ) - y , ) LPC Yu Hui . Pl Yum = Ply , hit . - - - ' ) - y ) = Ply . ix. Ii y , ) - yet - PIL Kul Pl Li Ki 's - ya ) ( Ult - - . - - . 11 -

  14. Accuracy Analysis - quit Efi ) q :D → { 0,13 - - quite + et " smoker ? Etsi ) " (1+ e ε ) Y i � 1 Want to approximate q ( X ) = 1 P n i =1 q ( x i ) . Claim: 1 P n ≈ q ( X ) i =1 e ε � 1 n n - qlt ) - I Zi ELIE , Zi ) # Hit - , uinofependentfpH-zi-EIE.tt/zt ) : 2£ Hoeffdings Inequality e.at#uetti:ZieI-eIIeeIT - - - a. 2. , ' ) n > enter ) # D - qlhlzalhexpf.IE?iiiE' Yer if if BIKE .to u login 12 22 EZ

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend