lightdp towards automating differential privacy proofs
play

LightDP: Towards Automating Differential Privacy Proofs Danfeng - PowerPoint PPT Presentation

LightDP: Towards Automating Differential Privacy Proofs Danfeng Zhang Daniel Kifer Penn State University Database w/o Database w/ Alices data Alices data " $ Alices data remain private if " , $ are


  1. LightDP: Towards Automating Differential Privacy Proofs Danfeng Zhang Daniel Kifer Penn State University

  2. Database w/o Database w/ Alice’s data Alice’s data 𝜈 " 𝜈 $ Alice’s data remain private if 𝜈 " , 𝜈 $ are close 2

  3. (Pure) Differential Privacy 𝜈 $ (𝑀) 𝜈 " (𝑀) 𝜈 " 𝜈 $ If for any adjacent databases and value 𝑀 , 𝜈 " (𝑀)/𝜈 $ (𝑀) ≀ 𝑓 + for some constant πœ— , then a computation is πœ— -private 3 Privacy Cost

  4. Motivation DP has seen explosive growth since 2006 β€’ U.S. Census Bureau LEHD OnTheMap tool [Machanavajjhala et al. 2008] β€’ Google Chrome Browser [Erlingsson et al. 2014] β€’ Apple’s new data collection efforts [Greenberg 2016] But also accompanied with flawed (paper-and- pencil) proofs β€’ e.g., ones categorized in [Chen&Machanavajjhala’15, Lyu et al.’16] Rigorous methods are needed for differential privacy proofs 4

  5. Related Work DP programming platforms (e.g., PINQ, Airavat) β€’ Use (instead of verify) basic DP mechanisms β€’ Cannot offer tight bounds for sophisticated algorithms Methods based on customized logics β€’ Steep learning curve β€’ Heavy annotation burden LightDP offers a better balance between expressiveness and usability 5

  6. LightDP: Overview Target Program with distinguished variable Source Program Relational, Dependent Type System Main Theorem v + bounded by constant πœ— Source program type checks in the target program Source program is πœ— -private 6

  7. Source Language: Syntax Random Expression Random (e.g., Laplace dist.) variable 7

  8. Source Language: Semantics Memory: mapping from variables to values Initial memory Adjacent memory Relational Reasoning via Type System Final memory dist. Final memory dist. 8

  9. Relational Types Base Type Distance Example Related Memories Ξ“ 𝑦 : num 6 𝑦: u 𝑦: u Ξ“(𝑧): num " 𝑧 : v 𝑧 : v+1 e.g., int, real 9

  10. Dependent Types Can be a program variable Example Related Memories Ξ“ 𝑦 : num 6 𝑦: u 𝑦: u Ξ“(𝑧): num 8 𝑧 : v 𝑧 : v + u 10

  11. Dependent Types Can be a non-prob. expression Example Related Memories Ξ“ 𝑦 : num 6 𝑦: u 𝑦: u 𝑧 : v 𝑧 : 9v + 2, u β‰₯ 1 Ξ“(𝑧): num 8>"?$:6 v ,u < 1 Notation 𝑛 " Ξ“ 𝑛 $ if 𝑛 " and 𝑛 $ are related by Ξ“ 11

  12. (for the non-probabilistic subset) Types form an invariant on two related program executions: Ξ“ 𝑛 " If initial memories 𝑛 $ Then after executing a well-typed program, Ξ“ A A 𝑛 $ 𝑛 " final memories Enforced by a type system 12

  13. Type System Expression: e.g., + | βˆ’ < | > | = | ≀ | β‰₯ 13

  14. Type System Distance must Command: be identical e.g., Related executions take same branch 14

  15. Relating Two Distributions 𝜈 " Ξ“ 𝜈 $ w.r.t. privacy cost 𝝑 if βˆ€π‘›. 𝜈 " (𝑛)/𝜈 $ (Ξ“(𝑛)) ≀ 𝑓 + 𝜈 " 𝜈 $ Laplace dist. w/ mean 0 and a scale factor 𝑠 Program πœƒ := Lap 𝑠 With no Ξ“ πœƒ = num 6 cost 15

  16. Relating Two Distributions 𝜈 " Ξ“ 𝜈 $ w.r.t. privacy cost πœ— if βˆ€π‘›. 𝜈 " (𝑛)/𝜈 $ (Ξ“(𝑛)) ≀ 𝑓 + 𝜈 " 𝜈 $ Laplace dist. w/ mean 0 Observation and a scale factor 𝑠 Program πœƒ may have an arbitrary distance , πœƒ := Lap 𝑠 which affects the added cost With cost Ξ“ πœƒ = num " 𝟐/𝒔 due to dist. property 16

  17. Observation πœƒ may have an arbitrary distance , which affects the added cost πœƒ has a polymorphic type Non-deterministic operation target program, explicitly source program tracks added privacy cost Intuitively, target program computes the added cost for one sample from distribution 𝜈 17

  18. In General Target program with Source program distinguished variable Type System source program target program 18

  19. Target Language set x to arbitrary value Verification task in the target language: Proving is bounded by some constant πœ— in any execution (in a non-probabilistic program) A safety property. Can be verified using off-the-shelf tools (e.g., Hoare logic, model checking) 19

  20. Putting Together The Sparse Vector Method [Dwork and Roth’14] Source Program β€’ Correctness proof is subtle Incorrect variants categorized in [Chen&Machanavajjhala’15, Lyu et al.’16] β€’ Formally verified very recently [Barthe et al. 2016] with heavy annotation burden 20

  21. Required Types Distance depends on the value of 𝑗th query answer ( π‘Ÿ[𝑗] ) Type Inference Types can be inferred by the inference algorithm of LightDP 21

  22. Target Program 22

  23. Completing the Proof Loop Invariant Main Theorem Source program type checks + bounded by constant πœ— Postcondition: = source program is πœ— -private 23

  24. More in the Paper Type inference algorithm Searching for proof with minimum cost w/ MaxSMT Formal proof for the main theorem More verified examples (with little manual efforts) 24

  25. Summary A safety property (verified by existing tools) Target Program with Automated by distinguished variable inference engine Source Program Relational, Dependent Type System Decomposing differential privacy into subtasks substantially simplifies language-based proof 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend