universally adaptive data analysis
play

Universally Adaptive Data Analysis Cynthia Dwork, Microsoft Research - PowerPoint PPT Presentation

Universally Adaptive Data Analysis Cynthia Dwork, Microsoft Research 2 : muffin tops? Adaptive Data Analysis 1 : > 6 ft? q 1 a 1 q 2 M 2 : muffin bottoms? a 2 q 3 a 3 Database data analyst


  1. Universally Adaptive Data Analysis Cynthia Dwork, Microsoft Research

  2. π‘Ÿ 2 : muffin tops? Adaptive Data Analysis π‘Ÿ 1 : > 6 ft? q 1 a 1 q 2 M π‘Ÿ 2 : muffin bottoms? a 2 𝑇 q 3 a 3 Database 𝑇 ∼ 𝐸 data analyst  π‘Ÿ 𝑗 depends on 𝑏 1 , 𝑏 2 , … , 𝑏 π‘—βˆ’1  Worry: analyst finds a query for which the dataset is not representative of population; reports surprising discovery

  3. π‘Ÿ 2 : muffin tops? Differential Privacy for Adaptive Validity π‘Ÿ 1 : > 6 ft? q 1 a 1 q 2 DP DP π‘Ÿ 2 : muffin bottoms? a 2 𝑇 q 3 a 3 Database data analyst  π‘Ÿ 𝑗 depends on 𝑏 1 , 𝑏 2 , … , 𝑏 π‘—βˆ’1  Differential privacy neutralizes risks incurred by adaptivity  Definition of privacy tailored to statistical analysis of large data sets [D., Feldman, Hardt, Pitassi , Reingold, Roth ’14]

  4. π‘Ÿ 2 : muffin tops? Differential Privacy for Adaptive Validity π‘Ÿ 1 : > 6 ft? q 1 a 1 q 2 DP DP π‘Ÿ 2 : muffin bottoms? a 2 𝑇 q 3 a 3 Database data analyst  π‘Ÿ 𝑗 depends on 𝑏 1 , 𝑏 2 , … , 𝑏 π‘—βˆ’1  Differential privacy neutralizes risks incurred by adaptivity  βˆƒ LARGE literature on DP algorithms for data analysis [D., Feldman, Hardt, Pitassi , Reingold, Roth ’14]

  5. Some Intuition  Fix a query, eg , β€œWhat fraction of population is over 6 feet tall ?”  Almost all large datasets will give an approximately correct reply  Most datasets are representative with respect to this query  If, in the process of adaptive exploration, the analyst finds a query for which the dataset is not representative, then she must have β€œlearned something significant” about the dataset.  Preserving the β€œprivacy” of the data may prevent over-fitting.

  6. Intuition After Nati’s Talk  Differential Privacy: The outcome of any analysis is essentially equally likely, independent of whether any individual joins, or refrains from joining, the dataset.  This is a stability requirement.  Gave rise to the folklore that differential privacy yields generalizability.  But we will be able to say something stronger.

  7. π‘Ÿ 2 : muffin tops? π‘Ÿ 1 : > 6 ft? q 1 a 1 q 2 DP M π‘Ÿ 2 : muffin bottoms? a 2 𝑇 q 3 a 3 Database data analyst  π‘Ÿ 𝑗 depends on 𝑏 1 , 𝑏 2 , … , 𝑏 π‘—βˆ’1  Differential privacy neutralizes risks incurred by adaptivity  E.g., for statistical queries: whp 𝐹 𝑇 𝐡 𝑇 βˆ’ 𝐹 𝑄 𝐡 𝑇 < 𝜐  High probability is important for handling many queries [D., Feldman, Hardt, Pitassi , Reingold, Roth ’14]

  8. Formalization Choose new query based on history of  Data sets 𝑇 ∈ π‘Œ π‘œ ; 𝑇 ∼ 𝐸 observations  Queries π‘Ÿ: π‘Œ π‘œ β†’ 𝑍  Algorithms that choose queries and output results Output chosen query  𝐡 1 = π‘Ÿ 1 (trivial choice), outputs (π‘Ÿ 1 , π‘Ÿ 1 (𝑇)) and its response on 𝑇  𝐡 𝑗 : π‘Œ π‘œ Γ— 𝑍 1 Γ— β‹― Γ— 𝑍 π‘—βˆ’1 β†’ 𝑍 𝑗 where  π‘Ÿ 𝑗 = 𝐷 𝑗 (𝑧 1 , … , 𝑧 π‘—βˆ’1 )  𝐡 𝑗 𝑇, 𝑧 1 , … , 𝑧 π‘—βˆ’1 = π‘Ÿ 𝑗 , π‘Ÿ 𝑗 𝑇 = (π‘Ÿ 𝑗 , 𝑏 𝑗 ) q 1 a 1  𝐼 ≝ 𝑇, π‘Ÿ π‘Ÿ 𝑇 not representative wrt 𝐸} q 2 a 2  βˆ€ 𝑧 1 , … , 𝑧 π‘—βˆ’1 Pr 𝑇, π‘Ÿ 𝑗 ∈ 𝐼 ≀ 𝛾 𝑗 q 3 𝑇 a 3  We want: Pr[ 𝑻, 𝐷 𝑗 𝑻 ∈ 𝐼] to be similar  π‘Ÿ 𝑗 (𝑇) should generalize even when π‘Ÿ 𝑗 chosen as a function of 𝑇 π‘Ÿ 𝑗 (𝑇) fails to generalize

  9. Differential Privacy [D.,McSherry,Nissim,Smith β€˜06] 𝑁 gives πœ— -differential privacy if for all pairs of adjacent data sets 𝑇, 𝑇′ , and all events π‘ˆ Pr 𝑁 𝑇 ∈ π‘ˆ ≀ 𝑓 πœ— Pr 𝑁 𝑇′ ∈ π‘ˆ Randomness introduced by 𝑁

  10. Differential Privacy [D.,McSherry,Nissim,Smith β€˜06] 𝑁 gives πœ— -differential privacy if for all pairs of adjacent data sets 𝑇, 𝑇′ , and all events π‘ˆ Pr 𝑁 𝑇 ∈ π‘ˆ ≀ 𝑓 πœ— Pr 𝑁 𝑇′ ∈ π‘ˆ For random variables 𝒀, 𝒁 over Ξ§, the max-divergence of 𝒀 from 𝒁 is given by Pr[𝒀 = 𝑦] 𝐸 ∞ (𝒀| 𝒁 = log max Pr[𝒁 = 𝑦] xβˆˆπ‘Œ Then πœ— -DP equivalent to 𝐸 ∞ 𝑁 𝑇 ||𝑁(𝑇 β€² ) ≀ πœ— .

  11. Differential Privacy [D.,McSherry,Nissim,Smith β€˜06] 𝑁 gives πœ— -differential privacy if for all pairs of adjacent data sets 𝑇, 𝑇′ , and all events π‘ˆ Pr 𝑁 𝑇 ∈ π‘ˆ ≀ 𝑓 πœ— Pr 𝑁 𝑇′ ∈ π‘ˆ For random variables 𝒀, 𝒁 over Ξ§, the max-divergence of 𝒀 from 𝒁 is given by Pr[𝒀 = 𝑦] 𝐸 ∞ (𝒀| 𝒁 = log max Pr[𝒁 = 𝑦] xβˆˆπ‘Œ Then πœ— -DP equivalent to 𝐸 ∞ 𝑁 𝑇 ||𝑁(𝑇 β€² ) ≀ πœ— . Closed under post-processing: 𝐸 ∞ 𝐡(𝑁 𝑇 )||𝐡(𝑁 𝑇 β€² ) ≀ πœ— .

  12. Differential Privacy [D.,McSherry,Nissim,Smith β€˜06] 𝑁 gives πœ— -differential privacy if for all pairs of adjacent data sets 𝑇, 𝑇′ , and all events π‘ˆ Pr 𝑁 𝑇 ∈ π‘ˆ ≀ 𝑓 πœ— Pr 𝑁 𝑇′ ∈ π‘ˆ For random variables 𝒀, 𝒁 over Ξ§, the max-divergence of 𝒀 from 𝒁 is given by Pr[𝒀 = 𝑦] 𝐸 ∞ (𝒀| 𝒁 = log max Pr[𝒁 = 𝑦] xβˆˆπ‘Œ Then πœ— -DP equivalent to 𝐸 ∞ 𝑁 𝑇 ||𝑁(𝑇 β€² ) ≀ πœ— . Group Privacy: βˆ€π‘‡, 𝑇 β€²β€² 𝐸 ∞ 𝑁 𝑇 ||𝑁 𝑇 β€² ≀ Ξ” 𝑇, 𝑇 β€²β€² πœ— .

  13. Properties  Closed under post-processing  Max-divergence remains bounded  Automatically yields group privacy  π‘™πœ— for groups of size 𝑙  Understand behavior under adaptive composition  Can bound cumulative privacy loss over multiple analyses  β€œThe epsilons add up”  Programmable  Complicated private analyses from simple private building blocks

  14. The Power of Composition  Lemma: The choice of π‘Ÿ 𝑗 is differentially private.  Closure under post-processing.  Inductive step (key): If π‘Ÿ is chosen in a differentially private fashion with respect to 𝑇 , then Pr[ 𝑻, 𝐷(𝑻) ∈ 𝐼] is small  Sufficiency: union bound. q 1 a 1 q 2 DP M a 2 𝑇 q 3 a 3 Database data analyst

  15. Description Length  Let 𝐡: π‘Œ π‘œ β†’ 𝑍 .  Description length of 𝐡 is the cardinality of its range If βˆ€π‘§ Pr 𝑇 𝑇, 𝑧 ∈ 𝐼 ≀ 𝛾 , then Pr 𝑇, 𝐡 𝑇 ∈ 𝐼 ≀ 𝑍 β‹… 𝛾 S  Description length composes too.  Product: 𝛾 β‹… Ξ  𝑗 |𝑍 𝑗 |  And, morally, it is closed under post-processing  Once you fix the randomness of the post-processing algorithm [D., Feldman, Hardt, Pitassi, Reingold, Roth ’15]

  16. Approximate max-divergence 𝛾 -approximate max-divergence of 𝒀 from 𝒁 Pr 𝒀 ∈ π‘ˆ βˆ’ 𝛾 𝛾 (𝒀| 𝒁 = log 𝐸 ∞ max Pr[𝒁 ∈ π‘ˆ] π‘ˆβˆˆπ‘Œ, Pr π’€βˆˆπ‘ˆ >𝛾

  17. Approximate max-divergence 𝛾 -approximate max-divergence of 𝒀 from 𝒁 Pr 𝒀 ∈ π‘ˆ βˆ’ 𝛾 𝛾 (𝒀| 𝒁 = log 𝐸 ∞ max Pr[𝒁 ∈ π‘ˆ] π‘ˆβˆˆπ‘Œ, Pr π’€βˆˆπ‘ˆ >𝛾 We are interested in (with 𝛾 , but too messy) Pr[ 𝑻,𝐡 𝑻 βˆˆπ‘ˆ] 𝐸 ∞ ((𝑻, 𝐡 𝑻 )||𝑻 Γ— 𝐡 𝑻 ) = log max Pr[𝑻×𝐡 𝑻 βˆˆπ‘ˆ] π‘ˆ

  18. Approximate max-divergence 𝛾 -approximate max-divergence of 𝒀 from 𝒁 Pr 𝒀 ∈ π‘ˆ βˆ’ 𝛾 𝛾 (𝒀| 𝒁 = log 𝐸 ∞ max Pr[𝒁 ∈ π‘ˆ] π‘ˆβˆˆπ‘Œ, Pr π’€βˆˆπ‘ˆ >𝛾 We are interested in (with 𝛾 , but too messy) Pr[ 𝑻,𝐡 𝑻 βˆˆπ‘ˆ] 𝐸 ∞ ((𝑻, 𝐡 𝑻 )||𝑻 Γ— 𝐡 𝑻 ) = log max Pr[𝑻×𝐡 𝑻 βˆˆπ‘ˆ] π‘ˆ How much more likely is 𝐡(𝑇) to relate to 𝑇 than to a fresh 𝑇′ ? Captures the maximum amount of information that an output of an algorithm might reveal about its input

  19. Unifying Concept: Max-Information 𝛾 𝒀; 𝒁 = 𝐸 ∞ 𝛾 ((𝒀, 𝒁)||𝒀 Γ— 𝒁)  𝐽 ∞ 𝛾 (𝑻; 𝐡 𝑻 )  We are interested in 𝐽 ∞ 𝛾 𝑻; 𝐡 𝑻 ≀ 𝑙 then for any π‘ˆ βŠ† π‘Œ π‘œ Γ— 𝑍  Theorem: If 𝐽 ∞ ∈ π‘ˆ ≀ 2 𝑙 Pr 𝑻 Γ— 𝐡 𝑻 ∈ π‘ˆ + 𝛾  Pr 𝑻, 𝐡 𝑻 ∈ 𝐼 ≀ 2 𝑙 max  So Pr 𝑻, 𝐡 𝑻 π‘§βˆˆπ‘ Pr 𝑻, 𝑧 ∈ 𝐼 + 𝛾 ! [D., Feldman, Hardt, Pitassi, Reingold, Roth ’15]

  20. Unifying Concept: Max-Information 𝛾 𝒀; 𝒁 = 𝐸 ∞ 𝛾 ((𝒀, 𝒁)||𝒀 Γ— 𝒁)  𝐽 ∞ 𝛾 (𝑻; 𝐡 𝑻 )  We are interested in 𝐽 ∞ 𝛾 𝑻; 𝐡 𝑻 ≀ 𝑙 then for any π‘ˆ βŠ† π‘Œ π‘œ Γ— 𝑍  Theorem: If 𝐽 ∞ ∈ π‘ˆ ≀ 2 𝑙 Pr 𝑻 Γ— 𝐡 𝑻 ∈ π‘ˆ + 𝛾  Pr 𝑻, 𝐡 𝑻 ∈ 𝐼 ≀ 2 𝑙 max  So Pr 𝑻, 𝐡 𝑻 π‘§βˆˆπ‘ Pr 𝑻, 𝑧 ∈ 𝐼 + 𝛾 !  Max-Information composes and is closed under post-processing 𝛾 (𝐡, π‘œ) .  For πœ— -DP 𝐡 : 𝐽 ∞ 𝐡, π‘œ ≀ πœ—π‘œ log 2 𝑓 . Better bounds for 𝐽 ∞ 𝛾 𝐡, π‘œ ≀ log 𝑍  𝐽 ∞ 𝛾 Bound on worst case approximate max info for any distribution on π‘œ -element databases [D., Feldman, Hardt, Pitassi, Reingold, Roth ’15]

  21. Abstract is Good  Focusing on properties is powerful  Completely universal approach to validity of adaptive analysis  DP, small description length, low max-information  Large numbers of arbitrary adaptively chosen computations  Closure under post-processing and composition

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend