adversarial classification under differential privacy
play

Adversarial Classification Under Differential Privacy Jairo Giraldo - PowerPoint PPT Presentation

Adversarial Classification Under Differential Privacy Jairo Giraldo Alvaro A. Cardenas Murat Kantarcioglu Jonathan Katz University of Utah UC Santa Cruz UT Dallas GMU 20th Century: computers were brains without senses-they only


  1. Adversarial Classification Under Differential Privacy Jairo Giraldo Alvaro A. Cardenas Murat Kantarcioglu Jonathan Katz University of Utah UC Santa Cruz UT Dallas GMU

  2. • 20th Century: computers were brains without senses—-they only knew what we told them. • More info in the world than what people can type on keyboard • 21st century: computers sense things, e.g., GPS we take for Kevin Ashton (British granted in our phones entrepreneur) coined the term IoT in 1999. � 2

  3. New Privacy Concerns � 3

  4. In Addition to Privacy, There is Another Problem: Data Trustworthiness � 4

  5. We Need to Provide 3 Properties 1. Classical Utility • Usable Statistics Utility • Reason for data collection Privacy vs. Utility 2. Privacy • Protect consumer data Privacy 3. Security This work • Trustworthy data Security • Detect data poisoning • Different from classical utility because this is an adversarial setting � 5

  6. New Adversary Model Database • Consumer data Query 푑 1 Response protected by Differential 푑 2 Privacy (DP) DP • Classical adversary in 푑푛 DP is curious • Our adversary is DP Sensor 1 different: data Sensor 2 poisoning by hiding their attacks in DP Sensor 3 noise Sensor n DP • Global and local DP � 6

  7. Adversary Goals • Intelligently poison the data in a way that is hard to detect (hide attack in DP noise) • Achieve maximum damage to the utility of the system (deviate estimate as much as possible) Attack Goals: Classical DP Multi-criteria Optimization f a E [ Y a ] max ¯ Y ← M ( D ) ¯ Y ∼ f 0 s.t. Attack D KL ( f a k f 0 )  γ Y a instead of ¯ Y f a 2 F � 7

  8. Functional Optimization Problem • We have to find a probability distribution • A probability density function f a • Among all possible continuous functions as long as Z f a ( r ) dr = 1 r ∈ Ω • What is the shape of ? f a � 8

  9. Solution: Variational Methods • Variational methods are a useful tool to find the shape of functions or the structure of matrices • They replace the function or matrix optimization problem with a parameterized perturbation of the function or matrix • We can then optimize with respect to the parameter to find the “shape” of the function/ matrix • The Lagrange multipliers give us the final parameters of the function � 9

  10. Solution Maximize Z rf a ( r ) dr Auxiliary Function r ∈ Ω ✓ f a ( r ) ◆ Z Subject to: q ( r, α ) = f ∗ a ( r ) + α p ( r ) . f a ( r ) ln dr ≤ γ . f 0 ( r ) r ∈ Ω Z f a ( r ) dr = 1 . r ∈ Ω Lagrangian: 0 1 0 1 Z Z q ( r, α ) ln q ( r, α ) Z A + κ 2 L ( α ) = rq ( r, α ) dr + κ 1 q ( r, α ) dr − 1 f 0 ( r ) dr − γ @ @ A r ∈ Ω r ∈ Ω r ∈ Ω Solution: y f 0 ( y ) e κ 1 a ( y ) = f ∗ , where κ 1 is the solution to D KL ( f ∗ a k f 0 ) = γ . r κ 1 dr R f 0 ( r ) e � 10

  11. Least-Favorable Laplace Attack 2.3 2.4 0.25 User ID Data = 0 2.2 = 0.1 User 1 0.5 0.2 = 2 2.9 Diff. 2.5 User 2 0.3 Aggregation Probability 0.15 Privacy 2.7 User 3 0.7 2.8 0.1 2.4 User 4 1 2.6 0.05 0 Database Possible Possible Query -10 -5 0 5 10 15 20 25 30 private compromised DP Aggregation response response response a ( y ) = κ 2 1 − b 2 f 0 ( y ) = 1 e − | y − θ | + ( y − θ ) 2 be − | y − θ | /b f ∗ b κ 1 2 b κ 2 1 κ 1 is the solution to 1 − b 2 + ln(1 − b 2 2 b 2 ) = γ κ 2 κ 2 � 11 1

  12. Example: Traffic Flow Estimation We use loop detection data from California - Vehicle count - Occupancy � 12

  13. Classical Bad Data Detection in Traffic Flow Estimation DP BDD Prediction Sensor Readings DP BDD TMC ✓ l i − 1 y i ( k ) + T F in y i ( k + 1) = ˆ ˆ i ( k ) Cabinet l i l i − F out � ( k ) + Q i ( y i ( k ) − ˆ y i ( k )) i λ i − 1 = 3 Loop detector L i +1 F in F out i ( k ) ( k ) i � 13 Cell i − 1 Cell i Cell i + 1

  14. The Attack Can Hide in DP Noise and Cause a Larger Impact With DP, the attacker can Without DP the lie more without attack is limited detection Can we do better? � 14

  15. Defense Against Adversarial (Adaptive) Distributions • Player 1 designs classifier D ∈ S minimize Φ (D,A) (e.g., Pr[Miss Detection] Subject to fix false alarms) - Player 1 makes the first move • Player 2 (attacker) has multiple strategies A ∈ F - Makes the move after observing the move of the classifier • Player 1 wants provable performance guarantees: - Once it selects D o by minimizing Φ , it wants proof that no matter what the attacker does, Φ <m, i.e. • � 15

  16. Defense in Traffic Case Proposed new defense as game between attacker and defender: • With classical defense • With our defense � 16

  17. Another Example: Sharing Electricity Consumption 100 =0.03 and BDD =0.02 and BDD 80 =0.01 and B D D Impact S (MW) =0.03 and DP-BDD 60 =0.02 and DP-BDD =0.01 and DP-BDD 40 20 0 10 -2 10 -1 10 0 � 17 Level of privacy ( )

  18. Conclusions • Growing number of applications where we need to provide utility, privacy, and security • In particular, adversarial classification under differential privacy • Various possible extensions • Different quantification of privacy loss (e.g., Rényi DP) • Adversary models (noiseless privacy), etc. • Related work on DP and adversarial ML • Certified robustness � 18

  19. Strategic Adversary + Defender • Player 1 designs classifier D ∈ S minimizing Φ (D,A) (e.g., Pr[Error]) – Defender makes the first move • Player 2 (attacker) has multiple strategies A ∈ F – Attacker makes the move after observing the move of the classifier • Player 1 wants provable performance guarantees: – Once it selects D o by minimizing Φ , it wants proof that no matter what the attacker does, Φ <m, i.e. � 19

  20. Strategy: Solve maximin and Show Solution is equal to minimax – For any finite, zero sum-game: – Minimax = Maximin = Nash Equilibrium (saddle point) � 20

  21. Sequential Hypothesis Testing • Sequence of random variables X 1 ,X 2 ,... – Honest sensors have X 1 ,X 2 ,...,X i distributed as f 0 (X 1 ,X 2 ,...,X i ) (Defined by DP) – Tampered sensor has X 1 , X 2 ,...,X i distributed as f 1 (X 1 , X 2 ,…,X i ) (note that f 1 is unknown) • Collect enough samples i until we have enough information to make a decision! – D =( N , d N ) where N=stopping time, d N =decision � 21

  22. Sequential Probability Ratio Test (SPRT) The solution of this problem is the SPRT: S n = ln f 1 ( x 1 , ..., x n ) f 0 ( x 1 , ...x n ) H 1 U S n Undecided H 0 L time � 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend