FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection
Ruixuan Liu1, Yang Cao2, Masatoshi Yoshikawa2, Hong Chen1
1Renmin University of China, 2Kyoto University
DASFAA, 2020
FedSel: Federated SGD under Local Differential Privacy with Top-k - - PowerPoint PPT Presentation
FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection Ruixuan Liu 1 , Yang Cao 2 , Masatoshi Yoshikawa 2 , Hong Chen 1 1 Renmin University of China, 2 Kyoto University DASFAA, 2020 Federated Learning Overview
Ruixuan Liu1, Yang Cao2, Masatoshi Yoshikawa2, Hong Chen1
1Renmin University of China, 2Kyoto University
DASFAA, 2020
Federated Learning Overview
Sensitive information: age, job, location, etc.
Federated Learning Overview
Sensitive information: age, job, location, etc.
Federated Learning Overview
Sensitive information: age, job, location, etc.
Federated Learning Overview
age, job, location, etc.
Federated Learning Overview
age, job, location, etc.
Federated Learning Overview
age, job, location, etc.
Federated Learning Privacy Vulnerabilities
age, job, location, etc.
Federated Learning Privacy Vulnerabilities
age, job, location, etc.
Federated Learning Privacy Vulnerabilities
age, job, location, etc.
Federated Learning Privacy Vulnerabilities
Possible privacy attacks…
“Whether data of a target victim has been used to train a model?”
Given a gender classifier, “What a male looks like?”
Given a gender classifier, “What is the race of people in Bob’s photos?”
Differential Privacy for Federated Learning
age, job, location, etc.
Differential Privacy for Federated Learning
+noise
The server adds noises to aggregated updates. Sensitive information: age, job, location, etc.
Differential Privacy for Federated Learning
+noise
Requires a trusted server
Sensitive information: age, job, location, etc.
Local Differential Privacy for Federated Learning
+noise +noise +noise
No worry about untrusted server
Sensitive information: age, job, location, etc.
Local Differential Privacy for Federated Learning
+noise +noise +noise LDP is a natural privacy definition for FL Sensitive information: age, job, location, etc.
Local Differential Privacy for Federated Learning …
For a -dimensional vector, the metric is:
If split local privacy budget to d dimensions[1]:
is large
Challenges of LDP in Federated Learning
[1] Wang N, Xiao X, Yang Y, et al. Collecting and analyzing multidimensional data with local differential privacy[C]//2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 2019: 638-649.
For a -dimensional vector, the metric is:
If split local privacy budget to d dimensions[1]:
is large An asymptotically optimal conclusion[1]:
dimensions
[1] Wang N, Xiao X, Yang Y, et al. Collecting and analyzing multidimensional data with local differential privacy[C]//2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 2019: 638-649.
Challenges of LDP in Federated Learning
Typical orders-of-magnitude d: 100-1,000,000s dimensions m: 100-1000s users per round : smaller privacy budget = stronger privacy The dimension curse!
Our Intuition
Common bottleneck of the dimension curse
Data are partitioned and distributed for accelerating the training process Gradient vectors are transmitted among separate workers Communication costs = bits of representing one real value
Reduce communication costs by only transmitting important dimensions
Dimensions with larger absolute magnitudes are more important => Efficient dimension reduction for LDP
Our Intuition
Common focus on selecting Top dimensions
Communication resources Utility / Learning performance Privacy budget Utility / Learning performance
Our Intuition
Communication resources Utility / Learning performance Privacy budget Utility / Learning performance
Common focus on selecting Top dimensions
Two-stage Framework- FedSel
Local vector = Top-k information + value information
Private selection + Value Perturbation
Pull
Local data
Calculate Gradients with local data Push noisy vector 𝑡 ∗ Update global parameters
…
parameters Average gradient Server User 𝑣 Select Top-K dimensions privately Perturb the selected value Update the local accumulated vector
𝑠1 × 𝑒
Two-stage Framework- FedSel
Local vector = Top-k information + value information
Private selection + Value Perturbation
Next goal
Pull
Local data
Calculate Gradients with local data Push noisy vector 𝑡 ∗ Update global parameters
…
parameters Average gradient Server User 𝑣 Select Top-K dimensions privately Perturb the selected value Update the local accumulated vector
𝑠1 × 𝑒
Methods-Exponential Mechanism (EXP)
1. Sorting and the ranking is denoted with { , …, }
Sample unevenly with the probability
1 3 6 2 4 5
value
1 2 3 4 5 6
Methods-Exponential Mechanism (EXP)
1. Sorting and the ranking is denoted with { , …, }
Sample unevenly with the probability
1 3 6 2 4 5
value probability
Methods-Perturbed Encoding Mechanism (PE)
1. Sorting and the ranking is denoted the Top-k status with { , …, }
For each dimension, to retain status with a larger probability to flip has a smaller probability 3. Sample from dimension set
1 3 6 2 4 5
value
Methods-Perturbed Encoding Mechanism (PE)
1. Sorting and the ranking is denoted the Top-k status with { , …, }
For each dimension, to retain status with a larger probability to flip has a smaller probability 3. Sample from dimension set
1 3 6 2 4 5
value
Methods-Perturbed Encoding Mechanism (PE)
1. Sorting and the ranking is denoted the Top-k status with { , …, }
For each dimension, to retain status with a larger probability to flip has a smaller probability 3. Sample from dimension set
1 3 6 2 4 5
value
Methods-Perturbed Sampling Mechanism (PS)
1. Sorting and the ranking is denoted the Top-k status with { , …, }
Sample a dimension from: Top-k dimension set, with a larger probability Non-top dimension set, with a smaller probability
1 3 6 2 4 5
value
Empirical results
mechanism for perturbing one dimension.
Empirical results
What we gain is much larger than what we lose from private and efficient Top-k selection
Summary
Conclusion
Takeaway
Future work