Privacy-Preserving Inference in Crowdsourcing Systems Liyao Xiang - - PowerPoint PPT Presentation

privacy preserving inference in crowdsourcing systems
SMART_READER_LITE
LIVE PREVIEW

Privacy-Preserving Inference in Crowdsourcing Systems Liyao Xiang - - PowerPoint PPT Presentation

Privacy-Preserving Inference in Crowdsourcing Systems Liyao Xiang Supervisor: Baochun Li Oct. 9, 2017 University of Toronto Localization via Crowdsourcing ? A d AC d AB ? B C d BC In a crowd, some users know about their locations


slide-1
SLIDE 1

Privacy-Preserving Inference in Crowdsourcing Systems

Liyao Xiang Supervisor: Baochun Li

  • Oct. 9, 2017

University of Toronto

slide-2
SLIDE 2

Localization via Crowdsourcing

  • In a crowd, some users know about their locations while

some don’t. With distance observations between them, how to localize each user?

2

B A C dAB dAC dBC

? ?

slide-3
SLIDE 3

Localization via Crowdsourcing

  • Each user sends their prior estimates and distance
  • bservations to a central server, who returns the most

likely position for each.

3

time t user i Prior estimate Z i,t time t user j Prior estimate Z j,t Upload Z i,t, Dij Upload Z j,t, Dji Run inference alg. Return Z* i,t Return Z* j,t

  • What if users would like to keep their locations private?
slide-4
SLIDE 4

Privacy-Preserving Localization

  • In a crowd, some users know about their locations while

some don’t. With distance observations between them, how to localize each user without breaching privacy?

4

B dAB dAC dBC A C

? ?

slide-5
SLIDE 5

Privacy-Preserving Localization

  • In a crowd, some users know about their locations while

some don’t. With distance observations between them, how to localize each user without breaching privacy?

5

B dAB dAC dBC A C

? ?

slide-6
SLIDE 6

Particle Representation

  • User’s Location
  • A user’s location is represented by a set of particles

Zi,t = { z1, …, zR}, Zt = {Z1,t , …, ZN,t}.

  • At time t, the server finds the most likely distribution of

Zt given Zt-1 and D.

6

Z∗

t = arg max Zt

P(Zt|Zt−1, D).

slide-7
SLIDE 7

First Attempt

  • To encrypt all particles and run the inference in the

encrypted domain.

7

However, encrypted operations are constrained.

slide-8
SLIDE 8

Particle Representation

  • User’s Location
  • A user’s location is represented by a set of particles

Zi,t = { z1, …, zR}. Each particle is associated with a weight { w1, …, wR}.

  • For example, if the location estimate is {z1, z2, z3} with

probabilities {0.6, 0.2, 0.2}, then the location is more likely to be z1 than z3.

8

slide-9
SLIDE 9
  • Users upload each particle’s weight {E(W1), …, E(WR)}

and distance observations to others E(D) in encryption.

  • Server updates each particle’s weight.

9

Particle Representation

slide-10
SLIDE 10

Privacy-Preserving Inference

  • Server computes partial information Ci,r for each particle r
  • f each user i ( j is observed by i):

10

ci,r = Y

j∈N (i)

Y

s∈{1,...,R}

Epk(ln wj,s) · Epk(d(zi,r, zj,s)2)−

1 2σ2

· Epk(Dij)

d(zi,r,zj,s) σ2

· Epk(D2

ij)−

1 2σ2

= Epk[ X

j∈N (i)

X

s∈{1,...,R}

(ln wj,s − (d(zi,r, zj,s) − Dij)2/2σ2)].

slide-11
SLIDE 11

Privacy-Preserving Inference

  • With secret key sk, user i updates the weight Wi,r for its

particle r ( djs is the calculated distance between particle s of user j and particle r of user i ):

11

wk

i,r = wk−1 i,r

exp[Esk(ci,r)] = wk−1

i,r

exp[ X

j∈N (i)

X

s∈{1,...,R}

(ln wj,s (djs Dij)2/2σ2)] = wk−1

i,r

Y

j∈N (i)

Y

s∈{1,...,R}

exp(ln wj,s (djs Dij)2/2σ2) = wk−1

i,r

Y

j∈N (i)

Y

s∈{1,...,R}

wj,s · exp ⇣ (djs Dij)2 2σ2 ⌘ ' wk−1

i,r

Y

j∈N (i)

Y

s∈{1,...,R}

Pr(zi,r, zj,s|Dij,t).

slide-12
SLIDE 12

Privacy-Preserving Localization with Crowdsourcing

12

Run inference. time t Prior Z i,t. U p l

  • a

d Z i , t, E( w) a n d E(D) . Decrypt and update prior with Z* i, t. D

  • w

n l

  • a

d C i , t . Upload Z i, t+1, E(w) and E(D). time t+1 Prior Z i,t+1. Download C i, t+1.

slide-13
SLIDE 13

13

But, with R particles, adversary can still guess correct location with Prob. 1/R.

slide-14
SLIDE 14

Data Perturbation

  • Idea: perturb Zi,t = { z1, …, zR} as Yi,t = { y1, …, yR}.
  • Perturbation: add Gaussian noise to Zi,t that

satisfies location differential privacy.

14

N(0, σ2)

slide-15
SLIDE 15

Privacy Definition

  • Location Differential Privacy:

15

A mechanism M satisfies (✏, )-differential privacy iff for all z, z0 that are d(z, z0) apart: Pr[M(z) ∈ Y ] ≤ e✏Pr[M(z0) ∈ Y ] + , and ✏ = ⇢d2(z, z0) + 2 p ⇢ log(1/)d(z, z0), where ρ is a constant specific to the perturbation mechanism we adopt.

slide-16
SLIDE 16

Interpretation of Privacy Definition

  • Location Differential Privacy: the projected distributions of

all the points within the same dotted circle are at most apart from each other.

  • As the distance between the two locations is smaller,

is smaller, indicating that it is harder to distinguish the two locations, i.e., higher privacy level.

16

z

d ( z , z ’ )

z’ M(z)(Y) M(z’)(Y) M(z’’)(Y) z” d(z, z”) ϵ1 ϵ1 ϵ2 ϵ2 ϵ1 < ϵ2 ϵ1 < ϵ2

slide-17
SLIDE 17

Privacy Definition

  • User Differential Privacy

17

If we report Z = (z1, ..., zR) as Y = (y1, ..., yR), then the probability of reporting Y given Z is: Pr[M(Z) ∈ Y] = Y

i

Pr[M(zi) ∈ Y ]. The user enjoys (✏0, )-differential privacy with ✏0 = ⇢Rd2(Z, Z0) + 2 p ⇢ log(1/)Rd2(Z, Z0).

slide-18
SLIDE 18

Perturbed Private Inference

  • Collecting Y, the server computes the pairwise distances

between each pair of perturbed particles as:

18

˜ d(y, y0) = q ||y − y0||2

2 − 4σ2.

slide-19
SLIDE 19

19

How can we guarantee the inference result the same with the unperturbed case?

slide-20
SLIDE 20

Privacy and Utility Analysis

  • Utility results: We proved is an unbiased

estimator of

  • Privacy guarantee: We proved our perturbation scheme

satisfies location differential privacy and user differential

  • privacy. Compared to previous work, we improve the

privacy level by with the same utility level.

20

˜ d(y, y0) √ R d(z, z0)

slide-21
SLIDE 21

Performance Evaluation

  • Overhead

21

20 40 60 80 0.2 0.4 0.6 0.8 Highest Particle Weight

  • No. of Iterations (5 Iterations × 15 Timeslots)

Convergence of the Particle Distribution

10 20 30 40 50 2000 7000 12000 Number of Users Average Running Time (ms) Running Time of the MAP Inference R = 50 R = 75 R = 100

slide-22
SLIDE 22

Performance Evaluation

  • Simulation results using random way point (RWP) model.

22

5 10 15 20 25 0.2 0.4 0.6 0.8 1 Position Error (m) CDF Position Error of 20 Users(R = 100) Unperturbed σ = 0.2 σ = 0.7 σ = 1.0 σ = 1.5 σ = 2.3 5 10 15 20 25 0.2 0.4 0.6 0.8 1 Position Error (m) CDF Position Error of 20 Users( σ = 0.5) R = 50 R = 75 R = 100 R = 125 R = 150

slide-23
SLIDE 23

Performance Evaluation

  • Comparison experiment and real-world experimental

results.

23

10 20 30 40 0.2 0.4 0.6 0.8 1 Position Error (m) CDF Comparison with Hilbert Curves on RWP Model Hilbert Curves (n = 64) Hilbert Curves (n = 512) Private Inference (σ = 5.0) Private Inference (σ = 10.0) 5 10 15 2 4 6 8 10 Sequence number Average Position Error(m) Average Position Error of 7 Users in Different Settings Unperturbed σ = 0.2, ε = 23.23 σ = 0.7, ε = 4.09 σ = 1.0, ε = 2.65 σ = 2.3, ε = 1.03

slide-24
SLIDE 24

Thank you!

24