Releasing Search Queries and Clicks Privately Arne Bayer July 24, - - PowerPoint PPT Presentation

releasing search queries and clicks privately
SMART_READER_LITE
LIVE PREVIEW

Releasing Search Queries and Clicks Privately Arne Bayer July 24, - - PowerPoint PPT Presentation

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Search Queries and Clicks Privately Arne Bayer July 24, 2017 Arne Bayer Releasing Search Queries and Clicks Privately A new Approach Releasing Data


slide-1
SLIDE 1

A new Approach Releasing Data Select-Queries Noisy Counts Results

Releasing Search Queries and Clicks Privately

Arne Bayer July 24, 2017

Arne Bayer Releasing Search Queries and Clicks Privately

slide-2
SLIDE 2

A new Approach Releasing Data Select-Queries Noisy Counts Results

Table of Contents

1 A new Approach 2 Releasing Data

Releasing Algorithm

3 Select-Queries

q∗ ∈ D1 q∗ / ∈ D1 Arbitary d

4 Noisy Counts 5 Results

Arne Bayer Releasing Search Queries and Clicks Privately

slide-3
SLIDE 3

A new Approach Releasing Data Select-Queries Noisy Counts Results

Releasing Lists

releasing anonymized lists tracing back possible

Arne Bayer Releasing Search Queries and Clicks Privately

slide-4
SLIDE 4

A new Approach Releasing Data Select-Queries Noisy Counts Results

Graphical Approach

Let G(E, V ) where: Vertices represent visited sites or search queries Edges represent links between sites

Arne Bayer Releasing Search Queries and Clicks Privately

slide-5
SLIDE 5

A new Approach Releasing Data Select-Queries Noisy Counts Results

P 13 B M D C L 3 10 10 10 4 9 4 5 10 11

Arne Bayer Releasing Search Queries and Clicks Privately

slide-6
SLIDE 6

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D

Arne Bayer Releasing Search Queries and Clicks Privately

slide-7
SLIDE 7

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D noise parameters as scale parameter for Laplace distribution b, bc, bq

Arne Bayer Releasing Search Queries and Clicks Privately

slide-8
SLIDE 8

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D noise parameters as scale parameter for Laplace distribution b, bc, bq

b general noise bc noise on clicked URLs bq noise on queires

Arne Bayer Releasing Search Queries and Clicks Privately

slide-9
SLIDE 9

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D noise parameters as scale parameter for Laplace distribution b, bc, bq

b general noise bc noise on clicked URLs bq noise on queires

d maximum queries kept per user

Arne Bayer Releasing Search Queries and Clicks Privately

slide-10
SLIDE 10

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D noise parameters as scale parameter for Laplace distribution b, bc, bq

b general noise bc noise on clicked URLs bq noise on queires

d maximum queries kept per user dc maximum URL clicks kept per user

Arne Bayer Releasing Search Queries and Clicks Privately

slide-11
SLIDE 11

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D noise parameters as scale parameter for Laplace distribution b, bc, bq

b general noise bc noise on clicked URLs bq noise on queires

d maximum queries kept per user dc maximum URL clicks kept per user M (q, D) = # of times q appeared in a given search log D

Arne Bayer Releasing Search Queries and Clicks Privately

slide-12
SLIDE 12

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D noise parameters as scale parameter for Laplace distribution b, bc, bq

b general noise bc noise on clicked URLs bq noise on queires

d maximum queries kept per user dc maximum URL clicks kept per user M (q, D) = # of times q appeared in a given search log D K minimum threshold of occurences M (q, D) + Lap (b) > K

Arne Bayer Releasing Search Queries and Clicks Privately

slide-13
SLIDE 13

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm Parameters

search log D noise parameters as scale parameter for Laplace distribution b, bc, bq

b general noise bc noise on clicked URLs bq noise on queires

d maximum queries kept per user dc maximum URL clicks kept per user M (q, D) = # of times q appeared in a given search log D K minimum threshold of occurences M (q, D) + Lap (b) > K K > d user limit smaller than threshold

Arne Bayer Releasing Search Queries and Clicks Privately

slide-14
SLIDE 14

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm A

Algorithm Releasing Algorithm A

1: Input: D, d, dc, b, bq, bc, K

Arne Bayer Releasing Search Queries and Clicks Privately

slide-15
SLIDE 15

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm A

Algorithm Releasing Algorithm A

1: Input: D, d, dc, b, bq, bc, K 2: Limit User Activity: Keep only d entries per user in D

Arne Bayer Releasing Search Queries and Clicks Privately

slide-16
SLIDE 16

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm A

Algorithm Releasing Algorithm A

1: Input: D, d, dc, b, bq, bc, K 2: Limit User Activity: Keep only d entries per user in D 3: Count Queries calculate absolute commonness of all Queries

q: M (q, D)

Arne Bayer Releasing Search Queries and Clicks Privately

slide-17
SLIDE 17

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm A

Algorithm Releasing Algorithm A

1: Input: D, d, dc, b, bq, bc, K 2: Limit User Activity: Keep only d entries per user in D 3: Count Queries calculate absolute commonness of all Queries

q: M (q, D)

4: Select-Queries: add all Queries to Q that exceed K: Q ←

{q : M (q, D) + Lap (b) > K}

Arne Bayer Releasing Search Queries and Clicks Privately

slide-18
SLIDE 18

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm A

Algorithm Releasing Algorithm A

1: Input: D, d, dc, b, bq, bc, K 2: Limit User Activity: Keep only d entries per user in D 3: Count Queries calculate absolute commonness of all Queries

q: M (q, D)

4: Select-Queries: add all Queries to Q that exceed K: Q ←

{q : M (q, D) + Lap (b) > K}

5: Get-Query-Counts:

add fuzziness to to Queries: q, M (q, D) + Lap (bq)

Arne Bayer Releasing Search Queries and Clicks Privately

slide-19
SLIDE 19

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Algorithm A

Algorithm Releasing Algorithm A

1: Input: D, d, dc, b, bq, bc, K 2: Limit User Activity: Keep only d entries per user in D 3: Count Queries calculate absolute commonness of all Queries

q: M (q, D)

4: Select-Queries: add all Queries to Q that exceed K: Q ←

{q : M (q, D) + Lap (b) > K}

5: Get-Query-Counts:

add fuzziness to to Queries: q, M (q, D) + Lap (bq)

6: Get-Click-Counts: calculate top ten clicks and add fuzziness:

q, u, #uq + Lap (bc)

Arne Bayer Releasing Search Queries and Clicks Privately

slide-20
SLIDE 20

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Privacy Guarantee

Pr

  • A (D1) ∈ ˆ

D

  • ≤ αPr
  • A (D2) ∈ ˆ

D

  • + δ1

(1)

Arne Bayer Releasing Search Queries and Clicks Privately

slide-21
SLIDE 21

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Privacy Guarantee

Pr

  • A (D1) ∈ ˆ

D

  • ≤ αPr
  • A (D2) ∈ ˆ

D

  • + δ1

(1) Pr

  • A (D2) ∈ ˆ

D

  • ≤ αPr
  • A (D1) ∈ ˆ

D

  • + δ1

(2)

Arne Bayer Releasing Search Queries and Clicks Privately

slide-22
SLIDE 22

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Privacy Guarantee

Pr

  • A (D1) ∈ ˆ

D

  • ≤ αPr
  • A (D2) ∈ ˆ

D

  • + δ1

(1) Pr

  • A (D2) ∈ ˆ

D

  • ≤ αPr
  • A (D1) ∈ ˆ

D

  • + δ1

(2) ǫalg = d · ln (α) + d/bq + dc/bc δalg = d

2 exp

d−K

b

  • with α = max
  • e1/b, 1 +

1 2e(K−1)/b − 1

  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-23
SLIDE 23

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Steps to prove, that privacy is garanteed

Arne Bayer Releasing Search Queries and Clicks Privately

slide-24
SLIDE 24

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Steps to prove, that privacy is garanteed

4 Select-Queries

Limit User: d = 1 Arbitary d

Arne Bayer Releasing Search Queries and Clicks Privately

slide-25
SLIDE 25

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Steps to prove, that privacy is garanteed

4 Select-Queries

Limit User: d = 1 Arbitary d

5 Get-Query-Counts

Arne Bayer Releasing Search Queries and Clicks Privately

slide-26
SLIDE 26

A new Approach Releasing Data Select-Queries Noisy Counts Results Releasing Algorithm

Steps to prove, that privacy is garanteed

4 Select-Queries

Limit User: d = 1 Arbitary d

5 Get-Query-Counts 6 Get-Click-Counts

Arne Bayer Releasing Search Queries and Clicks Privately

slide-27
SLIDE 27

A new Approach Releasing Data Select-Queries Noisy Counts Results

Part 1: d = 1

limit of 1 query per user

Arne Bayer Releasing Search Queries and Clicks Privately

slide-28
SLIDE 28

A new Approach Releasing Data Select-Queries Noisy Counts Results

Part 1: d = 1

limit of 1 query per user let D1 and D2 differ in exactly q∗

Arne Bayer Releasing Search Queries and Clicks Privately

slide-29
SLIDE 29

A new Approach Releasing Data Select-Queries Noisy Counts Results

Part 1: d = 1

limit of 1 query per user let D1 and D2 differ in exactly q∗

q∗ ∈ D1 and q∗ ∈ D2

Arne Bayer Releasing Search Queries and Clicks Privately

slide-30
SLIDE 30

A new Approach Releasing Data Select-Queries Noisy Counts Results

Part 1: d = 1

limit of 1 query per user let D1 and D2 differ in exactly q∗

q∗ ∈ D1 and q∗ ∈ D2 q∗ / ∈ D1, but q∗ ∈ D2

Arne Bayer Releasing Search Queries and Clicks Privately

slide-31
SLIDE 31

A new Approach Releasing Data Select-Queries Noisy Counts Results

Part 1: d = 1

limit of 1 query per user let D1 and D2 differ in exactly q∗

q∗ ∈ D1 and q∗ ∈ D2 q∗ / ∈ D1, but q∗ ∈ D2

ˆ D = Range (A)

Arne Bayer Releasing Search Queries and Clicks Privately

slide-32
SLIDE 32

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

R1

2 = Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-33
SLIDE 33

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

R1

2 = Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D]

split ˆ D

ˆ D+ contains outputs with q∗ ˆ D− does not contain outputs with q∗

Arne Bayer Releasing Search Queries and Clicks Privately

slide-34
SLIDE 34

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

R1

2 = Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D]

split ˆ D

ˆ D+ contains outputs with q∗ ˆ D− does not contain outputs with q∗

R1

2 = Pr[A(D1)∈ ˆ D+]+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-35
SLIDE 35

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

R1

2 = Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D]

split ˆ D

ˆ D+ contains outputs with q∗ ˆ D− does not contain outputs with q∗

R1

2 = Pr[A(D1)∈ ˆ D+]+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]

Using the Properties of ratios

Arne Bayer Releasing Search Queries and Clicks Privately

slide-36
SLIDE 36

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

R1

2 = Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D]

split ˆ D

ˆ D+ contains outputs with q∗ ˆ D− does not contain outputs with q∗

R1

2 = Pr[A(D1)∈ ˆ D+]+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]

Using the Properties of ratios R1

2 ≤ max

  • Pr[A(D1)∈ ˆ

D+] Pr[A(D2)∈ ˆ D+], Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−]

  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-37
SLIDE 37

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-38
SLIDE 38

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+]

Pr

  • A (D1) ∈ ˆ

D+ = Pr [M (q∗, D1) + Lap (b) > K]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-39
SLIDE 39

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+]

Pr

  • A (D1) ∈ ˆ

D+ = Pr [M (q∗, D1) + Lap (b) > K]

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+] = Pr[M(q∗,D1)+Lap(b)>K] Pr[M(q∗,D2)+Lap(b)>K]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-40
SLIDE 40

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+]

Pr

  • A (D1) ∈ ˆ

D+ = Pr [M (q∗, D1) + Lap (b) > K]

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+] = Pr[M(q∗,D1)+Lap(b)>K] Pr[M(q∗,D2)+Lap(b)>K]

Analogous for

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = Pr[M(q∗,D1)+Lap(b)<K] Pr[M(q∗,D1)+1+Lap(b)<K]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-41
SLIDE 41

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+]

Pr

  • A (D1) ∈ ˆ

D+ = Pr [M (q∗, D1) + Lap (b) > K]

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+] = Pr[M(q∗,D1)+Lap(b)>K] Pr[M(q∗,D2)+Lap(b)>K]

Analogous for

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = Pr[M(q∗,D1)+Lap(b)<K] Pr[M(q∗,D1)+1+Lap(b)<K]

R1

2 ≤ max

  • Pr[M(q∗,D1)+Lap(b)>K]

Pr[M(q∗,D1)+1+Lap(b)>K], Pr[M(q∗,D1)+Lap(b)<K] Pr[M(q∗,D1)+1+Lap(b)<K]

  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-42
SLIDE 42

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Properties of Laplace ratios 1 ≤ Pr[r<c+1]

Pr[r<c]

≤ e1/b and 1 ≥ Pr[r>c+1]

Pr[r>c]

≥ e−1/b with r = Lap (b)

Arne Bayer Releasing Search Queries and Clicks Privately

slide-43
SLIDE 43

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Properties of Laplace ratios 1 ≤ Pr[r<c+1]

Pr[r<c]

≤ e1/b and 1 ≥ Pr[r>c+1]

Pr[r>c]

≥ e−1/b with r = Lap (b) R1

2 ≤ max

  • Pr[M(q∗,D1)+Lap(b)>K]

Pr[M(q∗,D1)+1+Lap(b)>K], Pr[M(q∗,D1)+Lap(b)<K] Pr[M(q∗,D1)+1+Lap(b)<K]

  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-44
SLIDE 44

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Properties of Laplace ratios 1 ≤ Pr[r<c+1]

Pr[r<c]

≤ e1/b and 1 ≥ Pr[r>c+1]

Pr[r>c]

≥ e−1/b with r = Lap (b) R1

2 ≤ max

  • Pr[M(q∗,D1)+Lap(b)>K]

Pr[M(q∗,D1)+1+Lap(b)>K], Pr[M(q∗,D1)+Lap(b)<K] Pr[M(q∗,D1)+1+Lap(b)<K]

  • R1

2 ≤ max

  • 1, e1/b

= e1/b

Arne Bayer Releasing Search Queries and Clicks Privately

slide-45
SLIDE 45

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Properties of Laplace ratios 1 ≤ Pr[r<c+1]

Pr[r<c]

≤ e1/b and 1 ≥ Pr[r>c+1]

Pr[r>c]

≥ e−1/b with r = Lap (b) R1

2 ≤ max

  • Pr[M(q∗,D1)+Lap(b)>K]

Pr[M(q∗,D1)+1+Lap(b)>K], Pr[M(q∗,D1)+Lap(b)<K] Pr[M(q∗,D1)+1+Lap(b)<K]

  • R1

2 ≤ max

  • 1, e1/b

= e1/b Similarly: R2

1 = Pr[A(D2)∈ ˆ D] Pr[A(D1)∈ ˆ D] ≤ e1/b

Arne Bayer Releasing Search Queries and Clicks Privately

slide-46
SLIDE 46

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ ∈ D1

Properties of Laplace ratios 1 ≤ Pr[r<c+1]

Pr[r<c]

≤ e1/b and 1 ≥ Pr[r>c+1]

Pr[r>c]

≥ e−1/b with r = Lap (b) R1

2 ≤ max

  • Pr[M(q∗,D1)+Lap(b)>K]

Pr[M(q∗,D1)+1+Lap(b)>K], Pr[M(q∗,D1)+Lap(b)<K] Pr[M(q∗,D1)+1+Lap(b)<K]

  • R1

2 ≤ max

  • 1, e1/b

= e1/b Similarly: R2

1 = Pr[A(D2)∈ ˆ D] Pr[A(D1)∈ ˆ D] ≤ e1/b

⇒Since δ1 = 0 inequalities (1) and (2) are proven as needed.

Arne Bayer Releasing Search Queries and Clicks Privately

slide-47
SLIDE 47

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = 1 Pr[q∗ / ∈A(D2)] = 1 Pr[1+Lap(b)<K] = 1 1−0.5exp( 1−K

b ) Arne Bayer Releasing Search Queries and Clicks Privately

slide-48
SLIDE 48

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = 1 Pr[q∗ / ∈A(D2)] = 1 Pr[1+Lap(b)<K] = 1 1−0.5exp( 1−K

b )

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+] = 0

Arne Bayer Releasing Search Queries and Clicks Privately

slide-49
SLIDE 49

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = 1 Pr[q∗ / ∈A(D2)] = 1 Pr[1+Lap(b)<K] = 1 1−0.5exp( 1−K

b )

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+] = 0 Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-50
SLIDE 50

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = 1 Pr[q∗ / ∈A(D2)] = 1 Pr[1+Lap(b)<K] = 1 1−0.5exp( 1−K

b )

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+] = 0 Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = 1 Pr[1+Lap(b)<K] = 1 1−0.5exp( 1−K

b ) Arne Bayer Releasing Search Queries and Clicks Privately

slide-51
SLIDE 51

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = 1 Pr[q∗ / ∈A(D2)] = 1 Pr[1+Lap(b)<K] = 1 1−0.5exp( 1−K

b )

Pr[A(D1)∈ ˆ D+] Pr[A(D2)∈ ˆ D+] = 0 Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]

Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−] = 1 Pr[1+Lap(b)<K] = 1 1−0.5exp( 1−K

b )

This proves inequality (1).

Arne Bayer Releasing Search Queries and Clicks Privately

slide-52
SLIDE 52

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

Pr

  • A (D2) ∈ ˆ

D+ ≤ Pr [q∗ was released] = Pr [M (q∗, D2) + Lap (b1) > K] = 0.5exp 1−K

b

  • = δ1

Pr[A(D2)∈ ˆ D] Pr[A(D1)∈ ˆ D] = Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−] 0+Pr[A(D1)∈ ˆ D−]

=

Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−] + Pr[A(D2)∈ ˆ D+] Pr[A(D1)∈ ˆ D−] ≤ 1 − 0.5exp

1−K

b

  • +

0.5exp( 1−K

b )

Pr[A(D1)∈ ˆ D]

This proves inequality (2).

Arne Bayer Releasing Search Queries and Clicks Privately

slide-53
SLIDE 53

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

α = max

  • e1/b, 1 − 0.5exp

1−K

b

  • ,

1 1−0.5exp( 1−K

b )

  • = max
  • e1/b,

1 1−0.5exp( 1−K

b )

  • δ1 = 1

2e

1−K b Arne Bayer Releasing Search Queries and Clicks Privately

slide-54
SLIDE 54

A new Approach Releasing Data Select-Queries Noisy Counts Results q∗ / ∈ D1

α = max

  • e1/b, 1 − 0.5exp

1−K

b

  • ,

1 1−0.5exp( 1−K

b )

  • = max
  • e1/b,

1 1−0.5exp( 1−K

b )

  • δ1 = 1

2e

1−K b

d = 1 is differentially private

Arne Bayer Releasing Search Queries and Clicks Privately

slide-55
SLIDE 55

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Generalizing previous approach by adding d single queries Pr

  • A (D1) ∈ ˆ

D

  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-56
SLIDE 56

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Generalizing previous approach by adding d single queries Pr

  • A (D1) ∈ ˆ

D

  • ≤ αPr
  • A (D1 + q1) ∈ ˆ

D

  • + δ1

Arne Bayer Releasing Search Queries and Clicks Privately

slide-57
SLIDE 57

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Generalizing previous approach by adding d single queries Pr

  • A (D1) ∈ ˆ

D

  • ≤ αPr
  • A (D1 + q1) ∈ ˆ

D

  • + δ1

≤ α

  • αPr [A (D1) + q1 + q2) ∈ ˆ

D + δ1

  • + δ1 ≤ . . .

Arne Bayer Releasing Search Queries and Clicks Privately

slide-58
SLIDE 58

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Generalizing previous approach by adding d single queries Pr

  • A (D1) ∈ ˆ

D

  • ≤ αPr
  • A (D1 + q1) ∈ ˆ

D

  • + δ1

≤ α

  • αPr [A (D1) + q1 + q2) ∈ ˆ

D + δ1

  • + δ1 ≤ . . .

≤ αdPr

  • A (D2) ∈ ˆ

D

  • + δ1 αd−1

α−1

δalg will exceed 1

Arne Bayer Releasing Search Queries and Clicks Privately

slide-59
SLIDE 59

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Let queries x1, . . . , xnx ∈ D1 and y1, . . . , yny new ones in D2

Arne Bayer Releasing Search Queries and Clicks Privately

slide-60
SLIDE 60

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Let queries x1, . . . , xnx ∈ D1 and y1, . . . , yny new ones in D2 nx + ny ≤ d

Arne Bayer Releasing Search Queries and Clicks Privately

slide-61
SLIDE 61

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Let queries x1, . . . , xnx ∈ D1 and y1, . . . , yny new ones in D2 nx + ny ≤ d nx

i=1 (M (xi, D2) − M (xi, D1)) + ny i=1 M (yi, D2) ≤ d

Arne Bayer Releasing Search Queries and Clicks Privately

slide-62
SLIDE 62

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-63
SLIDE 63

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]≤ Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-64
SLIDE 64

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]≤ Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−]

=

Pr[A(D1)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−] · Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1+x1+x2)∈ ˆ D−] · . . . · Pr[A(D1+x1+···+xnx +y1+···+yny−1)∈ ˆ D−] Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-65
SLIDE 65

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]≤ Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−]

=

Pr[A(D1)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−] · Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1+x1+x2)∈ ˆ D−] · . . . · Pr[A(D1+x1+···+xnx +y1+···+yny−1)∈ ˆ D−] Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−]

≤ nx

i=1 e1/b · ny i=1 max

  • e1/b, Pr [M (yi, D2) + Lap (b) > K]
  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-66
SLIDE 66

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]≤ Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−]

=

Pr[A(D1)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−] · Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1+x1+x2)∈ ˆ D−] · . . . · Pr[A(D1+x1+···+xnx +y1+···+yny−1)∈ ˆ D−] Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−]

≤ nx

i=1 e1/b · ny i=1 max

  • e1/b, Pr [M (yi, D2) + Lap (b) > K]
  • ≤ enx/b ·
  • max
  • e1/b, 1 − 0.5exp

1−K

b

  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-67
SLIDE 67

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]≤ Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−]

=

Pr[A(D1)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−] · Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1+x1+x2)∈ ˆ D−] · . . . · Pr[A(D1+x1+···+xnx +y1+···+yny−1)∈ ˆ D−] Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−]

≤ nx

i=1 e1/b · ny i=1 max

  • e1/b, Pr [M (yi, D2) + Lap (b) > K]
  • ≤ enx/b ·
  • max
  • e1/b, 1 − 0.5exp

1−K

b

  • ≤ αd

Arne Bayer Releasing Search Queries and Clicks Privately

slide-68
SLIDE 68

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D1)∈ ˆ D] Pr[A(D2)∈ ˆ D] = 0+Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D+]+Pr[A(D2)∈ ˆ D−]≤ Pr[A(D1)∈ ˆ D−] Pr[A(D2)∈ ˆ D−]

=

Pr[A(D1)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−] · Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1+x1+x2)∈ ˆ D−] · . . . · Pr[A(D1+x1+···+xnx +y1+···+yny−1)∈ ˆ D−] Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−]

≤ nx

i=1 e1/b · ny i=1 max

  • e1/b, Pr [M (yi, D2) + Lap (b) > K]
  • ≤ enx/b ·
  • max
  • e1/b, 1 − 0.5exp

1−K

b

  • ≤ αd

This proves inequality (1).

Arne Bayer Releasing Search Queries and Clicks Privately

slide-69
SLIDE 69

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr

  • A (D2) ∈ ˆ

D+ ≤ ny

i=1 Pr [ny was chosen for release]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-70
SLIDE 70

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr

  • A (D2) ∈ ˆ

D+ ≤ ny

i=1 Pr [ny was chosen for release]

= ny

i=1 Pr [M (yi, D2) + Lap (b) > K]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-71
SLIDE 71

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr

  • A (D2) ∈ ˆ

D+ ≤ ny

i=1 Pr [ny was chosen for release]

= ny

i=1 Pr [M (yi, D2) + Lap (b) > K]

≤ 1

2

ny

i=1 exp

  • M(yi,D2)−K

b

  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-72
SLIDE 72

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr

  • A (D2) ∈ ˆ

D+ ≤ ny

i=1 Pr [ny was chosen for release]

= ny

i=1 Pr [M (yi, D2) + Lap (b) > K]

≤ 1

2

ny

i=1 exp

  • M(yi,D2)−K

b

  • ≤ d

2 exp

d−K

b

  • = δalg

Arne Bayer Releasing Search Queries and Clicks Privately

slide-73
SLIDE 73

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-74
SLIDE 74

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

=

Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

·

Pr[A(D1+x1+x2)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−]

· . . . ·

Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−] Pr[A(D1+···+xnx +y1+···+yny −1)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-75
SLIDE 75

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

=

Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

·

Pr[A(D1+x1+x2)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−]

· . . . ·

Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−] Pr[A(D1+···+xnx +y1+···+yny −1)∈ ˆ D−]

≤ nx

i=1 e1/b · ny i=1 max

  • e1/b, Pr [M (yi, D2) + Lap (b) < K]
  • Arne Bayer

Releasing Search Queries and Clicks Privately

slide-76
SLIDE 76

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

=

Pr[A(D1+x1)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

·

Pr[A(D1+x1+x2)∈ ˆ D−] Pr[A(D1+x1)∈ ˆ D−]

· . . . ·

Pr[A(D1+x1+···+xnx +y1+···+yny )∈ ˆ D−] Pr[A(D1+···+xnx +y1+···+yny −1)∈ ˆ D−]

≤ nx

i=1 e1/b · ny i=1 max

  • e1/b, Pr [M (yi, D2) + Lap (b) < K]
  • ≤ αd

Arne Bayer Releasing Search Queries and Clicks Privately

slide-77
SLIDE 77

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D] Pr[A(D2)∈ ˆ D]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-78
SLIDE 78

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D] Pr[A(D2)∈ ˆ D]

=

Pr[A(D2)∈ ˆ D]+Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-79
SLIDE 79

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D] Pr[A(D2)∈ ˆ D]

=

Pr[A(D2)∈ ˆ D]+Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

=

Pr[A(D2)∈ ˆ D] Pr[A(D1)∈ ˆ D−] + Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-80
SLIDE 80

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D] Pr[A(D2)∈ ˆ D]

=

Pr[A(D2)∈ ˆ D]+Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

=

Pr[A(D2)∈ ˆ D] Pr[A(D1)∈ ˆ D−] + Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

≤ αd +

0.5dexp( d−K

b )

Pr[A(D1)∈ ˆ D]

Arne Bayer Releasing Search Queries and Clicks Privately

slide-81
SLIDE 81

A new Approach Releasing Data Select-Queries Noisy Counts Results Arbitary d

Pr[A(D2)∈ ˆ D] Pr[A(D2)∈ ˆ D]

=

Pr[A(D2)∈ ˆ D]+Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

=

Pr[A(D2)∈ ˆ D] Pr[A(D1)∈ ˆ D−] + Pr[A(D2)∈ ˆ D−] Pr[A(D1)∈ ˆ D−]

≤ αd +

0.5dexp( d−K

b )

Pr[A(D1)∈ ˆ D]

This proves inequality (2).

Arne Bayer Releasing Search Queries and Clicks Privately

slide-82
SLIDE 82

A new Approach Releasing Data Select-Queries Noisy Counts Results

releasing noisy histograms is differentially private

Arne Bayer Releasing Search Queries and Clicks Privately

slide-83
SLIDE 83

A new Approach Releasing Data Select-Queries Noisy Counts Results

releasing noisy histograms is differentially private Get-Query-Counts is d/bq-differentially private

Arne Bayer Releasing Search Queries and Clicks Privately

slide-84
SLIDE 84

A new Approach Releasing Data Select-Queries Noisy Counts Results

releasing noisy histograms is differentially private Get-Query-Counts is d/bq-differentially private Get-Click-Counts is dc/bc-differentially private

Arne Bayer Releasing Search Queries and Clicks Privately

slide-85
SLIDE 85

A new Approach Releasing Data Select-Queries Noisy Counts Results

Optimal values can be calculated: K = d

  • 1 −

ln( 2δ

d )

ǫ

  • b = b

ǫ

d 1 5 10 20 40 80 160 K 5.70 31.99 66.99 140 292.04 608.16 1264.49 b 0.43 2.17 4.34 8.69 17.37 34.74 69.49

Table: Optimal values for K and b as a function of d with eǫ = 10 and δ = 10−5

Arne Bayer Releasing Search Queries and Clicks Privately