Random Projections, Graph Sparsification, and Differential Privacy - - PowerPoint PPT Presentation

random projections graph sparsification and differential
SMART_READER_LITE
LIVE PREVIEW

Random Projections, Graph Sparsification, and Differential Privacy - - PowerPoint PPT Presentation

Random Projections, Graph Sparsification, and Differential Privacy Jalaj Upadhyay Center for Applied Cryptographic Research University of Waterloo December 02, 2013 1/25 The Peril of Last Talk on Monday 2/25 This Paper in One Slide Random


slide-1
SLIDE 1

Random Projections, Graph Sparsification, and Differential Privacy

Jalaj Upadhyay

Center for Applied Cryptographic Research University of Waterloo

December 02, 2013

1/25

slide-2
SLIDE 2

The Peril of Last Talk on Monday

2/25

slide-3
SLIDE 3

This Paper in One Slide

Random Projections (JL transform) ⇓ Differential privacy

3/25

slide-4
SLIDE 4

This Paper in One Slide

Graph Sparsification + Random Projections (JL transform) ⇓ Differential privacy with improved sanitization time, and comparable utility and privacy guarantee

3/25

slide-5
SLIDE 5

Hope We are All Not There Anymore

4/25

slide-6
SLIDE 6

Differential Privacy: The Mathematical Formulation

  • The idea is that absence or presence of an individual entry

should not change the output “by much"

5/25

slide-7
SLIDE 7

Differential Privacy: The Mathematical Formulation

  • The idea is that absence or presence of an individual entry

should not change the output “by much"

  • A sanitization algorithm, K, gives ǫ-differential privacy if, for

all “neighboring data," D1 and D2, and for all range S, Pr [K(D1) ∈ S] Pr [K(D2) ∈ S] ≤ exp(ǫ)

5/25

slide-8
SLIDE 8

Differential Privacy: The Mathematical Formulation

  • The idea is that absence or presence of an individual entry

should not change the output “by much"

  • A sanitization algorithm, K, gives ǫ-differential privacy if, for

all “neighboring data," D1 and D2, and for all range S, Pr [K(D1) ∈ S] Pr [K(D2) ∈ S] ≤ exp(ǫ)

  • A sanitization algorithm, K, gives (ǫ, δ)-differential privacy

if, for all “neighboring data," D1 and D2, and for all range S, Pr [K(D1) ∈ S] ≤ exp(ǫ)Pr [K(D2) ∈ S] + δ.

5/25

slide-9
SLIDE 9

Differential Privacy: The Pretty (Common) Picture

6/25

slide-10
SLIDE 10

Why Should We Care About Cut Queries?

  • A natural question in social networking
  • How many people have friends outside their circle?

7/25

slide-11
SLIDE 11

Why Should We Care About Cut Queries?

7/25

slide-12
SLIDE 12

Why Should We Care About Cut Queries?

  • A natural question in social networking
  • How many people have friends outside their circle?
  • The answer is the number of edges crossing the border of

the set of the vertices corresponding to those people

  • This number is called cut corresponding to the set of

vertices

7/25

slide-13
SLIDE 13

Why Should We Care About Cut Queries?

  • A natural question in social networking
  • How many people have friends outside their circle?
  • The answer is the number of edges crossing the border of

the set of the vertices corresponding to those people

  • This number is called cut corresponding to the set of

vertices Question: Why would you really care about the privacy?

7/25

slide-14
SLIDE 14

Friendships or "What You May Call" Between People

Suppose Facebook decides to reveal the friendship graph

8/25

slide-15
SLIDE 15

Friendships or "What You May Call" Between People

Suppose Facebook decides to reveal the friendship graph There might be some people who might end up in trouble

8/25

slide-16
SLIDE 16

But Spare a Thought for a Few Celebrities

9/25

slide-17
SLIDE 17

But Spare a Thought for a Few Celebrities

9/25

slide-18
SLIDE 18

But Spare a Thought for a Few Celebrities

9/25

slide-19
SLIDE 19

But Spare a Thought for a Few Celebrities

9/25

slide-20
SLIDE 20

Disclaimer

The speaker does not support any of the above infidelity None of this work should be used in any of the above cited or related scenarios

  • Mr. Kennedy, Mr. Clinton, or NSA did not fund this research

10/25

slide-21
SLIDE 21

Scenarios Where You Can Use This Work...

11/25

slide-22
SLIDE 22

Scenarios Where You Can Use This Work...

11/25

slide-23
SLIDE 23

The Starting Point of This Work

  • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss

(JL) transform preserves DP

12/25

slide-24
SLIDE 24

The Starting Point of This Work

  • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss

(JL) transform preserves DP

  • JL transform says that using special choice of projection

matrix, projecting a set of vectors to a lower dimensional space preserves their pairwise distance

12/25

slide-25
SLIDE 25

The Starting Point of This Work

  • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss

(JL) transform preserves DP

  • The idea of BBDS is to use random projection of the

column entries of the representative matrix

12/25

slide-26
SLIDE 26

The Starting Point of This Work

  • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss

(JL) transform preserves DP

  • The idea of BBDS is to use random projection of the

column entries of the representative matrix

  • For a graph G, a reasonable choice is Laplacian,

LG := DG − AG

12/25

slide-27
SLIDE 27

The Starting Point of This Work

  • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss

(JL) transform preserves DP

  • The idea of BBDS is to use random projection of the

column entries of the representative matrix

  • For a graph G, a reasonable choice is Laplacian,

LG := DG − AG

  • For a set of vertices, S, Φ(S, ¯

S) = χT

SLGχS = √LGχS

12/25

slide-28
SLIDE 28

BBDS Mechanism Step by Step

The utility guarantee comes from JL-lemma

13/25

slide-29
SLIDE 29

BBDS Mechanism Step by Step

The utility guarantee comes from JL-lemma If we apply JL transform on √LG, then Φ(S, ¯ S) = M

  • LGχS = (1 ± ǫ)
  • LGχS

13/25

slide-30
SLIDE 30

BBDS Mechanism Step by Step

The utility guarantee comes from JL-lemma If we apply JL transform on √LG, then Φ(S, ¯ S) = M

  • LGχS = (1 ± ǫ)
  • LGχS

BBDS showed that it also preserves differential privacy when M is Gaussian

13/25

slide-31
SLIDE 31

What about DP?

Just multiplying √LG by M does not give DP guarantee S = {3, 6, 10} gives answer 0

14/25

slide-32
SLIDE 32

What about DP?

Just multiplying √LG by M does not give DP guarantee S = {3, 6, 10} gives answer 0 S = {3, 6, 10} gives a non-zero answer

14/25

slide-33
SLIDE 33

The Elegant Idea Used in BBDS

15/25

slide-34
SLIDE 34

The Elegant Idea Used in BBDS

is reweighted and transformed to This makes the graph connected and increases its second smallest eigenvalue

15/25

slide-35
SLIDE 35

The Two Faces of Complete Graph

16/25

slide-36
SLIDE 36

Algorithmic Disadvantage of a Complete Graph

On the negative side, overlaying a complete graph destroys any structural property of the graph

17/25

slide-37
SLIDE 37

Algorithmic Disadvantage of a Complete Graph

On the negative side, overlaying a complete graph destroys any structural property of the graph Why do we care about this?

  • Most of the graphs are sparse or have some structure
  • Sparsity and structure helps a lot in algorithmic design

17/25

slide-38
SLIDE 38

Algorithmic Disadvantage of a Complete Graph

On the negative side, overlaying a complete graph destroys any structural property of the graph Why do we care about this?

  • Most of the graphs are sparse or have some structure
  • Sparsity and structure helps a lot in algorithmic design

Question: Can we instead use a sparse graph?

17/25

slide-39
SLIDE 39

Differential Privacy on Sparse Graphs

Crucial observations

  • Second smallest eigenvalue gives an estimate of

connectivity (Cheeger’s theorem and Fielder’s result)

  • Eigenvalue of a graph is at least the eigenvalue of any of its

subgraph (Fielder’s result)

18/25

slide-40
SLIDE 40

Differential Privacy on Sparse Graphs

Crucial observations

  • Second smallest eigenvalue gives an estimate of

connectivity (Cheeger’s theorem and Fielder’s result)

  • Eigenvalue of a graph is at least the eigenvalue of any of its

subgraph (Fielder’s result) An expander graph is a sparse graph with high second smallest eigenvalue

18/25

slide-41
SLIDE 41

19/25

slide-42
SLIDE 42

Basic Construction

Input: An n-vertices sparse graph G

  • Pick a sparse expander graph, E

20/25

slide-43
SLIDE 43

Basic Construction

Input: An n-vertices sparse graph G

  • Pick a sparse expander graph, E
  • Set L ˜

G = w d LE +

  • 1 − w

d

  • LG

20/25

slide-44
SLIDE 44

Basic Construction

Input: An n-vertices sparse graph G

  • Pick a sparse expander graph, E
  • Set L ˜

G = w d LE +

  • 1 − w

d

  • LG
  • Pick a random projection matrix M with Gaussian noise,

and multiply with L ˜

G

Utility follows by comparing the spectral property of expander with complete graph

20/25

slide-45
SLIDE 45

Pictorial View of the Difference in Approaches

Original Graph BBDS This Work (Not complete picture)

21/25

slide-46
SLIDE 46

What About Dense Graphs?

When graph has high conductance, then apply sparsification techniques followed by random projection Can use local sparsification techniques or Global Sparsification Techniques

22/25

slide-47
SLIDE 47

What About Dense Graphs?

When graph has low conductance, overlay a high conductance graph (complete or sparse graph), and then apply sparsification techniques followed by random projections Can use local sparsification techniques or Global Sparsification Techniques

22/25

slide-48
SLIDE 48

What About Dense Graphs?

Main Lemma: The above sparsification techniques followed by JL transform that uses Gaussian matrix also preserves differential privacy

22/25

slide-49
SLIDE 49

Run Time of Sanitization Algorithms

  • Sparsification techniques uses time ˜

O(m), where m is the number of edges

  • For dense weighted graphs, m = O(n2), so sparsification

requires time ˜ O(n2)

  • Number of entries in the Laplacian of a sparse graph is

˜ O(n)

  • Multiplying the Laplacian of the graph by a Gaussian matrix

takes ˜ O(n2)

  • Total run time of sanitization is ˜

O(n2)

23/25

slide-50
SLIDE 50

A Comparative Study

Abbreviations: k : total number of queries, ε : privacy parameter, n : number of vertices, δ : spectral approximation parameter, s: set of vertices in a query Method Noise for any k Run Time Randomized Response O(

  • sn log k/ε)

O(n2) Exponential Sanitizer O(n log n/ε) Intractable Multiplicative Weight ˜ O(

  • |E| log k/ε)

O(n2) JL transform O(s√log k/ε) O(rn2.38)

24/25

slide-51
SLIDE 51

A Comparative Study

Abbreviations: k : total number of queries, ε : privacy parameter, n : number of vertices, δ : spectral approximation parameter, s: set of vertices in a query Method Noise for any k Run Time Randomized Response O(

  • sn log k/ε)

O(n2) Exponential Sanitizer O(n log n/ε) Intractable Multiplicative Weight ˜ O(

  • |E| log k/ε)

O(n2) JL transform O(s√log k/ε) O(rn2.38) Basic Scheme O(s√log k/ε) O(n2+o(1)) Using δ-Sparsifier O(sδ√log k/ε) O(n2+o(1))

24/25

slide-52
SLIDE 52

Conclusion

  • In this talk, we showed an algorithmic improvement over

the sanitization techniques

  • We achieve the best of both the world: efficient sanitization

and almost the same privacy and utility guarantee

25/25

slide-53
SLIDE 53

Conclusion

  • In this talk, we showed an algorithmic improvement over

the sanitization techniques

  • We achieve the best of both the world: efficient sanitization

and almost the same privacy and utility guarantee We also do the following in the paper:

  • A combinatorial analysis to answer (S, T)-cut queries
  • Further optimization: Fast-JL transform of Ailon-Chazelle

preserves differential privacy

25/25