Random Projections, Graph Sparsification, and Differential Privacy - PowerPoint PPT Presentation

Random Projections, Graph Sparsification, and Differential Privacy Jalaj Upadhyay Center for Applied Cryptographic Research University of Waterloo December 02, 2013 1/25

The Peril of Last Talk on Monday 2/25

This Paper in One Slide Random Projections (JL transform) ⇓ Differential privacy 3/25

This Paper in One Slide Graph Sparsification + Random Projections (JL transform) ⇓ Differential privacy with improved sanitization time, and comparable utility and privacy guarantee 3/25

Hope We are All Not There Anymore 4/25

Differential Privacy: The Mathematical Formulation • The idea is that absence or presence of an individual entry should not change the output “by much" 5/25

Differential Privacy: The Mathematical Formulation • The idea is that absence or presence of an individual entry should not change the output “by much" • A sanitization algorithm, K , gives ǫ -differential privacy if, for all “neighboring data," D 1 and D 2 , and for all range S , Pr [ K ( D 1 ) ∈ S ] Pr [ K ( D 2 ) ∈ S ] ≤ exp( ǫ ) 5/25

Differential Privacy: The Mathematical Formulation • The idea is that absence or presence of an individual entry should not change the output “by much" • A sanitization algorithm, K , gives ǫ -differential privacy if, for all “neighboring data," D 1 and D 2 , and for all range S , Pr [ K ( D 1 ) ∈ S ] Pr [ K ( D 2 ) ∈ S ] ≤ exp( ǫ ) • A sanitization algorithm, K , gives ( ǫ, δ ) -differential privacy if, for all “neighboring data," D 1 and D 2 , and for all range S , Pr [ K ( D 1 ) ∈ S ] ≤ exp( ǫ ) Pr [ K ( D 2 ) ∈ S ] + δ. 5/25

Differential Privacy: The Pretty (Common) Picture 6/25

Why Should We Care About Cut Queries? • A natural question in social networking • How many people have friends outside their circle? 7/25

Why Should We Care About Cut Queries? 7/25

Why Should We Care About Cut Queries? • A natural question in social networking • How many people have friends outside their circle? • The answer is the number of edges crossing the border of the set of the vertices corresponding to those people • This number is called cut corresponding to the set of vertices 7/25

Why Should We Care About Cut Queries? • A natural question in social networking • How many people have friends outside their circle? • The answer is the number of edges crossing the border of the set of the vertices corresponding to those people • This number is called cut corresponding to the set of vertices Question: Why would you really care about the privacy? 7/25

Friendships or "What You May Call" Between People Suppose Facebook decides to reveal the friendship graph 8/25

Friendships or "What You May Call" Between People Suppose Facebook decides to reveal the friendship graph There might be some people who might end up in trouble 8/25

But Spare a Thought for a Few Celebrities 9/25

Disclaimer The speaker does not support any of the above infidelity None of this work should be used in any of the above cited or related scenarios Mr. Kennedy, Mr. Clinton, or NSA did not fund this research 10/25

Scenarios Where You Can Use This Work... 11/25

The Starting Point of This Work • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss (JL) transform preserves DP 12/25

The Starting Point of This Work • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss (JL) transform preserves DP • JL transform says that using special choice of projection matrix, projecting a set of vectors to a lower dimensional space preserves their pairwise distance 12/25

The Starting Point of This Work • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss (JL) transform preserves DP • The idea of BBDS is to use random projection of the column entries of the representative matrix 12/25

The Starting Point of This Work • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss (JL) transform preserves DP • The idea of BBDS is to use random projection of the column entries of the representative matrix • For a graph G , a reasonable choice is Laplacian, L G := D G − A G 12/25

The Starting Point of This Work • Blocki et al. (BBDS) showed that Johnson-Lindenstrauss (JL) transform preserves DP • The idea of BBDS is to use random projection of the column entries of the representative matrix • For a graph G , a reasonable choice is Laplacian, L G := D G − A G S L G χ S = �√ L G χ S � • For a set of vertices, S , Φ( S, ¯ S ) = χ T 12/25

BBDS Mechanism Step by Step The utility guarantee comes from JL-lemma 13/25

BBDS Mechanism Step by Step The utility guarantee comes from JL-lemma If we apply JL transform on √ L G , then Φ( S, ¯ � � S ) = � M L G χ S � = (1 ± ǫ ) � L G χ S � 13/25

BBDS Mechanism Step by Step The utility guarantee comes from JL-lemma If we apply JL transform on √ L G , then Φ( S, ¯ � � S ) = � M L G χ S � = (1 ± ǫ ) � L G χ S � BBDS showed that it also preserves differential privacy when M is Gaussian 13/25

What about DP? Just multiplying √ L G by M does not give DP guarantee S = { 3 , 6 , 10 } gives answer 0 14/25

What about DP? Just multiplying √ L G by M does not give DP guarantee S = { 3 , 6 , 10 } gives S = { 3 , 6 , 10 } gives a answer 0 non-zero answer 14/25

The Elegant Idea Used in BBDS 15/25

The Elegant Idea Used in BBDS is reweighted and transformed to This makes the graph connected and increases its second smallest eigenvalue 15/25

The Two Faces of Complete Graph 16/25

Algorithmic Disadvantage of a Complete Graph On the negative side, overlaying a complete graph destroys any structural property of the graph 17/25

Algorithmic Disadvantage of a Complete Graph On the negative side, overlaying a complete graph destroys any structural property of the graph Why do we care about this? • Most of the graphs are sparse or have some structure • Sparsity and structure helps a lot in algorithmic design 17/25

Algorithmic Disadvantage of a Complete Graph On the negative side, overlaying a complete graph destroys any structural property of the graph Why do we care about this? • Most of the graphs are sparse or have some structure • Sparsity and structure helps a lot in algorithmic design Question: Can we instead use a sparse graph? 17/25

Differential Privacy on Sparse Graphs Crucial observations • Second smallest eigenvalue gives an estimate of connectivity (Cheeger’s theorem and Fielder’s result) • Eigenvalue of a graph is at least the eigenvalue of any of its subgraph (Fielder’s result) 18/25

Differential Privacy on Sparse Graphs Crucial observations • Second smallest eigenvalue gives an estimate of connectivity (Cheeger’s theorem and Fielder’s result) • Eigenvalue of a graph is at least the eigenvalue of any of its subgraph (Fielder’s result) An expander graph is a sparse graph with high second smallest eigenvalue 18/25

Basic Construction Input: An n -vertices sparse graph G • Pick a sparse expander graph, E 20/25

Basic Construction Input: An n -vertices sparse graph G • Pick a sparse expander graph, E G = w 1 − w • Set L ˜ � � d L E + L G d 20/25

Basic Construction Input: An n -vertices sparse graph G • Pick a sparse expander graph, E G = w 1 − w • Set L ˜ � � d L E + L G d • Pick a random projection matrix M with Gaussian noise, and multiply with L ˜ G Utility follows by comparing the spectral property of expander with complete graph 20/25

Pictorial View of the Difference in Approaches Original Graph BBDS This Work (Not complete picture) 21/25

What About Dense Graphs? When graph has high conductance, then apply sparsification techniques followed by random projection Can use local sparsification techniques or Global Sparsification Techniques 22/25

What About Dense Graphs? When graph has low conductance, overlay a high conductance graph (complete or sparse graph), and then apply sparsification techniques followed by random projections Can use local sparsification techniques or Global Sparsification Techniques 22/25

What About Dense Graphs? Main Lemma: The above sparsification techniques followed by JL transform that uses Gaussian matrix also preserves differential privacy 22/25

Run Time of Sanitization Algorithms • Sparsification techniques uses time ˜ O ( m ) , where m is the number of edges • For dense weighted graphs, m = O ( n 2 ) , so sparsification requires time ˜ O ( n 2 ) • Number of entries in the Laplacian of a sparse graph is ˜ O ( n ) • Multiplying the Laplacian of the graph by a Gaussian matrix takes ˜ O ( n 2 ) • Total run time of sanitization is ˜ O ( n 2 ) 23/25

Random Projections, Graph Sparsification, and Differential Privacy - PowerPoint PPT Presentation

Random Projections, Graph Sparsification, and Differential Privacy Jalaj Upadhyay Center for Applied Cryptographic Research University of Waterloo December 02, 2013 1/25 The Peril of Last Talk on Monday 2/25 This Paper in One Slide Random

Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov 2016 1 / 10 Todays Biz 1.

Vertex Sparsification and Oblivious Reductions Ankur Moitra, MIT September 14, 2010 Ankur Moitra

Active Regression via Linear-Sample Sparsification Xue Chen Eric Price UT Austin Xue Chen, Eric

Improved Dynamic Graph Learning through Fault-Tolerant Sparsification Chun Jiang Zhu , Sabine

Graph Sparsifiers Smaller graph that (approximately) preserves the values of some set of

DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOLATILES DIFFERENTIAL AROMA

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Back to Random Walks on Graphs Random walk on a graph: Stationary distribution: Back to Random

Quantum Speedup for Graph Sparsification, Cut Approximation and Laplacian Solving Simon Apers 1

Graph Sparsification Approaches to Scalable Integrated Circuit Modeling and Simulations Zhuo Feng

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Tutorial: Differential Categories and Cartesian Differential Categories JS Pacaud Lemay FMCS

Algorithms for random k -SAT and k -colourings of a random graph Michael Molloy Dept of Computer

Random graph methods October 16, 2018 Random graph methods October 16, 2018 1 / 37 Graphs and

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Deterministic MST Sparsification in the Congested Clique Janne H. Korhonen University of

Bayesians Can Learn From Old Data William H. Jefferys University of Texas at Austin University

Strong Convex Nonlinear Relaxations of the Pooling Problem One Relaxation to Rule Them All?

Why did Colorados general assembly pass this law? Note: Education, training, or experience to

Performance related pay in public services: theory and evidence Simon Burgess Public

A Plan to Fix Local Variable Debug Information in GCC Alexandre Oliva aoliva@redhat.com

Romans Series Lesson #99 April 25, 2013 Dean Bible Ministries www.deanbible.org Dr. Robert L.

The GNU Radio Toolkit Martin Braun, Ettus Research FOSDEM January 2016 (Martin Braun, Ettus

Principles of Software Construction: Objects, Design, and Concurrency API Design, Part I: Process