Media Graph Partitioning Introduction modules, cluster, - PowerPoint PPT Presentation

Online Social Networks and Media Graph Partitioning

Introduction modules, cluster, communities, groups, partitions (more on this today) 2

Outline PART I 1. Introduction: what, why, types? 2. Cliques and vertex similarity 3. Background: Cluster analysis 4. Hierarchical clustering (betweenness) 5. Modularity 6. How to evaluate (if time allows) 3

Outline PART II 1. Cuts 2. Spectral Clustering partitions 3. Dense Subgraphs 4. Community Evolution 5. How to evaluate (from Part I) 4

Graph partitioning The general problem – Input: a graph G = (V, E) • edge (u, v) denotes similarity between u and v • weighted graphs: weight of edge captures the degree of similarity Partitioning as an optimization problem: • Partition the nodes in the graph such that nodes within clusters are well interconnected (high edge weights), and nodes across clusters are sparsely interconnected (low edge weights) • most graph partitioning problems are NP hard

Graph Partitioning 6

Graph Partitioning Undirected graph 𝐻(𝑊, 𝐹): 5 1 2 6 4 Bi-partitioning task: 3 Divide vertices into two disjoint groups 𝑩, 𝑪 A B 5 1 2 6 4 3 How can we define a “good” partition of 𝑯 ? How can we efficiently identify such a partition? 7

Graph Partitioning What makes a good partition?  Maximize the number of within-group connections  Minimize the number of between-group connections 5 1 2 6 4 3 A B 8

Graph Cuts Express partitioning objectives as a function of the “edge cut” of the partition Cut: Set of edges with only one vertex in a group: B A 5 1 cut(A,B) = 2 2 6 4 3 9

An example

Min Cut min-cut: the min number of edges such that when removed cause the graph to become disconnected Minimizes the number of connections between partition arg min A,B cut(A,B)       min E U, V  U  A i, j U i  U j  V  U This problem can be solved in polynomial time Min-cut/Max-flow algorithm U V-U

Min Cut “Optimal cut” Minimum cut Problem: – Only considers external cluster connections – Does not consider internal cluster connectivity 12

Graph Bisection • Since the minimum cut does not always yield good results we need extra constraints to make the problem meaningful. • Graph Bisection refers to the problem of partitioning the nodes of the graph into two equal sets . • Kernighan-Lin algorithm: Start with random equal partitions and then swap nodes to improve some quality metric (e.g., cut, modularity, etc).

Cut Ratio Ratio Cut Normalize cut by the size of the groups Cut(U,V−U) Cut(U,V−U) + Ratio-cut = |𝑉| |𝑊−𝑉| 14

Normalized Cut Normalized-cut Connectivity between groups relative to the density of each group Cut(U,V−U) Cut(U,V−U) + 𝑊𝑝𝑚(𝑊−𝑉) Normalized-cut = 𝑊𝑝𝑚(𝑉) 𝑤𝑝𝑚(𝑉) : total weight of the edges with at least one endpoint in 𝑉 : 𝑤𝑝𝑚 𝑉 = 𝑒 𝑗 𝑗∈𝑉 Why use these criteria?  Produce more balanced partitions 15

Red is Min-Cut 1 1 9 1 + 8 Ratio-Cut(Red) = 8 = 2 2 18 5 + 4 = 20 Ratio-Cut(Green) = 1 1 28 1 + 27 = Normalized-Cut(Red) = 27 2 2 14 Normalized is even better 12 + 16 = Normalized-Cut(Green) = 48 for Green due to density

An example Which of the three cuts has the best (min, normalized, ratio) cut?

Graph expansion Graph expansion:   cut U, V - U  α   min  min U , V U U

Graph Cuts Ratio and normalized cuts can be reformulated in matrix format and solved using spectral clustering

SPECTRAL CLUSTERING

Matrix Representation Adjacency matrix ( A ): – n  n matrix – A=[a ij ], a ij =1 if edge between node i and j 1 2 3 4 5 6 5 0 1 1 0 1 0 1 1 1 0 1 0 0 0 2 2 6 1 1 0 1 0 0 3 4 3 0 0 1 0 1 1 4 1 0 0 1 0 1 5 Important properties: – Symmetric matrix 0 0 0 1 1 0 6 – Eigenvectors are real and orthogonal If the graph is weighted, a ij = w ij 21

Spectral Graph Partitioning x is a vector in  n with components (𝒚 𝟐 , … , 𝒚 𝒐 ) – Think of it as a label/value of each node of 𝑯  What is the meaning of A  x ? Entry y i is a sum of labels x j of neighbors of i 22

Spectral Analysis i th coordinate of A  x : – Sum of the x -values of neighbors of i – Make this a new value at node j 𝑩 ⋅ 𝒚 = 𝝁 ⋅ 𝒚 Spectral Graph Theory: – Analyze the “spectrum” of a matrix representing 𝐻 – Spectrum: Eigenvectors 𝑦 𝑗 of a graph, ordered by the magnitude (strength) of their corresponding eigenvalues 𝜇 𝑗 : Spectral clustering: use the eigenvectors of A or graphs derived by it Most based on the graph Laplacian 23

Matrix Representation Degree matrix (D): – n  n diagonal matrix – D=[d ii ], d ii = degree of node i 1 2 3 4 5 6 3 0 0 0 0 0 1 5 1 2 0 2 0 0 0 0 2 3 0 0 3 0 0 0 6 4 4 0 0 0 3 0 0 3 0 0 0 0 3 0 5 0 0 0 0 0 2 6 24

Matrix Representation Laplacian matrix (L): – n  n symmetric matrix 𝑴 = 𝑬 − 𝑩 1 2 3 4 5 6 5 1 3 -1 -1 0 -1 0 1 2 -1 2 -1 0 0 0 2 6 4 3 -1 -1 3 -1 0 0 3 4 0 0 -1 3 -1 -1 5 -1 0 0 -1 3 -1 6 0 0 0 -1 -1 2 25

Laplacian Matrix properties • The matrix L is symmetric and positive semidefinite – all eigenvalues of L are positive positive definite: if z T Mz is non-negative, for every non-zero column vector z • The matrix L has 0 as an eigenvalue, and corresponding eigenvector w 1 = (1,1,…,1) – λ 1 = 0 is the smallest eigenvalue Proof: Let w 1 be the column vector with all 1s -- show Lw 1 = 0w 1

The second smallest eigenvalue The second smallest eigenvalue (also known as Fielder value) λ 2 satisfies T λ  min x Lx 2   x w , x 1 1

The second smallest eigenvalue • For the Laplacian   x 0 x  w i i 1 • The expression: x T Lx is     2 x x i j  (i, j) E

The second smallest eigenvalue Thus, the eigenvector for eigenvalue λ 2 (called the Fielder vector) minimizes    where   2 x  0 min x x i i j  i x 0  (i, j) E  Intuitively, minimum when x i and x j close whenever there is an edge between nodes i and j in the graph.  x must have some positive and some negative components

Cuts + eigenvalues: intuition  A partition of the graph by taking: o one set to be the nodes i whose corresponding vector component x i is positive and o the other set to be the nodes whose corresponding vector component is negative .  The cut between the two sets will have a small number of edges because (x i − x j ) 2 is likely to be smaller if both x i and x j have the same sign than if they have different signs.  Thus, minimizing x T Lx under the required constraints will end giving x i and x j the same sign if there is an edge (i, j).

Example 5 1 2 6 4 3

Other properties of L Let G be an undirected graph with non-negative weights. Then  the multiplicity k of the eigenvalue 0 of L equals the number of connected components A 1 , . . . , A k in the graph  the eigenspace of eigenvalue 0 is spanned by the indicator vectors 1A 1 , . . . , 1A k of those components

Proof (sketch) If connected (k = 1) 𝟑 0 = 𝑦 𝝊 𝑴𝒚 = 𝒚 𝒋 − 𝒚 𝒌 𝒋,𝒌 ∈𝑭 Assume k connected components, both A and L block diagonal, if we order vertices based on the connected component they belong to (recall the “tile” matrix) L i Laplacian of the i-th component for all block diagonal matrices, that the spectrum is given by the union of the spectra of each block, and the corresponding eigenvectors are the eigenvectors of the block, filled with 0 at the positions of the other blocks.

Cuts + eigenvalues: summary • What we know about x ? 2 = 1 – 𝑦 is unit vector: 𝑦 𝑗 𝑗 – 𝑦 is orthogonal to 1 st eigenvector (1, … , 1) thus: 𝑦 𝑗 ⋅ 1 = 𝑦 𝑗 = 0 𝑗 𝑗   2 ( x x )    ( i , j ) E i j min  2 2 x All labelings i i of nodes 𝑗 so that 𝑦 𝑗 = 0 We want to assign values 𝑦 𝑗 to nodes i such that few edges cross 0. x (we want x i and x j to subtract each other) 𝑦 𝑗 𝑦 𝑘 0 Balance to minimize 34

Spectral Clustering Algorithms Three basic stages: Pre-processing • Construct a matrix representation of the graph Decomposition • Compute eigenvalues and eigenvectors of the matrix • Map each point to a lower-dimensional representation based on one or more eigenvectors Grouping • Assign points to two or more clusters, based on the new representation 35

Media Graph Partitioning Introduction modules, cluster, - PowerPoint PPT Presentation

Online Social Networks and Media Graph Partitioning Introduction modules, cluster, communities, groups, partitions (more on this today) 2 Outline PART I 1. Introduction: what, why, types? 2. Cliques and vertex similarity 3. Background:

Presentation 1 What is social media? Get Media Smart social media 2 What is social media?

New Media Production 2 MUMT 303 Week 1 Sven-Amin Lembke What is new media? What is OLD media?

Media 101 Presented by: Elements of a Media Campaign: Overview Positioning Media strategy

Social Media Legal Issues Brian C. England Deputy City Attorney Garland, Texas March 7, 2018

Social Media for Mason AGENDA What is Social Media Social Media Strategy Content

Law and the Media Media lies, war propaganda and manipulation JRN 6205 Media Ethics and Law By

What is your definition of media literacy? 1. Radical media education 2. Ideology in media 3.

MEDIA TRAINING Media Outreach and Social Media INTRODUCTIONS Media Outreach Best Practices

We (Are Still) the Media Dan Gillmor Arizona State University Media Shift: A Brief History

Presentation 2 Why is there advertising on social media? Get Media Smart social media 2

Chart 1: Children s Media Use s Media Use Chart 1: Children Chart 1: Childrens Media

CRISIS COMMUNICATION The Social Media Impact May 10, 2011 MEDIA AS A FULL SPECTRUM MONITORING

All Media ADS About All Media ADS All Media ADS offers Internet advertising that provides

Social Media Week BEIRUT Social Media versus Traditional Media; The contradictory results of the

Media Fragmentation The Impact, Data and What You Can Do About It Introduction The Impact Media

Social Media donts What is social media Social media is nothing new Just an extension

Network Flows Marco Chiarandini Department of Mathematics & Computer Science University of

Graph partitioning using matrix differential equations Nicola Guglielmi Gran Sasso Science

CS137: Electronic Design Automation Day 7: January 28, 2002 Partitioning 2 (spectral, network

Generalized Flow-Cut Dualities Sanjeevi Krishnan (Upenn) Bremen 2013 MAX FLOW = MIN CUT The

Semidefinite Programming Pekka Orponen T-79.7001 Postgraduate Course on Theoretical Computer

Optimization for Machine Learning Lecture 3: Bundle Methods S.V . N. (vishy) Vishwanathan Purdue

Chapter 7: Maximum Flow Problems (cp. Cook, Cunningham, Pulleyblank & Schrijver, Chapter 3)

tr r rt Prt

Media Graph Partitioning Introduction modules, cluster, - PowerPoint PPT Presentation

Online Social Networks and Media Graph Partitioning Introduction modules, cluster, communities, groups, partitions (more on this today) 2 Outline PART I 1. Introduction: what, why, types? 2. Cliques and vertex similarity 3. Background:

Presentation 1 What is social media? Get Media Smart social media 2 What is social media?

New Media Production 2 MUMT 303 Week 1 Sven-Amin Lembke What is new media? What is OLD media?

Media 101 Presented by: Elements of a Media Campaign: Overview Positioning Media strategy

Social Media Legal Issues Brian C. England Deputy City Attorney Garland, Texas March 7, 2018

Social Media for Mason AGENDA What is Social Media Social Media Strategy Content

Law and the Media Media lies, war propaganda and manipulation JRN 6205 Media Ethics and Law By

What is your definition of media literacy? 1. Radical media education 2. Ideology in media 3.

MEDIA TRAINING Media Outreach and Social Media INTRODUCTIONS Media Outreach Best Practices

We (Are Still) the Media Dan Gillmor Arizona State University Media Shift: A Brief History

Presentation 2 Why is there advertising on social media? Get Media Smart social media 2

Chart 1: Children s Media Use s Media Use Chart 1: Children Chart 1: Childrens Media

CRISIS COMMUNICATION The Social Media Impact May 10, 2011 MEDIA AS A FULL SPECTRUM MONITORING

All Media ADS About All Media ADS All Media ADS offers Internet advertising that provides

Social Media Week BEIRUT Social Media versus Traditional Media; The contradictory results of the

Media Fragmentation The Impact, Data and What You Can Do About It Introduction The Impact Media

Social Media donts What is social media Social media is nothing new Just an extension

Network Flows Marco Chiarandini Department of Mathematics &amp; Computer Science University of

Graph partitioning using matrix differential equations Nicola Guglielmi Gran Sasso Science

CS137: Electronic Design Automation Day 7: January 28, 2002 Partitioning 2 (spectral, network

Generalized Flow-Cut Dualities Sanjeevi Krishnan (Upenn) Bremen 2013 MAX FLOW = MIN CUT The

Semidefinite Programming Pekka Orponen T-79.7001 Postgraduate Course on Theoretical Computer

Optimization for Machine Learning Lecture 3: Bundle Methods S.V . N. (vishy) Vishwanathan Purdue

Chapter 7: Maximum Flow Problems (cp. Cook, Cunningham, Pulleyblank &amp; Schrijver, Chapter 3)

tr r rt Prt

Network Flows Marco Chiarandini Department of Mathematics & Computer Science University of

Chapter 7: Maximum Flow Problems (cp. Cook, Cunningham, Pulleyblank & Schrijver, Chapter 3)