Prediction for Processes on Network Graphs Gonzalo Mateos Dept. of - PowerPoint PPT Presentation

Prediction for Processes on Network Graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ April 18, 2019 Network Science Analytics Prediction for Processes on Network Graphs 1

Nearest neighbors Nearest-neighbor prediction Markov random fields Kernel regression on graphs Case study: Predicting protein function Network Science Analytics Prediction for Processes on Network Graphs 2

Processes on network graphs ◮ Motivation: study complex systems of elements and their interactions ◮ So far studied network graphs as representations of these systems ◮ Often some quantity associated with each of the elements is of interest ◮ Quantities may be influenced by the interactions among elements 1) Behaviors and beliefs influenced by social interactions 2) Functional roles of proteins influenced by their sequence similarity 3) Spread of epidemics influenced by proximity of individuals ◮ Can think of these quantities as random processes defined on graphs ◮ Static { X i } i ∈ V and dynamic processes { X i ( t ) } i ∈ V for t ∈ N or R + Network Science Analytics Prediction for Processes on Network Graphs 3

Nearest-neighbor prediction ◮ Consider prediction of a static process X := { X i } i ∈ V on a graph ◮ Process may be truly static, or a snapshot of a dynamic process Static network process prediction Predict X i , given observations of the adjacency matrix Y = y and of all attributes X ( − i ) = x ( − i ) but X i . ◮ Idea: exploit the network graph structure in y for prediction ◮ For binary X i ∈ { 0 , 1 } , say, simple nearest-neighbor method predicts �� j ∈N i x j � ˆ X i = I > τ |N i | ⇒ Average of the observed process in the neighborhood of i ⇒ Called ‘guilt-by-association’ or graph-smoothing method Network Science Analytics Prediction for Processes on Network Graphs 4

Example: predicting law practice ◮ Network G obs of working relationships among lawyers [Lazega’01] ◮ Nodes are N v = 36 partners, edges indicate partners worked together 13 33 5 8 36 6 31 30 10 24 32 18 23 20 15 28 4 22 35 3 34 26 14 19 25 12 16 17 9 7 29 2 27 21 11 1 ◮ Data includes various node-level attributes { X i } i ∈ V including ⇒ Type of practice, i.e., litigation (red) and corporate (cyan) ◮ Suspect lawyers collaborate more with peers in same legal practice ⇒ Knowledge of collaboration useful in predicting type of practice Network Science Analytics Prediction for Processes on Network Graphs 5

Example: predicting law practice (cont.) ◮ Q: In predicting practice X i , how useful is the value of one neighbor? ⇒ Breakdown of 115 edges based on practice of incident lawyers Litigation Corporate Litigation 29 43 Corporate 43 43 ◮ Looking at the rows in this table ◮ Litigation lawyers collaborators are 40% litigation, 60% corporate ◮ Collaborations of corporate lawyers are evenly split ⇒ Suggests using a single neighbor has little predictive power ◮ But 60% (29+43=72) of edges join lawyers with common practice ⇒ Suggests on aggregate knowledge of collaboration informative Network Science Analytics Prediction for Processes on Network Graphs 6

Example: predicting law practice (cont.) ◮ Incorporate information of all collaborators as in nearest-neighbors ◮ Let X i = 0 if lawyer i practices litigation, and X i = 1 for corporate 4 Frequency Frequency 4 3 2 2 1 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Fraction of Corporate Neighbors, Among Litigation Fraction of Corporate Neighbors, Among Corporate ◮ Nearest-neighbor prediction rule �� j ∈N i x j � ˆ X i = I > 0 . 5 |N i | ⇒ Infers correctly 13 of the 16 corporate lawyers (i.e., 81%) ⇒ Infers correctly 16 of the 18 litigation lawyers (i.e., 89%) ⇒ Overall error rate is just under 15% Network Science Analytics Prediction for Processes on Network Graphs 7

Modeling static network processes ◮ Nearest-neighbor methods may seem rather informal and simple ⇒ But competitive with more formal, model-based approaches ◮ Still, model-based methods have certain potential advantages: a) Probabilistically rigorous predictive statements; b) Formal inference for model parameters; and c) Natural mechanisms for handling missing data ◮ Model the process X := { X i } i ∈ V given an observed graph Y = y ⇒ Markov random field (MRF) models ⇒ Kernel-regression models using graph kernels Network Science Analytics Prediction for Processes on Network Graphs 8

Markov random fields Nearest-neighbor prediction Markov random fields Kernel regression on graphs Case study: Predicting protein function Network Science Analytics Prediction for Processes on Network Graphs 9

Markov random field models ◮ Consider a graph G ( V , E ) with given adjacency matrix A ⇒ Collection of discrete RVs X = [ X 1 , . . . , X N v ] ⊤ defined on V ◮ Def: process X is a Markov random field (MRF) on G if � � X ( − i ) = x ( − i ) � � X N i = x N i � � � � P X i = x i = P X i = x i , i ∈ V ◮ X i conditionally independent of other X k , given neighbors values ◮ ‘Spatial’ Markov property, generalizing Markov chains in time ◮ G defines neighborhoods N i , hence dependencies ◮ Roots in statistical mechanics, Ising model of ferromagnetism [Ising ’25] ⇒ MRFs used extensively in spatial statistics and image analysis ◮ Definition requires a technical condition P ( X = x ) > 0, for all x Network Science Analytics Prediction for Processes on Network Graphs 10

MRFs and Gibbs random fields ◮ MRFs equivalent to Gibbs random fields X , having joint distribution � 1 � exp { U ( x ) } P ( X = x ) = κ ⇒ Energy function U ( · ), partition function κ = � x exp { U ( x ) } ⇒ Equivalence follows from the Hammersley-Clifford theorem ◮ Energy function decomposable over the maximal cliques in G � U ( x ) = U c ( x ) c ∈C ⇒ Defined clique potentials U c ( · ), set C of maximal cliques in G � X ( − i ) � ◮ Can show P � � X i depends only on cliques involving vertex i Network Science Analytics Prediction for Processes on Network Graphs 11

Example: auto-logistic MRFs ◮ May specify MRFs through choice of clique potentials U c ( · ) ◮ Ex: Class of auto models are defined through the constraints: (i) Only cliques c ∈ C of size one and two have U c � = 0 � X N i � � � (ii) Probabilities P X i have an exponential family form ◮ For binary RVs X i ∈ { 0 , 1 } , the energy function takes the form � � U ( x ) = α i x i + β ij x i x j i ∈ V ( i , j ) ∈ E ◮ Resulting MRF is known as auto-logistic model, because exp { α i + � j ∈N i β ij x j } � X N i = x N i � � � P X i = 1 = 1 + exp { α i + � j ∈N i β ij x j } ⇒ Logistic regression of x i on its neighboring x j ’s ⇒ Ising model a special case, when G is a regular lattice Network Science Analytics Prediction for Processes on Network Graphs 12

Homogeneity assumptions ◮ Typical to assume that parameters α i and β ij are homogeneous ◮ Ex: Specifying α i = α and β ij = β yields conditional log-odds � � � X N i = x N i � � � P X i = 1 � log = α + β x j � X N i = x N i � � � P X i = 0 j ∈N i ⇒ Linear in the number of neighbors j of i with X j = 1 ◮ Ex: Specifying α i = α + |N i | β 2 and β ij = β 1 − β 2 yields � � � X N i = x N i � � � P X i = 1 � � log = α + β 1 x j + β 2 (1 − x j ) � � X N i = x N i � � P X i = 0 j ∈N i j ∈N i ⇒ Linear also in the number of neighbors j of i with X j = 0 Network Science Analytics Prediction for Processes on Network Graphs 13

MRFs for continuous random variables ◮ MRFs with continuous RVs: replace PMFs/sums with pdfs/integrals ⇒ Gaussian distribution common for analytical tractability � X N i = x N i , with ◮ Ex: auto-Gaussian model specifies Gaussian X i � � X N i = x N i � � � � = α i + β ij ( x j − α j ) E X i j ∈N i � X N i = x N i = σ 2 � � � var X i ⇒ Values X i modeled as weighted combinations of i ’s neighbors ◮ Let µ = [ α 1 , . . . , α N v ] ⊤ and Σ = σ 2 ( I − B ) − 1 , where B = [ β ij ] ⇒ Under β ii = 0 and β ij = β ji → X ∼ N ( µ , Σ ) ◮ Homogeneity assumptions can be imposed, simplifying expressions ⇒ Further set α i = α and β ij = β → X ∼ N ( α 1 , σ 2 ( I − β A ) − 1 ) Network Science Analytics Prediction for Processes on Network Graphs 14

Inference and prediction for MRFs ◮ In studying process X = { X i } i ∈ V of interest to predict some or all of X ◮ MRF models we have seen for this purpose are of the form � 1 � exp { U ( x ; θ ) } P θ ( X = x ) = κ ( θ ) ⇒ Parameter θ low-dimensional, e.g., θ = [ α, β ] in auto-models ◮ Predictions can be generated based on the distribution P θ ( · ) ⇒ Knowledge of θ is necessary, and typically θ is unknown ◮ Unlike nearest-neighbors prediction, MRFs requires inference of θ first Network Science Analytics Prediction for Processes on Network Graphs 15

Prediction for Processes on Network Graphs Gonzalo Mateos Dept. of - PowerPoint PPT Presentation

Prediction for Processes on Network Graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ April 18, 2019 Network Science Analytics

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

CS200: Graphs Prichard Ch. 14 Rosen Ch. 10 CS200 - Graphs 1 Graphs A collection of What can

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

Birth and Death Processes Today: Birth processes Birth and Death Processes Death

Programs, Processes, and Threads Programs, Processes, and Threads (Chapter 2) Processes

Graphs Graphs Simple graphs Algorithms Depth-first search Breadth-first search

Searching on Graphs November 16, 2016 CMPE 250 Graphs- Searching on Graphs November 16, 2016 1

Today. Types of graphs. Today. Types of graphs. Complete Graphs. Trees. Hypercubes. Today.

Algorithms for Lipschitz Learning on Graphs Sushant Sachdeva Yale Institute of Network Sciences

Implementing Processes Implementing Processes Review: Threads vs vs. Processes . Processes

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Genome Sequencing: Introduc2on to Fragment Assembly Lecture 5:

Structural Studies of an AAA+ ATPase Structural Studies of an AAA+ ATPase N-ethylmaleimide

BMI-206 Structure-Structure comparisons Sequence-Structure comparisons Marc A. Marti-Renom

Sourdough Bread Flour + Water + SaltHow hard could it be? Sydney Sherman GSPS - 10/19/18

Clustering In this example distance matrix: and have the most similar vectors 0 0.265 0.799

Composite Pattern Discovery for PCR Application Stanislav Angelov University of Pennsylvania,

Main parameters (invariants) 160 letters Omaha -Nebraska- -> Boston Diameter average

GDR ADN, 2-4 mai 2012 Replication in eukaryotic genomes Specific features of eukaryotic

Prediction for Processes on Network Graphs Gonzalo Mateos Dept. of - PowerPoint PPT Presentation

Prediction for Processes on Network Graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ April 18, 2019 Network Science Analytics

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

CS200: Graphs Prichard Ch. 14 Rosen Ch. 10 CS200 - Graphs 1 Graphs A collection of What can

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

Birth and Death Processes Today: Birth processes Birth and Death Processes Death

Programs, Processes, and Threads Programs, Processes, and Threads (Chapter 2) Processes

Graphs Graphs Simple graphs Algorithms Depth-first search Breadth-first search

Searching on Graphs November 16, 2016 CMPE 250 Graphs- Searching on Graphs November 16, 2016 1

Today. Types of graphs. Today. Types of graphs. Complete Graphs. Trees. Hypercubes. Today.

Algorithms for Lipschitz Learning on Graphs Sushant Sachdeva Yale Institute of Network Sciences

Implementing Processes Implementing Processes Review: Threads vs vs. Processes . Processes

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Genome Sequencing: Introduc2on to Fragment Assembly Lecture 5:

Structural Studies of an AAA+ ATPase Structural Studies of an AAA+ ATPase N-ethylmaleimide

BMI-206 Structure-Structure comparisons Sequence-Structure comparisons Marc A. Marti-Renom

Sourdough Bread Flour + Water + SaltHow hard could it be? Sydney Sherman GSPS - 10/19/18

Clustering In this example distance matrix: and have the most similar vectors 0 0.265 0.799

Composite Pattern Discovery for PCR Application Stanislav Angelov University of Pennsylvania,

Main parameters (invariants) 160 letters Omaha -Nebraska- -&gt; Boston Diameter average

GDR ADN, 2-4 mai 2012 Replication in eukaryotic genomes Specific features of eukaryotic

Main parameters (invariants) 160 letters Omaha -Nebraska- -> Boston Diameter average