Random graph methods October 16, 2018 Random graph methods October - PowerPoint PPT Presentation

Random graph methods October 16, 2018 Random graph methods October 16, 2018 1 / 37

Graphs and Trees – a poetic point of view A dead tree, cut into planks and read from one end to the other, is a kind of line graph, with dates down one side and height along the other, as if trees, like mathematicians, had found a way of turning time into form. Alice Oswald, British Poet Random graph methods October 16, 2018 2 / 37

Introduction Unidirectional graph A graph consists of a set of vertices (nodes) , along with a set of edges joining some pairs of the vertices. Random graph methods October 16, 2018 4 / 37

Introduction Graph – a map of random dependencies Let each vertex correspond to (represent) a random variable. The graph gives a visual way of understanding the joint distribution of the entire set of random variables. In this approach, the absence of an edge between two vertices has a special meaning: the corresponding random variables are conditionally independent, given the other variables (represented by other vertices). Such a graph does not tell a full story about the model but helps understand dependencies and search for them. If one specifies the model than the graphs plus some parameters for the distributions completely defines the model. Random graph methods October 16, 2018 5 / 37

Introduction Simple examples Example: Let X 1 , X 2 , X 3 be independent random variables. What is the graph for What is the graph for What is the graph for X = X 1 + X 2 , Y = X 2 , X = X 1 + X 2 , X = X 1 , Y = X 2 , and and Z = X 2 + X 3 ? Y = X 1 + X 3 , and Z = X 1 + X 2 + X 3 ? Z = X 2 + X 3 ? Z X ? Y X Y Z Random graph methods October 16, 2018 6 / 37

Introduction How to plot graphs in R install.packages("igraph") #only if not installed before!!! library(igraph) edges = matrix(c("Y","Z","X","Z","X","Y"), nrow=3, ncol=2, byrow=T) g = graph.edgelist(edges, directed=FALSE) plot(g, edge.width=2, vertex.size=30, edge.color=’black’) X Y Z Random graph methods October 16, 2018 7 / 37

Introduction Specific models for distributions Without further specification of the model is difficult to say what kind of dependence one have. Interpretation of graphs is difficult unless some distributional structure is imposed. One needs specify models for distributions to make complete answers. Two models are popular: For continuous variables: Gaussian models For discrete variables: Ising model ( Boltzman machines ) Random graph methods October 16, 2018 8 / 37

Gaussian model Fundamentals We assume that the observations have a multivariate Gaussian distribution with mean µ and covariance matrix Σ . There are several important properties of Gaussian distributions: The distribution is specified by pairwise covariances plus means. Conditional distributions are always Gaussian. The covariances of conditional distributions do not depend on the values of variables on which conditioning is taken but only on Σ . The independence of two variables (the lack of edge between corresponding nodes) means that the conditional covariance (given all other variables) is zero – these conditional covariances are called partial covariances . The inverse of covariance Σ , often called the precision matrix Θ = Σ − 1 tells when partial covariances are zero (lack of an edge): zero in the precision matrix is equivalent to zero of the corresponding partial correlation Random graph methods October 16, 2018 10 / 37

Gaussian model Partial covariances vs. the precision matrix Partial correlation: Partial correlation can be formulated in terms of the projections of the observations to the subspaces Let X i and X j be two coordinates in X = ( X 1 , . . . , X n ) and X � ij be the vector of all remaining coordinates. Y i - the residual from the orthogonal (the least square) projection of X i to X � ij . Y j - the residual from the orthogonal projection of X j to X � ij . n × n matrices of partial covariances and partial correlations � � Y i , Y j � � PC = [ � Y i , Y j � ] , R = [ ρ ij ] = � Y i �� Y j � Precision matrix: The inverse Θ of covariance Σ of X i ’s – the precision matrix θ ij ρ ij = − � θ ii θ jj Random graph methods October 16, 2018 11 / 37

Gaussian model Formulation of the problem We can view our model as a graph with edges marked with values of partial covariances and the vertices marked by the mean values. By splitting conceptually the model into 1 Graph that represents dependencies, 2 Means Partial covariances associated with each edge 3 we can divide the main problem of fitting the data to Gaussian density into three parts 1 Estimate the means at each vertex Estimate the structure of the graph 2 3 Given an estimate structure of the graph estimate partial covariances The means can be simply estimated by the mean values of variable corresponding to this vertex Estimating the rest is difficult Random graph methods October 16, 2018 12 / 37

Gaussian model Given the structure estimate the covariances Given a number N of values of X ’s, we would like to estimate the correlations (partial correlations) corresponding to an undirected graph that is representing the non-zero partial correlations. Suppose first that the graph is complete (fully connected). It is well known that the maximum likelihood estimator of Σ is the sample covariance matrix N S = 1 � x ) T ( x i − ¯ x )( x i − ¯ N i = 1 So in this case, the estimate is straightforward Suppose now that there are some edges missing in the actual graph of the partial covariances. The problem of finding an estimate given these constraints is non-trivial. Random graph methods October 16, 2018 13 / 37

Gaussian model Multivariate normal (Gaussian) distribution Everyone believes in Gauss distribution: experimentalists believing that it is a mathematical theorem, mathematicians believing that it is an empirical fact. Quote attributed to Henri Poincar´ e by de Finetti. However, Cramer attributes the remark to Lippman and quoted by Poincar´ e) Gabriel Lippman – a Nobel prize winner in physics, Henri Poincar´ e – a mathematician, theoretical physicist, engineer, and a philosopher of science The multivariate normal or Gaussian random vector X = ( X 1 , . . . , X p ) is given by density � � 1 − 1 2 ( x − µ ) T Σ − 1 ( x − µ ) f ( x ) = exp ( 2 π ) p / 2 � det ( Σ ) that is characterized by: a vector parameter µ and a matrix parameter Σ . The notation X ∼ N p ( µ , Σ ) should be read as “the random vector X has multivariate normal (Gaussian) distribution with the vector parameter µ and the matrix parameter Σ .” Random graph methods October 16, 2018 14 / 37

Gaussian model Multivariate normal (Gaussian) distribution – properties We often drop the dimension p from the notation writing X ∼ N ( µ , Σ ) . The vector parameter µ is equal to the mean of X and the matrix parameter Σ is equal to the covariance matrix of X . Any coordinate X i of X is also normally distributed, i.e. X i has N ( µ i , σ 2 i ) . If X ∼ N p ( µ , Σ ) and A is a q × p (non-random) matrix, q ≤ p , (and the matrix A is of the rank q ), then AX ∼ N q ( A µ , A Σ A T ) Random graph methods October 16, 2018 15 / 37

Gaussian model Subsetting from coordinates of MND Any vector made of a subset of different coordinates of X is also multivariate normal with the corresponding vector mean and covariance matrix. More precisely, if X ∼ N p ( µ , Σ ) and � X 1 � X = X 2 are partitioned into sub-vectors X 1 : q × 1 and X 2 : ( p − q ) × 1 then with � Σ 11 � µ 1 � � Σ 12 µ = and Σ = Σ 21 Σ 22 µ 2 X 1 ∼ N q ( µ 1 , Σ 11 ) and X 2 ∼ N p − q ( µ 2 , Σ 22 ) Random graph methods October 16, 2018 16 / 37

Gaussian model Conditional distributions If X ∼ N p ( µ , Σ ) and � X 1 � X = X 2 are partitioned into sub-vectors X 1 : q × 1 and X 2 : ( p − q ) × 1 then with � Σ 11 � µ 1 � � Σ 12 µ = and Σ = Σ 21 Σ 22 µ 2 the conditional distribution of X 1 given X 2 , is X 1 | X 2 = x 2 ∼ N q ( µ 1 + Σ 12 Σ − 1 22 ( x 2 − µ 2 ) , Σ 11 − Σ 12 Σ − 1 22 Σ 21 ) Random graph methods October 16, 2018 17 / 37

Gaussian model Regression reinterpretation of conditional distributions Vector X 1 given X 2 forms a regression model X 1 = a + DX 2 + ǫ , where The constant term a = µ 1 − Σ 12 Σ − 1 22 µ 2 The design matrix D = Σ 12 Σ − 1 22 The error term ǫ ∼ N q ( 0 , Σ 11 − D Σ 21 ) Special case X 1 = ( X i , X j ) – calculating partial covariances Random graph methods October 16, 2018 18 / 37

Random graph methods October 16, 2018 Random graph methods October - PowerPoint PPT Presentation

Random graph methods October 16, 2018 Random graph methods October 16, 2018 1 / 37 Graphs and Trees a poetic point of view A dead tree, cut into planks and read from one end to the other, is a kind of line graph, with dates down one side

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Back to Random Walks on Graphs Random walk on a graph: Stationary distribution: Back to Random

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Algorithms for random k -SAT and k -colourings of a random graph Michael Molloy Dept of Computer

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Simulation Random numbers Random numbers Anyone who considers arithmetic methods of

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

Random Walks on Graphs Larry Fenn DATE Larry Fenn Random Walks on Graphs Introduction

Random Graphs Will Perkins February 5, 2013 Graph Terminology A graph G = ( V , E ) is a set of

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Integration Testing Path Based Chapter 13 Call graph based integration Use the call graph

STK-IN4300 Details of Random Forests Statistical Learning Methods in Data Science Adaptive

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

Chapter 9 Object recognition Random Forests 9.9 Random forests 2 9.9 Random forests

Tree SSA A New Optimization Framework for GCC Diego Novillo dnovillo@redhat.com Red Hat

Guiding Search with Generalized Policies for Probabilistic Planning William Shen 1 , Felipe

R-Trees Albert-Jan Yzelman December 10, 2007 Albert-Jan Yzelman R-Trees > History Outline

Massive Data Algorithmics Lecture 4: External Search Trees Massive Data Algorithmics Lecture 4:

Midterm 2 Review and Minimum Spanning Trees Tyler Moore CSE 3353, SMU, Dallas, TX March 28, 2013

Disjoint Sets CptS 223 Advanced Data Structures Larry Holder School of Electrical

SSA Form & SSA-form: x 17-4 Each name is defined exactly once. Dead Code Elimination

CULTURAL AND NATURAL HERITAGE Humber Creek Erosion Control Class Environmental Assessment 1

Random graph methods October 16, 2018 Random graph methods October - PowerPoint PPT Presentation

Random graph methods October 16, 2018 Random graph methods October 16, 2018 1 / 37 Graphs and Trees a poetic point of view A dead tree, cut into planks and read from one end to the other, is a kind of line graph, with dates down one side

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Back to Random Walks on Graphs Random walk on a graph: Stationary distribution: Back to Random

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Algorithms for random k -SAT and k -colourings of a random graph Michael Molloy Dept of Computer

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Simulation Random numbers Random numbers Anyone who considers arithmetic methods of

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

Random Walks on Graphs Larry Fenn DATE Larry Fenn Random Walks on Graphs Introduction

Random Graphs Will Perkins February 5, 2013 Graph Terminology A graph G = ( V , E ) is a set of

Graph Indexing: Tree + Delta Delta &gt;= Graph &gt;= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Integration Testing Path Based Chapter 13 Call graph based integration Use the call graph

STK-IN4300 Details of Random Forests Statistical Learning Methods in Data Science Adaptive

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

Chapter 9 Object recognition Random Forests 9.9 Random forests 2 9.9 Random forests

Tree SSA A New Optimization Framework for GCC Diego Novillo dnovillo@redhat.com Red Hat

Guiding Search with Generalized Policies for Probabilistic Planning William Shen 1 , Felipe

R-Trees Albert-Jan Yzelman December 10, 2007 Albert-Jan Yzelman R-Trees &gt; History Outline

Massive Data Algorithmics Lecture 4: External Search Trees Massive Data Algorithmics Lecture 4:

Midterm 2 Review and Minimum Spanning Trees Tyler Moore CSE 3353, SMU, Dallas, TX March 28, 2013

Disjoint Sets CptS 223 Advanced Data Structures Larry Holder School of Electrical

SSA Form &amp; SSA-form: x 17-4 Each name is defined exactly once. Dead Code Elimination

CULTURAL AND NATURAL HERITAGE Humber Creek Erosion Control Class Environmental Assessment 1

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

R-Trees Albert-Jan Yzelman December 10, 2007 Albert-Jan Yzelman R-Trees > History Outline

SSA Form & SSA-form: x 17-4 Each name is defined exactly once. Dead Code Elimination