GP regression on random graphs: Covariance functions and Bayes - PowerPoint PPT Presentation

Motivation Covariance functions Bayes errors Summary GP regression on random graphs: Covariance functions and Bayes errors P Sollich 1 and Camille Coti 1 , 2 1 King’s College London 2 Laboratoire de Recherche en Informatique, Universit´ e Paris-Sud Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Motivation GP regression over continuous spaces relatively well understood [e.g. Opper & Malzahn] Discrete spaces occur in many applications: sequences, strings etc What can we say about GP learning on these? Focus on random graphs with finite connectivity as a paradigmatic case Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Graph Laplacian Easiest to define from graph Laplacian [Smola & Kondor 2003] Adjacency matrix A ij = 0 or 1 depending on whether nodes i and j are connected For a graph with V nodes, A is a V × V matrix Consider undirected links ( A ij = A ji ), and no self-loops ( A ii = 0 ) Degree of node i : d i = � V j =1 A ij Set D = diag ( d 1 , . . . , d V ) ; then graph Laplacian is def’d as L = 1 − D − 1 / 2 AD − 1 / 2 Spectral graph theory: L has eigenvalues in 0 . . . 2 Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Graph covariance functions Definition From graph Laplacian, can define covariance “functions” (really V × V matrices) Random walk kernel, a ≥ 2 : ( a − 1) 1 + D − 1 / 2 AD − 1 / 2 � P C ∝ ( a − L ) p ∝ � Diffusion kernel: − σ 2 � σ 2 � � � 2 D − 1 / 2 AD − 1 / 2 C ∝ exp 2 L ∝ exp Useful to normalize so that (1 /V ) � i C ii = 1 Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Graph covariance functions Interpretation Random walk on graph has transition probability matrix A ij d − 1 for transition j → i j After s steps, get ( AD − 1 ) s = D 1 / 2 ( D − 1 / 2 AD − 1 / 2 ) s D − 1 / 2 Compare this with p ( p � s )(1 /a ) s (1 − 1 /a ) p − s ( D − 1 / 2 AD − 1 / 2 ) s C ∝ s =0 So D 1 / 2 CD − 1 / 2 is a random walk transition matrix, averaged over distribution of number of steps: s ∼ Poisson ( σ 2 / 2) s ∼ Binomial(p,1/a) or Diffusion kernel is limit p, a → ∞ at constant p/a = σ 2 / 2 Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Random regular graphs Regular graphs: Every node has same degree d Random graph ensemble: all graphs with given V and d are assigned the same probability Typical loops are then long ( ∝ ln V ) if V is large So locally these graphs are tree-like How do graph covariance functions then behave? Expect that after many random walk steps ( p → ∞ ), kernel becomes uniform: C ij = 1 , all nodes fully correlated Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Covariance functions on regular trees On regular trees, all nodes are equivalent (except for boundary effects) So kernel C ij is a function only of distance ℓ measured along the graph (number of links between i and j ) Can calculate recursively over p : C ℓ,p =0 = δ ℓ, 0 and � 1 − 1 � C 0 ,p + d C 0 ,p +1 = ad C 1 ,p a 1 � 1 − 1 � C ℓ,p + d − 1 C ℓ,p +1 = ad C ℓ − 1 ,p + C ℓ +1 ,p a ad Normalize afterwards for each p so that C 0 ,p = 1 Let’s see what happens for d = 3 , a = 2 and increasing p Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Effect of increasing p a=2, d=3 1 K l p =1 p =2 0.8 p =3 p =4 p =5 p =10 0.6 p =20 p =50 p =100 0.4 p =200 p =500 p =infty 0.2 0 0 10 5 15 l Kernel does not become uniform even for p → ∞ Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops What is going on? Mapping to biased random walk Gather all the (equal) random walk probabilities over the shell of nodes at distance ℓ : S ℓ,p = d ( d − 1) ℓ − 1 C ℓ,p S 0 ,p = C 0 ,p , Then recursion S ℓ,p → S ℓ,p +1 represents a biased random walk in one dimension, with reflecting barrier at origin: 1 − 1 1 1 − 1 d − 1 1 − 1 d − 1 1 − 1 a a a ad a ad a − → − → − → � � � � 0 1 2 3 ← − ← − ← − 1 1 1 ad ad ad Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Random walk propagation Plots of ln S ℓ,p versus ℓ for d = 3 , a = 2 0 p =5000 S l -50 p =2000 1000 -100 500 -150 0 500 1000 1500 l ℓ → ℓ + 1 with prob. ( d − 1) / ( ad ) , ℓ → ℓ − 1 with prob. 1 / ( ad ) , so S ℓ,p has peak at ℓ = ( p/a )[( d − 2) /d ] Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Converting back to C ℓ,p ∝ S ℓ,p / ( d − 1) ℓ − 1 S l K l 0 0 0 -1 -2 -50 -50 -3 2000 0 10 p =5000 -100 -100 2000 500 100 -150 p =5000 -150 0 100 200 300 400 0 100 200 300 400 l l Covariance function determined by tail of S ℓ,p near origin Can be used to calculate C ℓ,p →∞ = [1 + ℓ ( d − 1) /d ]( d − 1) − ℓ/ 2 Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Effect of loops Eventually, approximation of ignoring loops must fail Estimate when this happens: tree of depth ℓ has V = 1 + d ( d − 1) ℓ − 1 nodes So a regular graph can be tree-like at most out to ℓ ≈ ln( V ) / ln( d − 1) Random walk on graph typically takes p/a steps, so expect loop effects to appear in covariance function around p ln( V ) a ≈ ln( d − 1) � Check by measuring average of K 1 = C ij / C ii C jj ( i, j nearest neighbours) on randomly generated graphs Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Covariance function for neighbouring nodes K 1 1 0.9 d =3 0.8 0.7 0.6 a =2, V =infty 0.5 a =2, V =500 0.4 a =4, V =infty a =4, V =500 0.3 0.2 ln V / ln( d -1) 0.1 0 1 10 100 1000 p/a K 1 starts to get larger than for tree approximation ( V → ∞ ) Results depend only on p/a for large p as expected Peter Sollich & Camille Coti GP regression on random graphs

Motivation Covariance functions Bayes errors Summary Approximations Effect of loops Kernel parameters Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

GP regression on random graphs: Covariance functions and Bayes - PowerPoint PPT Presentation

Motivation Covariance functions Bayes errors Summary GP regression on random graphs: Covariance functions and Bayes errors P Sollich 1 and Camille Coti 1 , 2 1 Kings College London 2 Laboratoire de Recherche en Informatique, Universit e

Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance Functions 2 Nugget Covariance

Modelling covariance kernels for nonstationary random fields Christopher G. Small University of

Uncertainty in Eddy Sources of Random Error Random Errors: . . . Covariance Measurements:

Covariance Matrices and Covariance Operators Theory and Applications H` a Quang Minh Functional

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Random Walks on Graphs Larry Fenn DATE Larry Fenn Random Walks on Graphs Introduction

Back to Random Walks on Graphs Random walk on a graph: Stationary distribution: Back to Random

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Random Graphs Random Graphs 1 / 19 Generative Models Could hope to understand networks in the

Covariance Matrices and Covariance Operators in Machine Learning and Pattern Recognition A

Community Detection in Multiplex Networks using Locally Adaptive Random Walks Zhana Kuncheva 1

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

3. Linear programs Review: linear algebra Geometrical intuition Standard form for LPs

Announcements Monday, October 02 Please fill out the mid-semester survey under Quizzes

Graph homotopy, ideals of finite varieties and a surprising duality William J. Martin Department

5 2020

Lecture 5: Geometrical numerical integration methods for differential equations Habib Ammari

Linear Algebra Chapter 9: Complex Scalars Section 9.2. Matrices and Vector Spaces with Complex