 
              Targeted end-to-end knowledge graph decomposition Blaž Škrlj, Jan Kralj and Nada Lavraˇ c Jožef Stefan Institute, Ljubljana, Slovenia blaz.skrlj@ijs.si September 3, 2018
Introduction Introduction Curated knowledge (e.g., BioMine Complex networks Ontologies) Problem statement Network decomposition Heuristics End-to-end learning Stochastic optimization Network embedding Results References Can we use the curated (background) knowledge to learn better from networks? September 3, 2018 1/20
Knowledge graphs Complex networks + semantic relations (e.g., BioMine 1 ) Introduction BioMine Problem statement Network decomposition Heuristics End-to-end learning Stochastic optimization Network embedding Results References 1 Lauri Eronen and Hannu Toivonen. “Biomine: predicting links between biological entities using network models of heterogeneous databases”. In: BMC bioinformatics 13.1 (2012), p. 119. September 3, 2018 2/20
Problem statement Inputs Introduction BioMine Problem Given: statement A knowledge graph (with relation-labeled edges) Network decomposition A set of class-labeled target nodes Heuristics End-to-end learning Outputs Stochastic optimization Network embedding An optimal decomposition of the knowledge graph with Results respect to target nodes and a given task (e.g., node References classification) Open problem: How to automatically exploit background knowledge (relation-labeled edges) during learning? September 3, 2018 3/20
Network decomposition—HINMINE 2 key idea Identify directed paths of length two between the target Introduction nodes of interest. BioMine Construct weighted edges between target nodes. Problem statement Edge construction. Network decomposition Heuristics End-to-end learning Stochastic optimization Network embedding Results References 2 Jan Kralj, Marko Robnik-ikonja, and Nada Lavra. “HINMINE: Heterogeneous information network mining with information retrieval heuristics”. In: Journal of Intelligent Information Systems (2017), pp. 1–33. September 3, 2018 4/20
Edge weight computation Introduction BioMine More formally, given a heuristic function f , a weight of an Problem statement edge between the two nodes u and v is computed as Network decomposition Heuristics � w ( u , v ) = f ( m ); End-to-end learning m ∈ M ( u , m ) ∈ E Stochastic ( m , v ) ∈ E optimization Network where the f ( m ) represents the weight function and m an embedding intermediary node. Here, M represents the set of Results intermediary nodes and E the set of a knowledge graph’s References edges. September 3, 2018 5/20
HINMINE and current state-of-the-art Table 1: HINMINE term weighing schemes, tested for Introduction decomposition of knowledge graphs and their corresponding BioMine formulas in text mining. Problem statement Network Scheme Formula decomposition f ( t , d ) Heuristics tf � � | D | End-to-end f ( t , d ) · log if-idf |{ d ′ ∈ D : t ∈ d ′ }| learning ( P ( t ∧ c ) P ( ¬ t ∧ ¬ c ) − P ( t ∧ ¬ c ) P ( ¬ t ∧ c )) 2 � chi^2 f ( t , d ) · Stochastic P ( t ) P ( ¬ t ) P ( c ) P ( ¬ c ) optimization c ∈ C P ( t ′ ∧ c ′ ) � � P ( t ′ , c ′ ) · log � Network ig f ( t , d ) · P ( t ′ ) P ( c ′ ) embedding c ∈ C , c ′∈{ c , ¬ c } t ′∈{ t , ¬ t } � P ( t ′∧ c ′ ) � P ( t ′ , c ′ ) · log � � Results c ′∈{ c , ¬ c } t ′∈{ t , ¬ t } P ( t ′ ) P ( c ′ ) � f ( t , d ) · gr c ′∈{ c , ¬ c } P ( c ) · log P ( c ) References − � c ∈ C � � | c | |¬ c | � log − log delta-idf f ( t , d ) · |{ d ′ ∈ D : d ′ ∈ c ∧ t ∈ d ′ }| |{ d ′ ∈ D : d ′ / ∈ d ′ }| ∈ c ∧ t / c ∈ C |{ d ′ ∈ D : d ′ ∈ c ∧ t ∈ d ′ }| � � � log 2 + f ( t , d ) · rf |{ d ′ ∈ D : d ′ / ∈ d ′ }| ∈ c ∧ t / c ∈ C � � k + 1 | D | f ( t , d ) · log bm25 · |{ d ′ ∈ D : t ∈ d ′ }| � | d | � f ( t , d ) + k · 1 − b + b · avgdl September 3, 2018 6/20
Towards end-to-end decomposition Introduction BioMine HINMINE’s heuristics are comparable to state-of-the-art Problem methods, BUT statement Network A Heuristic’s performance is dataset-dependent decomposition Heuristics Paths, used for decomposition are manually selected (many End-to-end learning possibilities) Stochastic In this paper we address the following questions: optimization Network Can we automate the heuristic selection? embedding Can decompositions be combined? Results References Is domain expert knowledge really needed for path selection? September 3, 2018 7/20
Decomposition as stochastic optimization Introduction BioMine Problem statement � � arg min Network X opt = ρ ( τ ( d , o , t )) . decomposition ( d , o , t ) ∈ P ( D ) × S × P ( T ) Heuristics End-to-end Where the: learning Stochastic ( d , o , t ) corresponds to paths, operators and heuristics optimization used Network embedding τ corresponds to decomposition computation Results ρ represents a decomposition scoring function References X opt is the optimal decomposition September 3, 2018 8/20
Combining decompositions Introduction Set of heuristic combination operators. Let { h 1 , h 2 , . . . , h k } BioMine be a set of matrices, obtained using different decomposition Problem statement heuristics. We propose four different heuristic combination Network operators. decomposition Heuristics 1 Element-wise sum . Let ⊕ denote elementwise matrix summation. Combined aggregated matrix is thus End-to-end defined as M = h 1 ⊕ · · · ⊕ h k , a well defined expression as ⊕ represents a commutative and associative learning operation. Stochastic 2 Element-wise product . Let ⊗ denote elementwise product. Combined aggregated matrix is thus defined as optimization M = h 1 ⊗ · · · ⊗ h k . Network 3 Normalized element-wise sum . Let ⊕ denote elementwise summation, and max ( A ) denote the largest embedding element of the matrix A . Combined aggregated matrix is thus defined as 1 Results max ( h 1 ⊕···⊕ hk ) ( h 1 ⊕ · · · ⊕ h k ) . As ⊕ represents a commutative operation, this operator can be M = generalized to arbitrary sets of heuristics without loss of generality. References 4 Normalized element-wise product . Let ⊗ denote elementwise product, and max ( A ) denote the largest element of the matrix A . Combined aggregated matrix is thus defined as 1 max ( h 1 ⊗···⊗ hk ) ( h 1 ⊗ · · · ⊗ h k ) . This operator can also be generalized to arbitrary sets of M = heuristics. September 3, 2018 9/20
Decomposition as stochastic optimization Introduction BioMine Problem Considering all possible paths + all possible heuristics + statement combinations of different decompositions results in Network decomposition combinatorial explosion . Heuristics End-to-end Obtaining the optimal decomposition can also be learning formulated as differential evolution : Stochastic optimization A binary vector of size Network | heuristics | + | triplets | + | combinationOP | is propagated embedding through the parametric space Results final solution represents a unique decomposition References September 3, 2018 10/20
Pseudocode of the approach Introduction BioMine Problem statement 1 Select unique paths, heuristics and operators Network decomposition 2 evolve binary vector of solutions with respect to target task Heuristics End-to-end (e.g., classification) learning 3 Upon final number of iterations/convergence etc., use the Stochastic optimization vector to obtain dataset-specific decomposition Network embedding BUT , how are the node labels predicted (decompositions Results scored)? References September 3, 2018 11/20
P-PR and node prediction Introduction BioMine Problem statement Network decomposition Modern way: Prediction via subnetwork embeddings . We Heuristics End-to-end compute P-PR vectors for individual target nodes, hence learning obtaining | k | 2 feature matrices, where | k | << | N | . Stochastic optimization These matrices are used to learn the labels. Network embedding Results References September 3, 2018 12/20
P-PR embeddings Introduction BioMine Problem statement Network decomposition Heuristics End-to-end learning Stochastic optimization Network Figure 1: Personalized PageRank-based embedding. Repeated for embedding each node, this iteration yields a | k | 2 matrix, directly usable for Results learning tasks. References September 3, 2018 13/20
P-PR general use Node classification Introduction BioMine We try to classify individual nodes into target class (es). Rel- Problem evant for e.g., statement Protein function prediction Network decomposition Heuristics Genre classification End-to-end Recommendation etc. learning Stochastic optimization Function prediction Recommendation Network embedding Results References September 3, 2018 14/20
Recommend
More recommend