Targeted end-to-end knowledge graph decomposition Bla krlj, Jan - PowerPoint PPT Presentation

Targeted end-to-end knowledge graph decomposition Blaž Škrlj, Jan Kralj and Nada Lavraˇ c Jožef Stefan Institute, Ljubljana, Slovenia blaz.skrlj@ijs.si September 3, 2018

Introduction Introduction Curated knowledge (e.g., BioMine Complex networks Ontologies) Problem statement Network decomposition Heuristics End-to-end learning Stochastic optimization Network embedding Results References Can we use the curated (background) knowledge to learn better from networks? September 3, 2018 1/20

Knowledge graphs Complex networks + semantic relations (e.g., BioMine 1 ) Introduction BioMine Problem statement Network decomposition Heuristics End-to-end learning Stochastic optimization Network embedding Results References 1 Lauri Eronen and Hannu Toivonen. “Biomine: predicting links between biological entities using network models of heterogeneous databases”. In: BMC bioinformatics 13.1 (2012), p. 119. September 3, 2018 2/20

Problem statement Inputs Introduction BioMine Problem Given: statement A knowledge graph (with relation-labeled edges) Network decomposition A set of class-labeled target nodes Heuristics End-to-end learning Outputs Stochastic optimization Network embedding An optimal decomposition of the knowledge graph with Results respect to target nodes and a given task (e.g., node References classification) Open problem: How to automatically exploit background knowledge (relation-labeled edges) during learning? September 3, 2018 3/20

Network decomposition—HINMINE 2 key idea Identify directed paths of length two between the target Introduction nodes of interest. BioMine Construct weighted edges between target nodes. Problem statement Edge construction. Network decomposition Heuristics End-to-end learning Stochastic optimization Network embedding Results References 2 Jan Kralj, Marko Robnik-ikonja, and Nada Lavra. “HINMINE: Heterogeneous information network mining with information retrieval heuristics”. In: Journal of Intelligent Information Systems (2017), pp. 1–33. September 3, 2018 4/20

Edge weight computation Introduction BioMine More formally, given a heuristic function f , a weight of an Problem statement edge between the two nodes u and v is computed as Network decomposition Heuristics � w ( u , v ) = f ( m ); End-to-end learning m ∈ M ( u , m ) ∈ E Stochastic ( m , v ) ∈ E optimization Network where the f ( m ) represents the weight function and m an embedding intermediary node. Here, M represents the set of Results intermediary nodes and E the set of a knowledge graph’s References edges. September 3, 2018 5/20

HINMINE and current state-of-the-art Table 1: HINMINE term weighing schemes, tested for Introduction decomposition of knowledge graphs and their corresponding BioMine formulas in text mining. Problem statement Network Scheme Formula decomposition f ( t , d ) Heuristics tf � � | D | End-to-end f ( t , d ) · log if-idf |{ d ′ ∈ D : t ∈ d ′ }| learning ( P ( t ∧ c ) P ( ¬ t ∧ ¬ c ) − P ( t ∧ ¬ c ) P ( ¬ t ∧ c )) 2 � chi^2 f ( t , d ) · Stochastic P ( t ) P ( ¬ t ) P ( c ) P ( ¬ c ) optimization c ∈ C P ( t ′ ∧ c ′ ) � � P ( t ′ , c ′ ) · log � Network ig f ( t , d ) · P ( t ′ ) P ( c ′ ) embedding c ∈ C , c ′∈{ c , ¬ c } t ′∈{ t , ¬ t } � P ( t ′∧ c ′ ) � P ( t ′ , c ′ ) · log � � Results c ′∈{ c , ¬ c } t ′∈{ t , ¬ t } P ( t ′ ) P ( c ′ ) � f ( t , d ) · gr c ′∈{ c , ¬ c } P ( c ) · log P ( c ) References − � c ∈ C � � | c | |¬ c | � log − log delta-idf f ( t , d ) · |{ d ′ ∈ D : d ′ ∈ c ∧ t ∈ d ′ }| |{ d ′ ∈ D : d ′ / ∈ d ′ }| ∈ c ∧ t / c ∈ C |{ d ′ ∈ D : d ′ ∈ c ∧ t ∈ d ′ }| � � � log 2 + f ( t , d ) · rf |{ d ′ ∈ D : d ′ / ∈ d ′ }| ∈ c ∧ t / c ∈ C � � k + 1 | D | f ( t , d ) · log bm25 · |{ d ′ ∈ D : t ∈ d ′ }| � | d | � f ( t , d ) + k · 1 − b + b · avgdl September 3, 2018 6/20

Towards end-to-end decomposition Introduction BioMine HINMINE’s heuristics are comparable to state-of-the-art Problem methods, BUT statement Network A Heuristic’s performance is dataset-dependent decomposition Heuristics Paths, used for decomposition are manually selected (many End-to-end learning possibilities) Stochastic In this paper we address the following questions: optimization Network Can we automate the heuristic selection? embedding Can decompositions be combined? Results References Is domain expert knowledge really needed for path selection? September 3, 2018 7/20

Decomposition as stochastic optimization Introduction BioMine Problem statement � � arg min Network X opt = ρ ( τ ( d , o , t )) . decomposition ( d , o , t ) ∈ P ( D ) × S × P ( T ) Heuristics End-to-end Where the: learning Stochastic ( d , o , t ) corresponds to paths, operators and heuristics optimization used Network embedding τ corresponds to decomposition computation Results ρ represents a decomposition scoring function References X opt is the optimal decomposition September 3, 2018 8/20

Combining decompositions Introduction Set of heuristic combination operators. Let { h 1 , h 2 , . . . , h k } BioMine be a set of matrices, obtained using different decomposition Problem statement heuristics. We propose four different heuristic combination Network operators. decomposition Heuristics 1 Element-wise sum . Let ⊕ denote elementwise matrix summation. Combined aggregated matrix is thus End-to-end defined as M = h 1 ⊕ · · · ⊕ h k , a well defined expression as ⊕ represents a commutative and associative learning operation. Stochastic 2 Element-wise product . Let ⊗ denote elementwise product. Combined aggregated matrix is thus defined as optimization M = h 1 ⊗ · · · ⊗ h k . Network 3 Normalized element-wise sum . Let ⊕ denote elementwise summation, and max ( A ) denote the largest embedding element of the matrix A . Combined aggregated matrix is thus defined as 1 Results max ( h 1 ⊕···⊕ hk ) ( h 1 ⊕ · · · ⊕ h k ) . As ⊕ represents a commutative operation, this operator can be M = generalized to arbitrary sets of heuristics without loss of generality. References 4 Normalized element-wise product . Let ⊗ denote elementwise product, and max ( A ) denote the largest element of the matrix A . Combined aggregated matrix is thus defined as 1 max ( h 1 ⊗···⊗ hk ) ( h 1 ⊗ · · · ⊗ h k ) . This operator can also be generalized to arbitrary sets of M = heuristics. September 3, 2018 9/20

Decomposition as stochastic optimization Introduction BioMine Problem Considering all possible paths + all possible heuristics + statement combinations of different decompositions results in Network decomposition combinatorial explosion . Heuristics End-to-end Obtaining the optimal decomposition can also be learning formulated as differential evolution : Stochastic optimization A binary vector of size Network | heuristics | + | triplets | + | combinationOP | is propagated embedding through the parametric space Results final solution represents a unique decomposition References September 3, 2018 10/20

Pseudocode of the approach Introduction BioMine Problem statement 1 Select unique paths, heuristics and operators Network decomposition 2 evolve binary vector of solutions with respect to target task Heuristics End-to-end (e.g., classification) learning 3 Upon final number of iterations/convergence etc., use the Stochastic optimization vector to obtain dataset-specific decomposition Network embedding BUT , how are the node labels predicted (decompositions Results scored)? References September 3, 2018 11/20

P-PR and node prediction Introduction BioMine Problem statement Network decomposition Modern way: Prediction via subnetwork embeddings . We Heuristics End-to-end compute P-PR vectors for individual target nodes, hence learning obtaining | k | 2 feature matrices, where | k | << | N | . Stochastic optimization These matrices are used to learn the labels. Network embedding Results References September 3, 2018 12/20

P-PR embeddings Introduction BioMine Problem statement Network decomposition Heuristics End-to-end learning Stochastic optimization Network Figure 1: Personalized PageRank-based embedding. Repeated for embedding each node, this iteration yields a | k | 2 matrix, directly usable for Results learning tasks. References September 3, 2018 13/20

P-PR general use Node classification Introduction BioMine We try to classify individual nodes into target class (es). Rel- Problem evant for e.g., statement Protein function prediction Network decomposition Heuristics Genre classification End-to-end Recommendation etc. learning Stochastic optimization Function prediction Recommendation Network embedding Results References September 3, 2018 14/20

Targeted end-to-end knowledge graph decomposition Bla krlj, Jan - PowerPoint PPT Presentation

Targeted end-to-end knowledge graph decomposition Bla krlj, Jan Kralj and Nada Lavra c Joef Stefan Institute, Ljubljana, Slovenia blaz.skrlj@ijs.si September 3, 2018 Introduction Introduction Curated knowledge (e.g., BioMine

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

The Decomposition of Graphs DPV Chapter 3 Jim Royer EECS February 6, 2019 Royer (EECS) Graph

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Targeted Marketing and Response Modelling Roger Beecham www.roger-beecham.com Targeted

How to create targeted audiences that work How to create targeted audiences that work

Targeted Charging Review Update The webinar will begin shortly Targeted Charging Review Update

Integration Testing Path Based Chapter 13 Call graph based integration Use the call graph

[11] The Singular Value Decomposition The Singular Value Decomposition Gene Golubs license

CORE DECOMPOSITION AND DENSEST SUBGRAPH IN MULTILAYER NETWORKS CORE DECOMPOSITION AND DENSEST

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

High-dimensional integration without Markov chains Alexander Gray Carnegie Mellon University

Topology-Aware Cooperative Data Protection in Blockchain-Based Decentralized Storage Networks

CSEP 590 B Computational Biology Gene Expression Analysis 1 Assaying Gene Expression 3

Chapt hapter er 6 6 Parallel Processors from Client to Cloud 6.1 Introduction Introduction

Collaboration wont just happen Supervising Co-Teaching Teams: Whose Line is it Deliberate

Python - Data Analysis Essentials Day 2 Giuseppe Accaputo g@accaputo.ch 01.12.2018 Slide 1 IT

ML HPC: Optimizing Optimizers for Optimization Workshop on the Convergence of ML & HPC

The vMatrix: Server Switching (work in progress ROC03) Amr A. Awadallah Mendel Rosenblum

Targeted end-to-end knowledge graph decomposition Bla krlj, Jan - PowerPoint PPT Presentation

Targeted end-to-end knowledge graph decomposition Bla krlj, Jan Kralj and Nada Lavra c Joef Stefan Institute, Ljubljana, Slovenia blaz.skrlj@ijs.si September 3, 2018 Introduction Introduction Curated knowledge (e.g., BioMine

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

The Decomposition of Graphs DPV Chapter 3 Jim Royer EECS February 6, 2019 Royer (EECS) Graph

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Targeted Marketing and Response Modelling Roger Beecham www.roger-beecham.com Targeted

How to create targeted audiences that work How to create targeted audiences that work

Targeted Charging Review Update The webinar will begin shortly Targeted Charging Review Update

Integration Testing Path Based Chapter 13 Call graph based integration Use the call graph

[11] The Singular Value Decomposition The Singular Value Decomposition Gene Golubs license

CORE DECOMPOSITION AND DENSEST SUBGRAPH IN MULTILAYER NETWORKS CORE DECOMPOSITION AND DENSEST

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Graph Indexing: Tree + Delta Delta &gt;= Graph &gt;= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

High-dimensional integration without Markov chains Alexander Gray Carnegie Mellon University

Topology-Aware Cooperative Data Protection in Blockchain-Based Decentralized Storage Networks

CSEP 590 B Computational Biology Gene Expression Analysis 1 Assaying Gene Expression 3

Chapt hapter er 6 6 Parallel Processors from Client to Cloud 6.1 Introduction Introduction

Collaboration wont just happen Supervising Co-Teaching Teams: Whose Line is it Deliberate

Python - Data Analysis Essentials Day 2 Giuseppe Accaputo g@accaputo.ch 01.12.2018 Slide 1 IT

ML HPC: Optimizing Optimizers for Optimization Workshop on the Convergence of ML &amp; HPC

The vMatrix: Server Switching (work in progress ROC03) Amr A. Awadallah Mendel Rosenblum

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

ML HPC: Optimizing Optimizers for Optimization Workshop on the Convergence of ML & HPC