N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan - - PowerPoint PPT Presentation

n gram graph representation for graphs
SMART_READER_LITE
LIVE PREVIEW

N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan - - PowerPoint PPT Presentation

N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang University of Wisconsin-Madison, Madison Presenter: Hanjun Dai Machine Learning Progress Significant progress in Machine Learning Machine translation


slide-1
SLIDE 1

N-gram Graph: Representation for Graphs

Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang University of Wisconsin-Madison, Madison Presenter: Hanjun Dai

slide-2
SLIDE 2

Machine Learning Progress

  • Significant progress in Machine Learning

Computer vision Machine translation Game Playing Medical Imaging

slide-3
SLIDE 3

ML for Graph-structured Data like Molecules?

Prediction Classifier Representation Learning

slide-4
SLIDE 4

ML for Graph-structured Data like Molecules?

Prediction Classifier Representation Learning

Key Challenge

slide-5
SLIDE 5

Our Method: N-gram Graphs

  • Unsupervised, so can be used by various learning methods
  • Simple, relatively fast to compute
  • Strong empirical performance
  • Outperforms traditional fingerprint/kernel and recent popular GNNs on

molecule datasets

  • Preliminary results on other types of data are also strong
  • Strong theoretical power for representation/prediction
slide-6
SLIDE 6

N-gram Graphs: Bag of Walks

  • Key idea: view a graph as Bag of Walks
  • Walks of length 𝑜 are called 𝑜-grams

A molecular graph Its 2-grams

slide-7
SLIDE 7

N-gram Graphs: Bag of Walks

  • Key idea: view a graph as Bag of Walks
  • Walks of length 𝑜 are called 𝑜-grams

A molecular graph Its 2-grams

N-gram Graph (suppose the embeddings for vertices are given): 1. Embed each 𝑜-gram: entrywise product of its vertex embeddings 2. Sum up the embeddings of all 𝑜-grams: denote the sum as 𝑔

(𝑜)

3. Repeat for 𝑜 = 1, 2, … , 𝑈, and concatenate 𝑔

(1), … , 𝑔 (𝑈)

slide-8
SLIDE 8

N-gram Graphs: Bag of Walks

  • Key idea: view a graph as Bag of Walks
  • Walks of length 𝑜 are called 𝑜-grams

A molecular graph Its 2-grams

N-gram Graph (suppose the embeddings for vertices are given): 1. Embed each 𝑜-gram: entrywise product of its vertex embeddings 2. Sum up the embeddings of all 𝑜-grams: denote the sum as 𝑔

(𝑜)

3. Repeat for 𝑜 = 1, 2, … , 𝑈, and concatenate 𝑔

(1), … , 𝑔 (𝑈)

Equivalent to a simple Graph Neural Network!

slide-9
SLIDE 9

Experimental Results

  • 60 tasks on 10 datasets (predict molecular properties)
  • Compared to classic fingerprint/kernel and recent GNNs
slide-10
SLIDE 10

Experimental Results

  • 60 tasks on 10 datasets (predict molecular properties)
  • Compared to classic fingerprint/kernel and recent GNNs
  • N-gram+XGBoost: top-1 for 21 tasks, and top-3 for 48 tasks
  • Overall better than the other methods
slide-11
SLIDE 11

Theoretical Analysis

  • N-gram graph ~= compressive sensing of the count statistics

(i.e., histogram of different types of 𝑜-grams)

  • Thus has strong representation and prediction power
slide-12
SLIDE 12
  • Code published: https://github.com/chao1224/n_gram_graph

Come to Poster # 70 for details!