n gram graph representation for graphs
play

N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan - PowerPoint PPT Presentation

N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang University of Wisconsin-Madison, Madison Presenter: Hanjun Dai Machine Learning Progress Significant progress in Machine Learning Machine translation


  1. N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang University of Wisconsin-Madison, Madison Presenter: Hanjun Dai

  2. Machine Learning Progress • Significant progress in Machine Learning Machine translation Computer vision Game Playing Medical Imaging

  3. ML for Graph-structured Data like Molecules? Prediction Classifier Representation Learning

  4. ML for Graph-structured Data like Molecules? Prediction Key Classifier Challenge Representation Learning

  5. Our Method: N-gram Graphs • Unsupervised, so can be used by various learning methods • Simple, relatively fast to compute • Strong empirical performance • Outperforms traditional fingerprint/kernel and recent popular GNNs on molecule datasets • Preliminary results on other types of data are also strong • Strong theoretical power for representation/prediction

  6. N-gram Graphs: Bag of Walks • Key idea: view a graph as Bag of Walks • Walks of length 𝑜 are called 𝑜 -grams A molecular graph Its 2-grams

  7. N-gram Graphs: Bag of Walks • Key idea: view a graph as Bag of Walks • Walks of length 𝑜 are called 𝑜 -grams A molecular graph Its 2-grams N-gram Graph (suppose the embeddings for vertices are given): Embed each 𝑜 -gram: entrywise product of its vertex embeddings 1. Sum up the embeddings of all 𝑜 -grams: denote the sum as 𝑔 2. (𝑜) 3. Repeat for 𝑜 = 1, 2, … , 𝑈 , and concatenate 𝑔 (1) , … , 𝑔 (𝑈)

  8. N-gram Graphs: Bag of Walks • Key idea: view a graph as Bag of Walks • Walks of length 𝑜 are called 𝑜 -grams A molecular graph Its 2-grams Equivalent to a simple N-gram Graph (suppose the embeddings for vertices are given): Graph Neural Network! Embed each 𝑜 -gram: entrywise product of its vertex embeddings 1. Sum up the embeddings of all 𝑜 -grams: denote the sum as 𝑔 2. (𝑜) 3. Repeat for 𝑜 = 1, 2, … , 𝑈 , and concatenate 𝑔 (1) , … , 𝑔 (𝑈)

  9. Experimental Results • 60 tasks on 10 datasets (predict molecular properties) • Compared to classic fingerprint/kernel and recent GNNs

  10. Experimental Results • 60 tasks on 10 datasets (predict molecular properties) • Compared to classic fingerprint/kernel and recent GNNs • N-gram+XGBoost: top-1 for 21 tasks, and top-3 for 48 tasks • Overall better than the other methods

  11. Theoretical Analysis • N-gram graph ~= compressive sensing of the count statistics (i.e., histogram of different types of 𝑜 -grams) • Thus has strong representation and prediction power

  12. Come to Poster # 70 for details! • Code published: https://github.com/chao1224/n_gram_graph

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend