relational bibliographic information
play

Relational Bibliographic Information Networks Huan Gui, Yizhou Sun, - PowerPoint PPT Presentation

Modeling Topic Diffusion in Multi- Relational Bibliographic Information Networks Huan Gui, Yizhou Sun, Jiawei Han, George Brova UIUC Multi-relational Information Networks In the real word, objects are connected via different types of


  1. Modeling Topic Diffusion in Multi- Relational Bibliographic Information Networks Huan Gui, Yizhou Sun, Jiawei Han, George Brova UIUC

  2. Multi-relational Information Networks • In the real word, objects are connected via different types of relationships, forming multi- relational heterogeneous information networks • E.g. – in the bibliographic information network, researchers could be linked together via different types of relationships • collaboration relationships, citation relationships, sharing common co-authors, co-attending conferences, etc. – In the social network case, people are connected • via friendships, colleague relationships, family relationships, etc.

  3. Multi-relational Information Networks

  4. Goal of this paper • They address the problem of modeling information diffusion in multi-relational information networks – Propose multi-relational diffusion model • Propose two models by extending the Linear Threshold model – Learn parameters of the diffusion model • Learning from action log (a sequence of object set recording when an object is activated) • Using MLE

  5. Dataset • They extracted topics from papers’ titles and abstracts: – 79 topics in DBLP dataset, and 30 topics in APS dataset, – study diffusion of these topics during selected periods when these topics have increasing popularity trends

  6. Distributed Graph Summarization

  7. Graph Summarization • Give a compressed representation of the graph

  8. Distributed graph processing systems • Giraph: an open source implementation of Pregel [8] proposed by Google – This paper • Others – GraphLab: proposed by Carlos Guestrin – Trinity: A Distributed Graph Engine on a Memory Cloud [SIGMOD 2013] by Microsoft Research Asia • Other distributed system in the database – Hadoop: Google – Hyracks: by Michael Carey et al (ICDE 2011)

  9. Algorithm

  10. MapReduce Triangle Enumeration With Guarantees

  11. Idea • Divide graphs into multiple overlap partitions, and distribute each partition to a mapper • Based on TTP (Triangle Type Partition) algorithm [CIKM 2013] • Using multiple rounds to reduce the memory cost

  12. Contributions • They propose Colored Triangle Type Partition (CTTP), a multi-round MapReduce randomized algorithm for triangle enumeration – Require rounds in the worst case • E is the total number of edges • m denotes the expected memory size of a reducer • M the total available space. – use M/E space per mapper, m space per reducer, and M words as total aggregate space

  13. Results They are the first to get the result for this graph

  14. Component Detection in Directed Networks

  15. Directional community • They propose a novel concept of communities, directional community – nodes play two different roles, source and terminal, in a directed network

  16. Proposed Methods • They changed Markov Clustering (MCL) and its variant, R-MCL methods • Based on a simulation of stochastic flows on the network

  17. Case Study: Twitter • Detecting Communities from Twitter Interaction Network – a directed edge from a source node to a terminal node is created if any of the following interactions happens • retweet(forwards) a tweet • reply to a tweet • mention someone

  18. Case Study: Twitter • Source: post some tweets • Terminal: spread the tweets This hashtag represents the “No vull pagar ” (“I don’t want to pay”) campaign, a protest in Catalonia at early April, 2012

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend