Final Report Interest-aware Information Diffusion in Dynamic Social - - PowerPoint PPT Presentation

final report
SMART_READER_LITE
LIVE PREVIEW

Final Report Interest-aware Information Diffusion in Dynamic Social - - PowerPoint PPT Presentation

Final Report Interest-aware Information Diffusion in Dynamic Social Network Zhenhao Cao Ru Wang Mobile Internet 2018. 6 Outline Introduction Related Work Challenge & Motivation Proposed Model Experiments References


slide-1
SLIDE 1

Final Report

Interest-aware Information Diffusion in Dynamic Social Network

Zhenhao Cao Ru Wang Mobile Internet

  • 2018. 6
slide-2
SLIDE 2
  • Introduction
  • Related Work
  • Challenge & Motivation
  • Proposed Model
  • Experiments
  • References

Outline

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 1/42

slide-3
SLIDE 3

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 2/42

  • Social Network

Introduction

slide-4
SLIDE 4
  • An earlier survey: a taxonomy for information cascade prediction
  • Collaborative Filtering methods
  • Leverage homophily: insightful
  • Get rid of troublesome feature engineering

Introduction – A Taxonomy

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 3/42

slide-5
SLIDE 5
  • Key idea behind CF: Homophily
  • Transplantable to information diffusion modeling

Introduction – Why CF?

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 4/42

adopt not adopt adopt not adopt Commodity adoption Information entity adoption (retweet a post)

slide-6
SLIDE 6
  • CRPM & IRPM [1] (CIKM2015)

Related Work – Extant CF-based Studies

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 5/42

slide-7
SLIDE 7
  • GPOP [2] (WWW2017)

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 6/42

Related Work – Extant CF-based Studies

slide-8
SLIDE 8
  • A Collaborative Filtering Model for Personalized Retweeting Prediction [3]

(DASFAA2015)

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 7/42

Related Work – Extant CF-based Studies

slide-9
SLIDE 9
  • More sufficient utility of social network information
  • Better adapted for Information Diffusion modeling
  • Novel insights into user retweet behavior

Challenge & Motivation

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 8/42

slide-10
SLIDE 10
  • More sufficient utility of social network information
  • A flat “snapshot” of users’ historical behaviors
  • Information loss: Permutation? Sequence? Diffusion topologies?

Challenge & Motivation

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 9/42

··· ··· ··· ···

1 1 1 1

··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ···

Retweet Matrix 𝑆 Diffusion Topology compress

slide-11
SLIDE 11
  • More sufficient utility of social network information
  • Better adapted for Information Diffusion modeling
  • Leverage diffusion topologies

* Essence of information diffusion * A main difference from recommendation system problems

Challenge & Motivation

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 10/42

slide-12
SLIDE 12
  • More sufficient utility of social network information
  • Better adaption to Information Diffusion modeling
  • Novel insights into user retweet behavior

Challenge & Motivation

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 11/42

Ret Retweet or

  • r

not not? Int ntere rest Resis Resistance Post Post Att ttraction Others’ Inf nfluence

slide-13
SLIDE 13

Our Work

slide-14
SLIDE 14
  • A novel framework for information diffusion

Our Work - ReTrend

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 12/42

𝑇 𝑎 𝑍 𝑌 𝐵 𝑈

𝐸𝑗𝑔

𝐷

𝑆

Interest-extraction Component Resistance-extraction Component Prediction Component

slide-15
SLIDE 15
  • Four matrices carrying observable data
  • Subscription Matrix (S)
  • Contagion Matrix (C)
  • Resistance Matrix (T)
  • Retweet Matrix (R)

ReTrend – Observable Data

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 13/42

𝑇 𝑎 𝑍 𝑌 𝐵 𝑈

𝐸𝑗𝑔

𝐷

𝑆

Interest-extraction Component Resistance-extraction Component Prediction Component

slide-16
SLIDE 16
  • Four factor matrices carrying latent feature vectors
  • User Interest Matrix (X)
  • User Influence Matrix (Y)
  • User Resistance Matrix (Z)
  • Item Attraction Matrix (A)

ReTrend – Learning Latent Feature

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 14/42

𝑇 𝑎 𝑍 𝑌 𝐵 𝑈

𝐸𝑗𝑔

𝐷

𝑆

Interest-extraction Component Resistance-extraction Component Prediction Component

slide-17
SLIDE 17
  • Four factor matrices carrying latent feature vectors
  • User Interest Matrix (X)
  • User Influence Matrix (Y)
  • Use

ser r Res esistance Matr trix (Z (Z)

  • Item Attraction Matrix (A)
  • We deem this inherent attribute ‘resistance’ varies over latent space but

remains fixed for a fixed user

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 15/42

ReTrend – Learning Latent Feature

𝑇 𝒂 𝑍 𝑌 𝐵 𝑈

𝐸𝑗𝑔

𝐷

𝑆

Interest-extraction Component Resistance-extraction Component Prediction Component

slide-18
SLIDE 18
  • Take Contagion Matrix for example
  • Contagion Matrix: |user| × |post|
  • Entry 𝐷𝑣𝑗: count of retweet behaviors

triggered by user 𝑣 w.r.t. post 𝑗

  • 𝐷𝑣𝑗 reflects two facts:
  • to what degree a user can trigger his

friends to retweet the post

  • how attractive the post is

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 16/42

ReTrend – Logic Explanation

𝑇 𝑎 𝑍 𝑌 𝐵 𝑈

𝐸𝑗𝑔

𝐷

𝑆

Interest-extraction Component Resistance-extraction Component Prediction Component

slide-19
SLIDE 19
  • Take Contagion Matrix for example
  • Assume a Gaussian observation noise

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 17/42

ReTrend – Logic Explanation

User Influence Matrix 𝑍

··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ···

Item Attraction Matrix 𝐵

··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ···

2 1

··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ···

Contagion Matrix C

≈ ×

𝑙 𝑙

slide-20
SLIDE 20
  • For Retweet Matrix
  • Retweet behavior can be determined

by user interest, resistance, parent influence and post attraction where

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 18/42

ReTrend – Logic Explanation

𝑇 𝑎 𝑍 𝑌 𝐵 𝑈

𝐸𝑗𝑔

𝐷

𝑆

Interest-extraction Component Resistance-extraction Component Prediction Component

slide-21
SLIDE 21
  • Conditional distribution over all observed data as
  • Place zero-mean spherical Gaussian priors on latent feature vectors

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 19/42

ReTrend – Entire Model

slide-22
SLIDE 22
  • By modifying the log-likelihood, we obtain the loss function as
  • SGD for optimization

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 20/42

ReTrend – Entire Model

slide-23
SLIDE 23
  • How ReTrend leverage information better?
  • Tree-structured essence of information cascade – Retweet-tree

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 21/42

ReTrend – Retweet-tree Encoding

··· ··· ··· ··· ···

slide-24
SLIDE 24
  • Subscription Matrix

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 22/42

ReTrend – Retweet-tree Encoding

··· ··· ··· ··· ···

···

1 1

···

1 1 1 1

···

1 1 1 1

···

1 1 1 1

···

1 1

···

1 1 1

···

1 1 1 1 1

···

1 1 1

··· ··· ··· ··· ··· ··· ··· ···

Subscribe Matrix 𝑇

slide-25
SLIDE 25
  • Retweet Matrix

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 23/42

ReTrend – Retweet-tree Encoding

··· ··· ··· ··· ···

··· ··· ··· ···

1 1 1 1

··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ···

Retweet Matrix 𝑆

slide-26
SLIDE 26
  • Contagion Matrix

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 24/42

ReTrend – Retweet-tree Encoding

··· ··· ··· ··· ···

··· ··· ··· ···

2 1

··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ···

Contagion Matrix C

slide-27
SLIDE 27
  • Dynamic inference on the most likely retweet-tree structure

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 26/42

ReTrend – Training

slide-28
SLIDE 28
  • AND, it is post-transcending

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 27/42

ReTrend – Training

slide-29
SLIDE 29

Modification

slide-30
SLIDE 30

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 28/42

Matrix Factorization – Drawbacks

  • Simple and fixed inner-product: Low Non-linearity[4]
  • Complex inference in low-dimensional latent space
  • Too much constraints

𝑇 𝑎 𝑍 𝑌 𝐵 𝑈

𝐸𝑗𝑔

𝐷

𝑆

Pure linear operation: Empirically lo low per performance

slide-31
SLIDE 31

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 29/42

MLP Module – Optimization for MF

  • Replace multiplication with a simple MLP module.
  • Level up non-linearity

Matrix A Matrix B Result Matrix A Matrix B Result

MLP Module

slide-32
SLIDE 32

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 30/42

MLP Module – Detail

slide-33
SLIDE 33

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 31/42

Experiments – Dataset

  • Rea

eal-world ld da data taset fro from Twitt tter

  • More than 90,000 users and 99,696,204 tweets related[1][2].
  • 440,000+ subscribes.
  • 2,370,000+ retweet behaviors.
  • 18,210,000+ un-retweet behaviors.
  • 18,210,000+ resistance tuples.
  • 2,170,000+ contagion tuples.

[1] https://www.aminer.cn/data-sna#Twitter-Dynamic-Net [2] https://www.aminer.cn/data-sna#Twitter-Dynamic-Action

slide-34
SLIDE 34

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 32/42

Experiments – Implementation detail

  • For ReTrend:
  • Indicator matrices
  • Normalization for R, S, C, T
  • Latent Feature: 30
  • SGD:
  • Batch size: 1000
  • Training epoch: 100
  • Learning rate: 0.03; 0.03*value(loss function)/500 when loss function is below 500
  • For MLP Module:
  • Trained for 20 epochs implemented by Keras
slide-35
SLIDE 35

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 33/42

Experiments – Performance

  • Plausibility Validation for MLP:
  • Plain features: only the identity of user and

item

  • Embedding: latent vector for user and item

1 1

slide-36
SLIDE 36

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 34/42

Experiments – Performance

  • Baselines:
  • Random
  • Word Vector Based SVM[5]
  • Neural Collaborative Filter[4]
slide-37
SLIDE 37

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 35/42

Experiments – Performance

  • ReTrend+MLP:
slide-38
SLIDE 38
  • [1] Jiang, Bo, Jiguang Liang, Ying Sha, and Lihong Wang. "Message clustering based matrix factorization model

for retweeting behavior prediction." In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM), pp. 1843-1846. ACM, 2015.

  • [2] Cui, Peng, Fei Wang, Shaowei Liu, Mingdong Ou, Shiqiang Yang, and Lifeng Sun. "Who should share what?:

item-level social influence prediction for users and posts ranking." In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 185-194. ACM, 2011.

  • [3] Li, Jun, Jiamin Qin, Tao Wang, Yi Cai, and Huaqing Min. "A Collaborative Filtering Model for Personalized

Retweeting Prediction." In International Conference on Database Systems for Advanced Applications, pp. 122-134. Springer, Cham, 2015.

  • [4] Xiangnan He, Lizi Liao, Hanwang Zhang. “Neural Collaborative Filtering”. arXiv preprint arXiv: 1708.05031,

2017

  • [5] Zhang, Q., Gong, Y., Wu, J., Huang, H., & Huang, X. (2016). Retweet prediction with attention-based deep

neural network. 75-84.

Reference

EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 36/42

slide-39
SLIDE 39

Thanks