Inferring stuff from observed networks 16.5.2012 David Stolz - - PowerPoint PPT Presentation

inferring stuff from observed networks
SMART_READER_LITE
LIVE PREVIEW

Inferring stuff from observed networks 16.5.2012 David Stolz - - PowerPoint PPT Presentation

Inferring stuff from observed networks 16.5.2012 David Stolz Agenda Structure of Approaches 1 Recommendation Network 2 Blogs 3 Meta-Conclusion 4 2 Structure of Approaches Understand Data Define Goals / Categorize


slide-1
SLIDE 1

16.5.2012 David Stolz

Inferring “stuff ” from

  • bserved networks
slide-2
SLIDE 2

2

Agenda

Structure of Approaches Recommendation Network Blogs “Meta-Conclusion”

1 2 3 4

slide-3
SLIDE 3

3

Structure of Approaches

Understand Data Define Goals / Categorize Method Infer Compare Add Knowledge

slide-4
SLIDE 4

4

Recommendation Network

slide-5
SLIDE 5

5

Recommendation Network

  • 4 Mio. Users
  • 16 Mio. Recommendations
  • nly ~3% of purchases associated with recommendation
  • 2 Years
  • Monetary benefit for recommender and

recommendee

slide-6
SLIDE 6

6

Recommendation Network

  • Analyze cascades
  • Categorize by different product categories
  • Books, DVD, Music, Video
slide-7
SLIDE 7

7

Recommendation Network

  • Remove:
  • no-purchase nodes
  • Late recommendations
  • Find all local subgraphs

Isomorphism test

slide-8
SLIDE 8

8

Recommendation Network

  • Most frequently observed cascade?
slide-9
SLIDE 9

9

Recommendation Network

  • Most frequently observed cascade?
  • Differences: Books, DVD, Music, Video?
slide-10
SLIDE 10

10

Recommendation Network

  • Most frequently observed cascade?
  • Differences:
  • Books:

70%

  • DVD:

12%

  • Music:

86.4%

  • Video:

74%

slide-11
SLIDE 11

11

Recommendation Network

  • Overall: splits = 5 * collisions
  • Simple graphs sometimes more rare than

complex graphs

slide-12
SLIDE 12

12

Recommendation Network

Paper Conclusions

  • Most cascades are small
  • Underlying social networks lead to

(measurably) more complex cascades

slide-13
SLIDE 13

13

Recommendation Network

slide-14
SLIDE 14

14

Recommendation Network

slide-15
SLIDE 15

15

Blogs

[ http://cluculzwriter.blogspot.com/ ]

slide-16
SLIDE 16

16

Blogs

  • 4 Years (1999 – 2002)
  • 25'000 Blogs
  • 750'000 Links (between blogs)
slide-17
SLIDE 17

17

Blogs

  • Exact notion of time
  • Only actual entries
  • Filter out “Side-bars”
slide-18
SLIDE 18

18

Blogs

  • Time characteristics
  • Community structure
  • Bursts
slide-19
SLIDE 19

19

Blogs

Time Graph:

  • Label Edges with time
  • Label Nodes with time interval
  • Prefix Graph Gt :
  • Subgraph of G up to time t
slide-20
SLIDE 20

20

Blogs

Community Extraction

  • Two step algorithm:
  • Find new community
  • Expand it
slide-21
SLIDE 21

21

Blogs

Communities (based on Prefix Graphs)

Dec 2001

slide-22
SLIDE 22

22

Blogs

Communities (based on Prefix Graphs)

Fraction ∈ [0,16] ?

Dec 2001

slide-23
SLIDE 23

23

Blogs

SCC Comparison against “Random” Graph

Dec 2001 Dec 2001

Observed “Random”

slide-24
SLIDE 24

24

Blogs

Bursts

Dec 2001

slide-25
SLIDE 25

25

Blogs

Paper Conclusions

  • End of 2001:
  • #Communities:

increased

  • Connectedness:

increased

  • Burstyness:

increased User behavior has changed

slide-26
SLIDE 26

26

Blogs

In another community, a blogger Dawn hosts a poll to determine the funniest and sexiest blogger. She conducts interviews with other bloggers in the community, of course listing their sites. She then becomes obsessed with one of the other bloggers Jim, which spurs comments by many others in the community.

slide-27
SLIDE 27

27

Blogs

In another community, a blogger Dawn hosts a poll to determine the funniest and sexiest blogger. She conducts interviews with other bloggers in the community, of course listing their sites. She then becomes obsessed with one of the other bloggers Jim, which spurs comments by many others in the community.

slide-28
SLIDE 28

28

“Meta-Conclusion”

  • Empirical results matter, even if they don't

astonish

  • Every step of the 4 step approach

influences the result!

  • Talk is silver, silence is golden.

( = don't publish papers just for the sake of publishing them)

slide-29
SLIDE 29

29

b

slide-30
SLIDE 30

30

Discussion