on the dynamics of topic based communites in online
play

On the Dynamics of Topic-Based Communites in Online - PowerPoint PPT Presentation

On the Dynamics of Topic-Based Communites in Online Knowledge-Sharing Networks Anna Guimar aes, Ana Paula Couto da Silva, Jussara Almeida Department of Computer Science - UFMG (Brazil) September 21, 2015 Introduction Online


  1. On the Dynamics of Topic-Based Communites in Online Knowledge-Sharing Networks Anna Guimar˜ aes, Ana Paula Couto da Silva, Jussara Almeida Department of Computer Science - UFMG (Brazil) September 21, 2015

  2. Introduction • Online Knowledge-Sharing Networks – Wikis, Q&A sites, discussion forums – User-created and maintained discussions – Wealth of knowledge 2

  3. Introduction • Online Knowledge-Sharing Networks – Wikis, Q&A sites, discussion forums – User-created and maintained discussions – Wealth of knowledge • Prior research focus on knowledge extraction by: – Detecting quality content [Agichtein et al., 2008] – Ranking questions and answers [Dalip et al., 2013] – Identifying expert users [Ravi et al., 2014, Wang et al., 2013] 2

  4. Introduction • More than repositories for knowledge! – Community structure surrounding discussions – Topics and communities subject to temporal changes – Multiple topics, multiple communities • This study: – Community approach to knowledge-sharing networks – Characterization and modeling of community evolution 3

  5. Case Study: Stack Overflow 4

  6. Case Study: Stack Overflow Tags 4

  7. Topic-Based Communities in Stack Overflow • Communities centered around topics – Topics are explicity defined – Independent from social interaction graph • Non-exclusive membership to multiple communities 5

  8. Stack Overflow Dataset • User activity – User ID, Tag ID, Time stamp • Data covering a six-year period – 2008–2014 Tags Posts Users 400 19.8 million 1.7 million 6

  9. Topic-Based Communities in Stack Overflow • Temporal analyses of community activity in terms of: – How user behavior affects community sustainability – How users relate to communities in the long run – How users divide their attention across different communities – How communities affect one another 7

  10. Communities in Stack Overflow: Findings • Significant revisiting behavior – Users continue to contribute to a same community – Revisitors to a community grow more significant over time Mean Fraction of Revisits 1st month 6th month 12th month Revisitors 0.20 0.44 0.50 Revisits 0.27 0.46 0.50 8

  11. Communities in Stack Overflow: Findings • Participation in multiple communities – 32% of users participate in up to 3 communities – Average user participates in 17 communities – Decaying pattern of activity over time 30 80 70 18 42 25 13 28 60 Communities 50 20 Posts 40 15 30 20 10 10 5 0 2 4 6 8 10 12 2 4 6 8 10 12 Months Months 9

  12. 0 2014 600 500 400 300 200 100 900 Months Aug 800 2014 Feb 2013 Aug 2013 Feb # Members Rails 3 Members New Members 700 Communities in Stack Overflow: Findings • Migrating behavior – Users traverse different communities over time – Shared member base across communities Ruby on Rails 3 → Ruby on Rails 4 10

  13. 2014 2014 3000 2000 1000 0 Months 6000 Aug Feb 5000 2013 Aug 2013 Feb # Members MySQL New Members 4000 Communities in Stack Overflow: Findings • Migrating behavior – Users traverse different communities over time – Shared member base across communities MySQL → PHP 10

  14. Communities in Stack Overflow: Findings • Key aspects dictating community evolution – Intra-community aspects – User revisits – Continued activity – Inter-community aspects – Shared member base – User migration 11

  15. How can we then describe community evolution? 12

  16. CERIS Model • CERIS – Community Evolution model with Revisits and Inter-community effectS • Goal: describe community activity (number of posts) over time • Incorporates revisits and community relationships 13

  17. CERIS Model • CERIS extends state-of-the-art models – Phoenix-R evolution model with revisits [Figueiredo et al., 2014] – Competition model [Beutel et al., 2012] • Epidemiology approach to network dynamics – Objects in the network are modeled as infections 14

  18. CERIS Model • Users are initially exposed to different communities S 15

  19. CERIS Model • Users become infected by participating in a community I 1 I 2 β 1 β 2 S 15

  20. CERIS Model • Users can recover by ceasing activity in a community I 1 I 2 γ 1 γ 2 β 1 β 2 S 15

  21. CERIS Model • Or they can be infected by additional communities I 1 , 2 εβ 2 εβ 1 γ 2 γ 1 I 1 I 2 γ 1 γ 2 β 1 β 2 S 15

  22. CERIS Model • Revisits to a same community captured by hidden states V 1 , 2 ω 1 , 2 I 1 , 2 εβ 2 εβ 1 γ 2 γ 1 ω 1 ω 2 V 1 I 1 I 2 V 2 γ 1 γ 2 β 1 β 2 S 15

  23. CERIS Model V 1 , 2 ω 1 , 2 I 1 , 2 V 1 , 2 V 1 , 2 εβ 2 εβ 1 + + s 1 s n v 1 γ 2 γ 1 ˆ ω 1 ω 2 + + V 1 I 1 I 2 V 2 γ 1 γ 2 V 1 V 1 β 1 β 2 ... S 16

  24. CERIS Model • Analyzes the time series for the number of posts in the communities simultaneously • Contagious process occurs following “shocks” – Wavelets method to identify activity peaks as shock candidates – e.g. When a new related community becomes active • Model fitting with the Levenberg-Marquardt algorithm and Minimum Description Length 17

  25. Jan 100 Jul 2014 Jul 2013 Jan 2012 Jan 0 50 150 Jul 200 250 300 350 400 ios7 ios6 ios5 model Jul CERIS Model Results HTML and CSS iOS versions 70000 css 60000 html 50000 model 40000 30000 20000 10000 0 2009 2010 2011 2012 2013 2014 18

  26. CERIS Model Results • Model results: – Reasonably accurate fittings – Captures different patterns of activity – Captures concurrent evolution of related communities RMSE HTML and CSS iOS versions All (mean, daily) 3046.895 13.612 21.131 19

  27. CERIS Model Results • Model outputs used to quantify the relationship between communities • Flow of users between communities: flow C 1 , C 2 ( t ) = εβ 2 ( t ) flow C 2 , C 1 ( t ) = εβ 1 ( t ) 20

  28. CERIS Model Results Top 100 Top 15 100 0.8 .net 0.9 objective-c 0.7 asp.net 0.8 80 css Communities 0.6 0.7 mysql ios 0.5 0.6 60 c++ 0.5 html 0.4 python 0.4 40 0.3 jquery 0.3 android 0.2 php 0.2 20 c# 0.1 0.1 javascript java 0.0 0.0 20 40 60 80 100 a # p d n l + s l s t c t t y m q p o e e v h i r o s - c o + s e i e i c n n a p h t y r r h v u c . . Communities j c d t m p y i s q t n s p c a j a a v e a j b j o 21

  29. Conclusions • Knowledge-sharing networks as a community environment – Topic-based communities defined by users interacting with topics of their interest • Investigation of topic-based communities in Stack Overflow – User activity in terms of communities they belong to – Impact of related communities • New model to describe community evolution – Incorporates key factors behind community activity – Good portrayal of the co-evolution of multiple communities 22

  30. Thank you! Anna Guimar˜ aes anna@dcc.ufmg.br 23

  31. References I Agichtein, E., Castillo, C., Donato, D., Gionis, A., and Mishne, G. (2008). Finding High-Quality Content in Social Media. In Proc. WSDM . Beutel, A., Prakash, B. A., Rosenfeld, R., and Faloutsos, C. (2012). Interacting Viruses in Networks: Can Both Survive? In Proc. ACM SIGKDD . 24

  32. References II Dalip, D. H., Gon¸ calves, M. A., Cristo, M., and Calado, P. (2013). Exploiting User Feedback to Learn to Rank Answers in Q&A Forums: A Case Study with Stack Overflow. In Proc. ACM SIGIR . Figueiredo, F., Almeida, J. M., Matsubara, Y., Ribeiro, B., and Faloutsos, C. (2014). Revisit Behavior in Social Media: The Phoenix-R Model and Discoveries. Proc. PKDD . 25

  33. References III Hansen, M. H. and Yu, B. (2001). Model Selection and the Principle of Minimum Description Length. Journal of the American Statistical Association , 96(454). Mor´ e, J. J. (1978). The levenberg-marquardt algorithm: implementation and theory. In Numerical analysis , pages 105–116. Springer. Ravi, S., Pang, B., Rastogi, V., and Kumar, R. (2014). Great Question! Question Quality in Community Q&A. In Proc. ICWSM . 26

  34. References IV Wang, X., Butler, B. S., and Ren, Y. (2013). The impact of membership overlap on growth: An ecological competition view of online groups. Organization Science , 24(2):414–431. 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend