in inter er ev event time distributions
play

In Inter er-ev event Time Distributions in Online Human Behavior - PowerPoint PPT Presentation

P-E-R-S-I-S-T-E-N-C-E and D I S T I N C T I V E N E S S of In Inter er-ev event Time Distributions in Online Human Behavior Jiwan Jeong and Sue Moon School of Computing, KAIST In TempWeb 17 (WWW 17 Companion) April 3, 2017 What is


  1. P-E-R-S-I-S-T-E-N-C-E and D I S T I N C T I V E N E S S of In Inter er-ev event Time Distributions in Online Human Behavior Jiwan Jeong and Sue Moon School of Computing, KAIST In TempWeb ’17 (WWW ’17 Companion) April 3, 2017

  2. What is inter-event time? • Time gap between two consecutive events • E.g., earthquake waves, packet arrivals, … 2

  3. Our definition of inter-event time • Time gap between two consecutive actions in a service by one person • E.g., tweeting, blog posting, email sending, … • Simply put • Inter-event time = interval • Inter-event time distribution = interval pattern 3

  4. Previous studies focused on • Characterizing aggregate interval patterns • Web re-visit pattern [Adar CHI 2007][Adar CHI 2008] • Web browsing pattern [Kumar WWW 2010] • Service usage pattern [Halfaker WWW 2015] • Finding universal laws among interval patterns • Power-law by priority queuing process [Barabasi Nature 2005] • Log-normal by non-homogeneous Poisson process [Malmgren PNAS 2008] 4

  5. We focus on individual-level • How does an individual’s interval pattern change over time? • Does it remain consistent or fluctuate from time to time? • How distinctive is it from those of others? 5

  6. Individuals have in inter erval al patter erns persistent over time, that are pe ctive from others. but distinct 6

  7. Tweets by El Ellen n DeGene neres Twitter timeline ✂ ✂ ✂ 7

  8. Tweets by Ji Jimmy y Fallon 8

  9. Tweets by Su Sue Mo Moon 9

  10. Tweets by Al Albe bert-Lá László Ba Barabási si 10

  11. Tweets by Ey Eytan Ada Adar 11

  12. Tweets by Aa Aaron n Cl Clause set 12

  13. Tweets by Ni Nicolas C Christakis 13

  14. Tweets by Al Alex x Ve Vespagini 14

  15. Tweets by Andr Andrew w Ng 15

  16. Tweets by Ed Ed Chi 16

  17. Tweets by Bru Bruno Go Gonçalv alves 17

  18. Tweets by Hae Haewoon Kw Kwak 18

  19. Tweets by Ca Carl rlos s Ca Castillo 19

  20. Tweets by Pe Peter Do Dodds 20

  21. In this work • Design a computation framework to quantify interval patterns • Show their persistence and distinctiveness • Use interval patterns to distinguish one user from others 21

  22. Datasets for this study 15 years of entire history • 7 years of entire history • 3000 recent tweets per user • 3 years of email history • 22

  23. Estimate Compare Design interval interval computation patterns patterns framework 23

  24. Estimate Compare Design interval interval computation patterns patterns framework 24

  25. als to co Convert di discrete e in inter ervals continuous PDF ? 25

  26. Gaussian kernel density estimation For multi-modal distributions, we use Sheather and Jones’ bandwidth [Sheater J R Stat Soc B 1991] 26

  27. Now, we can estimate interval patterns! ! 27

  28. Estimate Compare Design interval interval computation patterns patterns framework 28

  29. nce between interval patterns Calculate di distanc ? 29

  30. Jensen-Shannon distance • A metric of the difference between probability density functions • Non-negative: 𝑒 𝑦, 𝑧 ≥ 0 • Identity of indiscernibles: 𝑒 𝑦, 𝑧 = 0 iff 𝑦 = 𝑧 • Symmetry: 𝑒 𝑦, 𝑧 = 𝑒 𝑧, 𝑦 • Subadditivity: 𝑒 𝑦, 𝑨 ≤ 𝑒 𝑦, 𝑧 + 𝑒 𝑧, 𝑨 30

  31. Now, we can compare interval patterns! ! 31

  32. Estimate Compare Design interval interval computation patterns patterns framework 32

  33. nce and re Define se self-di distanc refere rence di distanc nce d self d ref 33

  34. Experimental settings for longitudinal analysis • Select users with +500 actions on each service • Divide each user’s timeline into 10 windows W 1 W 2 … W 9 W 10 +, = 45 self-distances for each user • - • 10 ×10 = 100 reference distances for each pair of users 34

  35. P-E-R-S-I-S-T-E-N-C-E & D I S T I N C T I V E N E S S 35

  36. Persistence and distinctiveness are relative • If 𝑒 1234 are small, the pattern is persistent • How small should it be? • If 𝑒 1234 < 𝑒 624 , the pattern is persistent [Saramäki PNAS 2014] • Furthermore, if 𝑒 1234 ≪ 𝑒 624 , the patterns are distinctive 36

  37. 𝑒 1234 vs 𝑒 624 37

  38. How long do interval patterns persist? • Binning 𝑒 1234 by the time gap between two windows W i W j • Compare binned 𝑒 1234 with overall 𝑒 624 38

  39. Persistence over time Binned into 6 groups 39

  40. Persistence over time 40

  41. Persistence over time 41

  42. Do interval patterns persist after long inactivity? • Binning 𝑒 1234 by the longest interval between two windows W i W j • Compare binned 𝑒 1234 with overall 𝑒 624 42

  43. Persistence after inactivity 43

  44. Persistence after inactivity 44

  45. Do interval patterns persist through changing daily routine? • Binning 𝑒 1234 by the circadian distance between two windows W i W j Circadian distance 0 24 0 12 24 12 45

  46. Persistence through changing daily routine 46

  47. In summary, • Individuals have interval signatures that persist over years • The signatures persist even after coming back from long inactivity • The signatures persist through changing daily routine 47

  48. APPLICATION User Identification Using Interval Signatures 48

  49. User identification: Problem definition • Given two windows each containing 100 intervals W A W B • Can we determine those from the same user or not? 49

  50. A very simple identifier W A W B If d < threshold, Else, Calculate the distance d 50

  51. Identification performance ( 1 − 𝐹𝑟𝑣𝑏𝑚 𝐹𝑠𝑠𝑝𝑠 𝑆𝑏𝑢𝑓 ) Wikipedia me2day Twitter Enron Consecutive 80% 87% 83% 76% > 1 year gap 71% 78% 76% 71% • Performance of other behavioral biometrics • Keystroke dynamics: ~90% [Peacock IEEE S&P 2004] • Mouse dynamics: ~80% [Jorgensen AsiaCCS 2011] • Gaits: ~80% [Gaufrov University of Oslo 2008] 51

  52. Follow-up questions • What do people with similar interval signatures have in common? • What can be inferred about users by analyzing interval signatures? • How interval signatures are related to other personal characteristics? 52

  53. In Interval Signature re: P-E-R-S-I-S-T-E-N-C-E and D I S T I N C T I V E N E S S of In Inter er-ev event Time Distributions in Online Human Behavior Q&A

  54. Dataset statistics # of users Wikipedia me2day Twitter Enron With >25 actions 521K 587K 921K 937K With >100 actions 165K 203K 768K 542K With >500 actions 47K 43K 334K 65K 54

  55. 𝑒 1234 vs 𝑒 624 at different window sizes 55

  56. K-means clustering of interval patterns 56

  57. Joint probability matrix for transition 𝑋 D → 𝑋 DF+ 57

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend