querying term associations and their temporal evolution
play

Querying Term Associations and their Temporal Evolution in Social - PowerPoint PPT Presentation

Motivation & Objective Temporal term association model Query operators Running example Conclusions Querying Term Associations and their Temporal Evolution in Social Data Vassilis Plachouras Yannis Stavrakas IMIS / ATHENAR.C.


  1. Motivation & Objective Temporal term association model Query operators Running example Conclusions Querying Term Associations and their Temporal Evolution in Social Data Vassilis Plachouras Yannis Stavrakas IMIS / ”ATHENA”R.C. Greece August 31, 2012

  2. Motivation & Objective Temporal term association model Query operators Running example Conclusions Motivation • Many applications use data from OSNs or microblogging services • Data collected by searching for terms related to the application domain • Selection of terms can have significant impact on results • Important to be able to explore the context and associations of terms

  3. Motivation & Objective Temporal term association model Query operators Running example Conclusions Objective • Aim to develop a platform that enables definition of data analysis campaigns from OSNs • Example: a journalist explores Twitter data can issue the following query concerning the financial crisis: For the period during which there is a strong association between hashtags #crisis and #protest, which other hashtags are associated to both #crisis and #protest? Which are the relevant tweets?

  4. Motivation & Objective Temporal term association model Query operators Running example Conclusions Preliminaries • Model applies to any temporally evolving collection of documents • We focus on tweets • Downloaded tweets are processed at regular time instances t = 1 , 2 , . . . , i • At time instance t = i , we process tweets downloaded between i − 1 and i • load tweets in relation TT with attributes tweet id, publication time and term • build model for tweets published between i − 1 and i

  5. Motivation & Objective Temporal term association model Query operators Running example Conclusions Model definition Model M is a set of quintuples M = {� n , c , w , T , g �} where • n and c are target and context nodes, respectively, corresponding to terms • T is the set of time instances for which the tuple is valid • g is the time granularity 1 � n , c | tw |− 1 • w = P T ( n → c ) = or � n ∈ tw 1 � n ∈ tw , | tw | =1 1 w = P T ( n → n ) = � n ∈ tw 1

  6. Motivation & Objective Temporal term association model Query operators Running example Conclusions Example of Model Build model M for the tweets tw i in two time instances t = 1 : tw 1 = { a } , tw 2 = { a } , tw 3 = { a , b } , tw 4 = { c } , tw 5 = { a , c } t = 2 : tw 6 = { a } , tw 7 = { a , c }

  7. Motivation & Objective Temporal term association model Query operators Running example Conclusions Example of Model Build model M for the tweets tw i in two time instances t = 1 : tw 1 = { a } , tw 2 = { a } , tw 3 = { a , b } , tw 4 = { c } , tw 5 = { a , c } t = 2 : tw 6 = { a } , tw 7 = { a , c } • For tuple � a , b , w , { 1 } , 1 � ∈ M , w = 1 / 4 = 0 . 25

  8. Motivation & Objective Temporal term association model Query operators Running example Conclusions Example of Model Build model M for the tweets tw i in two time instances t = 1 : tw 1 = { a } , tw 2 = { a } , tw 3 = { a , b } , tw 4 = { c } , tw 5 = { a , c } t = 2 : tw 6 = { a } , tw 7 = { a , c } • For tuple � a , b , w , { 1 } , 1 � ∈ M , w = 1 / 4 = 0 . 25 The model M is M = {� a , b , 0 . 25 , { 1 } , 1 � , � a , c , 0 . 25 , { 1 } , 1 � , � b , a , 1 . 00 , { 1 } , 1 � , � c , a , 0 . 50 , { 1 } , 1 � , � a , a , 0 . 50 , { 1 } , 1 � , � c , c , 0 . 50 , { 1 } , 1 � , � a , c , 0 . 50 , { 2 } , 1 � , � c , a , 1 . 00 , { 2 } , 1 � , � a , a , 0 . 50 , { 2 } , 1 �}

  9. Motivation & Objective Temporal term association model Query operators Running example Conclusions Model as a graph b 0.25,{1},1 0.50,{1},1 1.00,{1},1 a 0.50,{2},1 0.50,{2},1 0.25,{1},1 0.50,{1},1 0.50,{1},1 c 1.00,{2},1

  10. Motivation & Objective Temporal term association model Query operators Running example Conclusions Query operators Manipulating the quintuples of models with operators • filter • fold • jump • merge • join

  11. Motivation & Objective Temporal term association model Query operators Running example Conclusions Filter operator Notation filter ( M , cond ) Input • Model M • Condition cond Returns Set of quintuples in M that satisfy cond Example M 2 = filter ( M 1 , T inside { 5 . . . 12 } ∧ w ∈ top (10))

  12. Motivation & Objective Temporal term association model Query operators Running example Conclusions Fold operator Notation fold ( M , g ) Input • Model M • integer g = g o / g i where g o and g i are the time granularities of the output and input models respectively Returns Set of folded quintuples with time granularity g × g i

  13. Motivation & Objective Temporal term association model Query operators Running example Conclusions Fold operator Example For the input model M 1 M 1 = {� n 1 , c 1 , w 1 , { 1 } , 1 � , � n 1 , c 1 , w 2 , { 2 } , 1 � , � n 1 , c 1 , w 3 , { 3 } , 1 � , � n 2 , c 1 , w 4 , { 1 } , 1 � , � n 2 , c 1 , w 5 , { 4 } , 1 �} the operation M 2 = fold ( M 1 , 3) returns M 2 = {� n 1 , c 1 , w 6 , { 1 , 2 , 3 } , 3 � , � n 2 , c 1 , w 4 , { 1 , 2 , 3 } , 3 � , � n 2 , c 1 , w 5 , { 4 , 5 , 6 } , 3 �} where w 6 = P { 1 , 2 , 3 } ( n 1 → c 1 )

  14. Motivation & Objective Temporal term association model Query operators Running example Conclusions Jump operator Notation jump ( M , k ) Input • Model M • integer k Output A model with expanded contexts and weights equal to the probability of a path of length k between two nodes

  15. Motivation & Objective Temporal term association model Query operators Running example Conclusions Jump operator Example For t = 1 the transition matrix b 0.25,{1},1   0 . 50 0 . 25 0 . 25 0.50,{1},1 1.00,{1},1 P { 1 } = 1 . 00 0 . 00 0 . 00   0 . 50 0 . 00 0 . 50 a 0.50,{2},1 0.50,{2},1 0.25,{1},1 0.50,{1},1 0.50,{1},1 c 1.00,{2},1

  16. Motivation & Objective Temporal term association model Query operators Running example Conclusions Jump operator Example For t = 1 the transition matrix b 0.25,{1},1   0 . 50 0 . 25 0 . 25 0.50,{1},1 1.00,{1},1 P { 1 } = 1 . 00 0 . 00 0 . 00   0 . 50 0 . 00 0 . 50 a 0.50,{2},1 0.50,{2},1 For M ′ = jump ( M , 2) the 0.25,{1},1 weight w of tuple 0.50,{1},1 � a , a , w , { 1 } , 1 � ∈ M ′ is 0.50,{1},1 c w = p 2 1.00,{2},1 { 1 } (1 , 1)

  17. Motivation & Objective Temporal term association model Query operators Running example Conclusions Merge operator Notation merge ( M ) Input • Model M Output A model where all tuples with the same n and c are aggregated

  18. Motivation & Objective Temporal term association model Query operators Running example Conclusions Merge operator Example If the input model is M 1 = {� n 1 , c 1 , w 1 , T 1 , g � , � n 2 , c 1 , w 2 , T 1 , g � , � n 1 , c 1 , w 3 , T 2 , g �} then the output model M 2 = merge ( M 1 ) is M 2 = {� n 1 , c 1 , w 4 , T 1 ∪ T 2 , g � , � n 2 , c 1 , w 2 , T 1 , g �}

  19. Motivation & Objective Temporal term association model Query operators Running example Conclusions Join operator Notation join ( M 1 , M 2 , cond ) Input • Models M 1 and M 2 • Condition cond Output A subset of M 1 which satisfies condition cond on variables of M 1 and M 2

  20. Motivation & Objective Temporal term association model Query operators Running example Conclusions Join operator Example Given M 1 M 1 = {� n 1 , c 1 , 0 . 5 , { 1 , 2 } , 1 � , � n 1 , c 2 , 0 . 5 , { 1 , 2 } , 1 � , � n 1 , c 1 , 0 . 7 , { 3 , 4 } , 1 � , � n 1 , c 2 , 0 . 3 , { 3 , 4 } , 1 �} a query, which asks for the tuples with increasing weight over time join ( M 1 as m , M 1 as m ′ , m . n = m ′ . n ∧ m . c = m ′ . c ∧ min ( m . T ) > max ( m ′ . T ) ∧ m . w > m ′ . w ) returns M 2 = {� n 1 , c 1 , 0 . 7 , { 3 , 4 } , 1 �}

  21. Motivation & Objective Temporal term association model Query operators Running example Conclusions Dataset • Set of 16.5 million tweets • tracking a set of 74 Greek stop-words • collected between March 20 and June 20, 2012 • processed every 4 hours • Two most frequent hashtags are #ff and #elections12 Volume of tweets with hashtags per day Volume of tweets per day 350000 50000 300000 40000 250000 # of tweets # of tweets 200000 30000 150000 20000 100000 10000 50000 0 0 10/03 24/03 07/04 21/04 05/05 19/05 02/06 16/06 30/06 10/03 24/03 07/04 21/04 05/05 19/05 02/06 16/06 30/06 Date Date

  22. Motivation & Objective Temporal term association model Query operators Running example Conclusions Example query Query Find the hashtags that are associated with #ekloges12 and for which the association weight increases for two consecutive weeks.

  23. Motivation & Objective Temporal term association model Query operators Running example Conclusions Example query Query expressed with operators M 2 = filter ( M 1 , n = #ekloges12 ) M 3 = fold ( M 2 , 42) M 4 = join ( M 3 as m , M 3 as m ′ , cond ) M 5 = join ( M 4 as m , M 4 as m ′ , cond ) where cond = m . n <> m . c ∧ m . n = m ′ . n ∧ m . c = m ′ . c ∧ m . w > m ′ . w ∧ min ( m . T ) = max ( m ′ . T ) + 1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend