On the Navigability of Social Tagging On the Navigability of Social - - PowerPoint PPT Presentation

on the navigability of social tagging on the navigability
SMART_READER_LITE
LIVE PREVIEW

On the Navigability of Social Tagging On the Navigability of Social - - PowerPoint PPT Presentation

Graz University of Technology On the Navigability of Social Tagging On the Navigability of Social Tagging Systems Christoph Trattner Knowledge Management Institute and Institute for Information Systems and Computer Media Graz University of


slide-1
SLIDE 1

Graz University of Technology

On the Navigability of Social Tagging On the Navigability of Social Tagging Systems

Christoph Trattner

Knowledge Management Institute and Institute for Information Systems and Computer Media Graz University of Technology, Austria e-mail: ctrattner@iicm.edu web: http://www.austria-lexikon.at/af/User/Trattner%20Christoph In collaboration with:

1

Christoph Trattner 2011

D.Helic, M.Strohmaier, K. Andrews

slide-2
SLIDE 2

Graz University of Technology

What is a tagging system and what are What is a tagging system and what are tags?

What is a tagging system? A system that provides the user the possibility to A system that provides the user the possibility to apply tags to resources What are tags? g

  • lightweight keywords (free form vocabulary)
  • generated by users
  • for users

2

Christoph Trattner 2011

slide-3
SLIDE 3

Graz University of Technology

Popular Examples Popular Examples

3

Christoph Trattner 2011

slide-4
SLIDE 4

Graz University of Technology

Why system designers like tags?

  • Tags add additional meta data to resources for which

Tags add additional meta data to resources for which typically just sparse meta data information exists (such as pictures, movies, etc.)

  • Trough tags system designers are able to provide the

user with simple navigational tools that improve the systems information retrieval properties

  • Tags are cheap!!!

4

Christoph Trattner 2011

slide-5
SLIDE 5

Graz University of Technology

Why users like tags?

  • Trough tags users are able to categorize or describe

Trough tags users are able to categorize or describe resources

  • Can find information faster
  • through personal tags
  • Can find related content faster
  • trough related tags

5

Christoph Trattner 2011

slide-6
SLIDE 6

Graz University of Technology

Navigation with Tags

Typically tagging systems provide the user the following forms of information retrieval interfaces to navigate content of a tagging system 1. Tag clouds – widely used 2 T hi hi

  • 2. Tag hierarchies

new – hardly any implementations yet

6

Christoph Trattner 2011

Gupta et al. 2010

slide-7
SLIDE 7

Graz University of Technology

How does tag (cloud) based navigation How does tag (cloud) based navigation look like?

7

Christoph Trattner 2011

slide-8
SLIDE 8

Graz University of Technology

Questions???

Are Tag Clouds useful for navigation?

8

Christoph Trattner 2011

slide-9
SLIDE 9

Graz University of Technology

Modelling a tag dataset as a graph (1/2)

A t i d t t i t i ll d l d t i tit

  • A tagging dataset is typically modeled as a tripartite

hypergraph

  • V = R U U U T
  • An annotation is a hyperedge (r, t, u)
  • A tripartite hypergraph can be mapped onto three

bipartite graphs connecting users and resources bipartite graphs connecting users and resources, users and tags, and tags and resources.

9

Christoph Trattner 2011

slide-10
SLIDE 10

Graz University of Technology

Defining Navigability

A network is navigable iff: There is a short path between all or almost all pairs of d i th t k nodes in the network. Formally: Formally:

  • 1. There exists a giant component

2 The effective diameter is low (bounded by log n)

  • 2. The effective diameter is low (bounded by log n)

10

Christoph Trattner 2011

  • J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science

Technical Report 99-1776 (October 1999)

slide-11
SLIDE 11

Graz University of Technology

Navigability: Examples

Example 1: Not navigable: No giant component Example 2: Not navigable: giant component BUT Not navigable: giant component, BUT eff.diam: 7 > log2(8)

11

Christoph Trattner 2011

slide-12
SLIDE 12

Graz University of Technology

Navigability: Examples

Example 3: Navigable: Giant component AND eff diam: 2 < log (10) eff.diam: 2 < log2(10) Is this efficiently navigable? Is this efficiently navigable? There are short paths between all nodes, but can an agent or algorithm find them with local knowledge

12

Christoph Trattner 2011

  • nly?
slide-13
SLIDE 13

Graz University of Technology

Efficiently navigable

A network is efficiently navigable iff: If there is an algorithm that can find a short path with l l l k l d d th d li ti f th

  • nly local knowledge, and the delivery time of the

algorithm is bounded polynomially by logk(n). Example 4:

B A C

Efficiently navigable, if the algorithm knows it needs to go through A B C

13

Christoph Trattner 2011

  • J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science

Technical Report 99-1776 (October 1999)

slide-14
SLIDE 14

Graz University of Technology

Navigability of Social Tagging Systems (1/2)

14

Christoph Trattner 2011

slide-15
SLIDE 15

Graz University of Technology

Navigability of Social Tagging Systems (2/2)

.

„Hub“ tags

Tagging networks are navigable power-law networks. For power law t k ffi i t b li d t li d i ti l ith i t networks, efficient sub-linear decentralised navigation algorithms exist.

15

Christoph Trattner 2011

slide-16
SLIDE 16

Graz University of Technology

But how about User Interface constraints?

Tag Cloud Size n topN resources

(topN most common algorithm)

Pagination of resources / tag k resources shown / page k resources shown / page

(reverse chronological ordering)

16

Christoph Trattner 2011

slide-17
SLIDE 17

Graz University of Technology

How UI constraints effect Navigability

Tag Cloud Size

.

Pagination Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does not influence navigability (this is not very surprising). BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination with resources sorted in reverse-chronological order) leaves the network vulnerable to fragmentation. This destroys navigability of prevalent approaches

17

Christoph Trattner 2011

vulnerable to fragmentation. This destroys navigability of prevalent approaches to tag clouds.

slide-18
SLIDE 18

Graz University of Technology

Questions???

How can we recover the navigability of social tagging systems? Answer: Through resource specific resource list construction! construction!

18

Christoph Trattner 2011

slide-19
SLIDE 19

Graz University of Technology

What is a resource specific resource list ?

A ifi li t i li t

  • A resource specific resource list is a resource list

that is not only specific to a particular tag but also to a particular resource in the tagging also to a particular resource in the tagging system

  • Typically resource lists are calculated as follows

Res(t) = {ri(t) rn(t)} Res(t) {ri(t),…,rn(t)}

  • Resource specific resource lists are calculated

as as Res(t,r) = {ri(t,r),…,rn(t,r)}

19

Christoph Trattner 2011

slide-20
SLIDE 20

Graz University of Technology

Approach: Random Ordering

  • Instead of reverse-chronological ordering of resources,

we apply a random ordering.

  • On each click on a particular tag a different resource list is

generated

  • Problem: network is not efficiently navigable

Better algorithms can easily be envisioned.

20

Christoph Trattner 2011

slide-21
SLIDE 21

Graz University of Technology

Approach: Hierarchical Ordering

  • Instead of random ordering, we use hierarchical

background knowledge for ranking paginated reso rces [Kleinberg 2001] resources [Kleinberg 2001].

  • Kleinberg showed that if the nodes of a network

can be organized into a hierarchy then such a can be organized into a hierarchy, then such a hierarchy provides a probability distribution for connecting the nodes in the network. g

  • For such a network a hierarchical decentralized

searcher exists that is able to navigate the network in log(n) => the network is efficiently navigable

21

Christoph Trattner 2011

  • J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,

2001, p. 2001.

slide-22
SLIDE 22

Graz University of Technology

Approach: Hierarchical Ordering

22

Christoph Trattner 2011

  • J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,

2001, p. 2001.

slide-23
SLIDE 23

Graz University of Technology

Problem: Semantic Penalty

  • Hierarchy was more or less randomly

constructed

  • Does not take semantic similarity between

resources into account H t h d l d

  • Hence, two new approaches were developed
  • First idea, constructing efficiently navigable tag clouds

from structured web content [Trattner 2011] from structured web content [Trattner 2011]

  • Second idea, develop an algorithm that is able to

construct semantically correct resource hierarchies from tagging data [Trattner 2011a] from tagging data [Trattner 2011a]

  • C. Trattner , D. Helic, M. Strohmaier, “On the Construction of Efficiently Navigable Tag Clouds Using Knowledge from Structured Web Content,” in JUCS,

Volume 17, Issue 4, 565-582, 2011.

  • C. Trattner , “Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails” submitted to a journal, 2011.

23

Christoph Trattner 2011

p g g y gg g y y g j

slide-24
SLIDE 24

Graz University of Technology

O th t ti f ffi i tl i bl t On the construction of efficiently navigable tag clouds from structured web content

  • Content on the Web not always flat
  • There are websites that provide a hierarchical

structure

  • Example: Austria-Forum

24

Christoph Trattner 2011

slide-25
SLIDE 25

Graz University of Technology

Austria-Forum

Wiki b d O li l di t

  • Wiki-based Online encyclopedia system
  • provides over 200,000 information items about

Austria.

  • differently to Wikipedia, articles in Austria-Forum

are published, edited, checked and certified by people who are accepted as experts in particular fi ld p p p p p field

  • articles are organized hierarchically

into categories

Community AEIOU

Wissenssammlungen

into categories

  • categories are addressable via

structured URLs

(cf Open Directory DMOZ) (cf. Open Directory DMOZ)

25

Christoph Trattner 2011

slide-26
SLIDE 26

Graz University of Technology

Austria-Forum: Tagging system

C t t Wiki di A t i F i t t

  • Contrary to Wikipedia Austria-Forum integrates a

tagging system to link related documents with each

  • ther

26

Christoph Trattner 2011

slide-27
SLIDE 27

Graz University of Technology

Approach (1/2)

1 Hi hi l T Cl d C t ti

  • 1. Hierarchical Tag Cloud Construction

27

Christoph Trattner 2011

slide-28
SLIDE 28

Graz University of Technology

Approach (2/2)

2 Hi hi l R Li t C t ti

  • 2. Hierarchical Resource List Construction

28

Christoph Trattner 2011

slide-29
SLIDE 29

Graz University of Technology

Evaluation

T l t th t d l ith t k To evaluate the presented algorithm, a network theoretical framework [Trattner 2011b] based on the Stanford SNAP Library (http://snap.stanford.edu/) was y ( p p ) developed: Network theoretic module: C l

l t t k ti

Network-theoretic module: Calculates network properties

such as the size of the Largest Strongly Connected Component (LSCC) or the Effective Diameter (ED) of the tag cloud network

Searcher module: Implements a hierarchical decentralized

searcher to simulate “efficient” tag cloud driven navigation searcher to simulate efficient tag cloud driven navigation

  • C. Trattner , “NAVTAG - A Network-Theoretic Framework to Assess and Improve the Navigability of Tagging Systems,” in11th International Conference on

Web Engineering (ICWE 2011), Springer, 2011 (to be published). 29

Christoph Trattner 2011

g g ( ) p g ( p )

slide-30
SLIDE 30

Graz University of Technology

Hierarchical Decentralized Search

Sh t t th t t t Background knowledge: Shortest path to target Background knowledge:

(e.g. a folksonomy)

A t t k A tag network:

Goal: Navigate from START to TARGET using local background knowledge only

start target

using local background knowledge only

30

Christoph Trattner 2011

  • J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)
slide-31
SLIDE 31

Graz University of Technology

Results: Navigability

31

Christoph Trattner 2011

slide-32
SLIDE 32

Graz University of Technology

Results: Searcher

32

Christoph Trattner 2011

slide-33
SLIDE 33

Graz University of Technology

Empirical Analysis

All i ll 24 ti i t i it d

  • All in all, 24 participants were invited
  • 16 male and 8 female
  • median age = 33 years, ranging from 22 to 56
  • All participants were experienced computer (on

average 46 hours per week)

  • 12 of them were experienced with the test

system

33

Christoph Trattner 2011

slide-34
SLIDE 34

Graz University of Technology

Preliminaries

D l d f th l t t A t i F t d t t

  • Download of the latest Austria-Forum tag dataset
  • Generation of two different tag networks

g

  • NETWORK H: Hierarchically Constructed Tag Network
  • NETWORK C: Tag Network using the reverse chronologically

sorting resource list algorithm sorting resource list algorithm

  • Random Selection of 10 resource pairs randomly

path length (1 1 2 2 3 3 4 4 5 5) path length (1-1-2-2-3-3-4-4-5-5)

  • Simulation with hierarchical decentralized searcher

(Network H: 50% Success Rate, Network C: 20%)

  • Development of an online test

34

Christoph Trattner 2011

slide-35
SLIDE 35

Graz University of Technology

User Test: Austria-Forum

35

Christoph Trattner 2011

slide-36
SLIDE 36

Graz University of Technology

Results: User Study

The experiment showed that the hierarchically constructed tag network is significantly better navigable than the one tag network is significantly better navigable than the one constructed by the chronologically sorting resource list algorithm.

36

Christoph Trattner 2011

g

slide-37
SLIDE 37

Graz University of Technology

Problem: Predefined Resource Hierarchy

  • Not always a predefined resource hierarchy is

given

  • Hence, the presented approach is not

completely generic

  • Other problem:

Th S R t d d ti ll if th The Success Rate drops drastically if the provided resource hierarchy is neither balanced nor complete balanced nor complete

37

Christoph Trattner 2011

slide-38
SLIDE 38

Graz University of Technology

Question?

How can we construct fixed branched and balanced resource hierarchies from tagging data automatically???

38

Christoph Trattner 2011

slide-39
SLIDE 39

Graz University of Technology

Algorithm: Resource Hierarchy Generation Algorithm: Resource Hierarchy Generation Algorithm

39

Christoph Trattner 2011

slide-40
SLIDE 40

Graz University of Technology

Algorithm: Resource Hierarchy Labeling Algorithm: Resource Hierarchy Labeling Algorithm

40

Christoph Trattner 2011

slide-41
SLIDE 41

Graz University of Technology

Results: Semantic Evaluation

  • Taxonomic F-Measure and

Taxonomic Overlap identify the quality of a given taxonomy against a golden standard via common concepts. p

  • Comparison to four popular tag

hierarchy induction algorithms hierarchy induction algorithms

  • As golden standard for the experiment the Germanet
  • ntology was used (the Austria-Forum tag dataset contains
  • nly German tags)

41

Christoph Trattner 2011

  • nly German tags)
slide-42
SLIDE 42

Graz University of Technology

Results: Empirical Analysis

  • 9 test participants (all of them experienced in the evaluation

f t hi hi )

  • f concept hierarchies)
  • resource taxonomy with b=10
  • Evaluation via online test
  • Users had to classify tag trails

42

Christoph Trattner 2011

slide-43
SLIDE 43

Graz University of Technology

Results: Empirical Analysis

Compared to a tag taxonomy comprising only tags we can Compared to a tag taxonomy comprising only tags we can see that concept relations of a tag-resource taxonomy with branching factor b = 10 are only to 5% less hierarchically g y y arranged than the tag concepts of the in theory best semantically correct tag taxonomy approach the so-called D /C t t i d ti l ith Deg/Cooc tag taxonomy induction algorithm.

43

Christoph Trattner 2011

slide-44
SLIDE 44

Graz University of Technology

Results: Tag Cloud Navigability

In order to determine the navigability of the approach several tag networks with different resource list lengths were tag networks with different resource list lengths were generated. Branching factors used in the experiment: b=2,5 and 10. Resource list length was varied from k=10 to 50.

  • To determine navigability: Size of LSCC and ED was measured.
  • To determine efficiency a hierarchical decentralized searcher was

implemented utilizing the resource hierarchy as background knowledge to search the tag networks.

44

Christoph Trattner 2011

slide-45
SLIDE 45

Graz University of Technology

Results: Network Properties

Simulations show the navigability of the hierarchically constructed tag networks.

45

Christoph Trattner 2011

g

slide-46
SLIDE 46

Graz University of Technology

Results: Searcher

Simulations show very high success rates ( > 90%) even for “short” resource lists (k=10).

46

Christoph Trattner 2011

( )

slide-47
SLIDE 47

Graz University of Technology

Conclusions

F t k th ti l ti t i

  • From a network theoretical perspective tagging

systems are per se not navigable Problem: Current tag cloud algorithms calculate

  • Problem: Current tag cloud algorithms calculate

resource lists in a statically manner

  • Hence pagination clusters tag network
  • Hence, pagination clusters tag network
  • However, with hierarchically constructed resource

lists navigability can be recovered lists navigability can be recovered

  • Such tag networks are also efficiently navigable, if

the resources of the tagging system can be arranged gg g y g into a fixed branched resource taxonomy

47

Christoph Trattner 2011

slide-48
SLIDE 48

Graz University of Technology

End of Presentation Thank you!

Christoph Trattner

ctrattner@iicm.edu @

Graz University of Technology, Austria

48

Christoph Trattner 2011

slide-49
SLIDE 49

Graz University of Technology

References (1/2)

Trattner C : NAVTAG A Network Theoretic Framework to Assess and Improve the Trattner, C.: NAVTAG - A Network-Theoretic Framework to Assess and Improve the Navigability of Tagging Systems, 11th International Conference on Web Engineering (ICWE 2011), Springer, 2011 (to be published). Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists: A Comparative Study, 33 rd International Conference on Information Technology Interfaces, IEEE, Cavtat / Dubrovnik, Croatia, 2011 (to be published). Trattner, C., Helic, D. and Strohmaier, M.: On the Construction of Efficiently Navigable Tag Clouds Using Knowledge From Structured Web Content, Journal of Universal Computer Science, Volume 17, Issue 4, 565-582, 2011. Helic, D., Trattner, C., Strohmaier, M. and Andrews, K.: Are Tag Clouds Useful for Navigation? A Network-Theoretic Analysis, Journal of Social Computing and Cyber-Physical Systems, 2011 (to be published). Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of Folksonomies, In Proceedings of the 20th international conference on World wide web (WWW '11). ACM, New York, NY, USA, 417-426, 2011.

49

Christoph Trattner 2011

( ) , , , , ,

slide-50
SLIDE 50

Graz University of Technology

References (2/2)

Trattner C : QUERYCLOUD: AUTOMATICALLY LINKING RELATED DOCUMENTS VIA SEARCH QUERY Trattner, C.: QUERYCLOUD: AUTOMATICALLY LINKING RELATED DOCUMENTS VIA SEARCH QUERY (TAG) CLOUDS, In Proceedings of IADIS International Conference WWW/Internet 2010 (2010), IADIS International Conference on WWW/Internet, Romania, 2010. Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: On the Navigability of Social Tagging Systems, The Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: On the Navigability of Social Tagging Systems, The Second IEEE International Conference on Social Computing (SocialCom2010), Minneapolis, Minnesota, USA, 2010. (Best Paper Nomination) Trattner, C., Helic, D., Strohmaier, M.: Improving Navigability of Hierarchically-Structured Encyclopedias through Effective Tag Cloud Construction, IKNOW 2010 - 10th International Conference on Knowledge Management and Knowledge Technologies, Graz, Austria, 2010. Trattner, C., Hasani, I., Helic, D., Leitner, H.: The Austrian way of Wiki(pedia)! - Development of a Structured Wiki b d E l di ithi L l A t i C t t WikiS 2010 Th 6th I t ti l Wiki-based Encyclopedia within a Local Austrian Context, WikiSym 2010 - The 6th International Symposium on Wikis and Open Collaboration, ACM, Gdansk, Poland, 1-10, 2010. Trattner, C., Helic, D.: Linking Related Documents: Combining Tag Clouds and Search Queries, In 10th International Conference on Web Engineering - ICWE 2010 LNCS 6186 Springer Vienna Austria 486 - International Conference on Web Engineering - ICWE 2010, LNCS 6186 Springer, Vienna, Austria, 486 - 489, 2010. Trattner, C., Helic, D., Maglajlic, S.: Enriching Tagging Systems with Google Query Tags, In Proceedings of 32nd International Conference on Information Technology Interfaces, IEEE, Cavtat / Dubrovnik, Croatia,

50

Christoph Trattner 2011 gy , , , , 205 - 210, 2010.