On the Navigability of Social Tagging Systems Christoph Trattner - - PowerPoint PPT Presentation

on the navigability of social tagging systems
SMART_READER_LITE
LIVE PREVIEW

On the Navigability of Social Tagging Systems Christoph Trattner - - PowerPoint PPT Presentation

Graz University of Technology On the Navigability of Social Tagging Systems Christoph Trattner Knowledge Management Institute and Institute for Information Systems and Computer Media Graz University of Technology, Austria e-mail:


slide-1
SLIDE 1

Graz University of Technology 1

Christoph Trattner 2012

On the Navigability of Social Tagging Systems

Christoph Trattner

Knowledge Management Institute and Institute for Information Systems and Computer Media Graz University of Technology, Austria e-mail: ctrattner@iicm.edu web: http://www.austria-lexikon.at/af/User/Trattner%20Christoph In collaboration with:

D.Helic, M.Strohmaier, K. Andrews, Ch. Körner

slide-2
SLIDE 2

Graz University of Technology 2

Christoph Trattner 2012

What is a tagging system and what are tags?

What is a tagging system? A system that provides the user the possibility to apply tags to resources What are tags?

  • lightweight keywords (free form vocabulary)
  • generated by users
  • for users
slide-3
SLIDE 3

Graz University of Technology 3

Christoph Trattner 2012

Popular examples of tagging systems are…

slide-4
SLIDE 4

Graz University of Technology 4

Christoph Trattner 2012

Tags

slide-5
SLIDE 5

Graz University of Technology 5

Christoph Trattner 2012

Tags

slide-6
SLIDE 6

Graz University of Technology 6

Christoph Trattner 2012

Tags

slide-7
SLIDE 7

Graz University of Technology 7

Christoph Trattner 2012

Why system designers like tags?

  • Tags add additional meta data to resources for which

typically just sparse meta data information exists (such as pictures, movies, etc.)

  • Trough tags system designers are able to provide the

user with simple navigational tools that improve the systems information retrieval properties

  • Tags are cheap!!!
slide-8
SLIDE 8

Graz University of Technology 8

Christoph Trattner 2012

Why users like tags?

  • Trough tags users are able to categorize or describe

resources

  • Can find information faster
  • through personal tags
  • Can find related content faster
  • trough related tags
slide-9
SLIDE 9

Graz University of Technology 9

Christoph Trattner 2012

Navigation with Tags

Typically tagging systems provide the user the following forms of information retrieval interfaces to navigate content of a tagging system 1. Tag clouds – widely used

  • 2. Tag hierarchies

new – hardly any implementations yet

Gupta et al. 2010

slide-10
SLIDE 10

Graz University of Technology 10

Christoph Trattner 2012

How does tag (cloud) based navigation look like?

slide-11
SLIDE 11

Graz University of Technology 11

Christoph Trattner 2012

Are Tag Clouds useful for navigation?

Questions???

slide-12
SLIDE 12

Graz University of Technology 12

Christoph Trattner 2012

Modelling a tag dataset as a graph (1/2)

  • A tagging dataset is typically modeled as a tripartite

hypergraph

  • V = R U U U T
  • An annotation is a hyperedge (r, t, u)
  • A tripartite hypergraph can be mapped onto three

bipartite graphs connecting users and resources, users and tags, and tags and resources.

slide-13
SLIDE 13

Graz University of Technology 13

Christoph Trattner 2012

Defining Navigability

A network is navigable iff: There is a short path between all or almost all pairs of nodes in the network. Formally:

  • 1. There exists a giant component
  • 2. The effective diameter is low (bounded by log n)
  • J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science

Technical Report 99-1776 (October 1999)

slide-14
SLIDE 14

Graz University of Technology 14

Christoph Trattner 2012

Navigability: Examples

Example 1: Not navigable: No giant component Example 2: Not navigable: giant component, BUT eff.diam: 7 > log2(8)

slide-15
SLIDE 15

Graz University of Technology 15

Christoph Trattner 2012

Navigability: Examples

Example 3: Navigable: Giant component AND eff.diam: 2 < log2(10) Is this efficiently navigable? There are short paths between all nodes, but can an agent or algorithm find them with local knowledge

  • nly?
slide-16
SLIDE 16

Graz University of Technology 16

Christoph Trattner 2012

Efficiently navigable

A network is efficiently navigable iff: If there is an algorithm that can find a short path with

  • nly local knowledge, and the delivery time of the

algorithm is bounded polynomially by logk(n). Example 4: Efficiently navigable, if the algorithm knows it needs to go through A  B  C

A B C

  • J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science

Technical Report 99-1776 (October 1999)

slide-17
SLIDE 17

Graz University of Technology 17

Christoph Trattner 2012

Navigability of Social Tagging Systems (1/2)

In general tags form networks which are navigable from a network-theoretic perspective

slide-18
SLIDE 18

Graz University of Technology 18

Christoph Trattner 2012

Navigability of Social Tagging Systems (2/2)

.

Tagging networks are navigable power-law networks. For power law networks, efficient sub-linear decentralised navigation algorithms exist.

„Hub“ tags

slide-19
SLIDE 19

Graz University of Technology 19

Christoph Trattner 2012

But how about User Interface constraints?

Tag Cloud Size n topN resources

(topN most common algorithm)

Pagination of resources / tag k resources shown / page

(reverse chronological ordering)

slide-20
SLIDE 20

Graz University of Technology 20

Christoph Trattner 2012

How UI constraints effect Navigability

.

Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does not influence navigability (this is not very surprising). BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination with resources sorted in reverse-chronological order) leaves the network vulnerable to fragmentation. This destroys navigability of prevalent approaches to tag clouds. Pagination Tag Cloud Size

slide-21
SLIDE 21

Graz University of Technology 21

Christoph Trattner 2012

How can we recover the navigability of social tagging systems? Answer: Through resource specific resource list construction!

Questions???

slide-22
SLIDE 22

Graz University of Technology 22

Christoph Trattner 2012

What is a resource specific resource list ?

  • A resource specific resource list is a resource list

that is not only specific to a particular tag but also to a particular resource in the tagging system

  • Typically resource lists are calculated as follows

Res(t) = {ri(t),…,rn(t)}

  • Resource specific resource lists are calculated

as Res(t,r) = {ri(t,r),…,rn(t,r)}

slide-23
SLIDE 23

Graz University of Technology 23

Christoph Trattner 2012

Approach: Random Ordering

  • Instead of reverse-chronological ordering of resources,

we apply a random ordering.

  • On each click on a particular tag a different resource list is

generated

  • Problem: network is not efficiently navigable

Better algorithms can easily be envisioned.

slide-24
SLIDE 24

Graz University of Technology 24

Christoph Trattner 2012

Approach: Hierarchical Ordering

  • J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,

2001, p. 2001.

  • Instead of random ordering, we use hierarchical

background knowledge for ranking paginated resources [Kleinberg 2001].

  • Kleinberg showed that if the nodes of a network

can be organized into a hierarchy, then such a hierarchy provides a probability distribution for connecting the nodes in the network.

  • For such a network a hierarchical decentralized

searcher exists that is able to navigate the network in log(n) => the network is efficiently navigable

slide-25
SLIDE 25

Graz University of Technology 25

Christoph Trattner 2012

  • J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,

2001, p. 2001.

Approach: Hierarchical Ordering

slide-26
SLIDE 26

Graz University of Technology 26

Christoph Trattner 2012

Problem: Semantic Penalty

  • Hierarchy was more or less randomly

constructed

  • Does not take semantic similarity between

resources into account

  • Hence, two new approaches were developed
  • First idea, constructing efficiently navigable tag clouds

from structured web content [Trattner 2011]

  • Second idea, develop an algorithm that is able to

construct semantically sound resource hierarchies from tagging data [Trattner 2011a]

  • C. Trattner , D. Helic, M. Strohmaier, “On the Construction of Efficiently Navigable Tag Clouds Using Knowledge from Structured Web Content,” in JUCS,

Volume 17, Issue 4, 565-582, 2011.

  • C. Trattner , “Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails”, in CIT, 2011.
slide-27
SLIDE 27

Graz University of Technology 27

Christoph Trattner 2012

On the construction of efficiently navigable tag clouds from structured web content

  • Content on the Web not always flat
  • There are websites that provide a hierarchical

structure

  • Example: Austria-Forum
slide-28
SLIDE 28

Graz University of Technology 28

Christoph Trattner 2012

Austria-Forum

Community AEIOU

Wissenssammlungen

  • Wiki-based Online encyclopedia system
  • provides over 200,000 information items about

Austria.

  • differently to Wikipedia, articles in Austria-Forum

are published, edited, checked and certified by people who are accepted as experts in particular field

  • articles are organized hierarchically

into categories

  • categories are addressable via

structured URLs

(cf. Open Directory DMOZ)

slide-29
SLIDE 29

Graz University of Technology 29

Christoph Trattner 2012

Austria-Forum

Tags Resource

slide-30
SLIDE 30

Graz University of Technology 30

Christoph Trattner 2012

Approach (1/2)

  • 1. Hierarchical Tag Cloud Construction
slide-31
SLIDE 31

Graz University of Technology 31

Christoph Trattner 2012

Approach (2/2)

  • 2. Hierarchical Resource List Construction
slide-32
SLIDE 32

Graz University of Technology 32

Christoph Trattner 2012

Evaluation

To evaluate the presented algorithm, a network theoretical framework [Trattner 2011b] based on the Stanford SNAP Library (http://snap.stanford.edu/) was developed: Network-theoretic module: Calculates network properties

such as the size of the Largest Strongly Connected Component (LSCC) or the Effective Diameter (ED) of the tag cloud network

Searcher module: Implements a hierarchical decentralized

searcher to simulate “efficient” tag cloud driven navigation

  • C. Trattner , “NAVTAG - A Network-Theoretic Framework to Assess and Improve the Navigability of Tagging Systems,” in11th International Conference on

Web Engineering (ICWE 2011), Springer, 2011 .

slide-33
SLIDE 33

Graz University of Technology 33

Christoph Trattner 2012

Hierarchical Decentralized Search

A tag network: Background knowledge:

(e.g. a folksonomy)

start target

Goal: Navigate from START to TARGET using local background knowledge only

  • J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)
slide-34
SLIDE 34

Graz University of Technology 34

Christoph Trattner 2012

Results: Navigability

Approaches calculating resource lists in a random manner form navigable tag cloud networks

slide-35
SLIDE 35

Graz University of Technology 35

Christoph Trattner 2012

Results: Searcher

  • Best Results are obtained with

hierarchically constructed tag clouds/resource lists (=HH)

  • Naive (=TopN + chron. sorted resource

list) approach performs worst (=N)

  • However, HR performs better than a

pure random approach (=R)

slide-36
SLIDE 36

Graz University of Technology 36

Christoph Trattner 2012

User Study

  • To measure the performance of the approach a

between-group test design was used

  • For that purpose we randomly split up our test

users into two groups

Group A Group B Assigned to navigate Austria-Forum with hierarchically constructed resource lists Assigned to navigate in Austria- Forum with reverse chron. sorted resource lists Baseline

slide-37
SLIDE 37

Graz University of Technology 37

Christoph Trattner 2012

User Study

  • During the study the users were asked to resolve

10 Tasks

  • In particular, the users were asked to navigate

from 10 given start resources to 10 given target resources as fast as possible.

  • To get valid results, start and the target

resources were selected uniform at random (same for all users)

  • As tool for navigation users were allowed to use
  • nly tag clouds
slide-38
SLIDE 38

Graz University of Technology 38

Christoph Trattner 2012

User Study

  • To ensure that the user would have to navigate,

we selected the paths in such a way that the users had to visit at least 0-4 intermediate resources to find the target resources

  • As a max. amount of time, each of the users was

given 3 minutes of time for each task

slide-39
SLIDE 39

Graz University of Technology 39

Christoph Trattner 2012

Example: Tag cloud based navigation

Brahms Beethoven Start resource Target resource Resource list

slide-40
SLIDE 40

Graz University of Technology 40

Christoph Trattner 2012

User Study

  • Since we observed during our pilot test that

users had problems in finding resources that they did not know, the tags of the target resource were also presented to the users

  • The variable measured in the experiment was

success rate, i.e. we measured whether the user could find the target resources or not!

slide-41
SLIDE 41

Graz University of Technology 41

Christoph Trattner 2012

Results: User Study

  • All in all, 24 test user participated in the experiment
  • 16 male and 8 female
  • median age = 33 years, ranging from 22 to 56
  • All participants were experienced computer users (on

average 46 hours per week)

  • 12 of them were experienced with the Austria-Forum

test system

  • To get rid of this bias, we assigned those users

randomly to group A and B

slide-42
SLIDE 42

Graz University of Technology 42

Christoph Trattner 2012

Results: User Study

  • Regarding the mean success rate, we could observe that on

average users of group A could find to 55% their designated target resources

  • Compared to this, in group B the users were only able to find to

23% their designated target resources

  • Or in other words, on overage, we could observe an improvement
  • f 32% of the navigability of the Austria-Forum tagging system,

while using hierarchically constructed resource lists.

  • These results confirm our theoretical assumptions as they were

made in previous work of this area [Helic et al. 2011]

Helic, D., Trattner, C., Strohmaier, M. and Andrews, K.: Are Tag Clouds Useful for Navigation? A Network-Theoretic Analysis, Journal of Social Computing and Cyber-Physical Systems, 2011.

slide-43
SLIDE 43

Graz University of Technology 43

Christoph Trattner 2012

Results: User Study

The experiment showed that the hierarchically constructed tag network is significantly better navigable than the one naïve approach.

slide-44
SLIDE 44

Graz University of Technology 44

Christoph Trattner 2012

Problem: Predefined Resource Hierarchy

  • Not always a predefined resource hierarchy is

given

  • Hence, the presented approach is not

completely generic

  • Other problem:

The Success Rate drops drastically if the provided resource hierarchy is neither balanced nor complete

slide-45
SLIDE 45

Graz University of Technology 45

Christoph Trattner 2012

Question?

How can we construct fixed branched and balanced resource hierarchies from tagging data automatically???

slide-46
SLIDE 46

Graz University of Technology 46

Christoph Trattner 2012

Algorithm: Resource Hierarchy Generation

slide-47
SLIDE 47

Graz University of Technology 47

Christoph Trattner 2012

Algorithm: Resource Hierarchy Labeling

slide-48
SLIDE 48

Graz University of Technology 48

Christoph Trattner 2012

Results: Semantic Evaluation

  • Taxonomic F-Measure and

Taxonomic Overlap identify the quality of a given taxonomy against a golden standard via common concepts.

  • Comparison to four popular tag

hierarchy induction algorithms

  • As golden standard for the experiment the Germanet
  • ntology was used (the Austria-Forum tag dataset contains
  • nly German tags)
slide-49
SLIDE 49

Graz University of Technology 49

Christoph Trattner 2012

Results: Empirical Analysis

  • 9 test participants (all of them experienced in the evaluation
  • f concept hierarchies)
  • resource taxonomy with b=10
  • Evaluation via online test
  • Users had to classify tag trails
slide-50
SLIDE 50

Graz University of Technology 50

Christoph Trattner 2012

Results: Empirical Evaluation

Compared to a tag taxonomy comprising only tags we can see that concept relations of a tag-resource taxonomy with branching factor b = 10 are only to 5% less hierarchically arranged than the tag concepts of the in theory best semantically correct tag taxonomy approach the so-called Deg/Cooc tag taxonomy induction algorithm.

slide-51
SLIDE 51

Graz University of Technology 51

Christoph Trattner 2012

Results: Tag Cloud Navigability

In order to determine the navigability of the approach several tag networks with different resource list lengths were generated. Branching factors used in the experiment: b=2,5 and 10. Resource list length was varied from k=10 to 50.

  • To determine navigability: Size of LSCC and ED was measured.
  • To determine efficiency a hierarchical decentralized searcher was

implemented utilizing the resource hierarchy as background knowledge to search the tag networks.

slide-52
SLIDE 52

Graz University of Technology 52

Christoph Trattner 2012

Results: Network Properties

Simulations show the navigability of the hierarchically constructed tag networks.

slide-53
SLIDE 53

Graz University of Technology 53

Christoph Trattner 2012

Results: Searcher

Simulations show very high success rates ( > 90%) even for “short” resource lists (k=10).

slide-54
SLIDE 54

Graz University of Technology 54

Christoph Trattner 2012

Conclusions

  • From a network-theoretical perspective (and only

looking at tags) tagging systems are navigable

  • However, if we consider simple user-interface

constraints, they are NOT!

  • Problem: Current tag cloud algorithms calculate resource lists in a

statically manner

  • Pagination clusters tag network into isolated network clusters
  • However, with hierarchically constructed resource

lists navigability can be recovered

  • Such tag networks are also efficiently navigable, if

the resources of the tagging system can be arranged into a fixed branched resource taxonomy

slide-55
SLIDE 55

Graz University of Technology 55

Christoph Trattner 2012

End of Presentation Thank you!

Christoph Trattner

ctrattner@iicm.edu

Graz University of Technology, Austria

slide-56
SLIDE 56

Graz University of Technology 56

Christoph Trattner 2012

References and Further Readings

Trattner, C., Lin, Y., Parra, D., Yue, Z., Brusilovsky, P.: Evaluating Tag-Based Information Access in Image Collections, In Proceedings of the 23rd ACM Conference on Hypertext and Social Media, ACM, New York, NY, USA, 2012. Helic, D., Körner, C., Granitzer, M., Strohmaier, M., Trattner, C.: Navigational efficiency of broad

  • vs. narrow folksonomies, In Proceedings of the 23rd ACM Conference on Hypertext and

Social Media, ACM, New York, NY, USA, 2012. Trattner, C., Singer, P., Helic, D. and Strohmaier, M.: Exploring the Differences and Similarities

  • f Hierarchical Decentralized Search and Human Navigation in Information-networks In

Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, ACM, New York, NY, USA, 2012. Trattner, C.: Linking Related Content in Web Encyclopedias with search query tag clouds, IADIS International Journal on WWW/Internet ,Volume 9(2), 2011. Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails, Journal of Computing and Information Technology, Volume 19(3), 155-167, 2011. Trattner, C., Helic, D. and Strohmaier, M.: On the Construction of Efficiently Navigable Tag Clouds Using Knowledge From Structured Web Content, Journal of Universal Computer Science, Volume 17(4), 565-582, 2011.

slide-57
SLIDE 57

Graz University of Technology 57

Christoph Trattner 2012

Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of Folksonomies, In Proceedings of the 20th international conference on World wide web, ACM, New York, NY, USA, 417-426, 2011. Trattner, C., Körner, C., Helic, D.: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies, In Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, ACM, 7–9 September 2011, Messe Congress Graz, Austria, 2011. Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists: A Comparative Study, In Proceedings of the 33rd International Conference

  • n Information Technology Interfaces, IEEE, Cavtat / Dubrovnik, Croatia, 2011.

Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: On the Navigability of Social Tagging Systems, In proceedings of the Second IEEE International Conference on Social Computing , Minnesota, USA, 2010.

References and Further Readings