Exploring Social Tagging Graph for Web Object Classification Zhijun - - PowerPoint PPT Presentation

exploring social tagging graph for web object
SMART_READER_LITE
LIVE PREVIEW

Exploring Social Tagging Graph for Web Object Classification Zhijun - - PowerPoint PPT Presentation

Data e Web Mining AA 2009/2010 Exploring Social Tagging Graph for Web Object Classification Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han Giulia Mialich 825102 Luca Rossi 825038 Stats recently announced 300


slide-1
SLIDE 1

Exploring Social Tagging Graph for Web Object Classification

Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han

Giulia Mialich – 825102 Luca Rossi – 825038 Data e Web Mining – AA 2009/2010

slide-2
SLIDE 2

Giulia Mialich - Luca Rossi

  • recently announced 300 million users (sept,2009)
  • Social Media has overtaken porn as the #1 activity on the Web
  • % of companies using as a primary tool to find employees….80%
  • The #2 largest search engine in the world is
  • There are over 200,000,000 Blogs and 54% post content or tweet daily
  • 25% of Americans in the past month said they watched a short video…on their phone

More than 1.5 million pieces of content (web links, news stories, blog posts, notes, photos, etc.) are shared on Facebook…daily.

Stats

slide-3
SLIDE 3

Giulia Mialich - Luca Rossi

Web Objects

Products Video Photo Research Papers 10^3 - 10^9

slide-4
SLIDE 4

Giulia Mialich - Luca Rossi

Web Objects

Need to classify web objects into semantic categories

Index and organize web objects efficiently Browse and search of web

  • bjects conveniently

Discover interesting patterns from web objects

slide-5
SLIDE 5

Web Objects Classification

Giulia Mialich - Luca Rossi

slide-6
SLIDE 6

Web Objects Classification

Giulia Mialich - Luca Rossi

Challenging task for the specific characteristics of the data.

  • LACK OF FEATURES

Limited text description & difficulty on extracting content features of video/image

slide-7
SLIDE 7

Web Objects Classification

Giulia Mialich - Luca Rossi

Challenging task for the specific characteristics of the data.

  • LACK OF FEATURES

Limited text description & difficulty on extracting content features of video/image

  • LACK OF INTERCONNECTIONS

“Michael Jordan”

slide-8
SLIDE 8

Web Objects Classification

Giulia Mialich - Luca Rossi

Challenging task for the specific characteristics of the data.

  • LACK OF FEATURES

Limited text description & difficulty on extracting content features of video/image

  • LACK OF INTERCONNECTIONS

“Michael Jordan”

  • LACK OF LABELS

Difficulty to create a large training set

slide-9
SLIDE 9

Web Objects Classification

Giulia Mialich - Luca Rossi

Limited text description Rich semantic feature space Isolated settings of web objects Labeled examples in some domains Social tags Overcome the difficulties

  • f web object classification

Heterogeneous objects on Web ar tagged by users, with keywords freely chosen from their own vocabulary

slide-10
SLIDE 10

Social Tags

Giulia Mialich - Luca Rossi

LACK OF FEATURES

Users provides enriched semantic features for web object classification

slide-11
SLIDE 11

Social Tags

Giulia Mialich - Luca Rossi

LACK OF INTERCONNECTIONS

New link structure of web objects

slide-12
SLIDE 12

Social Tags

Giulia Mialich - Luca Rossi

LACK OF LABELS

Heterogenous types of web objects are connected through common tags

slide-13
SLIDE 13

Related work

Giulia Mialich - Luca Rossi

This is the first work to explore social tag data for web object classification. Investigated for a long time: - web page classification

  • multimedia classification

WEB PAGE CLASSIFICATION textual feature based hyperlink html & metadata query log MULTIMEDIA OBJECT CLASSIFICATION text features contextual information

slide-14
SLIDE 14

Related work

Giulia Mialich - Luca Rossi

Authors propose a general theoretic framework for explicitly modeling tagging behaviors and web object classification problem. Social Tag can benefit: web search information retrieval semantic web web page clustering user interest mining

slide-15
SLIDE 15
  • S. Bao, G.-R. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su.

Optimizing web search using social annotations. In WWW

[2007] Giulia Mialich - Luca Rossi

Bao et al. Observe that the social annotation can benefit web search in two aspects: 1.Annotations are good sumaries of corresponding web pages

[Amazon’s homepage: shopping, books, amazon, music, store]

Similar or closely related annotations are usually given to the same web pages SocialSimRank (SSR) 2.The count of annotations indicates the popularity of web pages from users’ point of view SocialPageRank (SPR)

[2007]

slide-16
SLIDE 16

Yin et al. work

Giulia Mialich - Luca Rossi

Innovatively social tag exploration for web object classification. They propose an iterative algorithm wich solves the problem efficiently, significantly outperforming the state-of-the-art methods that don’t use tags as bridges.

[2007]

slide-17
SLIDE 17

Social Tagging Graph

Giulia Mialich - Luca Rossi

Simplifying assumption: 2 types of objects, S and T. S objects are already labeled Labeled objects Unlabeled objects

slide-18
SLIDE 18

Social Tagging Graph

Giulia Mialich - Luca Rossi

G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags

t, {c1, c2, . . . , ck} is the set of all the categories

C =

  • V l

T :

slide-19
SLIDE 19

Social Tagging Graph

Giulia Mialich - Luca Rossi

G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags

t, {c1, c2, . . . , ck} is the set of all the categories

C =

  • V u

T :

slide-20
SLIDE 20

Social Tagging Graph

Giulia Mialich - Luca Rossi

G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags

t, {c1, c2, . . . , ck} is the set of all the categories

C =

  • Vtag
slide-21
SLIDE 21

Social Tagging Graph

Giulia Mialich - Luca Rossi

G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags

t, {c1, c2, . . . , ck} is the set of all the categories

C =

  • VS:
slide-22
SLIDE 22

Intuitions

Giulia Mialich - Luca Rossi

Web users are likely to select similar tags for objects beloning to the same semantic category, independent of the type ⇒ tags can be used as a “bridge” to semantically connect objects

slide-23
SLIDE 23

Intuitions

Giulia Mialich - Luca Rossi

The label assigned by the classifier should be consistent. This consistency can be captured by the following 3 properties: Category assignment of a vertex in Vs or Vlt should not deviate much from its original label as long as we trust the initial labeling

slide-24
SLIDE 24

Intuitions

Giulia Mialich - Luca Rossi

The label assigned by the classifier should be consistent. This consistency can be captured by the following 3 properties: Category assignment of a vertex in Vut should take into account any prior knowledge

slide-25
SLIDE 25

Intuitions

Giulia Mialich - Luca Rossi

The label assigned by the classifier should be consistent. This consistency can be captured by the following 3 properties: Category assignment of any vertex of the graph should be as consistent as possible with its neighbors’ labels

slide-26
SLIDE 26

The Optimization Framework

Giulia Mialich - Luca Rossi

  • fu: a k-dimension vector that represents the class distri-

bution of vertex u ∈ V , where k is the number of cate-

  • gories. fu[i] represents the possibility that u belongs to

category i, s.t. k

i=1 fu[i] = 1. We denote {fu}u∈V as f.

  • f ∗

u: the optimal solution of fu

  • ˆ

fu: for u ∈ VS ∪ V l

T , ˆ

fu is the class distribution esti- mated from the original category labels of vertex u. For u ∈ V u

T , ˆ

fu is the class distribution estimated from some prior knowledge of the unlabeled object u (e.g., the label assignments by a domain classifier).

  • wuv: a weight of the importance of edge (u, v). Given an
  • bject u and its associated tag v, wuv is the frequency

that v is used to tag u.

slide-27
SLIDE 27

The Optimization Framework

Giulia Mialich - Luca Rossi

O(f) = α

  • u∈VS

fu − ˆ fu2 +β

  • u∈V l

T

fu − ˆ fu2 +γ

  • u∈V u

T

fu − ˆ fu2 + +

  • (u,v)∈E

wuvfu − fv2

slide-28
SLIDE 28

The Optimization Framework

Giulia Mialich - Luca Rossi

1.

u∈VS fu − ˆ

fu2 means that the category of a vertex in VS should not deviate much from its original label(s). 2.

u∈V l

T fu − ˆ

fu2 means that the category of a vertex in V l

T should keep close to its initial label(s).

3.

u∈V u

T fu − ˆ

fu2 means that the category of a vertex in V u

T should keep close to the prior knowledge if any.

4.

(u,v)∈E wuvfu − fv2 makes sure that the class distri-

bution of the vertices are smooth over the whole graph, i.e., the class distribution of a vertex is consistent with its neighbors.

slide-29
SLIDE 29

The Optimization Framework

Giulia Mialich - Luca Rossi

Our target is to find

f ∗ = arg min O(f)

Based on this class distribution we can state that, given an object o, its class c is

c = arg max P(o|c) P(o) = arg max P(c|o) P(c)

slide-30
SLIDE 30

The Optimization Framework

Giulia Mialich - Luca Rossi

Our target is to find

f ∗ = arg min O(f)

Based on this class distribution we can state that, given an object o, its class c is

c = arg max

1≤i≤k

f ∗

u[i]

  • u∈V l

T ∪V u T f ∗

u[i]

slide-31
SLIDE 31

Any problem so far?

Giulia Mialich - Luca Rossi

Oh yes... finding a closed-form solution to this problem requires inverting a huge matrix with the size of all the objects and tags

slide-32
SLIDE 32

Any problem so far?

Giulia Mialich - Luca Rossi

Why not using a smart iterative algorithm instead? I’ve got an idea: let’s difgerentiate O(f) with respect to the 4 types of vertices and update f by setting the difgerentiated result to zero!

slide-33
SLIDE 33

∂O ∂s = 2α(fs − ˆ fs) + 2

  • v∈Vtag

wsv(fs − fv) = 0 fs = α α +

v∈Vtag wsv

ˆ fs +

  • v∈Vtag wsvfv

α +

v∈Vtag wsv

(3) ∂O ∂l = 2β(fl − ˆ fl) + 2

  • v∈Vtag

wlv(fl − fv) = 0 fl = β β +

v∈Vtag wlv

ˆ fl +

  • v∈Vtag wltfv

β +

v∈Vtag wlv

(4) ∂O ∂u = 2γ(fu − ˆ fu) + 2

  • v∈Vtag

wuv(fu − fv) = 0 fu = γ γ +

v∈Vtag wuv

ˆ fu +

  • v∈Vtag wuvfv

γ +

v∈Vtag wuv

(5)

Toward an Iterative Algorithm

Giulia Mialich - Luca Rossi

slide-34
SLIDE 34
  • ∂O

∂v = −2

  • s∈VS

wsv(fs − fv) − 2

  • l∈V l

T

wlv(fl − fv) −2

  • u∈V u

T

wuv(fu − fv) = 0

fv =

  • s∈VS wsvfs +

l∈V l

T wlvfl +

u∈V u

T wuvfu

  • s∈VS wsv +

l∈V l

T wlv +

u∈V u

T wuv

(6) It is easy to show that after each iteration, we still have

Toward an Iterative Algorithm

Giulia Mialich - Luca Rossi

slide-35
SLIDE 35

Algorithm 1: Iterative Algorithm Input: category size k, class labels C(x) for x ∈ VS ∪ V l

T ∪ V u T

Output: class labels ˜ C(x) for x ∈ V u

T

// Initialization foreach x ∈ VS ∪ V l

T ∪ V u T do

1

ˆ fx[C(x)] ← 1

2

foreach x ∈ V do

3

foreach i ← 1 to k do fx[i] ← 1/k

4

// Iteration repeat

5

foreach x ∈ VS do

6

f

  • x ←

α α+

v∈Vtag wxv ˆ

fx +

  • v∈Vtag wxvfv

α+

v∈Vtag wxv

7

foreach x ∈ V l

T do

f

  • x ←

β β+

v∈Vtag wxv ˆ

fx +

  • v∈Vtag wxvfv

β+

v∈Vtag wxv

8

foreach x ∈ V u

T do

f

  • x ←

γ γ+

v∈Vtag wxv ˆ

fx +

  • v∈Vtag wxvfv

γ+

v∈Vtag wxv

9

foreach x ∈ Vtag do f

  • x ←
  • s∈VS wsxfs+

l∈V l T

wlxfl+

u∈V u T wuxfu

  • s∈VS wsx+

l∈V l T

wlx+

u∈V u T wux

10

foreach x ∈ V do fx ← f

  • x

11

until converged ;

12

// Get Class Label foreach x ∈ V u

T do

13

˜ C(x) = arg max1≤i≤k

fx[i]

  • u∈V l

T ∪V u T

fu[i]

14

slide-36
SLIDE 36

What’s up

Giulia Mialich - Luca Rossi

Lines 1 to 4 The vector encoding the prior knowledge and the class distribution of each vertex u ∈ V are initialized

slide-37
SLIDE 37

What’s up

Giulia Mialich - Luca Rossi

Lines 5 to 12 The class distribution of each vertex u is repeatedly updated according to its neighbors vertices

slide-38
SLIDE 38

What’s up

Giulia Mialich - Luca Rossi

Lines 5 to 12 If u is not a tag, the class distribution of object u is updated from the class distribution of the associated tags according to (3), (4) and (5)

slide-39
SLIDE 39

What’s up

Giulia Mialich - Luca Rossi

Lines 5 to 12 if u is a tag, its class distribution is updated according to the class distributions of the connected objects (based on equation (6)) Note how the tags act as a bridge of belief propagation

slide-40
SLIDE 40

Complexity Analysis

Giulia Mialich - Luca Rossi

Lines 1-4 take O(k |V|) Lines 6-11 take O(2k |E|) ⇒ O(k |V| + iter |E|) where k is the number of categories and iter is the number of iterations

slide-41
SLIDE 41

Tuning the parameters

Giulia Mialich - Luca Rossi

α=0,β≠0 and γ=0

slide-42
SLIDE 42

Tuning the parameters

Giulia Mialich - Luca Rossi

α≠0,β=0 and γ=0

slide-43
SLIDE 43

Tuning the parameters

Giulia Mialich - Luca Rossi

α≠0,β≠0 and γ≠0

prior knowledge

slide-44
SLIDE 44

ODP:Shopping Amazon Name Count Name Count Publications/Books 558 Books 937 Consumer Electronics 494 Electronics 945 Health 1009 HealthPersonCare 747 Home and Garden 1976 HomeGarden 841 Jewelry 452 Jewelry 386 Music 527 Music 944 Office 77 OfficeProducts 695 Pet 443 PetSupplies 628

Experiments setup

Giulia Mialich - Luca Rossi

Web products classification (from Amazon). Web pages collected from ODP are used as external resource for helping classification. Tags for web pages are collected from delicious.

slide-45
SLIDE 45

Fi = 2πiρi πi + ρi , F(macro-averaged) = M

i=1 Fi

M

πi = TPi TPi + FPi , ρi = TPi TPi + FNi

Experiments setup: the measure

Giulia Mialich - Luca Rossi

Macro-averaged scores (MacroF1) are influenced by the performance in rare categories

slide-46
SLIDE 46

F(micro-averaged) = 2πρ π + ρ

Experiments setup: the measure

Giulia Mialich - Luca Rossi

Micro-averaged scores (MicroF1) tend to be dominated by the performance on common categories

π = TP TP + FP = M

i=1 TPi

M

i=1(TPi + FPi)

, ρ = TP TP + FN = M

i=1 TPi

M

i=1(TPi + FNi)

(4)

slide-47
SLIDE 47

Experiments setup

Giulia Mialich - Luca Rossi

The Tag-based classification Model (TM) presented in the paper is compared with SVM (Support Vector Machine) and HG (Harmonic Gaussian field method). Both are used with the title or the tag of the products as features.

slide-48
SLIDE 48

Experiments: overall performance

Giulia Mialich - Luca Rossi

Label Ratio 1% 5% Measure MicroF1 MacroF1 MicroF1 MacroF1 SVM+TITLE 0.4233 0.3812 0.5967 0.6091 SVM+TAG 0.4045 0.4059 0.6397 0.6435 HG+TITLE 0.6251 0.6038 0.6778 0.6689 HG+TAG 0.7174 0.7127 0.7856 0.7859 TM5 0.7870 0.7872 0.8027 0.8030

5α = 1000, β = ∞, γ = 0.1

slide-49
SLIDE 49

SVM+TITLE SVM+TAG HG+TITLE HG+TAG TM6 p% MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 5% 0.5967 0.6091 0.6397 0.6435 0.6778 0.6689 0.7856 0.7859 0.7918 0.7919 10% 0.6700 0.6789 0.7168 0.7334 0.6937 0.6802 0.7915 0.7864 0.8005 0.7996 15% 0.7181 0.7218 0.7417 0.7366 0.7139 0.7049 0.7921 0.7908 0.8187 0.8199 20% 0.7343 0.7399 0.7674 0.7722 0.7152 0.7059 0.8025 0.8004 0.8217 0.8231 25% 0.7545 0.7597 0.7763 0.7780 0.7131 0.7038 0.8109 0.8079 0.8259 0.8273 6α = 0, β = ∞, γ = 0

Experiments: Tag vs Title

Giulia Mialich - Luca Rossi

slide-50
SLIDE 50

HG+TITLE HG+TAG α = 1000 p% MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 NA NA NA NA 0.7594 0.7606 1% 0.6251 0.6038 0.7174 0.7127 0.7708 0.7719 2% 0.6499 0.6334 0.7510 0.7434 0.7771 0.7766 3% 0.6368 0.6368 0.7695 0.7666 0.7774 0.7769 4% 0.6503 0.6360 0.7566 0.7513 0.7885 0.7891 5% 0.6778 0.6689 0.7856 0.7859 0.7872 0.7866

Experiments: 2 domains

Giulia Mialich - Luca Rossi

slide-51
SLIDE 51

10 10

1

10

2

10

3

10

4

10

5

0.753 0.754 0.755 0.756 0.757 0.758 0.759 0.76 0.761 0.762 alpha Micro F1 Macro F1 infinity

Experiments: sensitivity of alpha

Giulia Mialich - Luca Rossi

slide-52
SLIDE 52

p% 5% 10% 15% 20% 25% Measure MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 γ=0 0.7918 0.7919 0.8005 0.7996 0.8187 0.8199 0.8217 0.8231 0.8259 0.8273 SVM+TAG 0.6397 0.6435 0.7168 0.7334 0.7417 0.7366 0.7674 0.7722 0.7763 0.7780 (γ=0.001)+(SVM+TAG) 0.7938 0.7914 0.8000 0.7987 0.8214 0.8198 0.8229 0.8238 0.8281 0.8295 (γ=0.01)+(SVM+TAG) 0.7964 0.7932 0.8013 0.8005 0.8199 0.8184 0.8223 0.8231 0.8292 0.8306 (γ=0.1)+(SVM+TAG) 0.7796 0.7673 0.8096 0.8109 0.8251 0.8201 0.8272 0.8277 0.8355 0.8364 (γ=1)+(SVM+TAG) 0.6878 0.6846 0.7704 0.7803 0.7913 0.7843 0.8033 0.8051 0.8165 0.8163 HG+TAG 0.7856 0.7859 0.7915 0.7864 0.7921 0.7908 0.8025 0.8004 0.8109 0.8079 (γ=0.001)+(HG+TAG) 0.7968 0.7973 0.8038 0.8026 0.8214 0.8228 0.8251 0.8263 0.8300 0.8316 (γ=0.01)+(HG+TAG) 0.8012 0.8028 0.8056 0.8040 0.8222 0.8233 0.8249 0.8261 0.8313 0.8329 (γ=0.1)+(HG+TAG) 0.8038 0.8043 0.8174 0.8151 0.8233 0.8238 0.8296 0.8301 0.8381 0.8387 (γ=1)+(HG+TAG) 0.7950 0.7951 0.8036 0.7982 0.8082 0.8065 0.8206 0.8192 0.8339 0.8308

Experiments: prior knowledge

Giulia Mialich - Luca Rossi

slide-53
SLIDE 53

1 2 3 4 5 6 7 8 9 10 11 10

4

10

3

10

2

10

1

10 Iteration Accuracy Change Scenario 1 Scenario 2 Scenario 3

Experiments: convergence

Giulia Mialich - Luca Rossi

slide-54
SLIDE 54

Conclusions

Giulia Mialich - Luca Rossi

Web object classification: An emerging task and increasingly important

Web object classification problem can take advantage from social tags in three aspects

represent web objects in a meaningful feature space

interconnect objects to indicate implicit relationship

bridging heterogeneous objects so that category information can be propagated from one domain to another

The proposed method significantly outperforms the state-of-the-art of general classification methods

In this model, it is only considered the setting of two types of web objects

It would be interesting to generalize the model to manage multi-types of objects