Exploring Social Tagging Graph for Web Object Classification
Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han
Giulia Mialich – 825102 Luca Rossi – 825038 Data e Web Mining – AA 2009/2010
Exploring Social Tagging Graph for Web Object Classification Zhijun - - PowerPoint PPT Presentation
Data e Web Mining AA 2009/2010 Exploring Social Tagging Graph for Web Object Classification Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han Giulia Mialich 825102 Luca Rossi 825038 Stats recently announced 300
Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han
Giulia Mialich – 825102 Luca Rossi – 825038 Data e Web Mining – AA 2009/2010
Giulia Mialich - Luca Rossi
More than 1.5 million pieces of content (web links, news stories, blog posts, notes, photos, etc.) are shared on Facebook…daily.
Giulia Mialich - Luca Rossi
Products Video Photo Research Papers 10^3 - 10^9
Giulia Mialich - Luca Rossi
Need to classify web objects into semantic categories
Index and organize web objects efficiently Browse and search of web
Discover interesting patterns from web objects
Giulia Mialich - Luca Rossi
Giulia Mialich - Luca Rossi
Challenging task for the specific characteristics of the data.
Limited text description & difficulty on extracting content features of video/image
Giulia Mialich - Luca Rossi
Challenging task for the specific characteristics of the data.
Limited text description & difficulty on extracting content features of video/image
“Michael Jordan”
Giulia Mialich - Luca Rossi
Challenging task for the specific characteristics of the data.
Limited text description & difficulty on extracting content features of video/image
“Michael Jordan”
Difficulty to create a large training set
Giulia Mialich - Luca Rossi
Limited text description Rich semantic feature space Isolated settings of web objects Labeled examples in some domains Social tags Overcome the difficulties
Heterogeneous objects on Web ar tagged by users, with keywords freely chosen from their own vocabulary
Giulia Mialich - Luca Rossi
LACK OF FEATURES
Users provides enriched semantic features for web object classification
Giulia Mialich - Luca Rossi
LACK OF INTERCONNECTIONS
New link structure of web objects
Giulia Mialich - Luca Rossi
LACK OF LABELS
Heterogenous types of web objects are connected through common tags
Giulia Mialich - Luca Rossi
This is the first work to explore social tag data for web object classification. Investigated for a long time: - web page classification
WEB PAGE CLASSIFICATION textual feature based hyperlink html & metadata query log MULTIMEDIA OBJECT CLASSIFICATION text features contextual information
Giulia Mialich - Luca Rossi
Authors propose a general theoretic framework for explicitly modeling tagging behaviors and web object classification problem. Social Tag can benefit: web search information retrieval semantic web web page clustering user interest mining
Optimizing web search using social annotations. In WWW
[2007] Giulia Mialich - Luca Rossi
Bao et al. Observe that the social annotation can benefit web search in two aspects: 1.Annotations are good sumaries of corresponding web pages
[Amazon’s homepage: shopping, books, amazon, music, store]
Similar or closely related annotations are usually given to the same web pages SocialSimRank (SSR) 2.The count of annotations indicates the popularity of web pages from users’ point of view SocialPageRank (SPR)
[2007]
Giulia Mialich - Luca Rossi
Innovatively social tag exploration for web object classification. They propose an iterative algorithm wich solves the problem efficiently, significantly outperforming the state-of-the-art methods that don’t use tags as bridges.
[2007]
Giulia Mialich - Luca Rossi
Simplifying assumption: 2 types of objects, S and T. S objects are already labeled Labeled objects Unlabeled objects
Giulia Mialich - Luca Rossi
G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags
t, {c1, c2, . . . , ck} is the set of all the categories
C =
T :
Giulia Mialich - Luca Rossi
G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags
t, {c1, c2, . . . , ck} is the set of all the categories
C =
T :
Giulia Mialich - Luca Rossi
G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags
t, {c1, c2, . . . , ck} is the set of all the categories
C =
Giulia Mialich - Luca Rossi
G = (V,E) is the social tagging graph, where V is the set of all the objects (plus tags) and E is the set of edges between an object and its tags
t, {c1, c2, . . . , ck} is the set of all the categories
C =
Giulia Mialich - Luca Rossi
Web users are likely to select similar tags for objects beloning to the same semantic category, independent of the type ⇒ tags can be used as a “bridge” to semantically connect objects
Giulia Mialich - Luca Rossi
The label assigned by the classifier should be consistent. This consistency can be captured by the following 3 properties: Category assignment of a vertex in Vs or Vlt should not deviate much from its original label as long as we trust the initial labeling
Giulia Mialich - Luca Rossi
The label assigned by the classifier should be consistent. This consistency can be captured by the following 3 properties: Category assignment of a vertex in Vut should take into account any prior knowledge
Giulia Mialich - Luca Rossi
The label assigned by the classifier should be consistent. This consistency can be captured by the following 3 properties: Category assignment of any vertex of the graph should be as consistent as possible with its neighbors’ labels
Giulia Mialich - Luca Rossi
bution of vertex u ∈ V , where k is the number of cate-
category i, s.t. k
i=1 fu[i] = 1. We denote {fu}u∈V as f.
u: the optimal solution of fu
fu: for u ∈ VS ∪ V l
T , ˆ
fu is the class distribution esti- mated from the original category labels of vertex u. For u ∈ V u
T , ˆ
fu is the class distribution estimated from some prior knowledge of the unlabeled object u (e.g., the label assignments by a domain classifier).
that v is used to tag u.
Giulia Mialich - Luca Rossi
O(f) = α
fu − ˆ fu2 +β
T
fu − ˆ fu2 +γ
T
fu − ˆ fu2 + +
wuvfu − fv2
Giulia Mialich - Luca Rossi
1.
u∈VS fu − ˆ
fu2 means that the category of a vertex in VS should not deviate much from its original label(s). 2.
u∈V l
T fu − ˆ
fu2 means that the category of a vertex in V l
T should keep close to its initial label(s).
3.
u∈V u
T fu − ˆ
fu2 means that the category of a vertex in V u
T should keep close to the prior knowledge if any.
4.
(u,v)∈E wuvfu − fv2 makes sure that the class distri-
bution of the vertices are smooth over the whole graph, i.e., the class distribution of a vertex is consistent with its neighbors.
Giulia Mialich - Luca Rossi
Our target is to find
f ∗ = arg min O(f)
Based on this class distribution we can state that, given an object o, its class c is
c = arg max P(o|c) P(o) = arg max P(c|o) P(c)
Giulia Mialich - Luca Rossi
Our target is to find
f ∗ = arg min O(f)
Based on this class distribution we can state that, given an object o, its class c is
c = arg max
1≤i≤k
f ∗
u[i]
T ∪V u T f ∗
u[i]
Giulia Mialich - Luca Rossi
Oh yes... finding a closed-form solution to this problem requires inverting a huge matrix with the size of all the objects and tags
Giulia Mialich - Luca Rossi
Why not using a smart iterative algorithm instead? I’ve got an idea: let’s difgerentiate O(f) with respect to the 4 types of vertices and update f by setting the difgerentiated result to zero!
∂O ∂s = 2α(fs − ˆ fs) + 2
wsv(fs − fv) = 0 fs = α α +
v∈Vtag wsv
ˆ fs +
α +
v∈Vtag wsv
(3) ∂O ∂l = 2β(fl − ˆ fl) + 2
wlv(fl − fv) = 0 fl = β β +
v∈Vtag wlv
ˆ fl +
β +
v∈Vtag wlv
(4) ∂O ∂u = 2γ(fu − ˆ fu) + 2
wuv(fu − fv) = 0 fu = γ γ +
v∈Vtag wuv
ˆ fu +
γ +
v∈Vtag wuv
(5)
Giulia Mialich - Luca Rossi
∂v = −2
wsv(fs − fv) − 2
T
wlv(fl − fv) −2
T
wuv(fu − fv) = 0
fv =
l∈V l
T wlvfl +
u∈V u
T wuvfu
l∈V l
T wlv +
u∈V u
T wuv
(6) It is easy to show that after each iteration, we still have
Giulia Mialich - Luca Rossi
Algorithm 1: Iterative Algorithm Input: category size k, class labels C(x) for x ∈ VS ∪ V l
T ∪ V u T
Output: class labels ˜ C(x) for x ∈ V u
T
// Initialization foreach x ∈ VS ∪ V l
T ∪ V u T do
1
ˆ fx[C(x)] ← 1
2
foreach x ∈ V do
3
foreach i ← 1 to k do fx[i] ← 1/k
4
// Iteration repeat
5
foreach x ∈ VS do
6
f
α α+
v∈Vtag wxv ˆ
fx +
α+
v∈Vtag wxv
7
foreach x ∈ V l
T do
f
β β+
v∈Vtag wxv ˆ
fx +
β+
v∈Vtag wxv
8
foreach x ∈ V u
T do
f
γ γ+
v∈Vtag wxv ˆ
fx +
γ+
v∈Vtag wxv
9
foreach x ∈ Vtag do f
l∈V l T
wlxfl+
u∈V u T wuxfu
l∈V l T
wlx+
u∈V u T wux
10
foreach x ∈ V do fx ← f
11
until converged ;
12
// Get Class Label foreach x ∈ V u
T do
13
˜ C(x) = arg max1≤i≤k
fx[i]
T ∪V u T
fu[i]
14
Giulia Mialich - Luca Rossi
Lines 1 to 4 The vector encoding the prior knowledge and the class distribution of each vertex u ∈ V are initialized
Giulia Mialich - Luca Rossi
Lines 5 to 12 The class distribution of each vertex u is repeatedly updated according to its neighbors vertices
Giulia Mialich - Luca Rossi
Lines 5 to 12 If u is not a tag, the class distribution of object u is updated from the class distribution of the associated tags according to (3), (4) and (5)
Giulia Mialich - Luca Rossi
Lines 5 to 12 if u is a tag, its class distribution is updated according to the class distributions of the connected objects (based on equation (6)) Note how the tags act as a bridge of belief propagation
Giulia Mialich - Luca Rossi
Lines 1-4 take O(k |V|) Lines 6-11 take O(2k |E|) ⇒ O(k |V| + iter |E|) where k is the number of categories and iter is the number of iterations
Giulia Mialich - Luca Rossi
α=0,β≠0 and γ=0
Giulia Mialich - Luca Rossi
α≠0,β=0 and γ=0
Giulia Mialich - Luca Rossi
α≠0,β≠0 and γ≠0
prior knowledge
ODP:Shopping Amazon Name Count Name Count Publications/Books 558 Books 937 Consumer Electronics 494 Electronics 945 Health 1009 HealthPersonCare 747 Home and Garden 1976 HomeGarden 841 Jewelry 452 Jewelry 386 Music 527 Music 944 Office 77 OfficeProducts 695 Pet 443 PetSupplies 628
Giulia Mialich - Luca Rossi
Web products classification (from Amazon). Web pages collected from ODP are used as external resource for helping classification. Tags for web pages are collected from delicious.
Fi = 2πiρi πi + ρi , F(macro-averaged) = M
i=1 Fi
M
πi = TPi TPi + FPi , ρi = TPi TPi + FNi
Giulia Mialich - Luca Rossi
Macro-averaged scores (MacroF1) are influenced by the performance in rare categories
F(micro-averaged) = 2πρ π + ρ
Giulia Mialich - Luca Rossi
Micro-averaged scores (MicroF1) tend to be dominated by the performance on common categories
π = TP TP + FP = M
i=1 TPi
M
i=1(TPi + FPi)
, ρ = TP TP + FN = M
i=1 TPi
M
i=1(TPi + FNi)
(4)
Giulia Mialich - Luca Rossi
The Tag-based classification Model (TM) presented in the paper is compared with SVM (Support Vector Machine) and HG (Harmonic Gaussian field method). Both are used with the title or the tag of the products as features.
Giulia Mialich - Luca Rossi
Label Ratio 1% 5% Measure MicroF1 MacroF1 MicroF1 MacroF1 SVM+TITLE 0.4233 0.3812 0.5967 0.6091 SVM+TAG 0.4045 0.4059 0.6397 0.6435 HG+TITLE 0.6251 0.6038 0.6778 0.6689 HG+TAG 0.7174 0.7127 0.7856 0.7859 TM5 0.7870 0.7872 0.8027 0.8030
5α = 1000, β = ∞, γ = 0.1
SVM+TITLE SVM+TAG HG+TITLE HG+TAG TM6 p% MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 5% 0.5967 0.6091 0.6397 0.6435 0.6778 0.6689 0.7856 0.7859 0.7918 0.7919 10% 0.6700 0.6789 0.7168 0.7334 0.6937 0.6802 0.7915 0.7864 0.8005 0.7996 15% 0.7181 0.7218 0.7417 0.7366 0.7139 0.7049 0.7921 0.7908 0.8187 0.8199 20% 0.7343 0.7399 0.7674 0.7722 0.7152 0.7059 0.8025 0.8004 0.8217 0.8231 25% 0.7545 0.7597 0.7763 0.7780 0.7131 0.7038 0.8109 0.8079 0.8259 0.8273 6α = 0, β = ∞, γ = 0
Giulia Mialich - Luca Rossi
HG+TITLE HG+TAG α = 1000 p% MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 NA NA NA NA 0.7594 0.7606 1% 0.6251 0.6038 0.7174 0.7127 0.7708 0.7719 2% 0.6499 0.6334 0.7510 0.7434 0.7771 0.7766 3% 0.6368 0.6368 0.7695 0.7666 0.7774 0.7769 4% 0.6503 0.6360 0.7566 0.7513 0.7885 0.7891 5% 0.6778 0.6689 0.7856 0.7859 0.7872 0.7866
Giulia Mialich - Luca Rossi
10 10
110
210
310
410
50.753 0.754 0.755 0.756 0.757 0.758 0.759 0.76 0.761 0.762 alpha Micro F1 Macro F1 infinity
Giulia Mialich - Luca Rossi
p% 5% 10% 15% 20% 25% Measure MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 MicroF1 MacroF1 γ=0 0.7918 0.7919 0.8005 0.7996 0.8187 0.8199 0.8217 0.8231 0.8259 0.8273 SVM+TAG 0.6397 0.6435 0.7168 0.7334 0.7417 0.7366 0.7674 0.7722 0.7763 0.7780 (γ=0.001)+(SVM+TAG) 0.7938 0.7914 0.8000 0.7987 0.8214 0.8198 0.8229 0.8238 0.8281 0.8295 (γ=0.01)+(SVM+TAG) 0.7964 0.7932 0.8013 0.8005 0.8199 0.8184 0.8223 0.8231 0.8292 0.8306 (γ=0.1)+(SVM+TAG) 0.7796 0.7673 0.8096 0.8109 0.8251 0.8201 0.8272 0.8277 0.8355 0.8364 (γ=1)+(SVM+TAG) 0.6878 0.6846 0.7704 0.7803 0.7913 0.7843 0.8033 0.8051 0.8165 0.8163 HG+TAG 0.7856 0.7859 0.7915 0.7864 0.7921 0.7908 0.8025 0.8004 0.8109 0.8079 (γ=0.001)+(HG+TAG) 0.7968 0.7973 0.8038 0.8026 0.8214 0.8228 0.8251 0.8263 0.8300 0.8316 (γ=0.01)+(HG+TAG) 0.8012 0.8028 0.8056 0.8040 0.8222 0.8233 0.8249 0.8261 0.8313 0.8329 (γ=0.1)+(HG+TAG) 0.8038 0.8043 0.8174 0.8151 0.8233 0.8238 0.8296 0.8301 0.8381 0.8387 (γ=1)+(HG+TAG) 0.7950 0.7951 0.8036 0.7982 0.8082 0.8065 0.8206 0.8192 0.8339 0.8308
Giulia Mialich - Luca Rossi
1 2 3 4 5 6 7 8 9 10 11 10
4
10
3
10
2
10
1
10 Iteration Accuracy Change Scenario 1 Scenario 2 Scenario 3
Giulia Mialich - Luca Rossi
Giulia Mialich - Luca Rossi
Web object classification: An emerging task and increasingly important
Web object classification problem can take advantage from social tags in three aspects
represent web objects in a meaningful feature space
interconnect objects to indicate implicit relationship
bridging heterogeneous objects so that category information can be propagated from one domain to another
The proposed method significantly outperforms the state-of-the-art of general classification methods
In this model, it is only considered the setting of two types of web objects
It would be interesting to generalize the model to manage multi-types of objects