ACM MM 2010 Dong Liu , Shuicheng Yan, Yong Rui and Hong-Jiang Zhang - - PowerPoint PPT Presentation
ACM MM 2010 Dong Liu , Shuicheng Yan, Yong Rui and Hong-Jiang Zhang - - PowerPoint PPT Presentation
ACM MM 2010 Dong Liu , Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet Sep. 2010 : 5 billion Sep. 2010 : 120
Proliferation of images and videos on the Internet
2
- Sep. 2010: 5 billion
2000 images /minute
- Sep. 2010 : 120 million
20 hours uploaded/minute
3
Year
1990 2000 2001 2002 Internet Image Search
Pure Content Based Direct Text Based Semantic Based
Query by Example Query by Surrounding Text Query by Tag 1st Paradigm 2nd Paradigm 3rd Paradigm
4 medici chapel, Firenze, Italy... Loggia dei lanzi, sword, honeymoon, ... Statue, building, sky, Italy, ... Cathedral, tower, Italy...
5
- D. Liu, X.-S. Hua and H.-J. Zhang. Content-Based Tag Processing for Internet
Social Images: A Survey. Multimedia Tools and Applications.
Tag Refinement
top 101 tour tiger sweet big cloud dog house tree sky ground cloud alex speed leave dog 101 tree
Tag-to-Region Auto-Tagging Tag Ranking
dog leave tree speed alex 101
To discover the relationship between the tags and the underlying semantic regions in the images.
6
Our Strategy
Perform tag analysis at the granularity of image regions Propose a new concept of multi-edge graph to
- model the parallel semantic
relationships between the images.
- propagate the tags from
images to regions.
How to solve various tag analysis tasks in a unified framework?
Content-Based Tag Analysis
Auto- Tagging Tag Refinem ent Tag-to- Region
probability that edge et is labeled as positive with tag c 7
Multi-Edge Graph A Core Equation
vertex 3
y3
vertex n
yn
vertex 1
y1
vertex 2
y2
vertex k
yk
vertex t
yt
(v, f) (v, f) (v, f) (v, f) (v, f) (v, f) (v, f) (v, f)
- ne edge
all edges between vertex i and j labeling information of vertex i with respect to tag c
Step 1: Bag-of-Regions Representation
8
Input Image Bag-of-Regions Segmentation 1 Segmentation 2
Step 2: Multi-Edge Graph Construction
9
Given two images with bag-of-regions representation: Edge construction : mutual k-Nearest Neighbor Edge affinity calculation
reliable edge connection
reliable similarity measure
10
two images with the same tag
dog, flower dog, bird at least one edge connecting the two regions corresponding to the tag
Notations
11
Model the cross-level tag propagation
12
Loss function Regularization Objective Function Solving F directly is of great computational challenge, we turn to the alternative optimization strategy
Optimize sub-problems with cutting plane
13
At each iteration, solving only a subset of tag confidence vector between vertex i and vertex j : The yielded a sub-optimization problem : Since Max function is non-smooth, we solve it with the cutting plane method.
14
f1 f2 f3 f1 f2 f3 dog 0.1 0.2 cat 0.1 0.1 0.2 apple 0.4 0.2 0.4 flower 0.3 0.3 0.2 tree 0.1 0.2 0.1 apple flower apple
Majority Voting: apple (2 times) > flower (1 time)
apple
By doing so, a series of tag analysis tasks can be performed in a coherent way.
15
The cutting plane iteration will terminate in a constant number of steps.
The optimization objective is convex, resulting in a
globally optimal solution.
In term of pixel-level accuracy.
MSRC-100 and Corel-350 datasets.(Benchmarks for tag-to-region assignment task) Comparison with kNN-1 (k=49), kNN-2 (k=99) and Bi-layer sparse coding [1].
16
Dataset kNN-1 kNN-2 Bi-layer [1] M-E Graph MSRC-350 0.45 0.37 0.63 0.73 COREL-100 0.52 0.44 0.61 0.67
[1] Liu, Cheng, Yan and Chua . Label to region by bi-layer sparsity priors. MM 2009.
17
In terms of Average F-Score
On the NUS-WIDE-SUB datasets with 18, 325 Flickr images Comparison with Baseline (initial user provided tags ), CBAR [1] and TRVSC [2].
18
Method Baseline CBAR [1] TRVSC [2] M-E Graph Precision 0.47 0.50 0.52 0.54 Recall 0.49 0.52 0.53 0.57 F-Score 0.44 0.47 0.49 0.53 [1] C. Wang, L. Zhang and H.-J. Zhang . Content-based Image Annotation Refinement. CVPR 2007. [2] D. Liu, X.-S. Hua and H.-J. Zhang. Retagging Social Images based on Visual and Semantic Consistency . WWW 2010.
In terms of Average Per-tag Precision and Recall.
MSRC, COREL and NUS-WIDE-SUB datasets. Comparison with the state-of-the-art multi-label auto-tagging methods.
19
20
Unified Tag Analysis with Multi-Edge Graph
Perform tag analysis at the granularity of image regions Model the parallel semantic relationship between the images Realize cross-level tag propagation
21