EE 6882 Visual Search Engine March 5 th , 2012 Lecture #7 Relevance - - PDF document

ee 6882 visual search engine
SMART_READER_LITE
LIVE PREVIEW

EE 6882 Visual Search Engine March 5 th , 2012 Lecture #7 Relevance - - PDF document

3/5/2012 EE 6882 Visual Search Engine March 5 th , 2012 Lecture #7 Relevance Feedback Graph Based Semi Supervised Learning Application of Image Matching: Manipulation Detection What We have Learned Image Representation and


slide-1
SLIDE 1

3/5/2012 1

EE 6882 Visual Search Engine

March 5th, 2012 Lecture #7

 Relevance Feedback  Graph‐Based Semi‐Supervised Learning  Application of Image Matching: Manipulation

Detection

What We have Learned

 Image Representation and Retrieval Using Global Features  Local Features and Image Matching  Image Classification

2

slide-2
SLIDE 2

3/5/2012 2

What more can be done?

 Image Representation and Retrieval

 User in the Loop: Relevance Feedback

 Local Features and Image Matching

 New Features  Different Ways of Quantization, Codebook Learning  Application: Duplicate Detection

 Image Classification

 Machine Learning Techniques  Semi‐Supervised Learning  Multi‐Modal Fusion

 Others:

 User interfaces

3

When User in the Loop: Interactive Query Refinement

Query Formulation

Online Update/ Rerank

Query Processing

1 2

  • Query Examples
  • Classifiers
  • Key Words
  • Feature Selection
  • Distance Metric
  • Ranking Model

New Classifiers

  • Relevance Feedback

(shot, track, track interval)

  • Feature/Attributes
  • Interaction Log

Handle novel data 3 Results Updated results

slide-3
SLIDE 3

3/5/2012 3

Example:

Columbia TAG Interactive Image Search System

 Demo:

Rapid Image Annotation with User Interaction

5 S.-F. Chang, Columbia U.

A Very Simple Case: Query Update

  • Automatically update Query Point based on user

feedback

f1 f2 Query Retrieval results

slide-4
SLIDE 4

3/5/2012 4

User Provides Feedback

f1 f2

Positive Feedback Negative Feedback

Query Update

f1 f2

Positive Feedback Negative Feedback New Query Vector = mean(Positive Feedbacks)

slide-5
SLIDE 5

3/5/2012 5

Query Expansion

f1 f2

Positive Feedback Negative Feedback

Build Multiple Classifiers

f1 f2

Positive Feedback Negative Feedback

slide-6
SLIDE 6

3/5/2012 6

Graph-based Semi-Supervised Learning

  • Given a small set of labeled data and a large number of

unlabeled data in a high-dimensional feature space

– Build sparse graphs with local connectivity – Propagate information over graphs of large data sets – Hopefully robust to noise and scalable to gigantic sets

11 Input samples with sparse labels Input samples with sparse labels Label propagation on graph Label propagation on graph Label inference results Label inference results Unlabeled Positive Negative

Negative Positive

Intuition

 Capture local structures via sparse graph

L inear Classifier Nonlinear Classifier Gr aph Semi- Super vised L ear ning T hr

  • ugh Spar

e Gr aph Constr uc tion (e.g., kNN)

slide-7
SLIDE 7

3/5/2012 7

Graph Construction Label Propagation Image\Video data Processing

(denoising, cropping …)

Possible Applications: Propagating Labels in Interactive Search & Auto Re‐ranking

Feature Extraction Compute Similarity Applications Search, Browsing User Interface

Interactive browse / label

Interactive Mode 13 S.-F. Chang, Columbia U.

Top-rank results

Existing Ranking/filtering System Automatic Mode

Re-ranking

  • ver large set

No predefined Category

Background Review

  • Given a dataset of labeled

samples , and unlabeled samples

  • undirected graph of samples as

vertices and edges weighted by sample similarity

  • Define weight matrix ;

vertex degree

slide-8
SLIDE 8

3/5/2012 8

Example

1 1 2 2 1 Weight matrix Node degree

L abel matr ix c lasses samples

1 1 1 1 ? ? ? F 1 1 1 1 0.1 0.2 0.9

L abel pr edic tion

Gr aph-based SSL

, ,

D 1 3 3 4 3

Some Options of Constructing Sparse Graph

 Distance Threshold  K-Nearest Neighbor Graph  B-Matched Graph

  • 1 and

(Huang and Jebara, AISTATS 2007) (Jebara, Wang, and Chang, ICML 2009)

max

  • max
slide-9
SLIDE 9

3/5/2012 9

Several Ways of Constructing Sparse Graphs

Distance threshold Rank threshold (kNN) B-Match k,b=4 k,b=6

Examples of Graph Construction

(KNN) (B-Matching) k = 4 b = 4

slide-10
SLIDE 10

3/5/2012 10

Graph Construction – Edge Weighting

 Binary Weighting  Gaussian Kernel Weighting  Locally Linear Reconstruction Weighting

Measure Smoothness: Graph Laplacian

  • Graph Laplacian

, and normalized Laplacian

  • smoothness of function f over graph

,

  • Multi-class

,

slide-11
SLIDE 11

3/5/2012 11

Classical Methods:

  • Predict a graph function (F) via cost optimization
  • Local and Global Consistency - LGC (Zhou et al, NIPS 04)
  • Gaussian Random Fields – GRF (Zhu et al, ICML03)

prediction function function smoothness empirical loss

  • (Zhu et al ICML03, Zhou et al NIPS04, Joachim ICML03)

∗ →

 Co mpare

method-gr aphs-weights

 B-matc hing te nds to

  • utpe rfo rm kNN

 B-Matc hing partic ularly

go o d fo r GT AM + lo c al line ar (L L R) we ight

Empirical Observations

22

GTAM GTAM GTAM GTAM GTAM GTAM

(Jebara, Wang, and Chang, ICML 2009)

slide-12
SLIDE 12

3/5/2012 12

23

Noisy Label and other Challenges

Unbalanced Labels Ill Label Locations Noisy Data and Labels LGC Propagation GRF Propagation

Label Unbalance ‐ A Quick Fix

  • Normalize labels within each class based on

node degrees Example:

Node degree matrix

L abel matr ix c lasses samples

slide-13
SLIDE 13

3/5/2012 13

 Change uni‐variate optimization to bi‐variate

formulation:

Dealing with Noisy Labels

‐‐ Graph Transduction via Alternate Minimization

( GTAM, Wang, Jebara, & Chang, ICML, 2008) ( LDST, Wang, Jiang, & Chang, CVPR, 2009)

Alternate Optimization

 Then, search optimal integer Y given F*  First, given Y solve continuous valued

Gradient decent search

slide-14
SLIDE 14

3/5/2012 14

Alternate Minimization for Label Tuning

 Iteratively repeat the above procedure

Add label: Delete label:

Q = 0.8 0.1 0.23 0.25 0.31 0.07 0.17 0.04 (1,1) (3,1) =

1 1 Example:

=

1 1

Decline of the cost function Q over iterations (with vs. without label tuning)

Iteration # 2 Iteration # 6 Initial Labels

Label Diagnosis and Self Tuning

( LDST, Wang, Jian, & Chang, CVPR, 2009)

Add label: Delete label:

slide-15
SLIDE 15

3/5/2012 15

Application: Web Search Reranking

Keyword Search Web Images Top images as + Bottom imgs as ‐ Label Diagnosis Diffusion

Google Search “Tiger”

Application: Web Search Reranking

Keyword Search Web Images Top images as + Bottom imgs as ‐ Label Diagnosis Diffusion

Rerank

slide-16
SLIDE 16

3/5/2012 16

Graph Construction Label Propagation Image\Video data Processing

(denoising, cropping …)

Possible Applications: Propagating Labels in Interactive Search & Auto Re‐ranking

Feature Extraction Compute Similarity Applications Search, Browsing User Interface

Interactive browse / label

Interactive Mode 31 S.-F. Chang, Columbia U. No predefined Category

Use image graph to tune & propagate information Use EEG brain signals to detect target of interest

(joint work with Sajda et al, ACMMM 2009, J. of Neural Engineering, May 2011)

Application: Brain Machine Interface for Image Retrieval

  • - denoise unreliable labels from brain signal decoding
slide-17
SLIDE 17

3/5/2012 17

33

The Paradigm

Database (any target that may interest users)

34

Database Neural (EEG) decoder EEG-scores

The Paradigm

slide-18
SLIDE 18

3/5/2012 18

35

Database Neural (EEG) decoder Exemplar labels (noisy) Graph-based Semi-Supervised Learning

The Paradigm

image features prediction score

36

Pre-triage Post-triage

The Paradigm

slide-19
SLIDE 19

3/5/2012 19

37

Pre-triage Post-triage

The Paradigm

Human inspects

  • nly a small

sample set via BCI Machine filters out noise and retrieves targets from very large DB

  • General:

no predefined target models, no keyword

  • High Throughput: neuro‐vision as

bootstrap of fast computer vision

38

The Neural Signatures of “Recognition”

  • D. Linden, Neuroscientist, 2005, the Oddball Effect

Novel (P3a)

Novel Target Standard

Target (P3b)

time

Standard Target Novel

slide-20
SLIDE 20

3/5/2012 20

NSF HNCV10 39

Single-trial EEG Analysis

  • Typically EEG is averaged over trials to increase the amplitude of the

signal correlated with cortical processes relative to artifacts (very low SNR)

  • High-density EEG systems were designed without a principled approach to handling

the volume of information provided by simultaneously sampling from large electrode arrays.

  • Our solution: identifying neural correlates with individual stimuli via single trial EEG

analysis.

  • We apply principled methods to find optimal ways for combining information over

electrodes and moments in time contained in individual trials

Single-trial EEG Event Related Potentials

NSF HNCV10 40

Identifying Discriminative Components in the EEG Using Single-Trial Analysis

LDA or Logistic Regression is used to learn the contributions of EEG signal components at different spatial‐temporal locations Optimal spatial filtering across electrodes within each short window (e.g., 100ms) Optimal temporal filtering over time windows after onset

(Parra, Sajda et al. 2002, 2003)

slide-21
SLIDE 21

3/5/2012 21

NSF HNCV10 41

Hierarchical Discriminant Components

… use factorization to greatly reduce the number of parameters (100K -> 100)…

Discover distribution of discriminative components

42

Effect of graph-based reranking (BCI test)

Top (noisy) results of Brain EEG signal detection Top results after graph‐ based label denoising & propagation P‐R curve significantly improved

slide-22
SLIDE 22

3/5/2012 22

43

More Example Results

Top 20 results of EEG detection Top 20 results of Hybrid System (BCI‐VPM) Top 20 results of EEG detection Top 20 results of Hybrid System (BCI‐VPM)

44

Dependency of Neuro & CV Components

… not every case improves …

  • Among 12 cases (4 subjects & 3 targets), 8 cases are clearly improved.

When the EEG decoder fails, the hybrid system also fails.

Question:

  • what’s the required EEG accuracy for

the hybrid system to work?

  • are some categories more difficult?

EEG accuracy

slide-23
SLIDE 23

3/5/2012 23

45

Part II: Application of Image Matching

Social Media Tracking

Frequent reuse and dissemination of media objects on Web 2.0 Frequent reuse and dissemination of media objects on Web 2.0

Application of Image Matching: Tracking Media Objects on the Web

channel time

ABC CNN MSNBC FOX CCTV TVBS LBC Al-Jazeera Google.cn Google.com Yahoo news 46

slide-24
SLIDE 24

3/5/2012 24

DVMM Lab, Columbia University

Manipulation correlated with Perspective

Raising the Flag on Iwo Jima Joe Rosenthal, 1945 Anti-Vietnam War, Ronald and Karen Bowen, 1969

47

digital video | multimedia lab

Web Scenario: Search, Cluster, Insights

Issue a text topic query Find similar images, merge into clusters Explore history/trend e.g., top 1000 results from web search engine Rank clusters (size?, original rank?)

  • 48-
slide-25
SLIDE 25

3/5/2012 25

digital video | multimedia lab

Duplicate Clusters Reveal Image Provenance

Biggest Clusters Contain Iconic Images Smallest Clusters Contain Marginal Images

DVMM Lab, Columbia University

Deep Analysis of Visual Data: Visual Migration Map (VMM)

Visual Duplicate Set Visual Migration Map

50

slide-26
SLIDE 26

3/5/2012 26

DVMM Lab, Columbia University

Visual Migration Map (VMM)

“Most Original” at the root “Most Divergent” at the leaves Images Derived through Series of Manipulations

  • Uncover manipulation

history

  • Assess originality
  • Correlate with sentiment

change

51 DVMM Lab, Columbia University

How to automate VMM construction? Start with Basic Manipulations

  • Given an image pair, detect possible manipulations
  • Each implies direction (one image derived from other)
  • Other possible manipulations: color correction, multiple

compression, sharpening, blurring

Original Scaled Cropped Gray Overlay Insertion

52

slide-27
SLIDE 27

3/5/2012 27

digital video | multimedia lab

Image Matching by Local Features

 Find local interest points in images  Match interest points to determine image

copies of the same scenes or objects

 Estimate geometrical transforms from matched

points

[Lowe, 1999]

  • 53-

DVMM Lab, Columbia University

More Challenging: Overlay Detection?

  • Given two images, we can observe that a region

is different between the two

  • But how do we know which is the original?
slide-28
SLIDE 28

3/5/2012 28

DVMM Lab, Columbia University

Cropping or Insertion?

  • Can find differences in image area
  • But is the smaller-area due to a crop or is the

larger area due to an insertion?

Cropping Original Insertion

DVMM Lab, Columbia University

Use Context from Many Duplicates

Normalize Scales and Positions Compute de-noised value for each pixel “Composite” image

slide-29
SLIDE 29

3/5/2012 29

DVMM Lab, Columbia University

Cropping Detection w/ Context

  • In cropping, we expect the content outside the

crop area to be consistent with the composite image

Image A Composite A Residue A Image B Composite B Residue B

DVMM Lab, Columbia University

Insertion Detection w/ Context

  • In insertion, we expect the area outside the crop

region to be different from the typical content

Image A Composite A Residue A Image B Composite B Residue B

slide-30
SLIDE 30

3/5/2012 30

DVMM Lab, Columbia University

Overlay Detection w/ Context

  • Comparing images against composite image

reveals portions that differ from typical content

  • Image with divergent content may have overlay

Image A Composite A Residue A Image B Composite B Residue B

DVMM Lab, Columbia University 60

Evaluation: Manipulation Detection

  • Context-Free detectors have near-perfect performance
  • Context-Dependent detectors still have errors
  • Consistency checking can further improve the accuracy
  • Are these error-prone results sufficient to build manipulation histories?

Context-Dependent Context-Free

slide-31
SLIDE 31

3/5/2012 31

DVMM Lab, Columbia University

Inferring Direction from Consistency

Not Plausible

DVMM Lab, Columbia University

Manipulation Direction from Consistency

Plausible

slide-32
SLIDE 32

3/5/2012 32

DVMM Lab, Columbia University

Derive Manipulation among Multiple Images

DVMM Lab, Columbia University

Emerging Migration Map

  • Individual parent-child

relationships give rise to a manipulation history

  • Relationships are only

plausible (we don’t know for sure)

  • Absences of relationships are

more concrete (we can be more certain)

  • Redundancy: plausible

derivations from parents and ancestors of parents

slide-33
SLIDE 33

3/5/2012 33

DVMM Lab, Columbia University

Experiments

  • Select 22 iconic images
  • Mostly political figures,

culled from Google Zeitgeist and TRECVID queries

  • Generate manipulation

histories:

  • fully-automatic mechanisms
  • pseudo groundtruth through

manual annotation

65 DVMM Lab, Columbia University

Automatic Visual Migration Map

“Originals” at source nodes “Manipulated” at sink nodes

66

slide-34
SLIDE 34

3/5/2012 34

“Originals” at source nodes “Manipulated” at sink nodes

DVMM Lab, Columbia University

Evaluation: Automatic Histories

  • High agreement with manually-constructed histories
  • Detect edits with Precision of 91% and Recall of 71%

Automatically-Constructed Manually-Constructed

Deleted Inserted

68

slide-35
SLIDE 35

3/5/2012 35

DVMM Lab, Columbia University

Application: Summarizing Changes

  • Analyze manipulation history graph structure to

extract most-original and most highly- manipulated images

69 DVMM Lab, Columbia University

Application: Finding Perspective

  • Survey image type and corresponding

perspective across many examples

  • Find correlation between high manipulation and

negative/critical opinion

70

slide-36
SLIDE 36

3/5/2012 36

DVMM Lab, Columbia University

Application: Finding Perspective

71 DVMM Lab, Columbia University

Myspace Profile: “Osama Bin Laden - My Idol of All Time!”

http://www.myspace.com/mamu_potnoi

Application: Finding Perspective

72

slide-37
SLIDE 37

3/5/2012 37

DVMM Lab, Columbia University

Daily Excelsior Newspaper: “Further Details of Bin Laden Plot Unearthed: ABC Report.”

http://www.dailyexcelsior.com/00jan31/inter.htm

Application: Finding Perspective

73 DVMM Lab, Columbia University

Democratic National Committee Site: “Capture Osama Bin Laden!”

http://www.democrats.org/page/petition/osama

Application: Finding Perspective

74

slide-38
SLIDE 38

3/5/2012 38

DVMM Lab, Columbia University

Joke Website: “Every time I get stoned, I go and do something stupid!” “Osama Bashed Laden”

http://www.almostaproverb.com/captions2.html

Application: Finding Perspective

75 DVMM Lab, Columbia University

VMM Applications: Geographic/Cultural Dispersion

76

slide-39
SLIDE 39

3/5/2012 39

DVMM Lab, Columbia University

VMM Applications: Reverse Profiling

77

78

References

Graph‐Based Relevance Feedback

  • J. Wang, Y.‐G. Jiang, and S.‐F. Chang. Label diagnosis through self tuning for web image search.

CVPR, 2009.

  • T. Jebara, J. Wang, and S.‐F. Chang, Graph Construction and b‐Matching for Semi‐Supervised

Learning, ICML 2009.

  • X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi‐supervised learning using Gaussian fields and

harmonic functions. ICML, 2003.

  • D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global
  • consistency. NIPS, 2004.

Brain Machine Interface

  • J. Wang, E. Pohlmeyer, B. Hanna, Y.‐G. Jiang, P. Sajda, and S.‐F. Chang,

“Brain State Decoding for Rapid Image Retrieval,” ACM Multimedia Conference, 2009.

Web Image Tracking

  • L. Kennedy and S.‐F. Chang, “Internet Image Archaeology: Automatically Tracing the Manipulation

Histories of Images on the Web,” ACM Multimedia 2008, Vancouver, Canada, October 2008.

  • L. Xie, et al, “Tracking Visual Memes in Rich‐Media Social Communities,” International AAAI

Conference on Weblogs and Social Media, 2011.