FXPAL Interactive Search for TRECVID 2004 John Adcock, Matthew - - PowerPoint PPT Presentation
FXPAL Interactive Search for TRECVID 2004 John Adcock, Matthew - - PowerPoint PPT Presentation
FXPAL Interactive Search for TRECVID 2004 John Adcock, Matthew Cooper, Andreas Girgensohn, Lynn Wilcox Overview First time doing search 2 nd year of participation overall Emphasis on interface elements Rich visualization of
FX Palo alto Laboratory Inc @ trecvid 2004
2
Overview
- First time doing search
– 2nd year of participation overall
- Emphasis on interface elements
– Rich visualization of search results – Quick and easy exploration of results
- Straightforward search engine
– Text search over ASR transcripts
- Literal search with Lucene
- Fuzzy search with LSS
– Keyframe search by image similarity
- Color correlograms
FX Palo alto Laboratory Inc @ trecvid 2004
3
Preprocessing
Unit of search retrieval is a “story”, but we couldn’t don’t have reference story segmentation for the test set
- Group reference shots into “stories”
– Bootstrap an LSS with common shot boundaries and ASR – use similarity-matrix method to find “story” boundaries
- Given new story boundaries
– Generate text indices for story and shots – Generate story-based LSS for search
FX Palo alto Laboratory Inc @ trecvid 2004
4
LS Index (shots)
Preprocessing
ASR Common Shot Ref Bootstrap LSS (shots) Similarity Segmentation Story Segments LS Index (stories) Lucene Index (stories) Lucene Index (shots)
FX Palo alto Laboratory Inc @ trecvid 2004
5
Search Engine
- User specifies combination of:
– Text query
- Literal query using Lucene or fuzzy query using LSS
– Image examples
- Any keyframe in the interface can be dragged onto the image
example area
– Text/image weighting is static and equal – Max image similarity of shot propagated to story – Text similarity of story propagated to shot
- Averaged with shot-based text similarity
FX Palo alto Laboratory Inc @ trecvid 2004
6
Search Engine
Lucene Search LSS Search Query text Image Color Correlogram Search Query Images Combine Ranked Stories
Searcher
- ption
FX Palo alto Laboratory Inc @ trecvid 2004
7
Interface Elements
- Stories summarized in keyframe “quads”
- Navigate through stories to video timeline/shots
- Transparent icon overlays
– Visited: grayed – Relevant: green – Irrelevant:red
- Query-relevance shown with size and color
- Hotkeys for most actions
- Multi-select and drag and drop
FX Palo alto Laboratory Inc @ trecvid 2004
8
Text query box Image query box Trecvid topic text Text search type Trecvid topic images Query results area Gray visited
- verlay
Relevant shots area Media player and zoom area Video timeline Expanded shots area Excluded overlay Included overlay Selected story
FX Palo alto Laboratory Inc @ trecvid 2004
9
Story Summary Quads
- Query-dependent story summary
– Use 4 highest scoring shots in the story – Allocate space proportional to score
Story thumbnail Shot thumbnails
FX Palo alto Laboratory Inc @ trecvid 2004
10
Building on searches
- Find similar
– Use shot/story text for search
- Add related
– Auto re-query with existing results
FX Palo alto Laboratory Inc @ trecvid 2004
11
Expanded Story / Timeline Browsing
- Selecting a story expands the
video at that point
– Clickable video timeline with relevancy shading – Clickable story quad timeline – Shot thumbs marked with relevancy – Overlay on shots marked (non)relevant – Mouse-overs zoom in the media player and tool-tip shows relevancy context – Double clicks play video in the media player
FX Palo alto Laboratory Inc @ trecvid 2004
12
Experiments
- 6 searchers answering 12 topics each in latin
square
– Pairs of orthogonal users grouped together
- Each topic answered 3 times
– Searchers include 2 primary developers
- 1 ended up in best and 1 in worst performing group
- Each of the 3 complete searcher runs goes
through 3 “systems” or methods for filling out the shot list yielding 9 total submissions
FX Palo alto Laboratory Inc @ trecvid 2004
13
System Types
- Type 1:
re-issue user queries and weight results of each query by precision against the user-labeled shots
- Type 2:
take text from all relevant shots and issue a single new LSS-based text query
- Type 3:
take text from each relevant shot in turn for LSS-based query and apply query ranking as in system type 1
Shots marked as not-relevant excluded from system results Every system type preceded by bracketing the user- retrieved shots
FX Palo alto Laboratory Inc @ trecvid 2004
14
Submissions
User IDed Shots Bracketed Shots System1 (Weighted) System2 (LSA1) System3 (LSA2) + +
FX Palo alto Laboratory Inc @ trecvid 2004
15
Results
- Ranks 3-6, 9-13 in overall MAP
– Strongly user dependent (user groups clump together) – Post-processing methods perform nearly same
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 I_A_1_AL_2_5 I_A_1_AL_1_4 I_A_1_AL_3_6 I_A_1_AL_1_7 I_A_1_AL_2_8 I_A_1_AL_3_9 I_A_1_AL_1_1 I_A_1_AL_2_2 I_A_1_AL_3_3 Submissions MAP
FXPal submissions Other contributors 2 3 1
FX Palo alto Laboratory Inc @ trecvid 2004
16
User vs. System
0.05 0.1 0.15 0.2 0.25 0.3 0.35 MAP Group 1 Group 2 Group 3 User Group
System Summary
WEIGHTED LSA1 LSA2 Bracketed None
FX Palo alto Laboratory Inc @ trecvid 2004
17
User vs. System in Overall
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
I_A_1_AL_2_5 I_A_1_AL_1_4 I_A_1_AL_3_6 fxpal_2_bracketed I_A_1_AL_1_7 I_A_1_AL_2_8 I_A_1_AL_3_9 I_A_1_AL_1_1 I_A_1_AL_2_2 I_A_1_AL_3_3 fxpal_2_users fxpal_1_bracketed fxpal_3_bracketed fxpal_3_users fxpal_1_users Submission
MAP
With bracketing User selected only Complete submission Other contributors
FX Palo alto Laboratory Inc @ trecvid 2004
18
Performance by Question
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 people on steps or stairs pedestrians and vehicles bicycles rolling people moving a stretcher umbrellas fingers striking keyboard buildings on fire handheld weapon firing golf ball into the hole Bill Clinton tennis player contacting ball horses in motion people and dogs wheelchairs signs at a protest zooming in US Capitol dome Benjamin Netanyahu buildings with flood waters Henry Hyde Saddam Hussein. Sam Donaldson hockey rink Boris Yeltsin MAP Overall median FXPal average Overall max
FX Palo alto Laboratory Inc @ trecvid 2004
19
Directions
- More sophisticated:
– Story segmentation – Image similarity / video features
- Simplify user interface for non power-users and
more typical search and re-use tasks
- Handle multiple simultaneous media streams