M4 – WP3 Multimodal integration
Progress report
Viper group Computer Vision and Multimedia Lab University of Geneva 30-01-03
M4 WP3 Multimodal integration Progress report Viper group - - PowerPoint PPT Presentation
M4 WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia Lab University of Geneva 30-01-03 Progress report UniGE Information retrieval setup / extension Video data processing Information
Viper group Computer Vision and Multimedia Lab University of Geneva 30-01-03
2
3
Segmentation Event definition A/V/text input URLisation SQL DB Time codes URLs Characterisation Feature definition Feature files GIFT indexing Keyframes Index file GIFT Text QBE query Text query Interface MRML Query client
4
Segmentation Event definition A/V/text input URLisation SQL DB Time codes URLs Characterisation Feature definition Feature files GIFT indexing Keyframes Index file Text QBE query Text query Interface MRML Query client Text Audio query GIFT
5
6
Classical techniques Based on spatio-temporal features (ongoing)
Mixed colour/motion information
Need to be extended to event-based segmentation
Integration of M4 features
Estimation on feature pattern model (motion) Support Vector Regression
Non-linear Prediction of Chaotic Times Series using SVM, NNSP’97 (Mukherjee, Osuna, Girosi) Predicting Time Series with SVM, ICANN’97 (Muller, Smola, Schölkopf, Vapnik)
7
2 2 1
1 V
V
8
9
10
11
12
More multimedia More like an XML protocol (as defined by W3C - XMLP) Trully multimedia / multimodal ⇒ Spec proposal release mid-Feb ⇒ Expected validation software: this summer
Based on RDF and Dublin Core (XML) DAML+OIL (OWL) compatible Makes existing software available (Xerces, Jena,…) Allows multiple extensions (WordNet,…)
13
14
m12 (Feb 2003) Technical doc of the working system in place
m24 (Feb 2004) Define intuitive way for meeting data querying and retrieval
m36 (Feb 2005) Technical doc of the meeting manager
15
16
17
Emphasis on multimedia information processing and retrieval Image, Video : Visual + Motion Audio (speech), Text Framework: Architecture, integration
Emphasis on multimodal interaction (query processing) Information from text, speech (text?), gesture,... Natural language processing
Emphasis on data summarisation Video, dialogs, documents
18
19
CBIR server
T CP / IP
Client
Multimedia data
http server
…
soc ket
MRML
layer QBE query formulator (eg PHP interface) Existing tool so ck et
MRML
layer Tool plugin (eg GIMP plugin)
MRML layer
so ck et Assessor (eg Viper evaluation script) Op en soc ket
GIFT
plugins M R M L PluginX PluginY
…
Multimedia feature storage
MRML logging
Multimedia data
Online Offline
Feature extraction
…
fe at ur es
URL abstraction (temporary local copy) Queries Response Relevance feedback