m4 wp3 multimodal integration
play

M4 WP3 Multimodal integration Progress report Viper group - PowerPoint PPT Presentation

M4 WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia Lab University of Geneva 30-01-03 Progress report UniGE Information retrieval setup / extension Video data processing Information


  1. M4 – WP3 Multimodal integration Progress report Viper group Computer Vision and Multimedia Lab University of Geneva 30-01-03

  2. Progress report � UniGE � Information retrieval setup / extension � Video data processing � Information management framework � WP3: � Issues � Status – deliverable 2 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  3. Information retrieval setup (initial) Event definition Feature definition A/V/text Feature files input Characterisation Index file Segmentation GIFT indexing GIFT URLisation MRML Keyframes Text QBE query query URLs Interface Text SQL DB Query client Time codes 3 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  4. Information retrieval setup (planned) Event definition Feature definition A/V/text Feature files input Characterisation Index file Segmentation GIFT indexing Keyframes GIFT URLisation MRML Text Text QBE Audio query query query URLs Interface Text SQL DB Query client Time codes 4 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  5. Video processing (1) OVAL :Video Access Library � C++ Video Object Model � Accepts plugin for specific formats � MPEG-1 : Dali from Cornell � LibDV, « XML » video plugin � Provides a generic API � Open, Close, GetProp stream � GetFrame(s) � Specific (MPEG: getMV, getDCT) � Do not accomodate Image Processing functionalities � Use of Matlab Mex with persistent memory 5 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  6. Video Processing (2) � Video segmentation � Classical techniques � Based on spatio-temporal features (ongoing) � Mixed colour/motion information � Need to be extended to event-based segmentation � Integration of M4 features � Video characterisation � Estimation on feature pattern model (motion) � Support Vector Regression � Non-linear Prediction of Chaotic Times Series using SVM, NNSP’97 (Mukherjee, Osuna, Girosi) � Predicting Time Series with SVM, ICANN’97 (Muller, Smola, Schölkopf, Vapnik) 6 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  7. Video Similarity Measure ( , ) 1 ( ) S V V E 1 V = − 1 2 2 V � Problems: S ( V 1 , V 1 ) ≠ 0 S ( V 1 , V 2 ) ≠ S ( V 2 , V 1 ) � Artificial symetrization D ( V 1 , V 2 ) = 0.5*[ S ( V 1 , V 2 ) + S ( V 2 , V 1 ) ] 7 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  8. Video Classification � Distance matrix computed with prediction error D ( V i , V j ) � For all pair of video <i,j> in the given database D i,j = D ( V i , V j ) � Curvilinear Component Analysis is applied on D ⇒ gives a 2-dimensionnal mapping of the feature space 8 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  9. Preliminary experiment � 29 video shots containing mainly Tv news and sport activities 9 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  10. 10 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  11. Ongoing… � Text retrieval � Inclusion within GIFT � Multimodal embedding (visual+text query) � Query expansion (eg using WordNet) � Event characterisation � High level model � Feature-based inference ⇒ Characterisation of well-known events ⇒ Suitable for restricted contexts (M4) 11 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  12. Information management � MRML : Going toward version 2.0 � More multimedia � More like an XML protocol (as defined by W3C - XMLP) � Trully multimedia / multimodal ⇒ Spec proposal release mid-Feb ⇒ Expected validation software: this summer � DEVA (Annotation model) � Based on RDF and Dublin Core (XML) � DAML+OIL (OWL) compatible � Makes existing software available (Xerces, Jena,…) � Allows multiple extensions (WordNet,…) 12 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  13. WP3: Initial work plan 13 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  14. WP3: Delivrables � D3,1: Report on baseline information access methods � m12 (Feb 2003) � Technical doc of the working system in place � D3,2: Report on methods for multimodal integration and NLP � m24 (Feb 2004) � Define intuitive way for meeting data querying and retrieval � D3,3: Final report on multimodal information access � m36 (Feb 2005) � Technical doc of the meeting manager 14 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  15. D3.1 � Gathered basic information � Group-based � Template sent by next week � Activity-based � Description of what you can contribute in one field � Response by Feb 20th � Fill in where you feel is relevant � Edited by End of Feb � Smoothed out gaps… � Sent to Steve by Mid March 15 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  16. WP3: Issues � Visual data is not usable alone � Need for text transcitps � Use of « external » data � Need for common format for data exchange � Annotation (explicit) � Processing results � Increase collaboration � Integration 16 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  17. WP3 breakdown � Year 1 (-> 03/2003) � Emphasis on multimedia information processing and retrieval � Image, Video : Visual + Motion � Audio (speech), Text � Framework: Architecture, integration � Year 2: (-> 03/2004) � Emphasis on multimodal interaction (query processing) � Information from text, speech (text?), gesture,... � Natural language processing � Year 3: (-> 03/2005) � Emphasis on data summarisation � Video, dialogs, documents 17 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  18. � ???? 18 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

  19. The framework Client CBIR server QBE query MRML soc GIFT Feature layer formulator ket extraction (eg PHP Op interface) Queries en Response soc Existing tool Relevance M ket PluginX feedback R Tool plugin MRML so M (eg GIMP layer ck L PluginY plugin) et … Assessor MRML so … plugins (eg Viper layer ck T evaluation et CP fe script) / at ur IP es Multimedia MRML feature … Multimedia logging storage data URL abstraction http server Offline Online (temporary local copy) Multimedia data 19 S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend