SLIDE 7
- 6. CONCLUSION AND FUTURE WORK
Presentation videos play an important role in information sharing and exchange. In this paper, we investigated hierarchical segmentation of presentation videos through visual and text analysis. Specifically, two-level video segmentation is studied in our work: topic-level and slide-
- level. We introduced Topic Words Introduction (TWI) for
test-based segmentation. Experiment results show that TWI can effectively segment a presentation into topically coherent slide blocks with an average F-score of 0.97. Slide-level segmentation bases on local color histogram difference analysis. To map text-based segmentation back to presentation video segmentation, image matching between converted slide images and extracted key frames are performed based on image edge analysis. With our data set, the F–scores for slide-level segmentation and image matching are 0.91 and 0.94 respectively. In this paper, we focus on presentations with a typical structure as described in Section 2. In the future, we will work on presentations that do not have this structure. We envision intelligent text analysis that integrates advanced techniques in machine learning and artificial intelligence will provide a viable solution to this problem. REFERENCES
[1]. S. Mukhopadhyay and B. Smith, “Passive capture and structuring of lectures”, Proceedings of the 7th ACM International Conference on Multimedia, Orlando, Florida, USA, October 1999, pp. 477-487. [2]. N. Yamanoto, J. Ogata and Y. Ariki, “Topic segmentation and retrieval systems for lecture videos based on spontaneous speech recognition”, Proceedings of EUROSPEECH 2003, Geneva, Switzerland, September 1-4, 2003, pp961-964. [3]. D. Phung, S. Venkatesh and C. Dorai, “High level segmentation of instructional videos based on content density”, Proceedings of Multimedia ’02, Juan-les-Pins, France, December, 2002, pp295-298. [4]. M. Liu, et al. “Segmentation of lecture videos based on text: a method for combining multiple linguistic features”, Proc. of the 37th Hawaii International Conference on System Sciences, 2004. [5]. M. Porter, An algorithm for suffix stripping. Program, 14(3):130-137, July, 1980. [6]. M. Hearst, “TextTiling: segmenting text into multi-paragraph subtopic passages”, Computational Linguistics, vol 23 (1), 1994. No.
slides
Key Frames
w/o combining
w/ combining P R F 1 11 15 14 10 0.93 0.90 0.91 2 47 50 47 44 0.94 0.93 0.93 3 19 19 19 18 1.00 0.94 0.97 Ave. 0.94 Table 3. Experimental results for image matching
6
View publication stats View publication stats