MMDB-1 J. Teuhola 2012 1
Multimedia Databases
Storage and Retrieval of Media Data
Jukka Teuhola
- Dept. of Information Technology, University of Turku
Fall 2012
Multimedia Databases Storage and Retrieval of Media Data Jukka - - PowerPoint PPT Presentation
Multimedia Databases Storage and Retrieval of Media Data Jukka Teuhola Dept. of Information Technology, University of Turku Fall 2012 MMDB-1 J. Teuhola 2012 1 General course info
MMDB-1 J. Teuhola 2012 1
Jukka Teuhola
Fall 2012
MMDB-1 J. Teuhola 2012 2
http://staff.cs.utu.fi/kurssit/multimedia_databases/autumn_2012/ Lectures 28 h, Tue 14-16, Fri 10-12, in classroom B2033 Homework 6 times, every solution gives a bonus of 1 point to the
final score in the examination.
Examination:
5 tasks, max 8 points, so the total score is 0-40 points Minimum accepted = 20 points, giving grade 1 Linear interpolation: 20-40 1-5, formula: Preliminary knowledge:
Databases; data structures and algorithms ) 5 20 bonus points 1 (
MMDB-1 J. Teuhola 2012 3
Powerpoint slides: <course homepage>/slides
Optional reading:
Retrieval, Springer 2007.
Approach, Addison-Wesley, 2003.
Application to GIS, Morgan-Kaufmann, 2002.
Springer 2008.
Miscellaneous articles.
MMDB-1 J. Teuhola 2012 4
MMDB-1 J. Teuhola 2012 5
Storage principles Data representation Queries, searching, content-based retrieval Indexing
Usage of software products MM authoring and content production MM presentation
MMDB-1 J. Teuhola 2012 6
What is multimedia? A dataset or document containing at least two
different media types.
Multimedia and imaging are continuously growing trends. Enhanced quality and quantity of information, compared to plain text Brings dramatic improvements to human-computer interaction Rich and expressive way of representing, browsing and interacting
with information
“Second information revolution” Revolutionizes business, science, engineering, manufacturing, art,
entertainment…
Crucial issues:
Size can be huge Speed required to satisfy audio/video transmission rates Semantics: both type- and instance-level metadata
MMDB-1 J. Teuhola 2012 7
(1) MM files and archives:
Simple browsing and retrieval No queries Supporting software: e.g. web/
media server, browser, player (2) Annotated & indexed archives:
Search by keyword; see e.g.
Image archive: Gimp-Savvy Audio archive: Spotify Spatial db: Google maps Video archive: YouTube
(3) MM archive as part of a wider application
Browsing/search by keyword,
plus related actions, e.g.
Web shop: Amazon
(4) ’True’ MM databases:
General queries by media
content, e.g.
Painting search: Hermitage Melody search: Musipedia
MMDB-1 J. Teuhola 2012 8
(a) Text
Integrated to most multimedia applications; complements (as
metadata) non-textual forms of data.
Nowadays text is usually structured/formatted by markup (e.g. XML) Visual variability through fonts and layout The most space-effective data type to store
(b) Audio
Increasingly popular data type Different formats (WAV, CD, MP3, AU, AIFF, QT, RA, WMA, Vorbis) Digitized audio rather space-consuming (tens of Kbytes per second) Compression is needed (e.g. MP3 compression ratio 12:1) More compact: synthetic music in MIDI format (Musical Instrument
Digital Interface); MPEG-4 SA (Structured Audio)
MMDB-1 J. Teuhola 2012 9
(c) Still raster images
Black-and-white / grey-scale / color One high-resolution image may take several megabytes Large number of image formats (GIF, TIFF, JPEG, JP2, PNG, …) Lossy compression ratio (e.g. for JPEG) normally about 1:10
(d) Vector graphics
2D or 3D drawings, models, maps Rather space-effective; consists of larger objects than pixels Parameters of (meta) objects: scaling, orientation, rotation, etc. Applications: CAD (computer-aided design), GIS (geographic
information systems), animations, computer games (e) Integrated documents (text & images)
Can be generated by today’s text processing programs
MMDB-1 J. Teuhola 2012 10
(f) Digital video
resemble each other).
20-30 frames per second.
FLV (Flash video), WebM (Google) (g) General integrated multimedia/hypermedia presentations
MMDB-1 J. Teuhola 2012 11
(a) Educational multimedia services:
Distance learning Teaching material Educational audio/video document archives Preview possibility
(b) Video-on-demand:
Selection of movie, possibly using queries Preview possibility; wind/rewind Requires high bandwidth Method of payment must be simple
MMDB-1 J. Teuhola 2012 12
(c) Audio-on-demand:
Less bandwidth-consuming than video Recorded programs, music, and live net radio stations
(d) Electronic commerce:
Online info about products: pictures, explanations, availability, etc. Possibility to make queries Online ordering systems with credit card / net bank payment. Examples: bookstore, travel agency
(e) Intelligent systems (‘expert systems’):
Machine repair: Automatic assistants of different repair jobs.
Manuals may be hard to read; demonstrative videos tell it better
Medical care: Standard surgery operations Crime investigations: combination of surveillance & other info
MMDB-1 J. Teuhola 2012 13
(f) Digital libraries
Organized collections of digital information Both documents and their metadata in digital form Versatile metadata- and content-based retrieval opportunities Usually accessible through the web The web itself & search engines may be considered some kind of
(poorly organized) digital library (g) Medical information systems
Patient data, including X-rays, EKG curves, MRI images, ... Strict confidentiality Used for diagnosis, monitoring and research Automated tools: image/signal processing, pattern recognition, ...
MMDB-1 J. Teuhola 2012 14
All multimedia applications share some common aspects and
functions.
The goal of this course is to find the domain-independent set of
“core algorithms” which can be used in many applications by varying a few parameters.
A generalized multimedia DBMS (MMDBMS) would be useful;
probably as an extension to a standard DBMS.
MMDB-1 J. Teuhola 2012 15
Hardware components: High-speed processors (CPU, GPU), high-
performance multimedia workstations, scanners, digital cameras, video cameras, high-resolution monitors, touch-screen monitors, high-precision printers and plotters.
High-bandwidth networks (WAN, LAN, mobile), fiber optics, network
standards
High-capacity storage devices: hard disks, optical disks and
jukeboxes, solid-state & non-volatile memories.
Image/video processing software: Compression (JPEG, MPEG),
analysis, filtering, segmenting, feature extraction.
CAD and animation software: 2D and 3D graphics, applications in
science, engineering, medicine, computer games, etc.
Pattern recognition (characters, shapes, etc.): E.g. neural networks Advanced software systems: OO languages, OO databases,
MMDB-1 J. Teuhola 2012 16
(1) Supports the main types of MM data (2) Can handle a very large number of MM objects (3) Supports high-performance, high-capacity storage management: Hierarchical storage (on-line, near-line, off-line) (4) Offers DB capabilities: Persistence, transactions, concurrency control, recovery from failures, querying with high-level declarative constructs, versioning, integrity constraints, security. (5) Information-retrieval capabilities: Exact-match retrieval, probabilistic (best-match) retrieval, content-based retrieval, ranking of results
MMDB-1 J. Teuhola 2012 17
Interactive querying Relevance feedback Query refinement Automatic feature extraction and indexing Content- and context-based indexing of different media Single- and multidimensional indexing
Clustering of media data Storage organization for large media objects Optimization of multimedia queries Replication, parallelism, distribution, scalability
MMDB-1 J. Teuhola 2012 18
Relational or extended relational DBMS, with support for large
Information retrieval module (content-based access of objects)
’NoSQL’ databases (’Not only SQL’) Improved retrieval speed for very large quantities of data. Restricted update types (mainly append), restricted transaction
support (relaxed consistency requirements)
Extensible database system with OO capabilities Support for queries and transactions involving MM objects Support for complex objects with MM subobjects