1
play

1 Music IR Music? Music IR Music? Music - Sound Music - Sound - - PDF document

Lead-in Who am I? Music Information Retrieval Vienna University of Technology http://www.tuwien.ac.at http://www.ifs.tuwien.ac.at/mir Faculty of Computer Science http://www.cs.tuwien.ac.at Department of Software Technology and


  1. Lead-in Who am I? Music Information Retrieval � Vienna University of Technology http://www.tuwien.ac.at http://www.ifs.tuwien.ac.at/mir • Faculty of Computer Science http://www.cs.tuwien.ac.at – Department of Software Technology and Interactive Systems Andreas Rauber http://www.isis.tuwien.ac.at » Software and Information Engineering Group Department of Softwaretechnology and http://www.ifs.tuwien.ac.at Interactive Systems - Andreas Rauber Vienna University of Technology http://www.ifs.tuwien.ac.at/~andi Machine Learning, Neural Networks http://www.ifs.tuwien.ac.at/~andi Text Mining, Digital Libraries Music Retrieval Digital Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lead-in Lead-in Activities Who else is MIR@ifs? � Audio Feature Extraction � � Thomas Lidy Music Classification � PlaySOM: Organisation of Music Archives � Robert Neumayer � PocketSOM: Browsing Music on Mobile Devices � Rudolf Mayer � 3D Worlds for Music � Jakob Frank � Audio Segmentation � Chord Detection � Other members Former members Blind Source Separation � Veronika Zenz Markus Frühwirth Text and Music (Lyrics, Bio, ...) � Peter Hlavac Elias Pampalk Ewald Peiszer Stefan Leitich Andreas Scharf David Laister Andrei Grecu & Doris Baum & others others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chorus Music IR – Music? What is „Music“? � Lead-in � Music, of course! � Chorus Audio: wav, au, mp3, ... � Verse 1: Music-IR Symbolic: MIDI, mod, ... � Verse 2: Audio Features www.samplesmith.com Scores: Scan, MusicXML � Verse 3: Classification and Benchmarking www.westminster.gov.uk � Verse 4: Clustering & Browsing � Text � Community data � Video/Images � Verse 5: Some other applications – Song lyrics – Playlists – Album covers � Fade-out – Artis Biographies – Market basket – Music videos – Websites: – Band evolution Fanpages, Album Reviews, Genre descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

  2. Music IR – Music? Music IR – Music? Music - Sound Music - Sound - Loudness http:// www.phys.unsw.edu.au/jw/hearing.html � Sound as acoustic wave Source of sound sound pressure sound pressure level pascal dB re 20 µPa � Characterized by the properties of waves immediate soft tissue damage 50000 approx. 185 (frequency/wavelength, amplitude) threshold of pain 100 134 hearing damage during short-term effect 20 approx. 120 � Frequency: pitch jet engine, 100 m distant 6–200 110–140 – Humans can hear approx. 20Hz-20kHz jack hammer, 1 m distant / discotheque 2 approx. 100 hearing damage during long-term effect 0.6 approx. 85 – speech: 200Hz-8kHz major road, 10 m distant 0.2–0.6 80–90 � Amplitude: Loudness passenger car, 10 m distant 0.02–0.2 60–80 µ Pa – measured as pressure in micropascal TV set at home level, 1 m distant 0.02 ca. 60 normal talking, 1 m distant 0.002–0.02 40–60 20 µ Pa – hearing threshold: approx. very calm room 0.0002–0.0006 20–30 – logarithmic decibel scale leaves noise, calm breathing 0.00006 10 auditory threshold at 2 kHz 0.00002 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? Music - Sound Music - Sound � � Different file formats for storing sound: Nyquist sampling theorem: Exact reconstruction of a continuous-time baseband signal from its – lossless formats samples is possible if the signal is bandlimited and the sampling • WAV (may hold compressed audio, but usually lossless PCM) frequency is greater than twice the signal bandwidth. • FLAC, Shorten, Monkey's Audio, ATRAC Advanced Lossless, Apple Lossless, WMA Lossless, TTA – lossy formats � is the Nyquist frequency, i.e. a signal with a specific frequency • MP3 must be sampled with twice that frequency for reconstruction. • ATRAC � More on sound, sound pressure, hearing thresholds, etc. later when • AAC we talk about feature extraction from sound. • Ogg Vorbis • WMA • ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? Music - Sound - PCM Music - Sound - MP3 � PCM: Pulse Code Modulation � Actually: MPEG-1 Audio Layer 3 � Digital representation of an analog signal where the magnitude of � Developed by a groups around Fraunhofer, Thomson, the signal is sampled regularly at uniform intervals, then quantized AT&T Bell Labs, several patent issues pending to a series of symbols � Lossy compression, based on psycho-acostic models � Used in WAV, CD-recordings, ... – differential encoding of stereo signal (lossless) � Quantization error: chosing discrete – focus on audible frequencies value near the analog signal – masking effects for each sample � – adaptive bit-depth encoding Any frequency above or equal to – quantization and huffman-encoding 1/2 sampling frequency is lost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

  3. Music IR – Music? Music IR – Music? Music - Sound - MP3 What is „Music“? � ID3-Tags � Music, of course! � Added later-on to allow embedding of meta data – Audio: wav, au, mp3, ... – Symbolic: MIDI, mod, ... � ID3v1: 30 char per entry, few standard fields www.samplesmith.com – Scores: Scan, MusicXML � ID3v2.4: UTF-8 support, tags at beginning of file www.westminster.gov.uk � Text � Community data � Video/Images � Used by search engines – Song lyrics – Playlists – Album covers – Artis Biographies – Market basket – Music videos – Websites: – Band evolution Fanpages, Album Reviews, Genre descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? Musical Instrument Digital Interface - MIDI Musical Instrument Digital Interface - MIDI � Some MIDI examples � Symbolic Music File Format (from: http://www.borg.com/~jglatt/files/midifile.htm) � Dave Smith, proposed in 1981 – Orchestral: Bach: Branderburg Concerto 4 – Orchestral: Star Treck Theme: Next Generation � MIDI specification 1.0 in 1983 – Classic: Beethoven: Für Elise � Interacting with keyboard produces messages – 1950's Rock&Roll: Bill Haley: Rock Around the Clock – 1950's Rock&Roll: Jerry Lee Louis: Great Balls of Fire – Note-On , Aftertouch , and Note-Off – Pop: Elton John: Don't Let the Sun Go Down – 127 note pitches – Pop: Phil Colins: Another Day in Paradise � Sequence of control commands – Heavy Metal: Queen: Another One Bites the Dust – Heavy Metal: Van Halen: Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Music IR – Music? Music IR – Music? MOD MOD � Similar to MIDI, but � Some examples (from http://modarchive.org) � stores audio samples together with control instructions – Classical: Dark Castle (Part 1) – Classical: Canon in D � should sound the same on every player – Classical: Beethoven: Für Elise � a.k.a. tracker modules (first ever module creating program – Guitar: Sweet Lorraine was Soundtracker, created by Karsten Obarski 1987) – Latin: Heart and Soul – Techno: 10KBlur – Disco: Rob Hubbard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend