Music Tagging Ryan Curtin LUG@GT Ryan Curtin Music Tagging - p. 1 - - PowerPoint PPT Presentation

music tagging
SMART_READER_LITE
LIVE PREVIEW

Music Tagging Ryan Curtin LUG@GT Ryan Curtin Music Tagging - p. 1 - - PowerPoint PPT Presentation

Music Tagging Ryan Curtin LUG@GT Ryan Curtin Music Tagging - p. 1 The Problem You have a music collection. It is: The Problem Common Music Formats Common Tag Formats Large and unwieldy (well, maybe not) Tag Format != Music


slide-1
SLIDE 1

Ryan Curtin Music Tagging - p. 1

Music Tagging

Ryan Curtin

LUG@GT

slide-2
SLIDE 2

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 2

The Problem

You have a music collection. It is:

Large and unwieldy (well, maybe not)

slide-3
SLIDE 3

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 2

The Problem

You have a music collection. It is:

Large and unwieldy (well, maybe not) Poorly organized, and finding things can be difficult

slide-4
SLIDE 4

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 2

The Problem

You have a music collection. It is:

Large and unwieldy (well, maybe not) Poorly organized, and finding things can be difficult Incorrectly or inconsistently tagged

slide-5
SLIDE 5

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 2

The Problem

You have a music collection. It is:

Large and unwieldy (well, maybe not) Poorly organized, and finding things can be difficult Incorrectly or inconsistently tagged Full of entirely untagged music

slide-6
SLIDE 6

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 2

The Problem

You have a music collection. It is:

Large and unwieldy (well, maybe not) Poorly organized, and finding things can be difficult Incorrectly or inconsistently tagged Full of entirely untagged music Impossible to tell what music is by its filename

slide-7
SLIDE 7

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 2

The Problem

You have a music collection. It is:

Large and unwieldy (well, maybe not) Poorly organized, and finding things can be difficult Incorrectly or inconsistently tagged Full of entirely untagged music Impossible to tell what music is by its filename As ugly as Brian Peppers

slide-8
SLIDE 8

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 2

The Problem

You have a music collection. It is:

Large and unwieldy (well, maybe not) Poorly organized, and finding things can be difficult Incorrectly or inconsistently tagged Full of entirely untagged music Impossible to tell what music is by its filename As ugly as Brian Peppers

If this does not apply to you, then go home!

slide-9
SLIDE 9

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 3

Common Music Formats

Your music collection likely contains some (or all) of the following:

MP3 (MPEG-1 Audio Layer 3) [.mp3] VOB (Ogg Vorbis) [.vob] FLAC (Free Lossless Audio Codec) [.flac] AAC (Advanced Audio Coding) [.mp4, .m4a] WMA (Windows Media Audio) [.wma] Speex [.spx] Monkey’s Audio [.ape] WAV (are you stupid?)

slide-10
SLIDE 10

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 4

Common Tag Formats

If your music is tagged, it is likely to be tagged with some (or all) of the following tag formats:

ID3v1 ID3v2.4 APE APEv2 WMA Vorbis comments

slide-11
SLIDE 11

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 5

Tag Format != Music Format

One type of music format can be tagged with another type of tag format.

slide-12
SLIDE 12

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 5

Tag Format != Music Format

One type of music format can be tagged with another type of tag format. For example:

MP3 with ID3v2.4 Monkey’s Audio with APEv2 MP3 with APEv2

slide-13
SLIDE 13

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 5

Tag Format != Music Format

One type of music format can be tagged with another type of tag format. For example:

MP3 with ID3v2.4 Monkey’s Audio with APEv2 MP3 with APEv2

Some combinations are impossible (WMA with anything not WMA).

slide-14
SLIDE 14

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 6

The ID3 Tag

http://www.id3.org/

Designed with the MP3 format in mind Originally a 128-byte fixed-size tag with title, artist, album,

year, genre, and a comment

30-character limit of text fields Fixed size does not allow more tag fields Informal standard: not approved by any standardization body

slide-15
SLIDE 15

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 7

ID3v1

128-byte fixed-size tag at end of MP3: ’TAG’ 3 bytes Title 30 bytes Artist 30 bytes Album 30 bytes Year 4 bytes Comment 30 bytes Genre 1 byte 80 different genres (created by Eric Kemp).

slide-16
SLIDE 16

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 8

ID3v1.1

Slightly more clever 128-byte implementation, by Michael Mutschler. Adds 2-byte tracknumber, reducing the size of the comment field to 28 bytes. Still horrendously unusable for complicated purposes!

slide-17
SLIDE 17

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 9

ID3v2

New, variable-size tag. Likely at beginning of file (or end, or middle, but nobody puts them there). Tag header 10 bytes "ID3", flags Extended header Variable (>6B) Restriction data ID3 Frame Variable A single tag Padding Optional Must be 0x0 Footer 10 bytes "3DI"

slide-18
SLIDE 18

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 10

ID3v2 Frame

Each tag is made up of several frames, of this format: Frame ID 4 bytes 4-character ID Size 4 bytes 32-bit integer Flags 2 bytes Status, format Frame Info Variable (>1B) Actual tag data

slide-19
SLIDE 19

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 11

ID3v2 Summary

Huge improvement over ID3v1

slide-20
SLIDE 20

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 11

ID3v2 Summary

Huge improvement over ID3v1 Allows new tags to be created (at author’s discretion)

slide-21
SLIDE 21

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 11

ID3v2 Summary

Huge improvement over ID3v1 Allows new tags to be created (at author’s discretion) Variable-width tags

slide-22
SLIDE 22

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 11

ID3v2 Summary

Huge improvement over ID3v1 Allows new tags to be created (at author’s discretion) Variable-width tags Not limited to text-only tags

slide-23
SLIDE 23

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 11

ID3v2 Summary

Huge improvement over ID3v1 Allows new tags to be created (at author’s discretion) Variable-width tags Not limited to text-only tags

List of known violations: http://www.id3.org/Compliance_Issues

slide-24
SLIDE 24

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 12

APE

The APE format was originally designed for the Monkey’s Audio format [.ape]. APEv1 and APEv2 are identical, except APEv2 has a header. Closer in style to Vorbis comments than ID3.

slide-25
SLIDE 25

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 13

APEv2

Usable in MP3 as well as Monkey’s Audio. APE Tags Header 32 bytes Version, size, count APE Item Size 4 bytes Length of APE Item APE Item Flags 4 bytes Item flags APE Item Key Variable ASCII characters APE Item Null 1 byte 0x00 (end of key) APE Item Value Variable Item value (UTF-8/bin) APE Tags Footer 32 bytes Same as header (nearly)

slide-26
SLIDE 26

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 14

Vorbis Comments

Specifically designed for Ogg Vorbis and FLAC (by Xiph.org). Vendor Length 4 bytes Length of vendor string Vendor String Variable Info about library Comment List Len 4 bytes Number of comments Comment Length 4 bytes Length of this comment Comment Variable "ARTIST=Autechre" Framing Bit 1 bit

slide-27
SLIDE 27

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags » The ID3 Tag » ID3v1 » ID3v1.1 » ID3v2 » ID3v2 Frame » ID3v2 Summary » APE » APEv2 » Vorbis Comments Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 14

Vorbis Comments

Specifically designed for Ogg Vorbis and FLAC (by Xiph.org). Vendor Length 4 bytes Length of vendor string Vendor String Variable Info about library Comment List Len 4 bytes Number of comments Comment Length 4 bytes Length of this comment Comment Variable "ARTIST=Autechre" Framing Bit 1 bit Field names ASCII (not ’=’), field values UTF-8. Field names not necessarily unique.

slide-28
SLIDE 28

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3

slide-29
SLIDE 29

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2

slide-30
SLIDE 30

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2 apetag

slide-31
SLIDE 31

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2 apetag metaflac

slide-32
SLIDE 32

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2 apetag metaflac

  • gginfo
slide-33
SLIDE 33

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2 apetag metaflac

  • gginfo

lltag

slide-34
SLIDE 34

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2 apetag metaflac

  • gginfo

lltag taginfo

slide-35
SLIDE 35

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2 apetag metaflac

  • gginfo

lltag taginfo

  • mptagger
slide-36
SLIDE 36

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 15

CLI interfaces

id3 id3v2 apetag metaflac

  • gginfo

lltag taginfo

  • mptagger

eyeD3

slide-37
SLIDE 37

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 16

EasyTAG

(example not on slideshow) Good for batch processing - has DB lookups.

slide-38
SLIDE 38

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 17

Amarok

(example not on slideshow) Like Windows Media Player: file-by-file tagging (ugly and slow).

slide-39
SLIDE 39

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 18

Quod Libet / Ex Falso

(example not on slideshow) Excellent for batch processing of tags!

slide-40
SLIDE 40

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers » CLI interfaces » EasyTAG » Amarok » Quod Libet / Ex Falso » Picard Music Transcoding Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 19

Picard

Official MusicBrainz tagger Very powerful and batchable

slide-41
SLIDE 41

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding » Bash scripts » TransKode Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 20

Bash scripts

Use command-line utilities like flac, lame, and others. (example not on slideshow)

slide-42
SLIDE 42

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding » Bash scripts » TransKode Audio Fingerprinting Questions and Comments? Ryan Curtin Music Tagging - p. 21

TransKode

KDE-based graphical batch audio transcoder http://sourceforge.net/projects/transkode/

slide-43
SLIDE 43

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 22

Audio Fingerprinting: What Is It?

Convert a full audio track into a short fingerprint for comparison and identification purposes.

slide-44
SLIDE 44

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 23

Audio Fingerprinting Algorithms

AMG LASSO - All Media Guide’s implementation; patented

slide-45
SLIDE 45

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 23

Audio Fingerprinting Algorithms

AMG LASSO - All Media Guide’s implementation; patented AudioID - Fraunhofer Institute (now licensed by Mufin)

slide-46
SLIDE 46

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 23

Audio Fingerprinting Algorithms

AMG LASSO - All Media Guide’s implementation; patented AudioID - Fraunhofer Institute (now licensed by Mufin) Gracenote’s MusicID - used by Winamp ("Nullsoft Playlist

Generator")

slide-47
SLIDE 47

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 23

Audio Fingerprinting Algorithms

AMG LASSO - All Media Guide’s implementation; patented AudioID - Fraunhofer Institute (now licensed by Mufin) Gracenote’s MusicID - used by Winamp ("Nullsoft Playlist

Generator")

Last.fm Fingerprinter - available in Last.fm client

slide-48
SLIDE 48

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 23

Audio Fingerprinting Algorithms

AMG LASSO - All Media Guide’s implementation; patented AudioID - Fraunhofer Institute (now licensed by Mufin) Gracenote’s MusicID - used by Winamp ("Nullsoft Playlist

Generator")

Last.fm Fingerprinter - available in Last.fm client MusicIP’s Open Fingerprint Architecture - see libofa

for open-source implementation

slide-49
SLIDE 49

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 23

Audio Fingerprinting Algorithms

AMG LASSO - All Media Guide’s implementation; patented AudioID - Fraunhofer Institute (now licensed by Mufin) Gracenote’s MusicID - used by Winamp ("Nullsoft Playlist

Generator")

Last.fm Fingerprinter - available in Last.fm client MusicIP’s Open Fingerprint Architecture - see libofa

for open-source implementation

Shazam - music identification via cell phone

slide-50
SLIDE 50

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 23

Audio Fingerprinting Algorithms

AMG LASSO - All Media Guide’s implementation; patented AudioID - Fraunhofer Institute (now licensed by Mufin) Gracenote’s MusicID - used by Winamp ("Nullsoft Playlist

Generator")

Last.fm Fingerprinter - available in Last.fm client MusicIP’s Open Fingerprint Architecture - see libofa

for open-source implementation

Shazam - music identification via cell phone Mirage - Dominik Schnitzer’s Masters Thesis (from Vienna

University of Technology), a Banshee plugin

slide-51
SLIDE 51

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 24

Mirage Music Similarity Measure

Decode MP3; take 100ms frames (implementation is not

windowed?): xn(t)

slide-52
SLIDE 52

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 24

Mirage Music Similarity Measure

Decode MP3; take 100ms frames (implementation is not

windowed?): xn(t)

Take STFT (windowed FFT) of input frames to obtain

spectrograms: Xn( f ) = F(xn(t))

slide-53
SLIDE 53

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 24

Mirage Music Similarity Measure

Decode MP3; take 100ms frames (implementation is not

windowed?): xn(t)

Take STFT (windowed FFT) of input frames to obtain

spectrograms: Xn( f ) = F(xn(t))

Map onto Mel spectrum: Mn( f ) = 2595log10( Xn( f )

700 + 1)

slide-54
SLIDE 54

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 24

Mirage Music Similarity Measure

Decode MP3; take 100ms frames (implementation is not

windowed?): xn(t)

Take STFT (windowed FFT) of input frames to obtain

spectrograms: Xn( f ) = F(xn(t))

Map onto Mel spectrum: Mn( f ) = 2595log10( Xn( f )

700 + 1)

Take logarithms of powers at each frequency:

Ln( f ) = log(Mn( f ))

slide-55
SLIDE 55

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 24

Mirage Music Similarity Measure

Decode MP3; take 100ms frames (implementation is not

windowed?): xn(t)

Take STFT (windowed FFT) of input frames to obtain

spectrograms: Xn( f ) = F(xn(t))

Map onto Mel spectrum: Mn( f ) = 2595log10( Xn( f )

700 + 1)

Take logarithms of powers at each frequency:

Ln( f ) = log(Mn( f ))

Take FFT of logarithms to produce ‘cepstrum’:

Cn(t) = F(Ln( f ))

slide-56
SLIDE 56

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 24

Mirage Music Similarity Measure

Decode MP3; take 100ms frames (implementation is not

windowed?): xn(t)

Take STFT (windowed FFT) of input frames to obtain

spectrograms: Xn( f ) = F(xn(t))

Map onto Mel spectrum: Mn( f ) = 2595log10( Xn( f )

700 + 1)

Take logarithms of powers at each frequency:

Ln( f ) = log(Mn( f ))

Take FFT of logarithms to produce ‘cepstrum’:

Cn(t) = F(Ln( f ))

Filter cepstrum into individual bins to produce MFCCs (20

coefficients)

slide-57
SLIDE 57

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 24

Mirage Music Similarity Measure

Decode MP3; take 100ms frames (implementation is not

windowed?): xn(t)

Take STFT (windowed FFT) of input frames to obtain

spectrograms: Xn( f ) = F(xn(t))

Map onto Mel spectrum: Mn( f ) = 2595log10( Xn( f )

700 + 1)

Take logarithms of powers at each frequency:

Ln( f ) = log(Mn( f ))

Take FFT of logarithms to produce ‘cepstrum’:

Cn(t) = F(Ln( f ))

Filter cepstrum into individual bins to produce MFCCs (20

coefficients)

Take mean and covariance matrix of MFCC frames to get

music fingerprint

slide-58
SLIDE 58

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 25

Visual Representation (1)

slide-59
SLIDE 59

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 26

Visual Representation (2)

slide-60
SLIDE 60

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting » Audio Fingerprinting: What Is It? » Audio Fingerprinting Algorithms » Mirage Music Similarity Measure » Visual Representation (1) » Visual Representation (2) » Fingerprint Distance Questions and Comments? Ryan Curtin Music Tagging - p. 27

Fingerprint Distance

Use the symmetric Kullback-Liebler Divergence: K-L divergence (for a vector): DKL(P||Q) = ∑

i

Pi log( Pi Qi

)

K-L divergence (for a matrix): DKL(P||Q) = ∑

i ∑ j

Pi log( Pi Qj

)

Symmetric K-L divergence: DsKL(P||Q) = DKL(P||Q) + DKL(Q||P) Sum across mean, covariance matrix, and inverse covariance matrix to get full distance.

slide-61
SLIDE 61

» The Problem » Common Music Formats » Common Tag Formats » Tag Format != Music Format Music Tags Music Taggers Music Transcoding Audio Fingerprinting Questions and Comments? » Questions and Comments? Ryan Curtin Music Tagging - p. 28

Questions and Comments?