Pixelwise classification for music document analysis Jorge - PowerPoint PPT Presentation

Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr´ eal (Canada) SIMSSA Workshop XII (Aug 2017) 1 / 31

Introduction 2 / 31

Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest 3 / 31

Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest ◮ Large amounts of content in symbolic format are required ◮ Manual transcription from source implies a high cost 3 / 31

Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest ◮ Large amounts of content in symbolic format are required ◮ Manual transcription from source implies a high cost ◮ Automatic transcription systems become valuable tools 3 / 31

Introduction Optical Music Recognition (OMR) ◮ From score image to symbolic encoding 4 / 31

Introduction Optical Music Recognition (OMR) ◮ Several interdisciplinary steps Score Document Symbol Music Music Symbolic image processing classi fi cation reconstruction encoding score 5 / 31

Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

Introduction ◮ Poor generalization of the existing strategies ◮ Music documents have a high level of heterogeneity 7 / 31

Introduction Framework ◮ Machine learning framework for music document processing ◮ Regardless of the specific characteristics of the source ◮ Detection of the different layers at the same time 8 / 31

Framework 9 / 31

Framework Pixelwise classification approach ◮ Categorization of each pixel within the input image ◮ Allows detecting small and thin elements present in music notation 10 / 31

Framework ◮ Machine learning for avoiding hand-crafted procedures 11 / 31

Framework ◮ Machine learning for avoiding hand-crafted procedures ◮ We make use of Convolutional Neural Networks (CNN) ◮ Great performance in image-related tasks ◮ Good generalization 11 / 31

Framework Convolutional Neural Networks ◮ Series of hierarchical transformations (convolutions) ◮ Transformations not fixed but learned through training ◮ Less dependent on human intervention 12 / 31

Framework Pixelwise classification ◮ Straightforward approach: classify every single pixel of the input image I ( x , y ) → { background , staff line , symbol , text , ... } 13 / 31

Framework Pixelwise classification ◮ To train the CNN we need ground truth ◮ Documents whose categories have been correctly separated 14 / 31

Framework Pixelwise classification ◮ Ground-truth example 1 ◮ One page ∼ 30 million pixels 1 Salzinnes Antiphonal manuscript (CDM-Hsmu M2149.14) 15 / 31

Framework Pixelwise classification ◮ CNN is provided with the surrounding region of the pixel to be classified 16 / 31

Framework Pixelwise classification ◮ Estimation of a probability for each possible category 17 / 31

Framework Pixelwise classification ◮ Relevant issues 18 / 31

Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation 18 / 31

Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js 18 / 31

Framework Pixel.js ◮ Web-based tool for ground truth creation 19 / 31

Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js 20 / 31

Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js ◮ Computational cost 20 / 31

Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js ◮ Computational cost ◮ Image-to-image approach 20 / 31

Framework Image-to-image classification ◮ Image-to-image pixelwise classification ◮ Classify a whole region at the same time ◮ We need to split the document into patches of equal size 21 / 31

Framework Image-to-image classification ◮ Similar accuracy ◮ Much more efficient (from several hours to few minutes) ◮ Usually needs a bigger training set 22 / 31

Deployment 23 / 31

Deployment General use ◮ Full workflow for a new type of document ◮ Ground-truth creation with Pixel.js ◮ Model training and document processing as Rodan jobs 24 / 31

Deployment Resources ◮ Training models: very slow, need of high-performance computing ◮ Classification: fast with the image-to-image approach 25 / 31

Deployment DEMO 26 / 31

Conclusions 27 / 31

Conclusions Summary ◮ Generalizable music document analysis with machine learning ◮ Research on effective and efficient strategies ◮ Usability through Rodan framework 28 / 31

Conclusions Future work ◮ Integrate with the rest of the OMR workflow ◮ Make efforts towards faster adaptation to new document types ◮ Efficient ground truth creation with Pixel.js ◮ Study of model adaptation techniques 29 / 31

Thank you! 30 / 31

Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr´ eal (Canada) SIMSSA Workshop XII (Aug 2017) 31 / 31

Pixelwise classification for music document analysis Jorge - PowerPoint PPT Presentation

Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr eal (Canada) SIMSSA Workshop XII (Aug 2017) 1 /

MUSIC THERAPY MUSIC THERAPY What is music therapy? Music therapy is simply the process of using

JEWISH MUSIC 101: WHAT IS JEWISH MUSIC? A PROGRAM OF THE LOWELL MILKEN FUND FOR AMERICAN JEWISH

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &

Music and Pain: A Music Therapy Perspective Deborah Salmon, MA, MTA, CMT BRAMS, Universit de

FOLK MUSIC AT KMH A presentation of the Folk Music Department at the Royal College of Music,

Music, Language and Computation Aline Honingh LoLaCo Guestlecture 2012 Outline Music at the

Music Classification Overview and Audio Features Graduate School of Culture Technology, KAIST

A Musical Future Options for Studying Music at UWA Why choose Music at UWA? Music at UWA

Music Tagging Ryan Curtin LUG@GT Ryan Curtin Music Tagging - p. 1 The Problem You have a

School Music Education Plan THAMES Guidance for Schools Music in Schools - Introducing School

Radium: A Music Editor Inspired by the Music Tracker Kjetil Matheussen Norwegian Center for

Music recommendation and discovery in which Web? scar Celma (Music Technology Group, UPF)

1 Music IR Music? Music IR Music? Music - Sound Music - Sound - Loudness http://

Music Composition with LISP Drew Krause LispNYC November 13, 2012 Lisp Music Environments

Genre Analysis History of La,n Music in the USA (PBS) Pt1 History of La,n Music in the USA (PBS)

Practise: MUSIC teleseismic Back-Projection Multiple Signal Classification (MUSIC) q MUSICBP is a

CS 1666 www.cs.pitt.edu/~nlf4/cs1666/ Introduction to SDL and 2D computer graphics Graphics

Lecture 2: Introduction to Segmentation Jonathan Krause Fei-Fei Li, Jonathan Krause Lecture 2 -

IETF 82 Cullen Jennings 1 1 2 Yes, we can do interactive voice and video today, but its

When USB devices attack Manchester Grey Hats PRESENTED BY: Tim Wilkes @mcrgreyhats Disclaimer:

Computer Graphics - Rasterization & Clipping - Hendrik Lensch Computer Graphics WS07/08

Opening Exercise Write a method that turns pixels with an average intensity less than 85 to

The Phase-II ATLAS ITk Pixel Upgrade Anna Macchiolo, Max-Planck-Ins2tut fr Physik on behalf of

7.1 Rasterization Hao Li http://cs420.hao-li.com 1 Rendering Pipeline 2 Outline Scan

Pixelwise classification for music document analysis Jorge - PowerPoint PPT Presentation

Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr eal (Canada) SIMSSA Workshop XII (Aug 2017) 1 /

MUSIC THERAPY MUSIC THERAPY What is music therapy? Music therapy is simply the process of using

JEWISH MUSIC 101: WHAT IS JEWISH MUSIC? A PROGRAM OF THE LOWELL MILKEN FUND FOR AMERICAN JEWISH

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &amp;

Music and Pain: A Music Therapy Perspective Deborah Salmon, MA, MTA, CMT BRAMS, Universit de

FOLK MUSIC AT KMH A presentation of the Folk Music Department at the Royal College of Music,

Music, Language and Computation Aline Honingh LoLaCo Guestlecture 2012 Outline Music at the

Music Classification Overview and Audio Features Graduate School of Culture Technology, KAIST

A Musical Future Options for Studying Music at UWA Why choose Music at UWA? Music at UWA

Music Tagging Ryan Curtin LUG@GT Ryan Curtin Music Tagging - p. 1 The Problem You have a

School Music Education Plan THAMES Guidance for Schools Music in Schools - Introducing School

Radium: A Music Editor Inspired by the Music Tracker Kjetil Matheussen Norwegian Center for

Music recommendation and discovery in which Web? scar Celma (Music Technology Group, UPF)

1 Music IR Music? Music IR Music? Music - Sound Music - Sound - Loudness http://

Music Composition with LISP Drew Krause LispNYC November 13, 2012 Lisp Music Environments

Genre Analysis History of La,n Music in the USA (PBS) Pt1 History of La,n Music in the USA (PBS)

Practise: MUSIC teleseismic Back-Projection Multiple Signal Classification (MUSIC) q MUSICBP is a

CS 1666 www.cs.pitt.edu/~nlf4/cs1666/ Introduction to SDL and 2D computer graphics Graphics

Lecture 2: Introduction to Segmentation Jonathan Krause Fei-Fei Li, Jonathan Krause Lecture 2 -

IETF 82 Cullen Jennings 1 1 2 Yes, we can do interactive voice and video today, but its

When USB devices attack Manchester Grey Hats PRESENTED BY: Tim Wilkes @mcrgreyhats Disclaimer:

Computer Graphics - Rasterization &amp; Clipping - Hendrik Lensch Computer Graphics WS07/08

Opening Exercise Write a method that turns pixels with an average intensity less than 85 to

The Phase-II ATLAS ITk Pixel Upgrade Anna Macchiolo, Max-Planck-Ins2tut fr Physik on behalf of

7.1 Rasterization Hao Li http://cs420.hao-li.com 1 Rendering Pipeline 2 Outline Scan

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &

Computer Graphics - Rasterization & Clipping - Hendrik Lensch Computer Graphics WS07/08