pixelwise classification for music document analysis
play

Pixelwise classification for music document analysis Jorge - PowerPoint PPT Presentation

Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr eal (Canada) SIMSSA Workshop XII (Aug 2017) 1 /


  1. Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr´ eal (Canada) SIMSSA Workshop XII (Aug 2017) 1 / 31

  2. Introduction 2 / 31

  3. Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest 3 / 31

  4. Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest ◮ Large amounts of content in symbolic format are required ◮ Manual transcription from source implies a high cost 3 / 31

  5. Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest ◮ Large amounts of content in symbolic format are required ◮ Manual transcription from source implies a high cost ◮ Automatic transcription systems become valuable tools 3 / 31

  6. Introduction Optical Music Recognition (OMR) ◮ From score image to symbolic encoding 4 / 31

  7. Introduction Optical Music Recognition (OMR) ◮ From score image to symbolic encoding 4 / 31

  8. Introduction Optical Music Recognition (OMR) ◮ Several interdisciplinary steps Score Document Symbol Music Music Symbolic image processing classi fi cation reconstruction encoding score 5 / 31

  9. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  10. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  11. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  12. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  13. Introduction ◮ Poor generalization of the existing strategies ◮ Music documents have a high level of heterogeneity 7 / 31

  14. Introduction Framework ◮ Machine learning framework for music document processing ◮ Regardless of the specific characteristics of the source ◮ Detection of the different layers at the same time 8 / 31

  15. Framework 9 / 31

  16. Framework Pixelwise classification approach ◮ Categorization of each pixel within the input image ◮ Allows detecting small and thin elements present in music notation 10 / 31

  17. Framework ◮ Machine learning for avoiding hand-crafted procedures 11 / 31

  18. Framework ◮ Machine learning for avoiding hand-crafted procedures ◮ We make use of Convolutional Neural Networks (CNN) ◮ Great performance in image-related tasks ◮ Good generalization 11 / 31

  19. Framework Convolutional Neural Networks ◮ Series of hierarchical transformations (convolutions) ◮ Transformations not fixed but learned through training ◮ Less dependent on human intervention 12 / 31

  20. Framework Pixelwise classification ◮ Straightforward approach: classify every single pixel of the input image I ( x , y ) → { background , staff line , symbol , text , ... } 13 / 31

  21. Framework Pixelwise classification ◮ To train the CNN we need ground truth ◮ Documents whose categories have been correctly separated 14 / 31

  22. Framework Pixelwise classification ◮ Ground-truth example 1 ◮ One page ∼ 30 million pixels 1 Salzinnes Antiphonal manuscript (CDM-Hsmu M2149.14) 15 / 31

  23. Framework Pixelwise classification ◮ CNN is provided with the surrounding region of the pixel to be classified 16 / 31

  24. Framework Pixelwise classification ◮ Estimation of a probability for each possible category 17 / 31

  25. Framework Pixelwise classification ◮ Relevant issues 18 / 31

  26. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation 18 / 31

  27. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js 18 / 31

  28. Framework Pixel.js ◮ Web-based tool for ground truth creation 19 / 31

  29. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js 20 / 31

  30. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js ◮ Computational cost 20 / 31

  31. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js ◮ Computational cost ◮ Image-to-image approach 20 / 31

  32. Framework Image-to-image classification ◮ Image-to-image pixelwise classification ◮ Classify a whole region at the same time ◮ We need to split the document into patches of equal size 21 / 31

  33. Framework Image-to-image classification ◮ Similar accuracy ◮ Much more efficient (from several hours to few minutes) ◮ Usually needs a bigger training set 22 / 31

  34. Deployment 23 / 31

  35. Deployment General use ◮ Full workflow for a new type of document ◮ Ground-truth creation with Pixel.js ◮ Model training and document processing as Rodan jobs 24 / 31

  36. Deployment Resources ◮ Training models: very slow, need of high-performance computing ◮ Classification: fast with the image-to-image approach 25 / 31

  37. Deployment DEMO 26 / 31

  38. Conclusions 27 / 31

  39. Conclusions Summary ◮ Generalizable music document analysis with machine learning ◮ Research on effective and efficient strategies ◮ Usability through Rodan framework 28 / 31

  40. Conclusions Future work ◮ Integrate with the rest of the OMR workflow ◮ Make efforts towards faster adaptation to new document types ◮ Efficient ground truth creation with Pixel.js ◮ Study of model adaptation techniques 29 / 31

  41. Thank you! 30 / 31

  42. Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr´ eal (Canada) SIMSSA Workshop XII (Aug 2017) 31 / 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend