the journey to open sourcing the code and models
play

The journey to open-sourcing the code and models FOSDEM 2020 Anis - PowerPoint PPT Presentation

The journey to open-sourcing the code and models FOSDEM 2020 Anis Khlif, Flix Voituret Whos been involved Romain Hennequin - Lead research scientist Laure Prtet - Former intern Anis Khlif - Research Engineer Flix Voituret - Research


  1. The journey to open-sourcing the code and models FOSDEM 2020 Anis Khlif, Félix Voituret

  2. Who’s been involved Romain Hennequin - Lead research scientist Laure Prétet - Former intern Anis Khlif - Research Engineer Félix Voituret - Research Engineer Manuel Moussallam - Head of Deezer Research Spleeter by Deezer

  3. What is it all about ? Spleeter by Deezer

  4. Large impact on tech audience 9500+ stars on Github 200k+ views 100k+ read on deezer.io Spleeter by Deezer

  5. Myth busting Deezer solved source separation Spleeter performs better than all other solutions Spleeter by Deezer

  6. What did we bring ? State of the art Fast MIT Licensed Spleeter by Deezer

  7. Primer on source separation

  8. Waveform Time Primer on source separation

  9. Time-frequency representation Primer on source separation

  10. Magnitude spectrogram Frequencies Time Primer on source separation

  11. Magnitude spectrogram Frequencies Harmonic content Time Primer on source separation

  12. Magnitude spectrogram Frequencies In-harmonic Percussive content Time Primer on source separation

  13. Magnitude spectrogram Frequencies Vocal content Time Primer on source separation

  14. Magnitude spectrogram Learn a mask for each instrument ! What fraction of the energy at each time and each frequency bin should be assigned to this instrument. Primer on source separation

  15. Magnitude spectrogram Primer on source separation

  16. Magnitude spectrogram Primer on source separation

  17. Magnitude spectrogram Primer on source separation

  18. Spleeter models 2, 4 & 5 stems

  19. A deep learning approach to mask prediction Vocal mask Deep learning model Instruments mask Spleeter models

  20. 4-stems Vocal mask Drums mask Deep learning Bass mask model Others mask Spleeter models

  21. 5-stems Vocal mask Drums mask Deep Bass mask learning model Piano mask Others mask Spleeter models

  22. Quick introduction to TensorFlow Input Operation Output s s ● Build computation graph that represent a parametrized function Parameters (or weights ) can be modified (trained) to fit an optimization ● function A model can be run in any tensorflow environment ● Some graph operations can be run very efficiently on GPU ● Spleeter models

  23. Quick introduction to TensorFlow Input Operation Output s s model = computation graph (network architecture) + weights (parameters) Spleeter models

  24. Overview Voice - * L1 loss unet masks - * Instruments Spleeter models

  25. Overview Example 1 Voice - * L1 loss unet masks - * Instruments Spleeter models

  26. Overview Example 1 Parameter update Voice - * L1 loss unet masks - * Instruments Spleeter models

  27. Overview Example 2 Voice - * L1 loss unet masks - * Instruments Spleeter models

  28. Overview Example 2 Parameter update Voice - * L1 loss unet masks - * Instruments Spleeter models

  29. Overview Example N Voice - * L1 loss unet masks - * Instruments Spleeter models

  30. Overview Example N Parameter update Voice - * L1 loss unet masks - * Instruments Spleeter models

  31. Dataset In-house dataset of tracks ~24k tracks with stems ~80 hours of recording … that we are not allowed to release! Spleeter models

  32. BUT...

  33. Training We can release learned weights 2 5 4 One mask per channel per source. ● 1 branch predicts masks for 1 source, with 2 channels ● ~10M parameters per branch ● Spleeter models

  34. Open-sourcing Spleeter Packaging & distribution

  35. Packaging constraints Predefined configurations On demand model downloading Oneliner command Open-sourcing Spleeter

  36. Predefined configurations JSON formatted file ● Spleeter Mostly model related parameters ● Provided as ● Embedded configuration files ... File path ○ Configuration name ○ Open-sourcing Spleeter

  37. Using GitHub releases as model hub Spleeter deezer/spleeter Open-sourcing Spleeter

  38. Separate source from command line Separate with default 2stems configuration $ spleeter separate -i input_file.mp3 -o output_dir Separate with specific embedded configuration $ spleeter separate -i input_file.mp3 -o output_dir -p spleeter:4stems Open-sourcing Spleeter

  39. Distribution constraints Critical dependencies to manage : FFmpe Spleeter CPU version g TensorFlow GPU version Cross platform ● Cross hardware ● User friendly ● Open-sourcing Spleeter

  40. Distribution channels Open-sourcing Spleeter

  41. Continuous integration and delivery Open-sourcing Spleeter

  42. Legal considerations No Intellectual Property consensus on weights ● We decided to open-source model ● Open-sourcing Spleeter

  43. Bibliography and references Industrial integrations : AconDigital plugins ● Various public web applications ● ~ 30 projects referenced as using Spleeter on GitHub Research publications : https://ieeexplore.ieee.org/document/8683555 ● ● http://archives.ismir.net/ismir2019/latebreaking/000036.pdf Open-sourcing Spleeter

  44. Demo Spleeter Live

  45. Thank you research@deezer.com

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend