The journey to open-sourcing the code and models FOSDEM 2020 Anis - - PowerPoint PPT Presentation

the journey to open sourcing the code and models
SMART_READER_LITE
LIVE PREVIEW

The journey to open-sourcing the code and models FOSDEM 2020 Anis - - PowerPoint PPT Presentation

The journey to open-sourcing the code and models FOSDEM 2020 Anis Khlif, Flix Voituret Whos been involved Romain Hennequin - Lead research scientist Laure Prtet - Former intern Anis Khlif - Research Engineer Flix Voituret - Research


slide-1
SLIDE 1

The journey to open-sourcing the code and models

FOSDEM 2020

Anis Khlif, Félix Voituret

slide-2
SLIDE 2

Spleeter by Deezer

Who’s been involved

Romain Hennequin - Lead research scientist Laure Prétet - Former intern Anis Khlif - Research Engineer Félix Voituret - Research Engineer Manuel Moussallam - Head of Deezer Research

slide-3
SLIDE 3

Spleeter by Deezer

What is it all about ?

slide-4
SLIDE 4

200k+ views

Spleeter by Deezer

Large impact on tech audience

9500+ stars on Github 100k+ read on deezer.io

slide-5
SLIDE 5

Spleeter performs better than all other solutions

Spleeter by Deezer

Myth busting

Deezer solved source separation

slide-6
SLIDE 6

State of the art Fast MIT Licensed

Spleeter by Deezer

What did we bring ?

slide-7
SLIDE 7

Primer on source separation

slide-8
SLIDE 8

Time

Primer on source separation

Waveform

slide-9
SLIDE 9

Primer on source separation

Time-frequency representation

slide-10
SLIDE 10

Frequencies Time

Primer on source separation

Magnitude spectrogram

slide-11
SLIDE 11

Frequencies Time

Primer on source separation

Magnitude spectrogram

Harmonic content

slide-12
SLIDE 12

Frequencies Time

Primer on source separation

Magnitude spectrogram

In-harmonic Percussive content

slide-13
SLIDE 13

Frequencies Time

Primer on source separation

Magnitude spectrogram

Vocal content

slide-14
SLIDE 14

Learn a mask for each instrument ! What fraction of the energy at each time and each frequency bin should be assigned to this instrument.

Primer on source separation

Magnitude spectrogram

slide-15
SLIDE 15

Primer on source separation

Magnitude spectrogram

slide-16
SLIDE 16

Primer on source separation

Magnitude spectrogram

slide-17
SLIDE 17

Primer on source separation

Magnitude spectrogram

slide-18
SLIDE 18

Spleeter models

2, 4 & 5 stems

slide-19
SLIDE 19

Deep learning model Vocal mask Instruments mask

Spleeter models

A deep learning approach to mask prediction

slide-20
SLIDE 20

Deep learning model Vocal mask

Spleeter models

4-stems

Others mask Drums mask Bass mask

slide-21
SLIDE 21

Deep learning model Vocal mask

Spleeter models

Others mask Drums mask Bass mask Piano mask

5-stems

slide-22
SLIDE 22
  • Build computation graph that represent a parametrized function
  • Parameters (or weights) can be modified (trained) to fit an optimization

function

  • A model can be run in any tensorflow environment
  • Some graph operations can be run very efficiently on GPU

Input s Operation Output s

Quick introduction to TensorFlow

Spleeter models

slide-23
SLIDE 23

Input s Operation Output s

Quick introduction to TensorFlow

Spleeter models

model = computation graph (network architecture) + weights (parameters)

slide-24
SLIDE 24

unet masks

*

L1 loss

  • *
  • Voice

Instruments

Spleeter models

Overview

slide-25
SLIDE 25

unet masks

*

L1 loss

  • *
  • Voice

Instruments

Spleeter models

Overview

Example 1

slide-26
SLIDE 26

unet masks

*

L1 loss

  • *
  • Voice

Instruments

Spleeter models

Overview

Example 1 Parameter update

slide-27
SLIDE 27

Example 2

unet masks

*

L1 loss

  • *
  • Voice

Instruments

Spleeter models

Overview

slide-28
SLIDE 28

Example 2

unet masks

*

L1 loss

  • *
  • Voice

Instruments

Spleeter models

Overview

Parameter update

slide-29
SLIDE 29

Example N

unet masks

*

L1 loss

  • *
  • Voice

Instruments

Spleeter models

Overview

slide-30
SLIDE 30

Example N

unet masks

*

L1 loss

  • *
  • Voice

Instruments

Spleeter models

Overview

Parameter update

slide-31
SLIDE 31

… that we are not allowed to release!

Spleeter models

Dataset

In-house dataset of tracks ~24k tracks with stems ~80 hours of recording

slide-32
SLIDE 32

BUT...

slide-33
SLIDE 33

Spleeter models

Training

  • One mask per channel per source.
  • 1 branch predicts masks for 1 source, with 2 channels
  • ~10M parameters per branch

2 4 5

We can release learned weights

slide-34
SLIDE 34

Open-sourcing Spleeter

Packaging & distribution

slide-35
SLIDE 35

Packaging constraints

Predefined configurations On demand model downloading Oneliner command

Open-sourcing Spleeter

slide-36
SLIDE 36

Predefined configurations

Spleeter

Embedded configuration files

...

  • JSON formatted file
  • Mostly model related parameters
  • Provided as

○ File path ○ Configuration name

Open-sourcing Spleeter

slide-37
SLIDE 37

Using GitHub releases as model hub

deezer/spleeter

Spleeter Open-sourcing Spleeter

slide-38
SLIDE 38

Separate source from command line

$ spleeter separate -i input_file.mp3 -o output_dir -p spleeter:4stems Separate with specific embedded configuration $ spleeter separate -i input_file.mp3 -o output_dir Separate with default 2stems configuration

Open-sourcing Spleeter

slide-39
SLIDE 39

Distribution constraints

Open-sourcing Spleeter FFmpe g TensorFlow Spleeter CPU version GPU version

  • Cross platform
  • Cross hardware
  • User friendly

Critical dependencies to manage :

slide-40
SLIDE 40

Distribution channels

Open-sourcing Spleeter

slide-41
SLIDE 41

Continuous integration and delivery

Open-sourcing Spleeter

slide-42
SLIDE 42

Legal considerations

Open-sourcing Spleeter

  • No Intellectual Property consensus on weights
  • We decided to open-source model
slide-43
SLIDE 43

Bibliography and references

Open-sourcing Spleeter

~30 projects referenced as using Spleeter on GitHub

Industrial integrations :

  • AconDigital plugins
  • Various public web applications

Research publications :

  • https://ieeexplore.ieee.org/document/8683555
  • http://archives.ismir.net/ismir2019/latebreaking/000036.pdf
slide-44
SLIDE 44

Demo

Spleeter Live

slide-45
SLIDE 45

Thank you

research@deezer.com