[PPT] - Deep Learning Techniques for Music Generation Recurrent (5) PowerPoint Presentation

SLIDE 1

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Jean-Pierre.Briot@lip6.fr

Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO

Deep Learning Techniques for Music Generation Recurrent (5)

SLIDE 2

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Recurrent

2

SLIDE 3

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#1 Limitation – Generation and #2 Limitation – Fixed Length

Works OK

But:

Fixed input (and output) length

3

SLIDE 4

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#1 Limitation – Generation and #2 Limitation – Fixed Length Solution: Recurrent Network (RNN)

Works OK

But:

Fixed input (and output) length

Solution:

Recurrent Network (RNN)
Variable length
Memorizes previous steps
Predicts next step

4

SLIDE 5

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Recurrent Network (RNN)

Memorizes previous steps
Can learn from previous step
Predicts next step
Can learn sequences

5 Recurrent connexions

SLIDE 6

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Recurrent Connexions

6

SLIDE 7

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Alternative (More Common) Notation

7

SLIDE 8

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Prediction

8

SLIDE 9

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Training a RNN

9

y = Expected next note(x) Training with

One example (x, y) = one note

Or

One example (x, y) = one melody

with x = y translated 1 step back

SLIDE 10

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Generation

10

Iterated generation
Note by Note
Reinject Next Note to Produce Next Next Note
Arbitrary Length

input layer

utput layer

hidden layer

SLIDE 11

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNNMelody

11

Input layer Output layer Hidden layer Current Note Slice

…

Next Note Slice

SLIDE 12

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN – Iterative Feedforward – #1 Example

12

input layer

utput layer

hidden layer LSTM blocks

Corpus: Soprano parts of Bach Chorales

SLIDE 13

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Gradient Vanishing/Explosion

13

SLIDE 14

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

LSTM (Long Short-Term Memory) [Hochreiter and Schmidhuber, 1997]

14

Protection of Memory by Gates
Gates are controlled by differentiable functions
Thus subject to Training
Training of the Meta-Level (Control)

SLIDE 15

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

15 X: 1 T: A Cup Of Tea R: reel M: 4/4 L: 1/8 K: Amix |:eA (3AAA g2 fg|eA (3AAA BGGf|eA (3AAA g2 fg|1afge d2 gf:|2afge d2 cd|| |:eaag efgf|eaag edBd|eaag efge|afge dgfg:|

RNN – Iterative Feedforward – #2 Example

Ex: Celtic melody generation [Sturm et al., 2016]
Celtic Folk Music Corpus (Melodies)
Text Encoding (ABC Notation)

SLIDE 16

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Celtic Melody Generation

16 Played by a human accordeonist

Iterated generation

– Note by Note – Arbitrary Length

Ex: Celtic melody generation [Sturm et al., 2016]
Celtic Folk Music Corpus (Melodies)
Text Encoding (ABC Notation)
Ex. of Melody Generated

SLIDE 17

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#3 Limitation – Variability

17

No Variability in the Generation
Because Neural Networks are Deterministic

– Same Input -> Same Output – Same First Note -> Same Whole Melody Generated

Solution:

– Sampling

SLIDE 18

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#3 Limitation – Variability – Solution: Sampling

Input Representation: One-Hot Encoding

– Corresponds to a Piano Roll Representation

Softmax Ouput Layer
Classification Task (between possible Notes)

18

SLIDE 19

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Sampling

Deterministic Strategy:

– Choose the Class (Note/Pitch) with the Highest Probability

Sampling (Variability)

– Sample within Possible Notes (Classes) (following the Probability Distribution)

np.random.multinomial(1, note_one_hot_encoding)

19

SLIDE 20

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Can Have Ouput Different Than Input Bach Chorale Counterpoint Generation – RNN Version

20

Soprano Voice Alto Voice Tenor Voice Bass Voice

…

Input layer Output layer Hidden layer

SLIDE 21

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Encoder-Decoder

21

SLIDE 22

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Autoencoder : RNN Encoder-Decoder

22

SLIDE 23

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

From Speech to Text [Chung et al., 2016]

23

SLIDE 24

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Translation Sequence to Sequence

24

SLIDE 25

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Translation

25

SLIDE 26

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Variational RNN Encoder-Decoder VRAE [Fabius and van Amersfoort, 2015]

26

SLIDE 27

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

27

SLIDE 28

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

28

Hierarchical

– Conductor RNN – Bottom RNN

Longer term generation
Structure
Translation
Interpolation (morphing)
Averaging of some points
Addition or subtraction of an attribute vector capturing a given characteristic

– This attribute vector is computed as the average latent vector for a collection of examples sharing that attribute (characteristic)

SLIDE 29

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

29

Averaging the latent space

SLIDE 30

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

30

Comparing Interpolation

– In the data space (melodies) – In the latent space

latent space data space

SLIDE 31

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

31

Comparing Interpolation

– In the data space (melodies) – In the latent space

SLIDE 32

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

32 https://www.youtube.com/watch?v=G5JT16flZwM

SLIDE 33

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

33

Adding a high note density attribute vector

SLIDE 34

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

BeatBlender in TensorFlow.js MusicVAE [Roberts et al., 2018]

34 https://experiments.withgoogle.com/ai/beat-blender/view/