Deep Learning Techniques for Music Generation Recurrent (5) - - PowerPoint PPT Presentation

deep learning techniques for music generation recurrent 5
SMART_READER_LITE
LIVE PREVIEW

Deep Learning Techniques for Music Generation Recurrent (5) - - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation Recurrent (5) Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em Informtica (PPGI) UNIRIO


slide-1
SLIDE 1

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Jean-Pierre Briot

Jean-Pierre.Briot@lip6.fr

Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO

Deep Learning Techniques for Music Generation Recurrent (5)

slide-2
SLIDE 2

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Recurrent

2

slide-3
SLIDE 3

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#1 Limitation – Generation and #2 Limitation – Fixed Length

  • Works OK

But:

  • Fixed input (and output) length

3

slide-4
SLIDE 4

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#1 Limitation – Generation and #2 Limitation – Fixed Length Solution: Recurrent Network (RNN)

  • Works OK

But:

  • Fixed input (and output) length

Solution:

  • Recurrent Network (RNN)
  • Variable length
  • Memorizes previous steps
  • Predicts next step

4

slide-5
SLIDE 5

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Recurrent Network (RNN)

  • Memorizes previous steps
  • Can learn from previous step
  • Predicts next step
  • Can learn sequences

5 Recurrent connexions

slide-6
SLIDE 6

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Recurrent Connexions

6

slide-7
SLIDE 7

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Alternative (More Common) Notation

7

slide-8
SLIDE 8

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Prediction

8

slide-9
SLIDE 9

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Training a RNN

9

y = Expected next note(x) Training with

  • One example (x, y) = one note

Or

  • One example (x, y) = one melody

with x = y translated 1 step back

slide-10
SLIDE 10

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Generation

10

  • Iterated generation
  • Note by Note
  • Reinject Next Note to Produce Next Next Note
  • Arbitrary Length

input layer

  • utput layer

hidden layer

slide-11
SLIDE 11

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNNMelody

11

Input layer Output layer Hidden layer Current Note Slice

Next Note Slice

slide-12
SLIDE 12

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN – Iterative Feedforward – #1 Example

12

input layer

  • utput layer

hidden layer LSTM blocks

Corpus: Soprano parts of Bach Chorales

slide-13
SLIDE 13

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Gradient Vanishing/Explosion

13

slide-14
SLIDE 14

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

LSTM (Long Short-Term Memory) [Hochreiter and Schmidhuber, 1997]

14

  • Protection of Memory by Gates
  • Gates are controlled by differentiable functions
  • Thus subject to Training
  • Training of the Meta-Level (Control)
slide-15
SLIDE 15

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

15

X: 1 T: A Cup Of Tea R: reel M: 4/4 L: 1/8 K: Amix |:eA (3AAA g2 fg|eA (3AAA BGGf|eA (3AAA g2 fg|1afge d2 gf:|2afge d2 cd|| |:eaag efgf|eaag edBd|eaag efge|afge dgfg:|

RNN – Iterative Feedforward – #2 Example

  • Ex: Celtic melody generation [Sturm et al., 2016]
  • Celtic Folk Music Corpus (Melodies)
  • Text Encoding (ABC Notation)
slide-16
SLIDE 16

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Celtic Melody Generation

16 Played by a human accordeonist

  • Iterated generation

– Note by Note – Arbitrary Length

  • Ex: Celtic melody generation [Sturm et al., 2016]
  • Celtic Folk Music Corpus (Melodies)
  • Text Encoding (ABC Notation)
  • Ex. of Melody Generated
slide-17
SLIDE 17

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#3 Limitation – Variability

17

  • No Variability in the Generation
  • Because Neural Networks are Deterministic

– Same Input -> Same Output – Same First Note -> Same Whole Melody Generated

  • Solution:

– Sampling

slide-18
SLIDE 18

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

#3 Limitation – Variability – Solution: Sampling

  • Input Representation: One-Hot Encoding

– Corresponds to a Piano Roll Representation

  • Softmax Ouput Layer
  • Classification Task (between possible Notes)

18

slide-19
SLIDE 19

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Sampling

  • Deterministic Strategy:

– Choose the Class (Note/Pitch) with the Highest Probability

  • Sampling (Variability)

– Sample within Possible Notes (Classes) (following the Probability Distribution)

np.random.multinomial(1, note_one_hot_encoding)

19

slide-20
SLIDE 20

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Can Have Ouput Different Than Input Bach Chorale Counterpoint Generation – RNN Version

20

Soprano Voice Alto Voice Tenor Voice Bass Voice

Input layer Output layer Hidden layer

slide-21
SLIDE 21

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Encoder-Decoder

21

slide-22
SLIDE 22

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

RNN Autoencoder : RNN Encoder-Decoder

22

slide-23
SLIDE 23

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

From Speech to Text [Chung et al., 2016]

23

slide-24
SLIDE 24

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Translation Sequence to Sequence

24

slide-25
SLIDE 25

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Translation

25

slide-26
SLIDE 26

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Variational RNN Encoder-Decoder VRAE [Fabius and van Amersfoort, 2015]

26

slide-27
SLIDE 27

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

27

slide-28
SLIDE 28

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

28

  • Hierarchical

– Conductor RNN – Bottom RNN

  • Longer term generation
  • Structure
  • Translation
  • Interpolation (morphing)
  • Averaging of some points
  • Addition or subtraction of an attribute vector capturing a given characteristic

– This attribute vector is computed as the average latent vector for a collection of examples sharing that attribute (characteristic)

slide-29
SLIDE 29

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

29

  • Averaging the latent space
slide-30
SLIDE 30

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

30

  • Comparing Interpolation

– In the data space (melodies) – In the latent space

latent space data space

slide-31
SLIDE 31

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

31

  • Comparing Interpolation

– In the data space (melodies) – In the latent space

slide-32
SLIDE 32

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

32

https://www.youtube.com/watch?v=G5JT16flZwM

slide-33
SLIDE 33

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MusicVAE [Roberts et al., 2018]

33

  • Adding a high note density attribute vector
slide-34
SLIDE 34

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

BeatBlender in TensorFlow.js MusicVAE [Roberts et al., 2018]

34

https://experiments.withgoogle.com/ai/beat-blender/view/

slide-35
SLIDE 35

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

LatentLoops in TensorFlow.js MusicVAE [Roberts et al., 2018]

35

https://teampieshop.github.io/latent-loops/