Deep Learning Techniques for Music Generation Compound and GAN (6) - - PowerPoint PPT Presentation

deep learning techniques for music generation compound
SMART_READER_LITE
LIVE PREVIEW

Deep Learning Techniques for Music Generation Compound and GAN (6) - - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation Compound and GAN (6) Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em Informtica (PPGI)


slide-1
SLIDE 1

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Jean-Pierre Briot

Jean-Pierre.Briot@lip6.fr

Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO

Deep Learning Techniques for Music Generation Compound and GAN (6)

slide-2
SLIDE 2

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Architectures

2

slide-3
SLIDE 3

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Architectures

3

  • Feedforward

mini-bach.py

  • Autoencoder

auto-bach.py – Variational Autoencoder (VAE) VRAE

  • Recurrent (RNN)

– LSTM lstm.py, Celtic

  • Generative Adversarial Networks (GAN)
  • Restricted Boltzmann Machine (RBM)
  • Reinforcement Learning (RL)
slide-4
SLIDE 4

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Compound Architectures

4

  • Autoencoder Stack = Autoencodern

– DeepHear, auto-bach.py

  • Autoencoder(RNN, RNN) = RNN Encoder-Decoder

– VRAE

  • RNN Variational Encoder-Decoder

– Music-VAE 784 400 200 100

slide-5
SLIDE 5

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Generative Adversarial Networks (GAN) [Goodfellow et al., 2014]

5

[Nam Hyuk Ahn, 2017]

  • Training Simultaneously 2 Neural Networks

– Generator

» Transforms Random noise Vectors into Faked Samples

– Discriminator

» Estimates probability that the Sample came from training data rather than from G

– Minimax 2-player game

D(x): PD(x from real data) (Correct) D(G(z)): PD(G(z) from real data) (Incorrect) 1 - D(G(z)): PD(G(z) from Generator) (Correct) Prediction by D P=1 P=0

slide-6
SLIDE 6

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

GAN Equation

  • Binary Cross-Entropy:
  • HB(y, y) = - (y log y + (1-y) log (1-y))
  • D(x) = 1

PD(x from real data) Correct

  • HB(D(x), D(x)) = - (D(x) log D(x) + (1-D(x)) log (1-D(x)))
  • HB(D(x), D(x)) = - log D(x)
  • D(G(z)) = 0

PD(G(z) from real data) Incorrect

  • HB(D(G(z)), D(G(z))) = - (D(G(z)) log D(G(z)) + (1-D(G(z))) log (1-D(G(z))))
  • HB(D(G(z)), D(G(z))) = - log (1-D(G(z)))
  • HB(D(x), D(x)) + HB(D(G(z)), D(G(z))) = - (log D(x) + log (1-D(G(z))))

6

slide-7
SLIDE 7

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

GAN and Turing Test

𝐻 𝑨, 𝜄(𝐻) 𝐻 𝜄(𝐻) ℝ𝑀 ℝ𝑁 𝐸 𝑦, 𝜄(𝐸) 𝐸 𝜄(𝐸) ℝ𝑁 𝐻 𝑨, 𝜄(𝐻) 𝐻 𝜄(𝐻) ℝ𝑀 ℝ𝑁 𝐸 𝑦, 𝜄(𝐸) 𝐸 𝜄(𝐸) ℝ𝑁

[Goodfellow, 2016]

Generator Discriminator

artist’s renditio

𝑨 𝐻 𝑨 or 𝑦 𝐸 𝐻(𝑨) or 𝐸 𝑦

7

slide-8
SLIDE 8

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

GAN Basic Training Algorithm

  • Initialize 𝜄(𝐻), 𝜄(𝐸)
  • For 𝑢 = 1: 𝑐: 𝑈
  • Initialize Δ𝜄(𝐸) = 0
  • For 𝑗 = 𝑢: 𝑢 + 𝑐 − 1
  • Sample 𝑨𝑗 ~ 𝑞(𝑨𝑗)
  • Compute 𝐸 𝐻 𝑨𝑗

, 𝐸(𝑦𝑗)

  • Δ𝜄𝑗

(𝐸) ← Compute gradient of Discriminator loss, 𝐾 𝐸

𝜄 𝐻 , 𝜄(𝐸)

  • Δ𝜄(𝐸) ← Δ𝜄(𝐸) + Δ𝜄𝑗

𝐸

  • Update 𝜄(𝐸)
  • Initialize Δ𝜄(𝐻) = 0
  • For 𝑘 = 𝑢: 𝑢 + 𝑐 − 1
  • Sample 𝑨

𝑘 ~ 𝑞(𝑨 𝑘)

  • Compute 𝐸 𝐻 𝑨𝑘

, 𝐸(𝑦𝑘)

  • Δ𝜄

𝑘 (𝐻) ← Compute gradient of Generator loss, 𝐾 𝐻

𝜄 𝐻 , 𝜄(𝐸)

  • Δ𝜄(𝐻) ← Δ𝜄(𝐻) + Δ𝜄

𝑘 𝐻

  • Update 𝜄(𝐻)

𝑙 𝑙 = 1

8

slide-9
SLIDE 9

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Examples of GAN Generated Images

CelebFaces Attributes Dataset (CelebA) > 200K celebrity images Synthetic (Generated) Celebrity images

[Karras et al., 2018] [Brundage et al., 2018]

9

slide-10
SLIDE 10

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

C-RNN-GAN [Mogren, 2016]

GAN(Bidirectional-LSTM2, LSTM2)

  • Discriminator considers the hidden layers

(forward and backward) values to be (or not) representative of the Real data

– Analog to RNN Encoder-Decoder which considers the hidden layer as the summary of a sequence

  • Classical music Training Dataset

11

slide-11
SLIDE 11

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

MidiNet [Yang et al., 2017]

GAN(Conditioning(Convolutional(Feedforward), Convolutional(Feedforward(History, Chord sequence))), Conditioning(Convolutional(Feedforward), History))

  • Convolutional
  • Conditioning

– Previous measure – Chord sequence

  • Pop music Training Dataset

12

https://soundcloud.com/vgtsv6jf5fwq/model3

slide-12
SLIDE 12

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

VAE vs GAN

  • VAE (Variational Autoencoder) and GAN (Generative Adversarial Networks)

Some Similarities:

  • Are both generative architectures
  • Generate from random latent variables

Differences:

  • VAE is representational of the whole training dataset
  • GAN is not
  • Smooth control interface for exploring latent data space
  • GAN has (ex: interpolation) but not as for VAE
  • GAN produces better quality content (ex: better resolution images)

[Dykeman, 2016]

13

slide-13
SLIDE 13

Deep Learning – Music Generation – 2018

Jean-Pierre Briot

Compound Architectures

  • Composition

– Bidirectional RNN, combining two RNNs, forward and backward in time – RNN-RBM [Boulanger-Lewandowski et al., 2012], combining an RNN (horizontal/sequence) and an RBM (vertical/chords)

  • Refinement

– Sparse autoencoder – Variational autoencoder (VAE) = Variational(Autoencoder)

  • Nested

– Stacked autoencoder = Autoencodern – RNN Encoder-Decoder = Autoencoder(RNN, RNN)

  • Pattern instantiation

– C-RBM [Lattner et al., 2016] = Convolutional(RBM) – C-RNN-GAN [Mogren, 2016] = GAN(Bidirectional-LSTM2, LSTM2) – Anticipation-RNN [Hadjeres & Nielsen, 2017] = Conditioning(RNN, RNN)

14