deep learning techniques for music generation 3
play

Deep Learning Techniques for Music Generation 3. Generation by - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em


  1. Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  2. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Prediction Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Pitch 1st 2nd (Pich of) F Input: Melody 3rd Output: Pitch of a Chord 4th – Training on a corpus/dataset <melody, chord> – Production (Prediction) Deep Learning – Music Generation – 2019 2 Jean-Pierre Briot

  3. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Classification Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Class A Pitch A# B 1st … 2nd F Output: Chord (Pitch Class) 3rd … Input: Melody 4th G# – Training on a corpus/dataset <melody, chord> – Production (Classification) Deep Learning – Music Generation – 2019 3 Jean-Pierre Briot

  4. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 4 Jean-Pierre Briot

  5. Softmax and Sigmoid • Softmax is the generalization of Sigmoid • From Binary classification to Categorical (Multiclass) classification Sigmoid Probability € [0, 1] Softmax S Probabilities = 1 Probabilities Logits Deep Learning – Music Generation – 2019 5 Jean-Pierre Briot

  6. Softmax and Sigmoid • Step function and Argmax are NOT differentiable • No gradient -> No possibility of back propagation Step function Probability € {0, 1} (Perceptron) Argmax Probability(Argmax) = 1 Probabilities Logits Deep Learning – Music Generation – 2019 6 Jean-Pierre Briot

  7. Representation • Audio – Waveform – Spectrogram (Fourier Transform) – Other (ex: MFCC) • Symbolic – Note – Rest – Note hold – Duration – Chord – Rhythm – Piano Roll – MIDI – ABC, XML… Deep Learning – Music Generation – 2019 7 Jean-Pierre Briot

  8. Representation Score C B A# Piano Roll A G# G 0 0 0 One hot Encoding 1 1 Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  9. Encoding of Features (ex : Note Pitch) • Value – Analogic • One-Hot – Digital • Embedding – Constructed Deep Learning – Music Generation – 2019 9 Jean-Pierre Briot

  10. Encoding • Rest – Zero-hot » But ambiguity with low probability notes – One more one-hot element – … • Hold – One more one-hot element » But only for monophonic melodies – Replay matrix – … Deep Learning – Music Generation – 2019 10 Jean-Pierre Briot

  11. Representation hold 0 1 0 1 0 1 rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ? A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 11 Jean-Pierre Briot

  12. Representation hold 0 1 0 1 0 1 If time slice = sixteenth rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 12 Jean-Pierre Briot

  13. Music / Representation / Network Output layer Input layer Hidden layers Alto Voice Soprano Tenor Voice Voice … … … Bass Voice Deep Learning – Music Generation – 2019 13 Jean-Pierre Briot

  14. Code • Python (3) • Keras • Theano or TensorFlow • Music21 Deep Learning – Music Generation – 2019 14 Jean-Pierre Briot

  15. Direct Use – Feedforward – Ex 2: ForwardBach • Feedforward Architecture • Prediction Task • Ex2: Counterpoint (Chorale) generation • Training on the set of (389) J.S. Bach Chorales (Choral Gesang) Output: 3 Melodies Input: Melody Deep Learning – Music Generation – 2019 15 Jean-Pierre Briot

  16. ForwardBach Bach BWV 344 Chorale (Training Example) Original Regenerated Deep Learning – Music Generation – 2019 16 Jean-Pierre Briot

  17. ForwardBach Bach BWV 423 Chorale (Test Example) Original Regenerated Deep Learning – Music Generation – 2019 17 Jean-Pierre Briot

  18. Music / Representation / Network Alternative 3 Models Architecture [Cotrim & Briot, 2019] Output layer Input layer Hidden layers Alto Voice Soprano Tenor Voice Voice Bass Voice Deep Learning – Music Generation – 2019 18 Jean-Pierre Briot

  19. Forward3Bach [Cotrim & Briot, 2019] Bach BWV 423 Chorale (Test Example) Original Single Architecture Regenerated Triple Architecture Regenerated Deep Learning – Music Generation – 2019 19 Jean-Pierre Briot

  20. Comparison ? What happened ? Deep Learning – Music Generation – 2019 20 Jean-Pierre Briot

  21. Overfitness Limitations • Musical accuracy is not that good (yet) • Regeneration of training example is better than Regeneration of test/validation example • Case of Overfitness Deep Learning – Music Generation – 2019 21 Jean-Pierre Briot

  22. Techniques • Limit Accuraccy and Control Overfitness • More Examples (Augment the Corpus) – Keeping a Good Style Representation, Coverture and Consistency – More Consistency and Coverture – Transpose (Align) All Chorales to Only One Key (ex: C) • More Synthetic Examples – More Coverture – Transpose All Chorales in All Keys (12) • Regularization – Weight-based » L1, L2 – Connexion-based » Dropout – Epochs-based » Early-Stop – Analysis of Learning Curves Deep Learning – Music Generation – 2019 22 Jean-Pierre Briot

  23. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 23 Jean-Pierre Briot

  24. Softmax + Cross-Entropy • Cross-Entropy measures dissimilarity between two probability distributions (prediction and target/true value) [Ng 2019] Deep Learning – Music Generation – 2019 24 Jean-Pierre Briot

  25. Output Activation and Cost/Loss Functions Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Task Application the target ( y ) function Regression Real IR Identity (Linear) Mean squared error { 0, 1 } Classification Binary Sigmoid Binary cross-entropy Classification Multiclass single label One-hot Softmax Categorical cross-entropy Monophony Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Polyphony Multiple Multi Multi Sigmoid Binary cross-entropy Multivoice Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Ex. multiclass single label: Classification among a set of possible notes for a monophonic melody, with only one single possible note choice (single label) Ex. multiclass multilabel: Classification among a set of possible notes for a single- voice polyphonic melody, therefore with several possible note choices (several labels) Ex. multi multiclass single label: Multiple classification among a set of possible notes for multivoice monophonic melodies, therefore with only one single possible note choice for each voice; Multiple classification among a set of possible notes for a set of time slices for a monophonic melody, therefore for each time slice with only one single possible note choice Deep Learning – Music Generation – 2019 25 Jean-Pierre Briot

  26. Output Activation and Cost/Loss Functions Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Task Interpretation the target ( y ) function none Regression Real IR Identity (Linear) Mean squared error none { 0, 1 } Classification Binary Sigmoid Binary cross-entropy argmax or sampling Classification Multiclass single label One-hot Softmax Categorical cross-entropy argsort and > threshold & max-notes Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Multiple Multi Multi Sigmoid Binary cross-entropy p argmax or sampling Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Other cost functions: Mean absolute error, Kullback-Leibler (KL) divergence… Deep Learning – Music Generation – 2019 26 Jean-Pierre Briot

  27. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 27 Jean-Pierre Briot

  28. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 28 Jean-Pierre Briot

  29. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 29 Jean-Pierre Briot

  30. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 30 Jean-Pierre Briot

  31. (Summary of) Principles of Loss Functions • Probability theory + Information theory See also Maximum likelihood principle • Intuition: – Information content (Likely Event) : Low – Information content (Unlikely Event) : High • Self-information: I(x) = log (1/ P (x)) = - log P ( x ) Likely event Unlikely event • Ex: I(note=B) = - log P (note=B) Entropy of Probability distribution : S i I(note=Note i ), weighted by P (note=Note i ) • H(note) = S i P (note=Note i ) I(note=Note i ) • • Expectation-based alternative definition: Expectation: Mean value of f(x) when x~ P : E x~ P [f(x)] = S x P (x) f(x) • • H(note) = E note~ P I(x) = E note~ P [- log P (note)] = - E note~ P [log P (note)] Deep Learning – Music Generation – 2019 31 Jean-Pierre Briot

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend