deep learning techniques for music generation 3
play

Deep Learning Techniques for Music Generation 3. Generation by - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em


  1. Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  2. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Prediction Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Pitch 1st 2nd (Pich of) F Input: Melody 3rd Output: Pitch of a Chord 4th – Training on a corpus/dataset <melody, chord> – Production (Prediction) Deep Learning – Music Generation – 2019 2 Jean-Pierre Briot

  3. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Classification Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Class A Pitch A# B 1st … 2nd F Output: Chord (Pitch Class) 3rd … Input: Melody 4th G# – Training on a corpus/dataset <melody, chord> – Production (Classification) Deep Learning – Music Generation – 2019 3 Jean-Pierre Briot

  4. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 4 Jean-Pierre Briot

  5. Softmax and Sigmoid • Softmax is the generalization of Sigmoid • From Binary classification to Categorical (Multiclass – Single Label) classification Sigmoid Probability € [0, 1] σ(z) = 1 ⁄ 1 + e -z Softmax S Probabilities = 1 σ(z) i = e z i / Σ e z j Probabilities Logits Deep Learning – Music Generation – 2019 5 Jean-Pierre Briot

  6. Softmax and Sigmoid • Step function and Argmax are NOT differentiable • No gradient -> No possibility of back propagation Step function Probability € {0, 1} (Perceptron) Softmax Probabilities Logits Deep Learning – Music Generation – 2019 6 Jean-Pierre Briot

  7. Direct Use – Feedforward – Ex: ForwardBach • Feedforward Architecture • Classification Task • Ex: Counterpoint (Chorale) generation • Training on the set of (389) J.S. Bach Chorales (Choral Gesang) Output: 3 Melodies Input: Melody Deep Learning – Music Generation – 2019 7 Jean-Pierre Briot

  8. Representation • Audio – Waveform – Spectrogram (Fourier Transform) – Other (ex: MFCC) • Symbolic – Note – Rest – Note hold – Duration – Chord – Rhythm – Piano Roll – MIDI – ABC, XML… Deep Learning – Music Generation – 2019 8 Jean-Pierre Briot

  9. Representation Score C B A# Piano Roll A G# G 0 0 0 One hot Encoding 1 1 Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  10. Encoding of Features (ex : Note Pitch) • Value – Analogic • One-Hot – Digital • Embedding – Constructed Deep Learning – Music Generation – 2019 10 Jean-Pierre Briot

  11. Encoding • Rest – Zero-hot » But ambiguity with low probability notes – One more one-hot element – … • Hold – One more one-hot element » But only for monophonic melodies – Replay matrix – … Deep Learning – Music Generation – 2019 11 Jean-Pierre Briot

  12. Representation hold 0 1 0 1 0 1 rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ? A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 12 Jean-Pierre Briot

  13. Representation hold 0 1 0 1 0 1 If time slice = sixteenth rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 13 Jean-Pierre Briot

  14. Representation Choices • More Alternative Possibilities for Note: – Pitch Class One-hot + Octave number – Binary of MIDI number – … • The Most Compact Representation is not always the Best for Training/Generation • Hold as a Note Special Symbol – Risk of Skewed Class, Because Appears Too Often Deep Learning – Music Generation – 2019 14 Jean-Pierre Briot

  15. Music / Representation / Network Input layer Output layer Hidden layers Alto Voice Soprano Tenor Voice Voice … … … Bass Voice Deep Learning – Music Generation – 2019 15 Jean-Pierre Briot

  16. Code • Python (3) • Keras • Theano or TensorFlow • Music21 Deep Learning – Music Generation – 2019 16 Jean-Pierre Briot

  17. Direct Use – Feedforward – Ex 2: ForwardBach • Feedforward Architecture • Prediction Task • Ex2: Counterpoint (Chorale) generation • Training on the set of (389) J.S. Bach Chorales (Choral Gesang) Output: 3 Melodies Input: Melody Deep Learning – Music Generation – 2019 17 Jean-Pierre Briot

  18. ForwardBach Bach BWV 344 Chorale (Training Example) Original Regenerated Deep Learning – Music Generation – 2019 18 Jean-Pierre Briot

  19. ForwardBach Bach BWV 423 Chorale (Test Example) Original Regenerated Deep Learning – Music Generation – 2019 19 Jean-Pierre Briot

  20. Music / Representation / Network Alternative 3 Models Architecture [Cotrim & Briot, 2019] Input layer Output layer Hidden layers Alto Voice Soprano Tenor Voice Voice Bass Voice Deep Learning – Music Generation – 2019 20 Jean-Pierre Briot

  21. Forward3Bach [Cotrim & Briot, 2019] Bach BWV 423 Chorale (Test Example) Original Single Architecture Regenerated Triple Architecture Regenerated Deep Learning – Music Generation – 2019 21 Jean-Pierre Briot

  22. Comparison ? What happened ? Deep Learning – Music Generation – 2019 22 Jean-Pierre Briot

  23. Overfitness Limitations • Musical accuracy is not that good (yet) • Regeneration of training example is better than Regeneration of test/validation example • Case of Overfitness Deep Learning – Music Generation – 2019 23 Jean-Pierre Briot

  24. Techniques • Limit Accuraccy and Control Overfitness • More Examples (Augment the Corpus) – Keeping a Good Style Representation, Coverture and Consistency – More Consistency and Coverture – Transpose (Align) All Chorales to Only One Key (ex: C) • More Synthetic Examples – More Coverture – Transpose All Chorales in All Keys (12) • Regularization – Weight-based » L1, L2 – Connexion-based » Dropout – Epochs-based » Early-Stop – Analysis of Learning Curves Deep Learning – Music Generation – 2019 24 Jean-Pierre Briot

  25. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 25 Jean-Pierre Briot

  26. Softmax + Cross-Entropy • Cross-Entropy measures dissimilarity between two probability distributions (prediction and target/true value) [Ng 2019] Deep Learning – Music Generation – 2019 26 Jean-Pierre Briot

  27. Output Activation and Cost/Loss Functions Task Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Application the target ( y ) function Regression Real IR Identity (Linear) Mean squared error { 0, 1 } Classification Binary Sigmoid Binary cross-entropy Classification Multiclass single label One-hot Softmax Categorical cross-entropy Monophony Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Polyphony Multiple Multi Multi Sigmoid Binary cross-entropy Multivoice Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Ex. multiclass single label: Classification among a set of possible notes for a monophonic melody, with only one single possible note choice (single label) Ex. multiclass multilabel: Classification among a set of possible notes for a single- voice polyphonic melody, therefore with several possible note choices (several labels) Ex. multi multiclass single label: Multiple classification among a set of possible notes for multivoice monophonic melodies, therefore with only one single possible note choice for each voice; Multiple classification among a set of possible notes for a set of time slices for a monophonic melody, therefore for each time slice with only one single possible note choice Deep Learning – Music Generation – 2019 27 Jean-Pierre Briot

  28. Output Activation and Cost/Loss Functions Task Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Interpretation the target ( y ) function none Regression Real IR Identity (Linear) Mean squared error none { 0, 1 } Classification Binary Sigmoid Binary cross-entropy argmax or sampling Classification Multiclass single label One-hot Softmax Categorical cross-entropy Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy argsort and > threshold & max-notes Multiple Multi Multi Sigmoid Binary cross-entropy p argmax or sampling Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Other cost functions: Mean absolute error, Kullback-Leibler (KL) divergence… Deep Learning – Music Generation – 2019 28 Jean-Pierre Briot

  29. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 29 Jean-Pierre Briot

  30. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 30 Jean-Pierre Briot

  31. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 31 Jean-Pierre Briot

  32. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 32 Jean-Pierre Briot

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend