Deep Learning Techniques for Music Generation 3. Generation by - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO Deep Learning – Music Generation – 2019 Jean-Pierre Briot

Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Prediction Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Pitch 1st 2nd (Pich of) F Input: Melody 3rd Output: Pitch of a Chord 4th – Training on a corpus/dataset <melody, chord> – Production (Prediction) Deep Learning – Music Generation – 2019 2 Jean-Pierre Briot

Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Classification Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Class A Pitch A# B 1st … 2nd F Output: Chord (Pitch Class) 3rd … Input: Melody 4th G# – Training on a corpus/dataset <melody, chord> – Production (Classification) Deep Learning – Music Generation – 2019 3 Jean-Pierre Briot

Softmax Probabilities Logits Deep Learning – Music Generation – 2019 4 Jean-Pierre Briot

Softmax and Sigmoid • Softmax is the generalization of Sigmoid • From Binary classification to Categorical (Multiclass) classification Sigmoid Probability € [0, 1] Softmax S Probabilities = 1 Probabilities Logits Deep Learning – Music Generation – 2019 5 Jean-Pierre Briot

Softmax and Sigmoid • Step function and Argmax are NOT differentiable • No gradient -> No possibility of back propagation Step function Probability € {0, 1} (Perceptron) Argmax Probability(Argmax) = 1 Probabilities Logits Deep Learning – Music Generation – 2019 6 Jean-Pierre Briot

Representation • Audio – Waveform – Spectrogram (Fourier Transform) – Other (ex: MFCC) • Symbolic – Note – Rest – Note hold – Duration – Chord – Rhythm – Piano Roll – MIDI – ABC, XML… Deep Learning – Music Generation – 2019 7 Jean-Pierre Briot

Representation Score C B A# Piano Roll A G# G 0 0 0 One hot Encoding 1 1 Deep Learning – Music Generation – 2019 Jean-Pierre Briot

Encoding of Features (ex : Note Pitch) • Value – Analogic • One-Hot – Digital • Embedding – Constructed Deep Learning – Music Generation – 2019 9 Jean-Pierre Briot

Encoding • Rest – Zero-hot » But ambiguity with low probability notes – One more one-hot element – … • Hold – One more one-hot element » But only for monophonic melodies – Replay matrix – … Deep Learning – Music Generation – 2019 10 Jean-Pierre Briot

Representation hold 0 1 0 1 0 1 rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ? A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 11 Jean-Pierre Briot

Representation hold 0 1 0 1 0 1 If time slice = sixteenth rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 12 Jean-Pierre Briot

Music / Representation / Network Output layer Input layer Hidden layers Alto Voice Soprano Tenor Voice Voice … … … Bass Voice Deep Learning – Music Generation – 2019 13 Jean-Pierre Briot

Code • Python (3) • Keras • Theano or TensorFlow • Music21 Deep Learning – Music Generation – 2019 14 Jean-Pierre Briot

Direct Use – Feedforward – Ex 2: ForwardBach • Feedforward Architecture • Prediction Task • Ex2: Counterpoint (Chorale) generation • Training on the set of (389) J.S. Bach Chorales (Choral Gesang) Output: 3 Melodies Input: Melody Deep Learning – Music Generation – 2019 15 Jean-Pierre Briot

ForwardBach Bach BWV 344 Chorale (Training Example) Original Regenerated Deep Learning – Music Generation – 2019 16 Jean-Pierre Briot

ForwardBach Bach BWV 423 Chorale (Test Example) Original Regenerated Deep Learning – Music Generation – 2019 17 Jean-Pierre Briot

Music / Representation / Network Alternative 3 Models Architecture [Cotrim & Briot, 2019] Output layer Input layer Hidden layers Alto Voice Soprano Tenor Voice Voice Bass Voice Deep Learning – Music Generation – 2019 18 Jean-Pierre Briot

Forward3Bach [Cotrim & Briot, 2019] Bach BWV 423 Chorale (Test Example) Original Single Architecture Regenerated Triple Architecture Regenerated Deep Learning – Music Generation – 2019 19 Jean-Pierre Briot

Comparison ? What happened ? Deep Learning – Music Generation – 2019 20 Jean-Pierre Briot

Overfitness Limitations • Musical accuracy is not that good (yet) • Regeneration of training example is better than Regeneration of test/validation example • Case of Overfitness Deep Learning – Music Generation – 2019 21 Jean-Pierre Briot

Techniques • Limit Accuraccy and Control Overfitness • More Examples (Augment the Corpus) – Keeping a Good Style Representation, Coverture and Consistency – More Consistency and Coverture – Transpose (Align) All Chorales to Only One Key (ex: C) • More Synthetic Examples – More Coverture – Transpose All Chorales in All Keys (12) • Regularization – Weight-based » L1, L2 – Connexion-based » Dropout – Epochs-based » Early-Stop – Analysis of Learning Curves Deep Learning – Music Generation – 2019 22 Jean-Pierre Briot

Softmax Probabilities Logits Deep Learning – Music Generation – 2019 23 Jean-Pierre Briot

Softmax + Cross-Entropy • Cross-Entropy measures dissimilarity between two probability distributions (prediction and target/true value) [Ng 2019] Deep Learning – Music Generation – 2019 24 Jean-Pierre Briot

Output Activation and Cost/Loss Functions Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Task Application the target ( y ) function Regression Real IR Identity (Linear) Mean squared error { 0, 1 } Classification Binary Sigmoid Binary cross-entropy Classification Multiclass single label One-hot Softmax Categorical cross-entropy Monophony Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Polyphony Multiple Multi Multi Sigmoid Binary cross-entropy Multivoice Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Ex. multiclass single label: Classification among a set of possible notes for a monophonic melody, with only one single possible note choice (single label) Ex. multiclass multilabel: Classification among a set of possible notes for a single- voice polyphonic melody, therefore with several possible note choices (several labels) Ex. multi multiclass single label: Multiple classification among a set of possible notes for multivoice monophonic melodies, therefore with only one single possible note choice for each voice; Multiple classification among a set of possible notes for a set of time slices for a monophonic melody, therefore for each time slice with only one single possible note choice Deep Learning – Music Generation – 2019 25 Jean-Pierre Briot

Output Activation and Cost/Loss Functions Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Task Interpretation the target ( y ) function none Regression Real IR Identity (Linear) Mean squared error none { 0, 1 } Classification Binary Sigmoid Binary cross-entropy argmax or sampling Classification Multiclass single label One-hot Softmax Categorical cross-entropy argsort and > threshold & max-notes Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Multiple Multi Multi Sigmoid Binary cross-entropy p argmax or sampling Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Other cost functions: Mean absolute error, Kullback-Leibler (KL) divergence… Deep Learning – Music Generation – 2019 26 Jean-Pierre Briot

Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 27 Jean-Pierre Briot

(Summary of) Principles of Loss Functions • Probability theory + Information theory See also Maximum likelihood principle • Intuition: – Information content (Likely Event) : Low – Information content (Unlikely Event) : High • Self-information: I(x) = log (1/ P (x)) = - log P ( x ) Likely event Unlikely event • Ex: I(note=B) = - log P (note=B) Entropy of Probability distribution : S i I(note=Note i ), weighted by P (note=Note i ) • H(note) = S i P (note=Note i ) I(note=Note i ) • • Expectation-based alternative definition: Expectation: Mean value of f(x) when x~ P : E x~ P [f(x)] = S x P (x) f(x) • • H(note) = E note~ P I(x) = E note~ P [- log P (note)] = - E note~ P [log P (note)] Deep Learning – Music Generation – 2019 31 Jean-Pierre Briot

Deep Learning Techniques for Music Generation 3. Generation by - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em

MUSIC THERAPY MUSIC THERAPY What is music therapy? Music therapy is simply the process of using

JEWISH MUSIC 101: WHAT IS JEWISH MUSIC? A PROGRAM OF THE LOWELL MILKEN FUND FOR AMERICAN JEWISH

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures

Music and Pain: A Music Therapy Perspective Deborah Salmon, MA, MTA, CMT BRAMS, Universit de

FOLK MUSIC AT KMH A presentation of the Folk Music Department at the Royal College of Music,

Music, Language and Computation Aline Honingh LoLaCo Guestlecture 2012 Outline Music at the

Music Generation Using Machine Learning Seminar Computer Music SS 2017 Michael Krause RWTH

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

A Musical Future Options for Studying Music at UWA Why choose Music at UWA? Music at UWA

Music Tagging Ryan Curtin LUG@GT Ryan Curtin Music Tagging - p. 1 The Problem You have a

School Music Education Plan THAMES Guidance for Schools Music in Schools - Introducing School

Radium: A Music Editor Inspired by the Music Tracker Kjetil Matheussen Norwegian Center for

Music recommendation and discovery in which Web? scar Celma (Music Technology Group, UPF)

Automatic Transcription of Monophonic Audio Signals Narciso Trevilatto Junior Jayme Garcia Arnal

5 Minutes to Enterprise JavaScript With Red Hat OpenShift Application Runtimes Lance Ball John

Anno unc e me nts FIT100 FIT100 FIT100 Due da te fo r Pro je c t 2B wa s no o n to da y

20 The semiconductor industry has taken the prophecy of Gorden Moore seriously as you can see on

RTP Payload for AMR-WB+ audio codec draft-sjoberg-avt-rtp-amrwbplus-01.txt Johan Sjberg,

Timbre Identification Classification of Musical Timbre Using Bayesian Networks Carina Schffer

A Nokia 3310 Ringtone Player in Elm BobKonf( @arkh4m ) ! Hello! BobKonf( @arkh4m ) ! My name

AES 116th, Workshop 14 The role of multiple low-frequency signals in the perception of

Deep Learning Techniques for Music Generation 3. Generation by - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em

MUSIC THERAPY MUSIC THERAPY What is music therapy? Music therapy is simply the process of using

JEWISH MUSIC 101: WHAT IS JEWISH MUSIC? A PROGRAM OF THE LOWELL MILKEN FUND FOR AMERICAN JEWISH

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &amp;

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures

Music and Pain: A Music Therapy Perspective Deborah Salmon, MA, MTA, CMT BRAMS, Universit de

FOLK MUSIC AT KMH A presentation of the Folk Music Department at the Royal College of Music,

Music, Language and Computation Aline Honingh LoLaCo Guestlecture 2012 Outline Music at the

Music Generation Using Machine Learning Seminar Computer Music SS 2017 Michael Krause RWTH

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

A Musical Future Options for Studying Music at UWA Why choose Music at UWA? Music at UWA

Music Tagging Ryan Curtin LUG@GT Ryan Curtin Music Tagging - p. 1 The Problem You have a

School Music Education Plan THAMES Guidance for Schools Music in Schools - Introducing School

Radium: A Music Editor Inspired by the Music Tracker Kjetil Matheussen Norwegian Center for

Music recommendation and discovery in which Web? scar Celma (Music Technology Group, UPF)

Automatic Transcription of Monophonic Audio Signals Narciso Trevilatto Junior Jayme Garcia Arnal

5 Minutes to Enterprise JavaScript With Red Hat OpenShift Application Runtimes Lance Ball John

Anno unc e me nts FIT100 FIT100 FIT100 Due da te fo r Pro je c t 2B wa s no o n to da y

20 The semiconductor industry has taken the prophecy of Gorden Moore seriously as you can see on

RTP Payload for AMR-WB+ audio codec draft-sjoberg-avt-rtp-amrwbplus-01.txt Johan Sjberg,

Timbre Identification Classification of Musical Timbre Using Bayesian Networks Carina Schffer

A Nokia 3310 Ringtone Player in Elm BobKonf( @arkh4m ) ! Hello! BobKonf( @arkh4m ) ! My name

AES 116th, Workshop 14 The role of multiple low-frequency signals in the perception of

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &