Automatic Music Generation Graduate School of Culture Technology, - - PowerPoint PPT Presentation

automatic music generation
SMART_READER_LITE
LIVE PREVIEW

Automatic Music Generation Graduate School of Culture Technology, - - PowerPoint PPT Presentation

2018 Fall CTP431: Music and Audio Computing Automatic Music Generation Graduate School of Culture Technology, KAIST Juhan Nam Outlines Early Approaches - Markov Models - Recombinant Models - Cellular Automata - Genetic Algorithm Recent


slide-1
SLIDE 1

2018 Fall CTP431: Music and Audio Computing

Automatic Music Generation

Graduate School of Culture Technology, KAIST Juhan Nam

slide-2
SLIDE 2

Outlines

  • Early Approaches
  • Markov Models
  • Recombinant Models
  • Cellular Automata
  • Genetic Algorithm
  • Recent Advances
  • Neural Networks
  • Interactive music generation
slide-3
SLIDE 3

Symbolic Music

  • Symbolic music is represented as a sequence of notes
slide-4
SLIDE 4

Symbolic Music

  • Music is structured sequential data

Li et. al, “The Clustering of Expressive Timing Within a Phrase in Classical Piano Performances by Gaussian Mixture Models”, CMMR, 2015

Scale Rhythm Form Harmony

Measure Beat Tick

slide-5
SLIDE 5

Symbolic Music

  • Musical notes are temporally dependent
  • Note-level
  • Beat-level
  • Measure-level
slide-6
SLIDE 6

Markov Model

  • A random variable ! has " states (#1, #2, … , #") and, at each time

step, one of the states are randomly chosen: !( ∈ {#1, #2, … , #"}

  • The probability distribution for the current state is determined by

the previous state(s)

  • The first-order Markov model: , !( !-, !., … , !(/- = , !( !(/-
  • The second-order Markov model: , !( !-, !., … , !(/- = , !( !(/-, !(/.
slide-7
SLIDE 7

Markov Model

  • Example: simple melody generation
  • "# ∈ {&, (, )}
  • The transition probability matrix 3 by 3

D C E

St End

+ "# = & "#-. = & = 0.7 + "# = ( "#-. = & = 0.1 + "# = ) "#-. = & = 0.2 + "# = & "#-. = ( = 0.2 + "# = ( "#-. = ( = 0.6 + "# = ) "#-. = ( = 0.2 + "# = & "#-. = ) = 0.3 + "# = ( "#-. = ) = 0.1 + "# = ) "#-. = ) = 0.6

slide-8
SLIDE 8

Markov Model

  • The transition matrix can be learned from data
  • Dancing Markov Gymnopédies: https://codepen.io/teropa/pen/bRqYVj/
  • Generated music
  • Learned with Satie's "Gymnopédies" and "Trois Gnossiennes”
  • https://www.youtube.com/watch?v=H3xgdDTvvlc
  • Learned with Bach's "Toccata and Fugue in D minor" (BWV 565)
  • https://www.youtube.com/watch?v=lIOiAK0x4vA
slide-9
SLIDE 9

Example: Illiac Suite

  • The first computer-generated composition (1956)
  • Lejaren Hiller and Leonard Issacson
  • They used Markov models of variable order

to select notes with different lengths

  • Music
  • https://www.youtube.com/watch?v=n0njBFLQSk8&list=PLIVblwUBdcStsN

pl0v4OCbC5k-mIDcyaR

slide-10
SLIDE 10

Recombinant Music

  • Musical Dice Game
  • Generate from pre-composed small pieces by random draws
  • The table of me preserves musical “style”

https://imslp.org/wiki/Musikalisches_W%C3%BCrfelspiel,_K.516f_(Mozart,_Wolfgang_Amadeus)

Mozart K. 516F

1116 = 45,949,729,863,572,161 variations

slide-11
SLIDE 11

Recombinant Music

  • David Cope’s Experiments in Musical Intelligence (EMI)
  • Segment and reassemble existing pieces of music by pattern matching
  • Create a new piece of music that preserves the style of the original

Augmented Transition Networks (David Cope)

slide-12
SLIDE 12

Infinite Jukebox

  • Music mash-up using beat-level self-similarity within a song

http://infinitejukebox.playlistmachinery.com/

slide-13
SLIDE 13

“In C”

  • Ted Riley’s ensemble music
  • Also called “Minimal music”

Source: https://nmbx.newmusicusa.org/terry-rileys-in-c/

“In C” by Terry Riley Instruction for beginners 1 Any number of people can play this piece on any instrument or instruments (including voice). 2 The piece consists of 53 melodic patterns to be repeated any amount of times. You can choose to start a new pattern at any point. The choice is up to the individual performer! We suggest beginners are very familiar with patterns 1-12. 3 Performers move through the melodic patterns in order and cannot go back to an earlier pattern. Players should try to stay within 2-3 patterns of each other. 4 If any pattern is too technically difficult, feel free to move to the next one. 5 The eighth note pulse is constant. Always listen for this pulse. The pulse for our experience will be piano and Orff instruments being played on the stage. 6 The piece works best when all the players are listening very carefully. Sometimes it is better to just listen and not play. It is important to fit into the group sound and understand how what you decide to play affects everybody around you. If you play softly, other players might follow you and play soft. If you play loud, you might influence

  • ther players to play loud.

7 The piece ends when the group decides it ends. When you reach the final pattern, repeat it until the entire group arrives on this figure. Once everyone has arrived, let the music slowly die away.

Source: https://www.musicinst.org/sites/default/files/attachments/In%20C%20Instructions%20for%20Beginners.pdf

https://www.youtube.com/results?search_query=Terry+Riley+In+C

slide-14
SLIDE 14

Cellular Automata

  • A cell-based state evolution model
  • Determines the state of each cell using neighbors and a rule set
  • A Wolfram model example:
  • Related to self-replicating patterns in biology

“Rule 90”

Source: https://natureofcode.com/book/chapter-7-cellular-automata/

slide-15
SLIDE 15

Conway’s Game of Life

  • 2D cellular automata
  • Rules of life

Death (1à0) : overpopulation (>=4) or loneliness (<=1) Birth (0à1) : 3 neighbors are alive Otherwise, stay in the same state

  • Demos:
  • http://www.cappel-nord.de/webaudio/conways-melodies/
  • http://nexusosc.com/gameofreich/
  • http://blipsoflife.herokuapp.com/

Source: https://natureofcode.com/book/chapter-7-cellular-automata/

slide-16
SLIDE 16

WolframTones

  • Automatic music generation system based on cellular automata
  • Demo: http://tones.wolfram.com/generate

Mapping to musical notes by rules

slide-17
SLIDE 17

Statistical Models

  • As aforementioned, music is highly structure sequence data.

Thus, we can model the sequence using an auto-regressive model.

  • In the first-order Markov model, it was simplified to ! "# "#$%
  • However, it explains only short-term relations among notes
  • Can we model the long-term relations using more complicated

model?

!("#|"%, … , "#$%)

"#: note features "% "+ ", "- …

slide-18
SLIDE 18

Toy Example 3 + 5 = 18 4 + 4 = 20 6 + 7 = 48 8 + 9 = 80 9 + 10 = ?

Note that “+” is not addition here

slide-19
SLIDE 19

Toy Example 3 + 5 = 18 4 + 4 = 20 6 + 7 = 48 8 + 9 = 80 9 + 10 = ?

Note that “+” is not addition here

! = #(%&, %() ! = %& ×(%( +1)

slide-20
SLIDE 20

Toy Example 2 + 2 = 6 3 + 6 = 12 4 + 5 = 19 6 + 10 = 40 7 + 18 = ?

Note that “+” is not addition here

slide-21
SLIDE 21

Toy Example ! = #(%&, %() ! = %& + %( + %&( 2 + 2 = 6 3 + 6 = 12 4 + 5 = 19 6 + 10 = 40 7 + 18 = ?

Note that “+” is not addition here

slide-22
SLIDE 22

Neural Network

  • A learning model based on multi-layered networks
  • The basic model (MLP) is composed of linear transforms and element-wise

nonlinear functions

ℎ(#) = & # (' # ( + * # ) ℎ(+) = & + (' + ℎ(#) + * + ) ℎ(,) = & , (' , ℎ(+) + * , )

  • .

( ℎ(#) ℎ(+) ℎ(,)

' # ' + ' , ' /

Multi-Layer Perceptron (MLP)

Non-linear functions

  • . = 0(ℎ(,))
slide-23
SLIDE 23

Neural Network

  • The Neural network is trained via error back-propagation

! " # ℎ(&) ℎ(() ℎ())

* & * ( * ) * +

Forward Computation Backward Computation , * (-./)

0, * (-./) 0* (-./)

*

12 (345) = * 12 (-./) − 8 0, * 12 (-./)

0*

12 (-./)

Gradient descent

slide-24
SLIDE 24

MLP Demo and visualization

  • https://playground.tensorflow.org
slide-25
SLIDE 25

The Toy Example

  • The neural network can learn highly complicated relations

between input and output !" # $

%

&

%

"

%

'

%

(

!&

2, 3, 4, 6, … 2, 6, 5, 10, … 6, 12, 19, 40 , … %

) *+, ← % ) ./0 − 2 3 # 454 367

slide-26
SLIDE 26

The Toy Example

  • The neural network can learn highly complicated relations

between input and output !" # $

%

&

%

"

%

'

%

(

!&

7 8 53.9999…

slide-27
SLIDE 27

Deep Neural Network

  • Use “deep” layers
  • Many parameters to explain the data distribution
  • Need more data and fast computation (e.g. GPU)
  • Many efficient training techniques

! "

#

$%&

#

$

'(

#

&

#

(

'&

… … …

slide-28
SLIDE 28

Deep Neural Network

  • Universal model regardless of the domain (image, audio, text, …)

!

"#$

!

"

!

$

!

%

… … …

“motor-bike” “I love coffee” “오늘 남북정상이 만나…”

slide-29
SLIDE 29

Deep Neural Network

  • Thus, we can apply the model to music!
  • However, we need to handle long sequences and variable lengths

!

"#$

!

"

!

$

!

%

… … …

'$, … , ')#$ ')

'$ '% '* '+ …

slide-30
SLIDE 30

Recurrent Neural Networks (RNN)

  • Sequence-to-sequence modeling

! "#

. . . . . . . . .

"% "# "& "'(% ! "& ! ") ! "'

slide-31
SLIDE 31

Examples

  • FolkRNN
  • https://folkrnn.org/
  • DeepBach
  • http://www.flow-machines.com/archives/deepbach-polyphonic-music-

generation-bach-chorales/

  • DeepJazz
  • https://deepjazz.io/
  • PerformanceRNN
  • https://magenta.tensorflow.org/performance-rnn
slide-32
SLIDE 32
  • Neural networks configured to reconstruct the input
  • The latent vector contains compressed information of the input
  • The decoder can be used to generate data: Variational Auto-Encoder

(VAE) is more often used

Auto-Encoder

Encoder Decoder Latent Vector

! " ! # ! $ ! % ! & ! ' ( ) (

Train to minimize the reconstruction error: * !; ( = ( − ) ( #

slide-33
SLIDE 33
  • Interpolation from the latent space

Generation Examples

Fei-Fei Li & Justin Johnson & Serena Yeung

Lecture 12 - May 15, 2018 94

Variational Autoencoders: Generating Data!

Vary z1 Vary z2 Degree of smile Head pose Diagonal prior on z => independent latent variables Different dimensions of z encode interpretable factors

  • f variation

Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

Fei-Fei Li & Justin Johnson & Serena Yeung

Lecture 12 - May 15, 2018 92

Decoder network Sample z from Sample x|z from

Variational Autoencoders: Generating Data!

Use decoder network. Now sample z from prior!

Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014 (Auto-Encoding Variational Bayes, Kingma and Welling, 2014)

slide-34
SLIDE 34

Google Magenta Project

  • https://magenta.tensorflow.org/
slide-35
SLIDE 35

Interactive Music Generation

  • Interactive composition/performance
  • http://eclipticalis.com/
  • https://junshern.github.io/algorithmic-music-tutorial/
  • http://teropa.info/blog/2017/01/23/terry-rileys-in-c.html
  • https://incredible-spinners.glitch.me/
  • Games
  • https://techbelly.github.io/game-soundtrack/webaudio/
  • http://musiccanbefun.edankwan.com/
  • Educational
  • https://learningmusic.ableton.com/