MuseGAN: Multi-track Sequential Generative Adversarial Networks for - - PowerPoint PPT Presentation

musegan multi track sequential generative adversarial
SMART_READER_LITE
LIVE PREVIEW

MuseGAN: Multi-track Sequential Generative Adversarial Networks for - - PowerPoint PPT Presentation

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan Yang Research Center of IT Innovation, Academia Sinica Demo Page


slide-1
SLIDE 1

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan Yang

Research Center of IT Innovation, Academia Sinica Demo Page https://salu133445.github.io/musegan/

*these authors contributed equally to this work

slide-2
SLIDE 2

Outline

。Goals & Challenges 。Data 。Proposed Model 。Results & Evaluation 。Future Works

Source Code https://github.com/salu133445/musegan Demo Page https://salu133445.github.io/musegan/

2

slide-3
SLIDE 3

Goals

Generate pop music 。of multiple tracks 。in piano-roll format 。using GAN with CNNs

[Source Code] https://github.com/ salu133445/musegan [Demo Page] https://salu133445. github.io/musegan/

3

slide-4
SLIDE 4

Challenge I

Multitrack Interdependency

vocal piano bass drums strings

music & clip by phycause

Multi-track GAN

4

slide-5
SLIDE 5

Challenge II

Music Texture

melody chord (harmony)

Convolutional Neural Networks

5

slide-6
SLIDE 6

Challenge III

Temporal Structure

paragraph 1 paragraph 2 paragraph 3 phrase 1 phrase 2 phrase 3 phrase 4 bar 1 bar 2 bar 3 bar 4 beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24

song

phrase 2

4/4 time

6

slide-7
SLIDE 7

Challenge III

Temporal Structure

bar 1 bar 2 bar 3 bar 4 beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24 phrase 2

Fixed Structure

Convolutional Neural Networks

4/4 time

7

slide-8
SLIDE 8

Data Representation

pitch time Bar 1 Bar 2 Bar 3 Bar 4 time step

8

Piano-roll

polyphonic  multi-track  (with symbolic timing)

slide-9
SLIDE 9

Data Representation

pitch time

Piano-roll

Bar 1 Bar 2 Bar 3 Bar 4 polyphonic  multi-track  (with symbolic timing)

9

A3 t0 t1

slide-10
SLIDE 10

Data Representation

Multi-track Piano-roll

pitch time tracks

polyphonic  multi-track 

(with symbolic timing)

10

slide-11
SLIDE 11

Data Representation

11

96 time steps 84

pitches

5 tracks 4 bars

a 4×96×84×5 tensor

Drums Guitar Piano Strings Bass

slide-12
SLIDE 12

Data

LPD (Lakh Pianoroll Dataset)

。>170,000 multi-track piano-rolls 。Derived from Lakh MIDI Dataset 。Mainly pop songs

Pypianoroll (Python package)

。Manipulation & Visualization 。Efficient Save/Load 。Parse/Write MIDI files 。On PYPI (pip installable)

[Dataset] https://salu133445.gith ub.io/musegan/dataset [Pypianoroll] https://salu133445. github.io/pypianoroll/

12

slide-13
SLIDE 13

Generative Adversarial Networks

X

real data

G z~p(z) G(z)

random noise fake data

Generator D real/fake Discriminator

4-bar phrases of 5 tracks

critic

(wgan-gp)

13

slide-14
SLIDE 14

MuseGAN – An Overview

Gtemp

4 latent variables 1 random noise temporal generator bar generator 4 piano-roll matrices

Gbar

14

slide-15
SLIDE 15

Bar Generator

MuseGAN

z z z z z z z z z z z z z G G G G G

15

slide-16
SLIDE 16

MuseGAN

z Bar Generator z z z z z z z z z z z z z

16

G G G G G No Coordination Coordination track-dependent track-independent

slide-17
SLIDE 17

z z z z z

MuseGAN

z Bar Generator G z G G G G G z z z z z z z z z z z z z z z z z

17

G G G G G

slide-18
SLIDE 18

z z z z z

MuseGAN

z Bar Generator G z G G G G G z z z z z z z z z z z z z z z z z

18

G G G G G

slide-19
SLIDE 19

Time Dependent Independent Track Dependent Melody Groove Independent Chords Style

z z z z z

MuseGAN

z Bar Generator G z G G G G G z z z z z z z z z z z z z z z z z

19

G G G G G

Chords Style Melody Groove

slide-20
SLIDE 20

Results

More Samples on Demo Page https://salu133445.github.io/musegan/

Sample 1 Sample 2

20

Bass Drums Guitar Strings Piano Step 0 Step 700 Step 2500 Step 6000 Step 7900

Drum pattern Chords Bass Line

slide-21
SLIDE 21

Objective Metrics

UPC

step

QN

step

UPC number of used pitch classes per bar QN ratio of qualified notes

Monitor the Training

21 step

2000 4000 6000 8000 104 106 108 1010 1012

Negative Critic Loss

slide-22
SLIDE 22

User Study

H: harmonious R: rhythmic MS: musically structured C: coherent OR: overall rating

composer jamming hybrid

22

slide-23
SLIDE 23

Summary

。MuseGAN

  • a novel GAN for multi-track sequence generation
  • multi-track, polyphonic music
  • human-AI cooperative scenario (see the paper)

。Lakh Pianoroll Dataset (LPD) (new dataset!!) 。Pypianoroll (new package!!)

23

slide-24
SLIDE 24

Future Works

Full Song Generation

bar 1 bar 2 bar 3 bar 4 beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24 phrase 2 paragraph 1 paragraph 2 paragraph 3 phrase 1 phrase 2 phrase 3 phrase 4

song

Hierarchical Temporal Structure

24

slide-25
SLIDE 25

Future Works

Cross-modal Generation

。Music + Video 。Music + Lyrics 。Video + Text

25

slide-26
SLIDE 26

Q&A

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment Source Code https://github.com/salu133445/musegan Demo Page https://salu133445.github.io/musegan/