FloWaveNet: A Generative Flow for Raw Audio Sungwon Kim 1 , Sang-gil - PowerPoint PPT Presentation

Jul 17, 2023 •167 likes •347 views

ICML 2019 FloWaveNet: A Generative Flow for Raw Audio Sungwon Kim 1 , Sang-gil Lee 1 , Jongyoon Song 1 , Jaehyeon Kim 2 , Sungron Yoon 1,3 1 Seoul National University, 2 Kakao Corporation, 3 ASRI, INMC, Institute of Engineering Research, Seoul

ICML 2019 FloWaveNet: A Generative Flow for Raw Audio Sungwon Kim 1 , Sang-gil Lee 1 , Jongyoon Song 1 , Jaehyeon Kim 2 , Sungron Yoon 1,3 1 Seoul National University, 2 Kakao Corporation, 3 ASRI, INMC, Institute of Engineering Research, Seoul National University Poster 6/12 6:30 PM @Pacific Ballroom #2
WaveNet ) log $ % & ':) = + log $ % & , & ., ,-' https://deepmind.com/blog/wavenet-generative-model-raw-audio/
WaveNet Sequential sampling ) log $ % & ':) = + log $ % & , & ., ,-' https://deepmind.com/blog/wavenet-generative-model-raw-audio/
Previous parallel speech synthesis models Pre-trained WaveNet Inverse Autoregressive Flows (IAFs) Probability Density Distillation !" # $ % ||# ' % Oord, Aaron, et al. "Parallel WaveNet: Fast High-Fidelity Speech Synthesis." International Conference on Machine Learning . 2018.
Previous parallel speech synthesis models Pre-trained WaveNet Parallel sampling Inverse Autoregressive Flows (IAFs) Probability Density Distillation !" # $ % ||# ' % Oord, Aaron, et al. "Parallel WaveNet: Fast High-Fidelity Speech Synthesis." International Conference on Machine Learning . 2018.
Previous parallel speech synthesis models Pre-trained WaveNet Parallel sampling Inverse Autoregressive Flows (IAFs) Probability Density Distillation Power Loss Perceptual Loss + !" # $ % ||# ' % Contrastive Loss Frame Loss Oord, Aaron, et al. "Parallel WaveNet: Fast High-Fidelity Speech Synthesis." International Conference on Machine Learning . 2018.
Our Objectives • Simplify the training procedure for parallel sampling • Maintain the quality of speech samples
Our Objectives • Simplify the training procedure for parallel sampling • Maintain the quality of speech samples Flow-based generative models for raw audio!
FloWaveNet 3 3 + % , - Raw audio Gaussian Noise + log det 2, - & Training phase log $ % & ':) = log $ + , - & ':) 2&
FloWaveNet 5 5 + % , - 34 , - Raw audio Gaussian Noise + log det 2, - & Training phase log $ % & ':) = log $ + , - & ':) 2& Sampling phase 34 (6) 6 = 6 ':) ~ 5 + 6 = 8 9, ; , & = , - <
FloWaveNet 5 5 + % , - 34 , - Raw audio Gaussian Noise + log det 2, - & Training phase log $ % & ':) = log $ + , - & ':) 2& Sampling phase 34 (6) 6 = 6 ':) ~ 5 + 6 = 8 9, ; , & = , - < 34 are designed to be computed efficiently Both the transformation , - and , - à Efficient training & Parallel sampling
FloWaveNet 7 4 56 7 4 59 4 : 7 ⋅ 4 59 7 log det 3 4 56 & log $ % & ':) = log $ + , & ':) + . 3& /
Mean Opinion Scores FloWaveNet ≥ Gaussian IAF
Sampling speed FloWaveNet ≅ Gaussian IAF ≅ Parallel WaveNet >> Autoregressive WaveNet 1000s times faster
Conclusion • FloWaveNet produces high quality audio samples as well as previous distilled models. • FloWaveNet synthesizes audio samples in parallel – w/o well pre-trained WaveNet (No distillation!) – w/o auxiliary loss terms Demo page Code Poster 6/12 6:30 PM @Pacific Ballroom #2
16

Recommend

RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW

RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY Mr DAOUDA GON COULIBALY ACE GLOBAL DEPOSITORY 24-26 Jan

306 views • 14 slides

2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW

2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN NIGERIA:

135 views • 11 slides

Raw Sockets and ICMP Raw Sockets and ICMP Code Examples Ping Traceroute Srinidhi

Topics Topics Raw sockets Internet Control Message Protocol (ICMP) Raw Sockets and ICMP Raw Sockets and ICMP Code Examples Ping Traceroute Srinidhi Varadarajan 11/4/2002 2 Raw Sockets Raw Sockets Creating a Raw Socket

199 views • 6 slides

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio API Spec Editor (Audio WG) Proposal: Audio Device Client Work-in-progress! (in Audio CG) Proposal: Audio Device Client Low-level audio I/O

454 views • 27 slides

generative design systems Generative Brief Design Definitions Workshop Processes

Generative Brief Design Definitions Workshop Processes Applications generative design systems Generative Brief Design Definitions Workshop Processes Applications generative design systems Generative Brief Design Definitions

943 views • 75 slides

Generative networks part 2: GANs 23 / 54 Recap on generative networks Generative networks provide

Generative networks part 2: GANs 23 / 54 Recap on generative networks Generative networks provide a way to sample from any distribution. 1. Sample z , where denotes an efficiently sampleable distribution (e.g., uniform or Gaussian). 2.

486 views • 47 slides

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player A/V Receiver w ith MP3/WMA Flash & HDD based DVD Receiver MP3/WMA Players Home Audio driven by Digital Multi Channel Surround Sound Multi

366 views • 20 slides

Raw Committee Meeting 2015 Raw Nationals Scranton, PA October 14, 2015 Welcome from the Raw

Raw Committee Meeting 2015 Raw Nationals Scranton, PA October 14, 2015 Welcome from the Raw Committee Matt Gary Ryan Gleason Angela Simons Joe Warpeha (chair) Thank You Steve Mann, meet director Bill Clayton, technical

561 views • 32 slides

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video 1 Schield: 2020 PPTX Create Audio-Video 2 PowerPoint: Record Audio: Create Audio and Video Overview by There are two ways to record audio in

172 views • 15 slides

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Audio 1 Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter A packet- G.7xx ization D 1mV A G.7xx D August 13, 2001 Audio 3 Digital audio sample each audio channel and quantize

478 views • 21 slides

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio : Coding vs. Aesthetics Leonard Paul Lotus Audio GDC 2003 Game Audio : Coding vs. Aesthetics Code Content Coder Composer vs ? Technology

611 views • 37 slides

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is a type of neural net, used in deep learning/machine learning problems The goal of a GAN is to train two simultaneous models: a generative model

441 views • 18 slides

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse and Jimmy Ba CSC421/2516 Lecture 18: Generative Adversarial Networks 1 / 20 Implicit Generative Models Recall: implicit generative models learn a

238 views • 20 slides

Open house Open house Open house Open house on on on on on on on on World Raw Cashew

Open house Open house Open house Open house on on on on on on on on World Raw Cashew Trade Forum World Raw Cashew Trade Forum World Raw Cashew Trade Forum World Raw Cashew Trade Forum Possible causes of 2018 crisis Surplus

259 views • 6 slides

Radio-Activated Water (RAW) Systems RAW Exchange System Preliminary Design In-Process Stakeholder

LBNF Long-Baseline Neutrino Facility Target Hall Radio-Activated Water (RAW) Systems RAW Exchange System Preliminary Design In-Process Stakeholder Buy-In Karl Williams, Raina Wang October 16, 2019 Target Hall RAW Exchange System Purpose of

301 views • 12 slides

Flow Visualization Overview: Flow Visualization (1) Introduction, overview Flow data Simulation

Flow Visualization Overview: Flow Visualization (1) Introduction, overview Flow data Simulation vs. measurement vs. modelling 2D vs. surfaces vs. 3D Steady vs time-dependent flow Direct vs. indirect flow visualization Experimental flow

1.18k views • 92 slides

System Structuring with Threads Example: A Transcoding Web Proxy Appliance Proxy clients

System Structuring with Threads Example: A Transcoding Web Proxy Appliance Proxy clients Interposed between Web (HTTP) clients and servers. Masquerade as (represent) the server to the client. Masquerade as (represent) the client to the

617 views • 14 slides

Lightweight Neural Networks from PCA & LDA Based Distilled Dense Neural Networks ICIP 2020

Lightweight Neural Networks from PCA & LDA Based Distilled Dense Neural Networks ICIP 2020 MEA. Seddik 1 , 2 , , H. Essafi 1 , A. Benzine 1 , 3 , M. Tamaazousti 1 1 CEA List, France 2 CentraleSuplec, L2S, France 3 Sorbonne University,

243 views • 5 slides

1 last time SIMD (single instruction multiple data) hardware idea: wider ALUs and registers

1 last time SIMD (single instruction multiple data) hardware idea: wider ALUs and registers Intels interface _mm sharing the CPU: context switching context = visible CPU state (registers, condition codes, PC, ) exceptions = OS gets run

1.42k views • 129 slides

CENG3420 Lecture 01: Introduction Bei Yu (Latest update: January 9, 2019) Spring 2019 1 / 50

CENG3420 Lecture 01: Introduction Bei Yu (Latest update: January 9, 2019) Spring 2019 1 / 50 Overview Course Information Background Organization First Glance Summary 2 / 50 Overview Course Information Background Organization

754 views • 60 slides

Form over Function Teaching Beginners How to Construct Programs Michael Sperber Collaborators:

Form over Function Teaching Beginners How to Construct Programs Michael Sperber Collaborators: Marcus Crestani, Martin Gasbichler, Herbert Klaeren, Eric Knauel @ University of Tbingen Wednesday, September 12, 12 Back at the Ranch ...

791 views • 58 slides

Another Dynamic Algorithm: Scoreboard Summary Tomasulo Algorithm Speedup 1.7 from compiler;

Another Dynamic Algorithm: Scoreboard Summary Tomasulo Algorithm Speedup 1.7 from compiler; 2.5 by hand For IBM 360/91 about 3 years after CDC 6600 BUT slow memory (no cache) limits benefit Goal: High Performance without special

303 views • 3 slides

Lecture 22: NoSQL Finale Wednesday, April 22, 2015 Announcements Course evaluations will be

Lecture 22: NoSQL Finale Wednesday, April 22, 2015 Announcements Course evaluations will be done online Today: continue and finish MongoDB Also today: Quiz 7 MongoDB Roadmap Data model JSON syntax Semi-structured data

408 views • 27 slides

Scenario Workshop SOUTHEAST GUIDING COALITION ENROLLMENT AND PROGRAM BALANCING November 12, 2020

Scenario Workshop SOUTHEAST GUIDING COALITION ENROLLMENT AND PROGRAM BALANCING November 12, 2020 - Slide 1 Meeting Objectives Select a proposal for presentation at the November 19 open house Review virtual open house format and identify

591 views • 23 slides

FloWaveNet: A Generative Flow for Raw Audio Sungwon Kim 1 , Sang-gil - PowerPoint PPT Presentation

ICML 2019 FloWaveNet: A Generative Flow for Raw Audio Sungwon Kim 1 , Sang-gil Lee 1 , Jongyoon Song 1 , Jaehyeon Kim 2 , Sungron Yoon 1,3 1 Seoul National University, 2 Kakao Corporation, 3 ASRI, INMC, Institute of Engineering Research, Seoul

RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW

2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW

Raw Sockets and ICMP Raw Sockets and ICMP Code Examples Ping Traceroute Srinidhi

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

generative design systems Generative Brief Design Definitions Workshop Processes

Generative networks part 2: GANs 23 / 54 Recap on generative networks Generative networks provide

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Raw Committee Meeting 2015 Raw Nationals Scranton, PA October 14, 2015 Welcome from the Raw

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

Open house Open house Open house Open house on on on on on on on on World Raw Cashew

Radio-Activated Water (RAW) Systems RAW Exchange System Preliminary Design In-Process Stakeholder

Flow Visualization Overview: Flow Visualization (1) Introduction, overview Flow data Simulation

System Structuring with Threads Example: A Transcoding Web Proxy Appliance Proxy clients

Lightweight Neural Networks from PCA & LDA Based Distilled Dense Neural Networks ICIP 2020

1 last time SIMD (single instruction multiple data) hardware idea: wider ALUs and registers

CENG3420 Lecture 01: Introduction Bei Yu (Latest update: January 9, 2019) Spring 2019 1 / 50

Form over Function Teaching Beginners How to Construct Programs Michael Sperber Collaborators:

Another Dynamic Algorithm: Scoreboard Summary Tomasulo Algorithm Speedup 1.7 from compiler;

Lecture 22: NoSQL Finale Wednesday, April 22, 2015 Announcements Course evaluations will be

Scenario Workshop SOUTHEAST GUIDING COALITION ENROLLMENT AND PROGRAM BALANCING November 12, 2020

Sambuz

Useful Links

Newsletter

Mail Us

FloWaveNet: A Generative Flow for Raw Audio Sungwon Kim 1 , Sang-gil - PowerPoint PPT Presentation

ICML 2019 FloWaveNet: A Generative Flow for Raw Audio Sungwon Kim 1 , Sang-gil Lee 1 , Jongyoon Song 1 , Jaehyeon Kim 2 , Sungron Yoon 1,3 1 Seoul National University, 2 Kakao Corporation, 3 ASRI, INMC, Institute of Engineering Research, Seoul

RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW

2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW

Raw Sockets and ICMP Raw Sockets and ICMP Code Examples Ping Traceroute Srinidhi

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

generative design systems Generative Brief Design Definitions Workshop Processes

Generative networks part 2: GANs 23 / 54 Recap on generative networks Generative networks provide

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Raw Committee Meeting 2015 Raw Nationals Scranton, PA October 14, 2015 Welcome from the Raw

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

Open house Open house Open house Open house on on on on on on on on World Raw Cashew

Radio-Activated Water (RAW) Systems RAW Exchange System Preliminary Design In-Process Stakeholder

Flow Visualization Overview: Flow Visualization (1) Introduction, overview Flow data Simulation

System Structuring with Threads Example: A Transcoding Web Proxy Appliance Proxy clients

Lightweight Neural Networks from PCA &amp; LDA Based Distilled Dense Neural Networks ICIP 2020

1 last time SIMD (single instruction multiple data) hardware idea: wider ALUs and registers

CENG3420 Lecture 01: Introduction Bei Yu (Latest update: January 9, 2019) Spring 2019 1 / 50

Form over Function Teaching Beginners How to Construct Programs Michael Sperber Collaborators:

Another Dynamic Algorithm: Scoreboard Summary Tomasulo Algorithm Speedup 1.7 from compiler;

Lecture 22: NoSQL Finale Wednesday, April 22, 2015 Announcements Course evaluations will be

Scenario Workshop SOUTHEAST GUIDING COALITION ENROLLMENT AND PROGRAM BALANCING November 12, 2020

Sambuz

Useful Links

Newsletter

Mail Us

Lightweight Neural Networks from PCA & LDA Based Distilled Dense Neural Networks ICIP 2020