Global convergence of neuron birth-death dynamics ICML 2019 Long - - PowerPoint PPT Presentation

▶

Nov 27, 2022 171 likes •256 views

Global convergence of neuron birth-death dynamics ICML 2019 Long Beach, CA 13 June 2019 Grant M. Rotskofg, Samy Jelassi, Joan Bruna, and Eric Vanden-Eijnden Center for Data Science, New York University Courant Institute of Mathematical

SLIDE 1

Global convergence of neuron birth-death dynamics

Grant M. Rotskofg, Samy Jelassi, Joan Bruna, and Eric Vanden-Eijnden

Center for Data Science, New York University Courant Institute of Mathematical Sciences, New York University

ICML 2019 Long Beach, CA 13 June 2019

SLIDE 2

ICML 13 Jun 2019

Synopsis of main results

→! Modifjcation of gradient descent (GD) with unbalanced optimal transport

Parameter birth-death process
Proof of global convergence
Rate of convergence scales as

→! Based on mean-fjeld perspective on neural networks

Analysis of deterministic partial difgerential equation (PDE)
PDE leads to practical, effjcient algorithm

→! Experiments show faster convergence relative to GD

Illustrative examples show efgect of transport / exploration
Easy to implement with no additional gradient computations

SLIDE 3

ICML 13 Jun 2019

Single hidden layer neural network

d-dim input PDE for parameter distribution Distinct from kernel learning / NTK, dynamics leads to feature selection

SLIDE 4

ICML 13 Jun 2019

Non-local mass transport (particle birth-death)

Parameters are killed / cloned Total population is fjxed

SLIDE 5

ICML 13 Jun 2019

Theorem [R,J,B,V-E]: global convergence

Compare with [Chizat & Bach, 2018] without any restriction on homogeneity for the units

SLIDE 6

ICML 13 Jun 2019

Dramatic efgect of birth/death dynamics

SLIDE 7

ICML 13 Jun 2019

Global convergence of neuron birth-death dynamics ICML 2019 Long - - PowerPoint PPT Presentation

Global convergence of neuron birth-death dynamics

Grant M. Rotskofg, Samy Jelassi, Joan Bruna, and Eric Vanden-Eijnden

Center for Data Science, New York University Courant Institute of Mathematical Sciences, New York University

ICML 2019 Long Beach, CA 13 June 2019

Synopsis of main results

→! Modifjcation of gradient descent (GD) with unbalanced optimal transport

→! Based on mean-fjeld perspective on neural networks

→! Experiments show faster convergence relative to GD

Single hidden layer neural network

d-dim input PDE for parameter distribution Distinct from kernel learning / NTK, dynamics leads to feature selection

Non-local mass transport (particle birth-death)

Parameters are killed / cloned Total population is fjxed

Theorem [R,J,B,V-E]: global convergence

Compare with [Chizat & Bach, 2018] without any restriction on homogeneity for the units

Dramatic efgect of birth/death dynamics

Come see our poster!

Grant Rotskofg Samy Jelassi Joan Bruna Eric Vanden-Eijnden arXiv:1902.01843 Thu Jun 13th 06:30 -- 09:00 PM @ Pacifjc Ballroom #93