Neural Tangent Kernel Convergence and Generalization in Neural - - PowerPoint PPT Presentation

neural tangent kernel
SMART_READER_LITE
LIVE PREVIEW

Neural Tangent Kernel Convergence and Generalization in Neural - - PowerPoint PPT Presentation

Neural Tangent Kernel Convergence and Generalization in Neural Networks Arthur Jacot Franck Gabriel Clment Hongler arthur.jacot@epfl.ch franck.gabriel@epfl.ch clement.hongler@epfl.ch What happens during training? One step of Gradient


slide-1
SLIDE 1

Neural Tangent Kernel

Convergence and Generalization in Neural Networks Arthur Jacot

arthur.jacot@epfl.ch

Franck Gabriel

franck.gabriel@epfl.ch

Clément Hongler

clement.hongler@epfl.ch

slide-2
SLIDE 2

Describes the effect of gradient descent on the network function Neural Tangent Kernel: One step of Gradient Descent One datapoint x0

What happens during training?

slide-3
SLIDE 3

In the Infinite width limit:

  • Deterministic
  • Fixed in time
  • Explicit formula

Determines the trajectory of the network function during training

slide-4
SLIDE 4

Kernel methods

Kernel Gradient Descent Positive definite NTK Kernel ridge regression

Neural Networks

Gradient Descent Convergence to a global min. Least-squares loss

slide-5
SLIDE 5

The sum of all microscopic changes yields a macroscopic effect What happens inside a very wide network?

  • The activations of the hidden neurons become independent
  • The parameters and activations evolve less and less
  • However all layers learn: