mean field theory of two layers neural networks dimension
play

Mean-field theory of two-layers neural networks: dimension-free - PowerPoint PPT Presentation

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit Song Mei, Theodor Misiakiewicz, and Andrea Montanari Stanford University June 26, 2019 COLT 2019 Song Mei (Stanford University) Mean Field Dynamics for


  1. Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit Song Mei, Theodor Misiakiewicz, and Andrea Montanari Stanford University June 26, 2019 COLT 2019 Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 1 / 12

  2. Gradient dynamics of two-layers neural network ◮ Two layers neural network: θ ✐ ❂ ✭ ❛ ✐ ❀ w ✐ ✮ ✷ R ❉ ✿ Θ ❂✭ θ ✶ ❀ ✿ ✿ ✿ ❀ θ ◆ ✮ ❀ ◆ ② ✭ x ❀ Θ ✮ ❂ ✶ ❳ ❫ ❛ ✐ ✛ ✭ ❤ w ✐ ❀ x ✐ ✮ ✿ ◆ ✐ ❂✶ ◮ Risk function: ◆ ② � ✶ ✑ ✷ ✐ ❤✏ ❳ ❘ ◆ ✭ Θ ✮ ❂ E x ❀② ❛ ✐ ✛ ✭ ❤ w ✐ ❀ x ✐ ✮✮ ✿ ◆ ✐ ❂✶ ◮ SGD/gradient flow: Θ ❦ ✰✶ ❂ Θ ❦ � ✑ ❦ r ❵ ◆ ✭ Θ ❦ ❀ x ❦ ❀ ② ❦ ✮ ❀ ❞ ❞ t Θ t ❂ � r ❘ ◆ ✭ Θ t ✮ ✿ Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 2 / 12

  3. Gradient dynamics of two-layers neural network ◮ Two layers neural network: θ ✐ ❂ ✭ ❛ ✐ ❀ w ✐ ✮ ✷ R ❉ ✿ Θ ❂✭ θ ✶ ❀ ✿ ✿ ✿ ❀ θ ◆ ✮ ❀ ◆ ② ✭ x ❀ Θ ✮ ❂ ✶ ❳ ❫ ❛ ✐ ✛ ✭ ❤ w ✐ ❀ x ✐ ✮ ✿ ◆ ✐ ❂✶ ◮ Risk function: ◆ ② � ✶ ✑ ✷ ✐ ❤✏ ❳ ❘ ◆ ✭ Θ ✮ ❂ E x ❀② ❛ ✐ ✛ ✭ ❤ w ✐ ❀ x ✐ ✮✮ ✿ ◆ ✐ ❂✶ ◮ SGD/gradient flow: Θ ❦ ✰✶ ❂ Θ ❦ � ✑ ❦ r ❵ ◆ ✭ Θ ❦ ❀ x ❦ ❀ ② ❦ ✮ ❀ ❞ ❞ t Θ t ❂ � r ❘ ◆ ✭ Θ t ✮ ✿ Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 2 / 12

  4. Gradient dynamics of two-layers neural network ◮ Two layers neural network: θ ✐ ❂ ✭ ❛ ✐ ❀ w ✐ ✮ ✷ R ❉ ✿ Θ ❂✭ θ ✶ ❀ ✿ ✿ ✿ ❀ θ ◆ ✮ ❀ ◆ ② ✭ x ❀ Θ ✮ ❂ ✶ ❳ ❫ ❛ ✐ ✛ ✭ ❤ w ✐ ❀ x ✐ ✮ ✿ ◆ ✐ ❂✶ ◮ Risk function: ◆ ② � ✶ ✑ ✷ ✐ ❤✏ ❳ ❘ ◆ ✭ Θ ✮ ❂ E x ❀② ❛ ✐ ✛ ✭ ❤ w ✐ ❀ x ✐ ✮✮ ✿ ◆ ✐ ❂✶ ◮ SGD/gradient flow: Θ ❦ ✰✶ ❂ Θ ❦ � ✑ ❦ r ❵ ◆ ✭ Θ ❦ ❀ x ❦ ❀ ② ❦ ✮ ❀ ❞ ❞ t Θ t ❂ � r ❘ ◆ ✭ Θ t ✮ ✿ Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 2 / 12

  5. Two-layers neural networks Input layer Hidden layer Output layer w 1 a 1 w 2 a 2 w 3 a 3 a 4 w 4 Figure: Architecture for ◆ ❂ ✹ . θ ✐ ❂ ✭ ❛ ✐ ❀ w ✐ ✮ Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 3 / 12

  6. Related literatures ◮ Mean field distributional dynamics: ❅ t ✚ t ✭ θ ✮ ❂ r ✁ ✭ r ✠✭ θ ❀ ✚ t ✮ ✚ t ✮ ✿ ◮ Non-linear dynamics. Converges in some cases. ◮ [Mei, Montanari, Nguyen, 2018], [Rotskoff and Vanden-Eijnden, 2018], [Chizat and Bach, 2018a], [Sirignano and Spiliopoulos, 2018]. ◮ Neural tangent kernel (NTK) dynamics: ❅ t ❦ u t ❦ ✷ ✷ ❂ �❤ u t ❀ ❍ u t ✐ ✿ ◮ Linear dynamics. Always converges to ✵ empirical risk. ◮ [Jacot, Gabriel, and Clement, 2018], [Li and Liang, 2018], [Du, Zhai, Poczos, Singh, 2018]. Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 4 / 12

  7. Related literatures ◮ Mean field distributional dynamics: ❅ t ✚ t ✭ θ ✮ ❂ r ✁ ✭ r ✠✭ θ ❀ ✚ t ✮ ✚ t ✮ ✿ ◮ Non-linear dynamics. Converges in some cases. ◮ [Mei, Montanari, Nguyen, 2018], [Rotskoff and Vanden-Eijnden, 2018], [Chizat and Bach, 2018a], [Sirignano and Spiliopoulos, 2018]. ◮ Neural tangent kernel (NTK) dynamics: ❅ t ❦ u t ❦ ✷ ✷ ❂ �❤ u t ❀ ❍ u t ✐ ✿ ◮ Linear dynamics. Always converges to ✵ empirical risk. ◮ [Jacot, Gabriel, and Clement, 2018], [Li and Liang, 2018], [Du, Zhai, Poczos, Singh, 2018]. Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 4 / 12

  8. Related literatures ◮ Mean field distributional dynamics: ❅ t ✚ t ✭ θ ✮ ❂ r ✁ ✭ r ✠✭ θ ❀ ✚ t ✮ ✚ t ✮ ✿ ◮ Non-linear dynamics. Converges in some cases. ◮ [Mei, Montanari, Nguyen, 2018], [Rotskoff and Vanden-Eijnden, 2018], [Chizat and Bach, 2018a], [Sirignano and Spiliopoulos, 2018]. ◮ Neural tangent kernel (NTK) dynamics: ❅ t ❦ u t ❦ ✷ ✷ ❂ �❤ u t ❀ ❍ u t ✐ ✿ ◮ Linear dynamics. Always converges to ✵ empirical risk. ◮ [Jacot, Gabriel, and Clement, 2018], [Li and Liang, 2018], [Du, Zhai, Poczos, Singh, 2018]. Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 4 / 12

  9. Related literatures ◮ Mean field distributional dynamics: ❅ t ✚ t ✭ θ ✮ ❂ r ✁ ✭ r ✠✭ θ ❀ ✚ t ✮ ✚ t ✮ ✿ ◮ Non-linear dynamics. Converges in some cases. ◮ [Mei, Montanari, Nguyen, 2018], [Rotskoff and Vanden-Eijnden, 2018], [Chizat and Bach, 2018a], [Sirignano and Spiliopoulos, 2018]. ◮ Neural tangent kernel (NTK) dynamics: ❅ t ❦ u t ❦ ✷ ✷ ❂ �❤ u t ❀ ❍ u t ✐ ✿ ◮ Linear dynamics. Always converges to ✵ empirical risk. ◮ [Jacot, Gabriel, and Clement, 2018], [Li and Liang, 2018], [Du, Zhai, Poczos, Singh, 2018]. Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 4 / 12

  10. Related literatures ◮ Mean field distributional dynamics: ❅ t ✚ t ✭ θ ✮ ❂ r ✁ ✭ r ✠✭ θ ❀ ✚ t ✮ ✚ t ✮ ✿ ◮ Non-linear dynamics. Converges in some cases. ◮ [Mei, Montanari, Nguyen, 2018], [Rotskoff and Vanden-Eijnden, 2018], [Chizat and Bach, 2018a], [Sirignano and Spiliopoulos, 2018]. ◮ Neural tangent kernel (NTK) dynamics: ❅ t ❦ u t ❦ ✷ ✷ ❂ �❤ u t ❀ ❍ u t ✐ ✿ ◮ Linear dynamics. Always converges to ✵ empirical risk. ◮ [Jacot, Gabriel, and Clement, 2018], [Li and Liang, 2018], [Du, Zhai, Poczos, Singh, 2018]. Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 4 / 12

  11. Related literatures ◮ Mean field distributional dynamics: ❅ t ✚ t ✭ θ ✮ ❂ r ✁ ✭ r ✠✭ θ ❀ ✚ t ✮ ✚ t ✮ ✿ ◮ Non-linear dynamics. Converges in some cases. ◮ [Mei, Montanari, Nguyen, 2018], [Rotskoff and Vanden-Eijnden, 2018], [Chizat and Bach, 2018a], [Sirignano and Spiliopoulos, 2018]. ◮ Neural tangent kernel (NTK) dynamics: ❅ t ❦ u t ❦ ✷ ✷ ❂ �❤ u t ❀ ❍ u t ✐ ✿ ◮ Linear dynamics. Always converges to ✵ empirical risk. ◮ [Jacot, Gabriel, and Clement, 2018], [Li and Liang, 2018], [Du, Zhai, Poczos, Singh, 2018]. Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 4 / 12

  12. This work (a) Improved bound for SGD - PDE interpolation. (b) Relationship of the mean field limit and the kernel limit. Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 5 / 12

  13. SGD and distributional dynamics (DD) ◮ SGD for Θ ❦ , with ✭ x ❦ ❀ ② ❦ ✮ ✘ P x ❀② , ✐ ✷ ❬ ◆ ❪ , θ ❦ ✰✶ ❂ θ ❦ ✐ � ✷ s ❦ ◆ r θ ✐ ❵ ◆ ✭ Θ ❦ ❀ x ❦ ❀ ② ❦ ✮ ✿ (SGD) ✐ ◮ [MMN18]: s ❦ ❂ ✧✘ ✭ ❦✧ ✮ , ❦ ❂ t❂✧ , ◆ ✦ ✶ , ✧ ✦ ✵ : ◆ ✑ ✶ ✚ ✭ ◆ ✮ ✐ ✮ ✚ t ✷ P ✭ R ❉ ✮ ✂ ❬✵ ❀ ✶ ✮ ✿ ❳ ❫ ✍ θ ❦ ❦ ◆ ✐ ❂✶ ◮ Distributional dynamics (DD) for ✚ t , ❅ t ✚ t ✭ θ ✮ ❂✷ ✘ ✭ t ✮ r θ ✁ ✭ ✚ t ✭ θ ✮ r θ ✠✭ θ ❀ ✚ t ✮✮ ❀ (DD) where ✠✭ θ ❀ ✚ ✮ ❂ ✍❘ ✭ ✚ ✮ ❩ ❯ ✭ θ ❀ θ ✵ ✮ ✚ ✭❞ θ ✵ ✮ ✿ ✍✚ ✭ θ ✮ ❂ ❱ ✭ θ ✮ ✰ Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 6 / 12

  14. SGD and distributional dynamics (DD) ◮ SGD for Θ ❦ , with ✭ x ❦ ❀ ② ❦ ✮ ✘ P x ❀② , ✐ ✷ ❬ ◆ ❪ , θ ❦ ✰✶ ❂ θ ❦ ✐ � ✷ s ❦ ◆ r θ ✐ ❵ ◆ ✭ Θ ❦ ❀ x ❦ ❀ ② ❦ ✮ ✿ (SGD) ✐ ◮ [MMN18]: s ❦ ❂ ✧✘ ✭ ❦✧ ✮ , ❦ ❂ t❂✧ , ◆ ✦ ✶ , ✧ ✦ ✵ : ◆ ✑ ✶ ✚ ✭ ◆ ✮ ✐ ✮ ✚ t ✷ P ✭ R ❉ ✮ ✂ ❬✵ ❀ ✶ ✮ ✿ ❳ ❫ ✍ θ ❦ ❦ ◆ ✐ ❂✶ ◮ Distributional dynamics (DD) for ✚ t , ❅ t ✚ t ✭ θ ✮ ❂✷ ✘ ✭ t ✮ r θ ✁ ✭ ✚ t ✭ θ ✮ r θ ✠✭ θ ❀ ✚ t ✮✮ ❀ (DD) where ✠✭ θ ❀ ✚ ✮ ❂ ✍❘ ✭ ✚ ✮ ❩ ❯ ✭ θ ❀ θ ✵ ✮ ✚ ✭❞ θ ✵ ✮ ✿ ✍✚ ✭ θ ✮ ❂ ❱ ✭ θ ✮ ✰ Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 6 / 12

  15. SGD and distributional dynamics (DD) ◮ SGD for Θ ❦ , with ✭ x ❦ ❀ ② ❦ ✮ ✘ P x ❀② , ✐ ✷ ❬ ◆ ❪ , θ ❦ ✰✶ ❂ θ ❦ ✐ � ✷ s ❦ ◆ r θ ✐ ❵ ◆ ✭ Θ ❦ ❀ x ❦ ❀ ② ❦ ✮ ✿ (SGD) ✐ ◮ [MMN18]: s ❦ ❂ ✧✘ ✭ ❦✧ ✮ , ❦ ❂ t❂✧ , ◆ ✦ ✶ , ✧ ✦ ✵ : ◆ ✑ ✶ ✚ ✭ ◆ ✮ ✐ ✮ ✚ t ✷ P ✭ R ❉ ✮ ✂ ❬✵ ❀ ✶ ✮ ✿ ❳ ❫ ✍ θ ❦ ❦ ◆ ✐ ❂✶ ◮ Distributional dynamics (DD) for ✚ t , ❅ t ✚ t ✭ θ ✮ ❂✷ ✘ ✭ t ✮ r θ ✁ ✭ ✚ t ✭ θ ✮ r θ ✠✭ θ ❀ ✚ t ✮✮ ❀ (DD) where ✠✭ θ ❀ ✚ ✮ ❂ ✍❘ ✭ ✚ ✮ ❩ ❯ ✭ θ ❀ θ ✵ ✮ ✚ ✭❞ θ ✵ ✮ ✿ ✍✚ ✭ θ ✮ ❂ ❱ ✭ θ ✮ ✰ Song Mei (Stanford University) Mean Field Dynamics for Neural Network June 26, 2019 6 / 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend