swalp stochastic weight averaging in low precision
play

SWALP: Stochastic Weight Averaging in Low-Precision Training - PowerPoint PPT Presentation

SWALP: Stochastic Weight Averaging in Low-Precision Training Guandao Yang, Tianyi Zhang, Polina Kirichenko, Junwen Bai, Andrew Gordon Wilson, Christopher De Sa Low-precision Computation Problem Statement We study how to leverage


  1. SWALP: Stochastic Weight Averaging 
 in Low-Precision Training Guandao Yang, Tianyi Zhang, Polina Kirichenko, Junwen Bai, Andrew Gordon Wilson, Christopher De Sa

  2. Low-precision Computation

  3. Problem Statement We study how to leverage low-precision training to obtain a high-accuracy model.

  4. Problem Statement We study how to leverage low-precision training to obtain a high-accuracy model. Output model can be higher-precision.

  5. SWALP SWALP SGD-LP model model

  6. SWALP SWALP SGD-LP model model Updating

  7. SWALP SWALP SGD-LP Every c model model iterations Averaging Updating

  8. SWALP Infrequently SWALP SGD-LP Every c model model iterations Averaging Updating

  9. 
 Convergence Analysis Let T be the number of iterations. 
 Theorem 1 (quadratic) 
 SWALP converges to the optimal solution 
 at a O(1/T) rate.

  10. 
 Convergence Analysis Let T be the number of iterations. 
 Theorem 1 (quadratic) 
 SWALP converges to the optimal solution 
 at a O(1/T) rate. SWALP has the same convergence rate 
 as full precision SGD.

  11. 
 Convergence Analysis Let δ be the quantization gap. 
 Theorem 2 (strongly convex) 
 The expected distance between SWALP solution 
 and the optimal one is bounded by O( δ ^2).

  12. 
 Convergence Analysis Let δ be the quantization gap. 
 Theorem 2 (strongly convex) 
 The expected distance between SWALP solution 
 and the optimal one is bounded by O( δ ^2). • The best bound for SGD-LP is O( δ ) 
 (Li et al, NeurIPS 2017). • SWALP requires half the number of bits to 
 reduce the noise ball by the same factor.

  13. Experiments

  14. Experiments 1.3 2.9 0.8 2.3

  15. Experiments

  16. Poster @ Pacific Ballroom #58 SWALP Codes QPyTorch: 
 A Low-Precision 
 Framework

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend