Fine-Grained Analysis of Stability and Generalization for SGD
Yunwen Lei1 and Yiming Ying2
1University of Kaiserslautern 2University at Albany, State University of New York (SUNY)
Fine-Grained Analysis of Stability and Generalization for SGD Yunwen - - PowerPoint PPT Presentation
Fine-Grained Analysis of Stability and Generalization for SGD Yunwen Lei 1 and Yiming Ying 2 1 University of Kaiserslautern 2 University at Albany, State University of New York (SUNY) yunwen.lei@hotmail.com yying@albany.edu June, 2020 Overview
1University of Kaiserslautern 2University at Albany, State University of New York (SUNY)
w∈Ω F(w)
n
i=1 f (w; zi).
z EA
T
S,A
n
2
2 .
1+α
2 1−α
1+α
1+α
t=1 ηtwt/ T t=1 ηt.
T+1 = O
T
t
2 ≤ w − w′2 2 + O(η2).
4 and T ≍ n2 and get
2 ).
2 ≤ w − w′2 2 + O(η
2 1−α ).
2 ).
3α−3 2(2−α) , T ≍ n 2−α 1+α and get
2 ).
n), we let ηt =T
α2+2α−3 4
2 1+α and get E[F(¯
2 ).
1 n2
t=1 ηt
n
t=1 η2 t .
1 nT + 1 n2 .
Neural Information Processing Systems, pages 451–459, 2011.
arXiv:1811.02564, 2018.
International Conference on Machine Learning, pages 744–753, 2018.
Machine Learning Research, 12:2121–2159, 2011.
(Jan):55–79, 2005.
Processing Systems, pages 9747–9757, 2018.
Conference on Machine Learning, pages 1225–1234, 2016.
Learning, pages 2820–2829, 2018.
3375–3421, 2017.
ucke, G. Neu, and L. Rosasco. Beating sgd saturation with tail-averaging and minibatching. In Advances in Neural Information Processing Systems, pages 12568–12577, 2019.
Information Processing Systems, pages 1116–1124, 2014.
multiple passes. In Advances in Neural Information Processing Systems, pages 8114–8124, 2018.
International Conference on Machine Learning, pages 449–456, 2012.
Learning Research, 11(Oct):2635–2670, 2010.
Harmonic Analysis, 42(2):224—-244, 2017.