Landscape Connectivity and Dropout Stability
- f SGD Solutions for Over-parameterized
Neural Networks
Marco Mondelli Alexander Shevchenko
Landscape Connectivity and Dropout Stability of SGD Solutions for - - PowerPoint PPT Presentation
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks Alexander Shevchenko Marco Mondelli Neural Network Training From theoretical perspective training of neural networks is di ffi cult
Marco Mondelli Alexander Shevchenko
Over-parameterization (Stochastic) gradient descent
Mean-field view: Two layers [Mei et al., 2019] Multiple layers [Araujo et al., 2019] Quantitative bounds:
log(width)
1 width
N ∑N i=1 aiσ (x; wi)) 2
N ∑N i=1 ak i σ (xk; wk i )) 2
)
̂ yN(x, θ) = 1 N
N
∑
i=1
aiσ (x; wi)
M
i=1
2
M
i=1
2
−1
log N N + α(D + log N)
log N N + α(D + log N)
i.i.d. particles that evolve with gradient flow
concentrate to the same limit
connectivity
2
} Online SGD: θk+1 = θk + αN2∇θk
2
bounds the average distance)