surfing iterative optimization over incrementally trained
play

Surfing: Iterative Optimization Over Incrementally Trained Deep - PowerPoint PPT Presentation

Surfing: Iterative Optimization Over Incrementally Trained Deep Networks Ganlin Song, Zhou Fan, John Lafferty Department of Statistics and Data Science Yale University Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over


  1. Surfing: Iterative Optimization Over Incrementally Trained Deep Networks Ganlin Song, Zhou Fan, John Lafferty Department of Statistics and Data Science Yale University Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 1 / 6

  2. Background We consider inverting a trained generative network G by x � G ( x ) − y � 2 min x f ( x ) = min Generative Model Invert a Generator Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 2 / 6

  3. Background • Compressed sensing framework: observe z = Ay + ǫ ; recover y by (Bora, Jalal, Price & Dimakis 2017) x � AG ( x ) − z � 2 x f ( x ) = min min Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 3 / 6

  4. Background • Compressed sensing framework: observe z = Ay + ǫ ; recover y by (Bora, Jalal, Price & Dimakis 2017) x � AG ( x ) − z � 2 x f ( x ) = min min • f ( x ) is non-convex; gradient descent not guaranteed to reach global optimum Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 3 / 6

  5. Motivation Landscape of x �→ − f θ ( x ) = −� G θ ( x ) − y � 2 , as weights θ are trained Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 4 / 6

  6. Algorithm Intuition • The landscape for initial random network is “nice” • Initialize with random network and track optimum for intermediate networks Surfing Algorithm • Obtain a sequence of parameters θ 0 , θ 1 , . . . , θ T during training • Optimize empirical risk function f θ 0 , f θ 1 , . . . , f θ T iteratively using gradient descent • For each t ∈ { 1 , . . . , T } , initialize gradient descent at the solution from time t − 1 Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 5 / 6

  7. Theory and Experiments Theoretical Results If G θ has random parameters, all critical points of f θ ( x ) belong to a small 1 neighborhood around 0 with high probability (Builds on Hand & Voroninski 2017) Under certain conditions, modified surfing can track the minimizer 2 Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 6 / 6

  8. Theory and Experiments Theoretical Results If G θ has random parameters, all critical points of f θ ( x ) belong to a small 1 neighborhood around 0 with high probability (Builds on Hand & Voroninski 2017) Under certain conditions, modified surfing can track the minimizer 2 Experiments For DCGAN trained on Fashion-MNIST min x � G θ ( x ) − G θ ( x 0 ) � 2 min x � AG θ ( x ) − Ay � 2 Ganlin Song, Zhou Fan, John Lafferty Surfing: Iterative Optimization Over Incrementally Trained Deep Networks 6 / 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend