fractional underdamped langevin dynamics
play

FRACTIONAL UNDERDAMPED LANGEVIN DYNAMICS: Umut im ekli LTCI, Tlcom - PowerPoint PPT Presentation

FRACTIONAL UNDERDAMPED LANGEVIN DYNAMICS: Umut im ekli LTCI, Tlcom Paris, RETARGETING SGD WITH MOMENTUM UNDER Institut Polytechnique de Paris HEAVY-TAILED GRADIENT NOISE Umut im ekli*, Lingjiong Zhu*, Yee Whye Teh, Mert


  1. FRACTIONAL UNDERDAMPED LANGEVIN DYNAMICS: Umut Ş im ş ekli LTCI, Télécom Paris, RETARGETING SGD WITH MOMENTUM UNDER Institut Polytechnique de Paris HEAVY-TAILED GRADIENT NOISE Umut Ş im ş ekli*, Lingjiong Zhu*, Yee Whye Teh, Mert Gürbüzbalaban (Florida State University) (University of Oxford) (Rutgers University) *equal contribution International Conference on Machine Learning, 2020

  2. <latexit sha1_base64="C8nmr3TGvVxaCrtAv71khNJL21s=">AEXicfZLNbhMxEMfdhI+yfLVw5GIRVWo5RElBwLEqrUSlohbol1Sn0azXm1ixvYvtLUktPwXPwEPACXHlCbhxhafASRM12SAsrfTf+c3fnhlNnAtubKPxc6FSvXb9xs3FW9HtO3fv3V9afnBkskJTdkgzkemTGAwTXLFDy61gJ7lmIGPBjuPeqyE/Pmfa8Ewd2EHOWhI6iqecg2h9tIBURALIJaLhLnUt3urpB+vYWI1B9UR7AMmqQbadLHxBSC65NW3HCVeY7EnWgXbP4/TMrfI1P3K3l2qNemN08LxojkUNjc9+e7nymSQZLSRTlgow5rTZyG3LgbacCuYjUhiWA+1Bh50GqUAy03Kj9j1eCZEp5kOn7J4FJ12OJDGDGRoYEWC7ZoyGwb/xU4Lm75sOa7ywjJFLx9KC4FthoezxAnXjFoxCAKo5qFWTLsQpmXDxCOi2EeaSQkqcWEq3pHhC3Hq+t7PwvMpeF6GF1Pwogy1YXaCY/eujLev2HaZbY2sjlAQ+GQO5irAfvk1mYwv1NIlZc+mvIKbZbg7BXf9GQGRd6GUs5NcdbpTvsCAmVT73l/aJ38+isgWC3uj2Zvg3suZBpvpJ46A7kiufNijDhmK/+VBf5wXRDSzIsOSbJYJ48NiN8trPC+O1uvNp/X1t89qG8/HK76IHqHaBU10Qu0gV6jfXSIKPqKfqHf6E/1U/VL9Vv1+2VqZWHseYhmTvXHX9L3abg=</latexit> <latexit sha1_base64="k1CAiTA04L6xnd5D7O+mlmMUzw=">AESXicfZLdbtMwGIbdZcAIfxucIHFiNoEGE1MzEHCNI1NYtIQ42dbpbqrvjhuF9V2Itst3SzfAbfB1XAFXAYcIY5w2pR1KcJSpNfv8/049hfnPNWmXv9emwvmL12+snA1vHb9xs1bi0u3D3XWV5Qd0IxnqhGDZjyV7MCkhrNGrhiImLOjuPe64EcDpnSayU/mNGctAV2ZdlIKxlvtxS/EpDxhlgxid2x7a5F79RBPvC4IAQ5Ph/Twk797ZgoIeYw8TquPSqySoaxL+ceYUJCPN54GxfFxwSv4ZnW7cWV+np9tPCsiEqxgsq1316a+0qSjPYFk4Zy0LoZ1XPTsqBMSjlzIelrlgPtQZc1vZQgmG7Z0bU5/MA7Ce5kyn/S4JE7nWFBaH0qYh8pwJzoKivMf7Fm3Retmwq875hko4bdfocmwXb4CTVDFq+KkXQFXqz4rpCSigxr9USCT7TDN/8zKx/rKcJUWHuGOHzl2Egyk4qMKzKXhWhUozM8Gx/VDFO+dsp8q2R6mWUOC4MQNz6eGw2k0kZUElbFLN2RLncKsK96bgnjsmwPMTqMTsJud/ulstoEFPTvRjdMnOxeGZJv5uVHsrc9+lzMFJlOPLQHVFal0fo6pBD/i4NhGedFeGFEiOZLO6GOyoOsaz4nBjPXq6vH+2crm83LEF9A9tIxWUYReoE30Bu2jA0TRz9rd2v3acvAt+BH8Cn6PQ+dqZc4dGHNB38A2B7WQ=</latexit> <latexit sha1_base64="fF9Fa6Oi0eSLx1J4zadUKP9lCRI=">AENnicfZLfbtMwFMbdlcEIsHVwyY1FhbRxUTUDATeTprFJTBpi/Ok2qW4rx3Fa7YTbGe0s/w4vAOvwg1cIW5BJwuZV2KsBTp+PzOd+wcf1HGmTbt9rfaUv3G8s1bK7eDO3fvra41u8f6zRXhHZIylN1GmFNOZO0Y5jh9DRTFIuI05Po7FXBT86p0iyVH80koz2Bh5IljGDjU4PGBI2jPtIGK7gNEVZDwSTiTDCjB9YziJiESGlq+rGDaJcNObIw2fBoEyKjGJZDTj9BlChMQit9jc7F3w5sO3R9CZO+3WCbrlT5Jgq5QaPZbrWnCy4GYRk0QbmOButLX1CcklxQaQjHWnfDdmZ6FivDCKcuQLmGSZneEi7PpRYUN2z0xk5+NhnYpikyn/SwGl2XmGx0HoiIl8psBnpKiuS/2Ld3CQve5bJLDdUksuDkpxDk8Ji4DBmihLDJz7ARDF/V0hG2E/L+GcJkKSfSoElnExbmdRcUKU2LFz1+H5HDyvwos5eFGFxevNcGTfV/H+Fduvsr2p1CKCOTxdgJn0cFw9TcRlQyVsXNXsiu4W4WHc/DQ9RHm2QhXag7iqz89qDbQWM9u+8Fdymc7FwRoj3rfKPrGq9mVGTqie2NL3zPhqiIvhfHR6XdT4IrlmkuJU64LY4dVGy8Gx1ut8Glr692z5s7z0uIr4CF4BDZACF6AHfAaHIEOIOBHbm2Wlurf61/r/+s/7osXaqVmgfg2qr/gOeTHUc</latexit> DEEP LEARNING & SGD-MOMENTUM § Deep learning (in general) n f ( x ) , 1 x ? = arg min n o X f ( i ) ( x ) n x ∈ R d non-convex data points i =1 network weights cost function § Optimization Algorithm – S tochastic G radiend D escent with momentum step-size stochastic velocity (learning rate) gradient (momentum) f k ( x ) , 1 v k +1 =˜ v k � ˜ η r ˜ f k +1 ( x k ) r ˜ ˜ γ ˜ X f ( i ) ( x ) b x k +1 = x k + ˜ v k +1 i ∈ Ω k minibatch minibatch size Momentum decay 2

  3. <latexit sha1_base64="KVGumPDBglPv0/WPIQVS5uBk6A=">AENnicfZLdbtMwFMfdhcEIsA+45MZiAnVMVG1BwM2kaWwSk4YH/uQ6lKdOE4XzXaC7Z2Vh6Hd+BVuIErxC2PgNkWpsiLFn6+/zO/9g5OUHKY2aze+1Be/a4vUbSzf9W7fvLK+srt091slAUXZE56o0wA047FkRyY2nJ2mioEIODsJzl/l/GTIlI4T+dGMU9YV0JdxFMwLtRbHT8iSoSYDIOe2XqC6QPQkBx3iQSAg4qpORO25sTFIN3sREf1bGkgBte3CklkSMANZhouCO6JnfEL8sn5eYKsoWySY3up6s9GcLDwvWqVYR+U67K0tfCVhQgeCSUM5aN1pNVPTtaBMTDnLfDLQLAV6Dn3WcVKCYLprJz3K8EMXCXGUKLelwZPotMOC0HosApcpwJzpKsuD/2KdgYledm0s04FhkhYXRQOTYLzhuMwVowaPnYCqIrdWzE9A9c5436LTyT7QhPXPxla1yTXxfyGILKjLJuFwyk4rMKLKXhRhUozc4kD+76K967YXpXtTqyWUOD4dA6m0sFR9TYRlgWVsGHVsyOu4E4VHkzBg+wTAZ6eQSVnP7z60v1qAQ368rUfsJ+ecp8n+wyNzeKvXHutylTYBL12BJQfRHLzM1Rn+Tif3kwKvOc8GdGJH+SRKuMzfYreoYz4vjdqP1tNF+92x9+3k54kvoPnqA6qiFXqBt9BodoiNE0c/aYm25tuJ98354v7zfRepCrfTcQzPL+/MXanJy3w=</latexit> <latexit sha1_base64="QrCyrRqFlXcH3Bmcpi1Gkw2AnU=">AD93icfZJLbxMxEMfdLo+yvFI4clkRIRUkoqQgQJyq0kpUKqI80kaKQzTr9SZWbO9iOyGptZ8FLiCufAmucOXb4E02SrJBjGTp7/nNeMajCVPOtKnX/2xsehcuXrq8dcW/eu36jZuV7VunOhkqQpsk4YlqhaApZ5I2DTOctlJFQYScnoWDFzk/G1GlWSLfm0lKOwJ6ksWMgHGubuV5szvYwePwfoCNYiB7nH4MsISQAzaMR9TG2TziYQGCeHrvVqr1Wn1qwbpoFKCjvpbm9+xVFChoJKQzho3W7U9OxoAwjnGY+HmqaAhlAj7adlCo7tjpJ7PgnvNEQZwod6QJpt7lDAtC64kIXaQA09dljv/xdpDEz/rWCbToaGSzArFQx6YJMgnFkRMUWL4xAkgirleA9IHBcS4ufpY0k8kEQJkZN1UMovzCmFsx1m2CkdLcFSG50vwvAyVpmaOQ/u2jA8X7LDMDqapFhPgQWsNptLBcbmaiIoHlbBROWdfLOB+GR4vwePsAwae9qEUcxQtfnpUfkCDnf7Lpulz2+Z7+MD6vZG0Vcu+3VKFZhEPbAYVE8wmbk96uFc/C8OxkWcE/7KiuQtmSThOnOL3Siv8bo43a01HtV23zyu7j0pVnwL3UF30Q5qoKdoD71EJ6iJCPqCfqJf6Lc38T5737zvs9DNjSLnNlox78dfEkRdrQ=</latexit> <latexit sha1_base64="pDfvaTRzPf4KrEM0W/eQlFyxsmA=">AD13icfZJb9MwFMe9hsItw7e4MWiQkI8VO1AwOM0NsGkIcalW1FTqhPHba36EtlOaRdF8IR45UPAK3wdvg1Om6ptirAU6e/zO3/7+OSEMWfGNhp/tirehYuXLm9f8a9eu37jZnXn1qlRiSa0RXuh2CoZxJ2rLMctqONQURcnoWjp7n/GxMtWFKvrfTmHYFDCTrMwLWhXrVO63eCAeGCRwIsEMt0heQGMNAZr1qrVFvzBbeFM1C1FCxTno7lR9BpEgiqLSEgzGdZiO23RS0ZYTzA8SQ2MgIxjQjpMSBDXdPaIDN93kQj3lXaftHgWXWkIyZitBl5pWaMsuD/2KdxPafdVMm48RSeYX9ROrcJ5R3DENCWT50AopmrFZMhaCDW9c0PJP1ElBAgozSYhFk61PYTydZtg7HK3Bchucr8LwMtaF2gcP0bRkfLtlhmR3MrGlAgOP2Boylg5PybSIqDnR/Oyp79sUS7pfh8Qo8zj4GwOMhlHKOouVLj8oHGDCLat9lc/til/l+cEDd3Gj6yrlfx1SDVfphGoAeCYzN0eDIBf/y4NJkeEvzYieUlWKW7ywW6Wx3hTnO7Wm4/qu28e1/aeFCO+je6ie+gBaqKnaA+9RCeohQj6gn6iX+i398H7H31vs1TK1uF5zZaW973vySzUjU=</latexit> <latexit sha1_base64="idQYSgAEt/YXMHrO0jrzbvpGoSg=">AD3nicfZLbhMxFIbdDJcy3FJYshmIkAqLKAmoZcGiKo1EpSLKJW2kOE09Hk9ixfaMbE9I6s4WVogtL8CSLTwLb4MnmSjJBHGkf453/nt46Pjx4wqXav92Sg5V65eu75w7156/adu+WteycqSiQmLRyxSLZ9pAijgrQ01Yy0Y0kQ9xk59YevMn46IlLRSHzUk5h0OeoLGlKMtE31yg8hR3rg+6aZwstWb7gNx/4TeHnW8F56kIpQT3rlSq1am4a3Luq5qIA8jntbpR8wiHDCidCYIaU69VqsuwZJTEjqQsTRWKEh6hPOlYKxInqmulbUu+xzQReGEn7Ce1Ns8sOg7hSE+7byqxzVWRZ8l+sk+jwRdQESeaCDy7KEyYpyMvG4wXUEmwZhMrEJbU9urhAZIazs+FwryCUecIxEYO6LUzOYWmnGarsLREhwV4cUSvChCqYieY9+8L+LmgjWL7GBqNRAj5rXYCwsHBdv40F+oOQmKHr2+QLuF+HREjxKzyBi8QAVag6DxUsPiwcopObdfkhn9vlf6rwgNi9keSNdb+NiUQ6k8NRLPqUjtHvVhJv5Xh8Z5nRXuyopkLekoYiq1i10vrvG6OGlU68+qjXfPK3s7+YpvgfgEdgGdbAL9sBrcAxaAIMv4Cf4BX47585n56vzbVZa2sg98FKON/Ap3MVE4=</latexit> UNDERSTANDING SGD-M § Theory: better-established for convex problems still in early phase for non-convex problems § Useful approach for analysis à S tochastic D ifferential E quations (SDE) U k ( x ) , r ˜ gradient noise: f k ( x ) � r f ( x ) E k U k ( x ) k 2 < 1 § If we assume and invoke CLT: U k ∼ Gaussian § SGD-m à Euler-Maruyama Discretization of the SDE : friction r 2 γ d v t = � ( γ v t + r f ( x t ))d t + β dB t Underdamped Dalalyan&Riou-Durand’18 (a.k.a. Kinetic) Brownian Gao et al.’18 Langevin Dynamics Motion d x t = v t d t Inverse temperature 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend