mirrored langevin dynamics
play

Mirrored Langevin Dynamics Ya-Ping Hsieh https://lions.epfl.ch - PowerPoint PPT Presentation

Mirrored Langevin Dynamics Ya-Ping Hsieh https://lions.epfl.ch Laboratory for Information and Inference Systems (LIONS) Ecole Polytechnique F ed erale de Lausanne (EPFL) Switzerland NeurIPS Spotlight [Dec 6th, 2018] Joint work with


  1. Mirrored Langevin Dynamics Ya-Ping Hsieh https://lions.epfl.ch Laboratory for Information and Inference Systems (LIONS) ´ Ecole Polytechnique F´ ed´ erale de Lausanne (EPFL) Switzerland NeurIPS Spotlight [Dec 6th, 2018] Joint work with Ali Kavis, Paul Rolland, Volkan Cevher @ LIONS

  2. Introduction ◦ Task: given a target distribution d µ = e − V ( x ) d x , generate samples from µ . ⊲ Fundamental in machine learning/statistics/computer science/etc. Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 2/ 8

  3. Introduction ◦ Task: given a target distribution d µ = e − V ( x ) d x , generate samples from µ . ⊲ Fundamental in machine learning/statistics/computer science/etc. ◦ A scalable framework: First-order sampling (assuming access to ∇ V ). Step 1. Langevin Dynamics √ X ∞ ∼ e − V . d X t = −∇ V ( X t )d t + 2d B t ⇒ Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 2/ 8

  4. Introduction ◦ Task: given a target distribution d µ = e − V ( x ) d x , generate samples from µ . ⊲ Fundamental in machine learning/statistics/computer science/etc. ◦ A scalable framework: First-order sampling (assuming access to ∇ V ). Step 1. Langevin Dynamics √ X ∞ ∼ e − V . d X t = −∇ V ( X t )d t + 2d B t ⇒ Step 2. Discretize x k +1 = x k − β k ∇ V ( x k ) + � 2 β k ξ k ⊲ β k step-size, ξ k standard normal ⊲ strong analogy to gradient descent method Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 2/ 8

  5. Recent progress: Unconstrained distributions are easy ◦ State-of-the-art: When dom ( V ) = R d , W 2 Assumption d TV KL Literature [Cheng and Bartlett, 2017] O � ǫ − 2 d � O � ǫ − 2 d � O � ǫ − 1 d � LI � ∇ 2 V � mI ˜ ˜ ˜ [Dalalyan and Karagulyan, 2017] [Durmus et al., 2018] O � ǫ − 4 d � O � ǫ − 2 d � LI � ∇ 2 V � 0 ˜ ˜ - [Durmus et al., 2018] � E � X − Y � 2 , Note: W 2 ( µ 1 , µ 2 ) ≔ inf d TV ( µ 1 , µ 2 ) ≔ sup | µ 1 ( A ) − µ 2 ( A ) | X ∼ µ 1 ,Y ∼ µ 2 A Borel Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 3/ 8

  6. Recent progress: Unconstrained distributions are easy ◦ State-of-the-art: When dom ( V ) = R d , W 2 Assumption d TV KL Literature [Cheng and Bartlett, 2017] O � ǫ − 2 d � O � ǫ − 2 d � O � ǫ − 1 d � LI � ∇ 2 V � mI ˜ ˜ ˜ [Dalalyan and Karagulyan, 2017] [Durmus et al., 2018] O � ǫ − 4 d � O � ǫ − 2 d � LI � ∇ 2 V � 0 ˜ ˜ - [Durmus et al., 2018] � E � X − Y � 2 , Note: W 2 ( µ 1 , µ 2 ) ≔ inf d TV ( µ 1 , µ 2 ) ≔ sup | µ 1 ( A ) − µ 2 ( A ) | X ∼ µ 1 ,Y ∼ µ 2 A Borel ◦ What about constrained distributions? ⊲ include many important applications, such as Latent Dirichlet Allocation (LDA). Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 3/ 8

  7. A challenge: Constrained distributions are hard ◦ When dom ( V ) is compact, convergence rates deteriorate significantly. W 2 or KL d TV Assumption Literature O � ǫ − 6 d 5 � LI � ∇ 2 V � mI ˜ ? [Brosse et al., 2017] O � ǫ − 6 d 5 � LI � ∇ 2 V � 0 ˜ ? [Brosse et al., 2017] ⊲ cf. , when V is unconstrained, ˜ O ( ǫ − 4 d ) convergence in d TV . ⊲ Projection is not a solution: slow rates [Bubeck et al., 2015], boundary issues. Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 4/ 8

  8. Unconstrained optimization of constrained problems ◦ Entropic Mirror Descent : Unconstrained optimization within the simplex. min V ( x ) x ∈ ∆ d ⊲ Choose h to be the entropic mirror map, h ⋆ its dual Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 5/ 8

  9. Unconstrained optimization of constrained problems ◦ Entropic Mirror Descent : Unconstrained optimization within the simplex. min V ( x ) x ∈ ∆ d ⊲ Choose h to be the entropic mirror map, h ⋆ its dual ⊲ Mirror vs primal image: y = ∇ h ( x ) ⇔ x = ∇ h ⋆ ( y ) y k +1 = y k − β k ∇ V ( x k ) ⇒ no projection since dom ( h ⋆ ) = R d . Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 5/ 8

  10. Unconstrained optimization of constrained problems ◦ Entropic Mirror Descent : Unconstrained optimization within the simplex. min V ( x ) x ∈ ∆ d ⊲ Choose h to be the entropic mirror map, h ⋆ its dual ⊲ Mirror vs primal image: y = ∇ h ( x ) ⇔ x = ∇ h ⋆ ( y ) y k +1 = y k − β k ∇ V ( x k ) ⇒ no projection since dom ( h ⋆ ) = R d . ◦ A “mirror descent theory” for Langevin Dynamics? Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 5/ 8

  11. Mirrored Langevin Dynamics (MLD) ◦ Given e − V and h , compute e − W ≔ ∇ h # e − V √ � d Y t = −∇ W ◦ ∇ h ( X t )d t + 2d B t X ∞ ∼ e − V . MLD ≡ ⇒ X t = ∇ h ⋆ ( Y t ) Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 6/ 8

  12. Mirrored Langevin Dynamics (MLD) ◦ Given e − V and h , compute e − W ≔ ∇ h # e − V √ � d Y t = −∇ W ◦ ∇ h ( X t )d t + 2d B t X ∞ ∼ e − V . MLD ≡ ⇒ X t = ∇ h ⋆ ( Y t ) √ � y k +1 = y k − β k ∇ W ( y k ) + 2 ξ k ◦ Discretize: . x k +1 = ∇ h ⋆ ( y k +1 ) Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 6/ 8

  13. Mirrored Langevin Dynamics (MLD) ◦ Given e − V and h , compute e − W ≔ ∇ h # e − V √ � d Y t = −∇ W ◦ ∇ h ( X t )d t + 2d B t X ∞ ∼ e − V . MLD ≡ ⇒ X t = ∇ h ⋆ ( Y t ) √ � y k +1 = y k − β k ∇ W ( y k ) + 2 ξ k ◦ Discretize: . x k +1 = ∇ h ⋆ ( y k +1 ) ◦ The dual distribution e − W can be unconstrained even if e − V is constrained. ⊲ Convergence rates for e − W are easy. Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 6/ 8

  14. Benefits of MLD ◦ Improved rates for constrained sampling. ◦ Can turn non-convex problems into convex ones!! � � ⊲ We provide the first ˜ 1 O rate for Latent Dirichlet Allocation. √ T ◦ Works well in practice. Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 7/ 8

  15. For more details... Welcome to our poster #43!! Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 8/ 8

  16. [0] Brosse, N., Durmus, A., Moulines, ´ E., and Pereyra, M. (2017). Sampling from a log-concave distribution with compact support with proximal langevin monte carlo. arXiv preprint arXiv:1705.08964 . [0] Bubeck, S., Eldan, R., and Lehec, J. (2015). Sampling from a log-concave distribution with projected langevin monte carlo. arXiv preprint arXiv:1507.02564 . [0] Cheng, X. and Bartlett, P. (2017). Convergence of langevin mcmc in kl-divergence. arXiv preprint arXiv:1705.09048 . [0] Dalalyan, A. S. and Karagulyan, A. G. (2017). User-friendly guarantees for the langevin monte carlo with inaccurate gradient. arXiv preprint arXiv:1710.00095 . [0] Durmus, A., Majewski, S., and Miasojedow, B. (2018). Analysis of langevin monte carlo via convex optimization. arXiv preprint arXiv:1802.09188 . Mirrored Langevin Dynamics | Ya-Ping Hsieh , https://lions.epfl.ch Slide 8/ 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend