stochastic hamiltonian gradient methods for smooth games
play

Stochastic Hamiltonian Gradient Methods for Smooth Games Nicolas - PowerPoint PPT Presentation

Stochastic Hamiltonian Gradient Methods for Smooth Games Nicolas Loizou joint work with Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent , Simon Lacoste-Julien , Ioannis Mitliagkas . : : : ICML 2020 : :


  1. Stochastic Hamiltonian Gradient Methods for Smooth Games Nicolas Loizou joint work with Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent † , Simon Lacoste-Julien † , Ioannis Mitliagkas † . � : � : � : � ICML 2020 � : � : � : � † Canada CIFAR AI Chair N. Loizou, Stochastic Hamiltonian Methods 1 / 14

  2. Overview Min-max Optimization Problem 1 Motivation Related Work Main Contributions Classes of Stochastic Games and Hamiltonian Viewpoint 2 Stochastic Hamiltonian Gradient Methods 3 Stochastic Hamiltonian Gradient Descent Stochastic Variance Reduced Hamiltonian Gradient Method Convergence Guarantees Numerical Experiments 4 Conclusion & Future Directions of Research 5 N. Loizou, Stochastic Hamiltonian Methods 2 / 14

  3. The Min-Max Optimization Problem Problem: Stochastic Smooth Game. n x 2 2 R d 2 g ( x 1 , x 2 ) = 1 X x 1 2 R d 1 max min g i ( x 1 , x 2 ) (1) n i =1 where g : R d 1 ⇥ R d 2 ! R is a smooth objective. Goal: Find Min-max solution / Nash Equilibrium. Find x ⇤ = ( x ⇤ 2 ) 2 R d such that, for every x 1 2 R d 1 and x 2 2 R d 2 , 1 , x ⇤ g ( x ⇤ 1 , x 2 )  g ( x ⇤ 1 , x ⇤ 2 )  g ( x 1 , x ⇤ 2 ) , Appears in many applications: Domain Generalization (Albuquerque et al., 2019) Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) Formulations in Reinforcement Learning (Pfau, Vinyals, 2016) N. Loizou, Stochastic Hamiltonian Methods 3 / 14

  4. Related Work Deterministic Games: Last-iterate convergence guarantees. Classic results (Korpelevich, 1976; Nemirovski, 2004) and recent results (Mescheder et al., 2017; Daskalakis et al., 2017; Gidel et al., 2018b; Azizian et al., 2019). N. Loizou, Stochastic Hamiltonian Methods 4 / 14

  5. Related Work Deterministic Games: Last-iterate convergence guarantees. Classic results (Korpelevich, 1976; Nemirovski, 2004) and recent results (Mescheder et al., 2017; Daskalakis et al., 2017; Gidel et al., 2018b; Azizian et al., 2019). Stochastic Games: Convergent methods rely on iterate averaging over compact domains (Nemirovski, 2004) . Palaniappan & Bach, 2016 and Chavdarova et al., 2019 proposed methods with last-iterate convergence guarantees over a non-compact domain but under strong monotonicity assumption. N. Loizou, Stochastic Hamiltonian Methods 4 / 14

  6. Related Work Deterministic Games: Last-iterate convergence guarantees. Classic results (Korpelevich, 1976; Nemirovski, 2004) and recent results (Mescheder et al., 2017; Daskalakis et al., 2017; Gidel et al., 2018b; Azizian et al., 2019). Stochastic Games: Convergent methods rely on iterate averaging over compact domains (Nemirovski, 2004) . Palaniappan & Bach, 2016 and Chavdarova et al., 2019 proposed methods with last-iterate convergence guarantees over a non-compact domain but under strong monotonicity assumption. Second-Order Methods: Consensus optimization method (Mescheder et al., 2017) and Hamiltonian gradient descent (Balduzzi et al., 2018; Abernethy et al., 2019) . No available analysis for the stochastic problem. N. Loizou, Stochastic Hamiltonian Methods 4 / 14

  7. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  8. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  9. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . 3 A novel unbiased estimator of the Hamiltonian gradient. Crucial point for proving convergence for the proposed methods (existing methods use biased estimators). N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  10. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . 3 A novel unbiased estimator of the Hamiltonian gradient. Crucial point for proving convergence for the proposed methods (existing methods use biased estimators). 4 First stochastic Hamiltonian variance reduced method (linear convergence guarantees). N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  11. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . 3 A novel unbiased estimator of the Hamiltonian gradient. Crucial point for proving convergence for the proposed methods (existing methods use biased estimators). 4 First stochastic Hamiltonian variance reduced method (linear convergence guarantees). Hamiltonian Perspective: Popular stochastic optimization algorithms can be used as methods for solving stochastic min-max problems. N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  12. Smooth Games and Hamiltonian Gradient Descent x 1 2 R d 1 max min x 2 2 R d 2 g ( x 1 , x 2 ) (2) ✓ r x 1 g ✓ r 2 r 2 ◆ ◆ x 1 , x 1 g x 1 , x 2 g x = ( x 1 , x 2 ) > 2 R d ξ ( x ) = J = r ξ = �r 2 �r 2 �r x 2 g x 2 , x 1 g x 2 , x 2 g Vector x ⇤ 2 R d is a stationary point when ξ ( x ⇤ ) = 0. Key Assumption: All stationary points of the objective g are global min-max solutions. Hamiltonian Gradient Descent (HGD) (Balduzzi et al., 2018) H ( x ) = 1 2 k ξ ( x ) k 2 . min (3) x HGD can be expressed using a Jacobian-vector product: x k +1 = x k � η k r H ( x ) = x k � η k h i J > ξ N. Loizou, Stochastic Hamiltonian Methods 6 / 14

  13. Stochastic Hamiltonian Function n x 2 2 R d 2 g ( x 1 , x 2 ) = 1 X x 1 2 R d 1 max min g i ( x 1 , x 2 ) (4) n i =1 ✓ r x 1 g i ✓ r 2 n r 2 ◆ J = 1 ◆ x 1 , x 1 g i x 1 , x 2 g i X ξ i ( x ) = J i , where J i = . �r 2 �r 2 �r x 2 g i x 2 , x 1 g i x 2 , x 2 g i n i =1 Finite-Sum Structure Hamiltonian Function n H ( x ) = 1 H i , j ( x ) = 1 X H i , j ( x ) 2 h ξ i ( x ) , ξ j ( x ) i (5) where n 2 i , j =1 Algorithms use gradient of only one component function H i , j ( x ): r H i , j ( x ) = 1 h i J > i ξ j + J > j ξ i . (6) 2 Unbiased estimator of the r H ( x ). That is, E i , j [ r H i , j ( x )] = r H ( x ). N. Loizou, Stochastic Hamiltonian Methods 7 / 14

  14. Classes of Stochastic Smooth Games Stochastic Bilinear Games. n g ( x 1 , x 2 ) = 1 X x > 1 b i + x > 1 A i x 2 + c > (7) i x 2 n i =1 Stochastic su ffi ciently bilinear games. (Abernethy et al., 2019) Games where the following condition is true: ( δ 2 + ρ 2 )( δ 2 + β 2 ) � 4 L 2 ∆ 2 > 0 , (8) ⇤ 2 and  ∆ , ρ 2 = min x 1 , x 2 λ min r 2 r 2 � � ⇥ where 0 < δ  σ i x 1 , x 1 g ( x 1 , x 2 ) x 1 , x 2 g β 2 = min x 1 , x 2 λ min ⇤ 2 . r 2 ⇥ x 2 , x 2 g ( x 1 , x 2 ) N. Loizou, Stochastic Hamiltonian Methods 8 / 14

  15. Classes of Stochastic Smooth Games Stochastic Bilinear Games. n g ( x 1 , x 2 ) = 1 X x > 1 b i + x > 1 A i x 2 + c > (7) i x 2 n i =1 Proposition: Stochastic bilinear game (7) ) Stochastic Hamiltonian function (5) is a smooth quadratic quasi-strongly convex function. Stochastic su ffi ciently bilinear games. (Abernethy et al., 2019) Games where the following condition is true: ( δ 2 + ρ 2 )( δ 2 + β 2 ) � 4 L 2 ∆ 2 > 0 , (8) ⇤ 2 and  ∆ , ρ 2 = min x 1 , x 2 λ min r 2 r 2 � � ⇥ where 0 < δ  σ i x 1 , x 1 g ( x 1 , x 2 ) x 1 , x 2 g β 2 = min x 1 , x 2 λ min ⇤ 2 . r 2 ⇥ x 2 , x 2 g ( x 1 , x 2 ) N. Loizou, Stochastic Hamiltonian Methods 8 / 14

  16. Classes of Stochastic Smooth Games Stochastic Bilinear Games. n g ( x 1 , x 2 ) = 1 X x > 1 b i + x > 1 A i x 2 + c > (7) i x 2 n i =1 Proposition: Stochastic bilinear game (7) ) Stochastic Hamiltonian function (5) is a smooth quadratic quasi-strongly convex function. Stochastic su ffi ciently bilinear games. (Abernethy et al., 2019) Games where the following condition is true: ( δ 2 + ρ 2 )( δ 2 + β 2 ) � 4 L 2 ∆ 2 > 0 , (8) ⇤ 2 and  ∆ , ρ 2 = min x 1 , x 2 λ min r 2 r 2 � � ⇥ where 0 < δ  σ i x 1 , x 1 g ( x 1 , x 2 ) x 1 , x 2 g β 2 = min x 1 , x 2 λ min ⇤ 2 . r 2 ⇥ x 2 , x 2 g ( x 1 , x 2 ) Proposition : Stochastic su ffi ciently bilinear game ) Stochastic Hamiltonian function (5) is smooth and satisfies the PL condition. N. Loizou, Stochastic Hamiltonian Methods 8 / 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend