Escaping Saddle Points in Constant Dimensional Spaces: an - PowerPoint PPT Presentation

Escaping Saddle Points in Constant Dimensional Spaces: an Agent-based Modeling Perspective Grant Schoenebeck, University of Michigan Fang-Yi Yu , Harvard University

Results • Analyze the convergence rate of a family of stochastic processes Evolutionary • Three related applications Game theory – Evolutionary game theory – Dynamics on social networks – Stochastic Gradient Descent Dynamics Stochastic on social Gradient networks Descent

Target Audience Evolutionary Game theory Dynamics Stochastic on social Gradient networks Descent

Target Audience (still not-to-scale) Evolutionary Game Theory Dynamics on social networks Stochastic Gradient Descent

Outline • Escaping saddle point Evolutionary game theory Dynamics on social Stochastic Gradient Descent networks

Outline • Escaping saddle point • Case study: dynamics on social networks Evolutionary game theory Dynamics on social Stochastic Gradient Descent networks

Upper bounds and lower bounds ESCAPING SADDLE POINTS

Reinforced random walk with 𝐺 A discrete time stochastic process {𝑌 𝑙 : 𝑙 = 0, 1, … } in ℝ 𝑒 that admits the following representation, 𝑌 𝑙+1 − 𝑌 𝑙 = 1 𝑜 𝐺 𝑌 𝑙 + 𝑉 𝑙 1 𝑌 𝑙 𝑜 𝐺(𝑌 𝑙 ) 1 𝑜 𝑉 𝑙 𝑌 𝑙+1

Reinforced random walk with 𝐺 A discrete time stochastic process {𝑌 𝑙 : 𝑙 = 0, 1, … } in ℝ 𝑒 that admits the following representation, 𝑌 𝑙+1 − 𝑌 𝑙 = 1 𝑜 𝐺 𝑌 𝑙 + 𝑉 𝑙 • Expected difference (drift), 𝐺 𝑌 1 𝑌 𝑙 𝑜 𝐺(𝑌 𝑙 ) 1 𝑜 𝑉 𝑙 𝑌 𝑙+1

Reinforced random walk with 𝐺 A discrete time stochastic process {𝑌 𝑙 : 𝑙 = 0, 1, … } in ℝ 𝑒 that admits the following representation, 𝑌 𝑙+1 − 𝑌 𝑙 = 1 𝑜 𝐺 𝑌 𝑙 + 𝑉 𝑙 • Expected difference (drift), 𝐺 𝑌 1 𝑌 𝑙 • Unbiased noise (noise), 𝑉 𝑙 𝑜 𝐺(𝑌 𝑙 ) 1 𝑜 𝑉 𝑙 𝑌 𝑙+1

Reinforced random walk with 𝐺 A discrete time stochastic process {𝑌 𝑙 : 𝑙 = 0, 1, … } in ℝ 𝑒 that admits the following representation, 𝑌 𝑙+1 − 𝑌 𝑙 = 1 𝑜 𝐺 𝑌 𝑙 + 𝑉 𝑙 • Expected difference (drift), 𝐺 𝑌 1 𝑌 𝑙 • Unbiased noise (noise), 𝑉 𝑙 𝑜 𝐺(𝑌 𝑙 ) • Step size, 1/𝑜 1 𝑜 𝑉 𝑙 𝑌 𝑙+1

Examples A discrete time Markov process {𝑌 𝑙 : 𝑙 = 0, 1, … } in ℝ 𝑒 that admits the following representation, 𝑌 𝑙+1 − 𝑌 𝑙 = 1 𝑜 𝐺 𝑌 𝑙 + 𝑉 𝑙 • Agent based models with 𝑜 agents – Evolutionary games – Dynamics on social networks • Heuristic local search algorithms with uniform step size 1/𝑜

Node Dynamic on complete graphs [SY18] • Let 𝑔 𝑂𝐸 : 0,1 → [0,1] . 𝑜 agents interact on a complete graph • Each agent 𝑤 has an initial binary state 𝐷 0 ( v ) ∈ {0,1} • At round 𝑙 , • Pick a node 𝑤 uniformly at random −1 (1) 𝐷 𝑙 • Compute the fraction of opinion 1 , 𝑌 𝑙 = <- Complete graph 𝑜 • Update 𝐷 𝑙+1 (𝑤) to 1 w.p. 𝑔 𝑂𝐸 𝑌 𝑙 ; 0 o.w.

Node Dynamic Includes several existing dynamics Update functions • Voter model Voter Majority 3-Majority 1 • Iterative majority [Mossel et al 14] • Iterative 3-majority [Doerr et al 11] 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1

Node Dynamic Reinforced random walk on ℝ Node dynamic on complete graphs • • 𝑌 𝑙 be the fraction of nodes in state 1 Let 𝑔 𝑂𝐸 : 0,1 → [0,1] . There are 𝑜 agents on a complete graph at 𝑙 . • Each agent 𝑤 has an initial binary state 𝐷 0 (v) ∈ {0,1} • At round 𝑙 , • Pick a node 𝑤 uniformly at random • Compute the fraction of opinion 1 , 𝑌 𝑙 = −1 (1) 𝐷 𝑙 𝑜 • Update 𝐷 𝑙+1 (𝑤) to 1 w.p. 𝑔 𝑂𝐸 𝑌 𝑙 ; 0 o.w.

Node Dynamic Reinforced random walk on ℝ Node dynamic on complete graphs • • Let 𝑔 𝑂𝐸 : 0,1 → [0,1] . There are 𝑜 agents on 𝑌 𝑙 be the fraction of nodes in state 1 at 𝑙 . a complete graph • Given 𝑌 𝑙 , the expected number of nodes in • Each agent 𝑤 has an initial binary state state 1 after round 𝑙 , is 𝐷 0 (v) ∈ {0,1} 𝑂𝐸 𝑌 𝑙 − 𝑌 𝑙 ) . E[𝑜𝑌 𝑙+1 ∣ 𝑌 𝑙 ] = 𝑜𝑌 𝑙 + (𝑔 • At round 𝑙 , • Pick a node 𝑤 uniformly at random • Compute the fraction of opinion 1 , 𝑌 𝑙 = −1 (1) 𝐷 𝑙 𝑜 • Update 𝐷 𝑙+1 (𝑤) to 1 w.p. 𝑔 𝑂𝐸 𝑌 𝑙 ; 0 o.w.

Node Dynamic Reinforced random walk on ℝ Node dynamic on complete graphs • • Let 𝑔 𝑂𝐸 : 0,1 → [0,1] . There are 𝑜 agents on 𝑌 𝑙 be the fraction of nodes in state 1 at 𝑙 . a complete graph • Given 𝑌 𝑙 , the expected number of nodes in • Each agent 𝑤 has an initial binary state state 1 after round 𝑙 , is E[𝑜𝑌 𝑙+1 ∣ 𝑌 𝑙 ] = 𝑜𝑌 𝑙 + (𝑔 𝑂𝐸 𝑌 𝑙 − 𝑌 𝑙 ) . 𝐷 0 (v) ∈ {0,1} Updated to 1 from 1 • At round 𝑙 , • Pick a node 𝑤 uniformly at random • Compute the fraction of opinion 1 , 𝑌 𝑙 = −1 (1) 𝐷 𝑙 𝑜 • Update 𝐷 𝑙+1 (𝑤) to 1 w.p. 𝑔 𝑂𝐸 𝑌 𝑙 ; 0 o.w.

Node Dynamic Reinforced random walk on ℝ Node dynamic on complete graphs • • Let 𝑔 𝑂𝐸 : 0,1 → [0,1] . There are 𝑜 agents on 𝑌 𝑙 be the fraction of nodes in state 1 at 𝑙 . a complete graph 1 • E 𝑌 𝑙+1 𝑌 𝑙 − 𝑌 𝑙 = 𝑜 (𝑔 𝑂𝐸 𝑌 𝑙 − 𝑌 𝑙 ) . • Each agent 𝑤 has an initial binary state Drift 𝐺(𝑌 𝑙 ) 𝐷 0 (v) ∈ {0,1} • At round 𝑙 , • Pick a node 𝑤 uniformly at random • Compute the fraction of opinion 1 , 𝑌 𝑙 = −1 (1) 𝐷 𝑙 𝑜 • Update 𝐷 𝑙+1 (𝑤) to 1 w.p. 𝑔 𝑂𝐸 𝑌 𝑙 ; 0 o.w.

Node Dynamic Reinforced random walk on ℝ Node dynamic on complete graphs • • Let 𝑔 𝑂𝐸 : 0,1 → [0,1] . There are 𝑜 agents on 𝑌 𝑙 be the fraction of nodes in state 1 at 𝑙 . a complete graph 1 • 𝑌 𝑙+1 − 𝑌 𝑙 = 𝑔 𝑂𝐸 𝑌 𝑙 − 𝑌 𝑙 + 𝑉 𝑙 . • 𝑜 Each agent 𝑤 has an initial binary state Drift Noise 𝐷 0 (v) ∈ {0,1} • At round 𝑙 , • Pick a node 𝑤 uniformly at random • Compute the fraction of opinion 1 , 𝑌 𝑙 = −1 (1) 𝐷 𝑙 𝑜 • Update 𝐷 𝑙+1 (𝑤) to 1 w.p. 𝑔 𝑂𝐸 𝑌 𝑙 ; 0 o.w.

Question Given 𝐺 and 𝑉 , what is the limit of 𝑌 𝑙 for sufficiently large 𝑜 ? 𝑌 𝑙+1 − 𝑌 𝑙 = 1 𝑜 𝐺 𝑌 𝑙 + 𝑉 𝑙

Mean field approximation 𝑌 𝑙+1 − 𝑌 𝑙 = 1 𝑦 ′ = 𝐺(𝑦) 𝑜 (𝐺 𝑌 𝑙 + 𝑉 𝑌 𝑙 )

Mean field approximation 𝑙 If 𝑜 is large enough, for 𝑙 = 𝑃(𝑜) , 𝑌 𝑙 ≈ 𝑦 𝑜 by Wormald et al 95.

Regular point 𝑙 If 𝑜 is large enough, for 𝑙 = 𝑃(𝑜) , 𝑌 𝑙 ≈ 𝑦 𝑜 .

Fixed point, 𝑮 𝒚 ∗ = 𝟏 𝑙 If 𝑜 is large enough, for 𝑙 = 𝑃(𝑜) , 𝑌 𝑙 ≈ 𝑦 𝑜 .

Escaping non-attracting fixed point When can the process escape a non-attracting fixed point?

Escaping non-attracting fixed point When can the process escape a non-attracting fixed point? 1. Θ 𝑜 2. Θ(𝑜 log 𝑜) 3. Θ 𝑜 log 𝑜 4 4. Θ 𝑜 2

Escaping non-attracting fixed point When can the process escape a non-attracting fixed point? 1. Θ 𝑜 2. 2. 𝚰(𝒐 𝒎𝒑𝒉 𝒐) 3. Θ 𝑜 log 𝑜 4 4. Θ 𝑜 2

Lower bound Escaping saddle point region takes at least Ω(𝑜 log 𝑜) steps. 𝜗 𝑌 0 = 𝑦 ∗

Upper bound Escaping saddle point region takes at most O(𝑜 log 𝑜) steps. If 𝜗 reg 𝑌 0 = 𝑦 ∗ 𝑌 𝑈 , 𝑈 = 𝑃(𝑜 log 𝑜)

Upper bound Escaping saddle point region takes at most O(𝑜 log 𝑜) steps. If • Noise, 𝑉 𝑙 – Martingale difference 𝜗 reg – bounded 𝑌 0 = 𝑦 ∗ – Noisy (covariance matrix is large) • Expected difference, 𝐺 ∈ 𝒟 2 – 𝑦 ∗ is hyperbolic 𝑌 𝑈 , 𝑈 = 𝑃(𝑜 log 𝑜)

Gradient-like dynamics Converges to an attracting fixed-point region in O(𝑜 log 𝑜) steps. If • Noise, 𝑉 𝑙 – Martingale difference – bounded – Noisy • Expected difference, 𝐺 ∈ 𝒟 2 – Fixed points are hyperbolic – Potential function

Outline • Escaping saddle point Evolutionary game theory Dynamics on social Stochastic Gradient Descent networks

Outline • Escaping saddle point • Case study: dynamics on social networks Evolutionary game theory Dynamics on social Stochastic Gradient Descent networks

Dynamics on social networks (DIS)AGREEMENT IN PLANTED COMMUNITY NETWORKS

Echo chamber Beliefs are amplified through interactions in segregated systems

Echo chamber Beliefs are amplified through interactions in segregated systems Rich-get-richer Community structure

Question What is the consensus time given a rich-get-richer opinion formation and the level of intercommunity connectivity?

Escaping Saddle Points in Constant Dimensional Spaces: an - PowerPoint PPT Presentation

Escaping Saddle Points in Constant Dimensional Spaces: an Agent-based Modeling Perspective Grant Schoenebeck, University of Michigan Fang-Yi Yu , Harvard University Results Analyze the convergence rate of a family of stochastic processes

Escaping Saddle Points in Constant Dimensional Spaces: an Agent-based Modeling Perspective Grant

Escaping Saddle Points with Adaptive Gradient Methods Matthew Staib 1 , Sashank Reddi 2 ,

Escaping from saddle points on Riemannian manifolds Yue Sun , Nicolas Flammarion , Maryam

The structure of the escaping set of a transcendental entire function Gwyneth Stallard (joint

Tyrol Hill Park Phase 4 Elementary Campbell Elementary Campbell Park Spaces Open Park

On the escaping set of exponential maps Patrick Comdhr Christian-Albrechts-Universitt zu Kiel

Non-constant Non-constant growth model growth model You are calculating the intrinsic value of

n -dimensional manifold M with T := TM n -dimensional manifold M with T := TM T n -dimensional

The History of Tooled Leather In the 1800s Old West, the saddle was a symbol of stature and

Leather Seat Covers The History of Tooled Leather In the 1800s Old West, the saddle was a

Upper Saddle River Schools Enrollment Study December, 2016 Ross Haber and Associates Selected

Saddle Creek Road Relocation Study Format of the Meeting Open House Brief presentations

Duality (II) Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Saddle-point

Conjugate gradient methods for stochastic Galerkin finite element saddle point matrices B T A

The Permutable Mystery Tour Transcendental Functions and Escaping Points For transcendental

CMPS 112, Spring 2019 Midterm (Solutions) Section Points Score Reductions 10 points Lists

Introducing open games Jules Hedges Joint work with Neil Ghani Viktor Winschel Philipp Zahn

Game Theory: Spring 2020 Ulle Endriss Institute for Logic, Language and Computation University

Final exam Patrick Loiseau, Michela Chessa Game Theory, Fall 2013 2 hours, no document allowed

Particle Filter-based Estimation of Orbital Parameters of Visual Binary Stars with Incomplete

Game Theory Rohit Vaish Department of Computer Science & Automation Indian Institute of

Conflict, Evolution, Hegemony, and the Power of the State David K. Levine and Salvatore Modica

AI Safety and Beneficence Some Current Research Paths Presentation to Data Learning and

Mcanismes de scurit et de coopration entre nuds d'un rseaux mobile ad hoc Pietro

Escaping Saddle Points in Constant Dimensional Spaces: an - PowerPoint PPT Presentation

Escaping Saddle Points in Constant Dimensional Spaces: an Agent-based Modeling Perspective Grant Schoenebeck, University of Michigan Fang-Yi Yu , Harvard University Results Analyze the convergence rate of a family of stochastic processes

Escaping Saddle Points in Constant Dimensional Spaces: an Agent-based Modeling Perspective Grant

Escaping Saddle Points with Adaptive Gradient Methods Matthew Staib 1 , Sashank Reddi 2 ,

Escaping from saddle points on Riemannian manifolds Yue Sun , Nicolas Flammarion , Maryam

The structure of the escaping set of a transcendental entire function Gwyneth Stallard (joint

Tyrol Hill Park Phase 4 Elementary Campbell Elementary Campbell Park Spaces Open Park

On the escaping set of exponential maps Patrick Comdhr Christian-Albrechts-Universitt zu Kiel

Non-constant Non-constant growth model growth model You are calculating the intrinsic value of

n -dimensional manifold M with T := TM n -dimensional manifold M with T := TM T n -dimensional

The History of Tooled Leather In the 1800s Old West, the saddle was a symbol of stature and

Leather Seat Covers The History of Tooled Leather In the 1800s Old West, the saddle was a

Upper Saddle River Schools Enrollment Study December, 2016 Ross Haber and Associates Selected

Saddle Creek Road Relocation Study Format of the Meeting Open House Brief presentations

Duality (II) Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Saddle-point

Conjugate gradient methods for stochastic Galerkin finite element saddle point matrices B T A

The Permutable Mystery Tour Transcendental Functions and Escaping Points For transcendental

CMPS 112, Spring 2019 Midterm (Solutions) Section Points Score Reductions 10 points Lists

Introducing open games Jules Hedges Joint work with Neil Ghani Viktor Winschel Philipp Zahn

Game Theory: Spring 2020 Ulle Endriss Institute for Logic, Language and Computation University

Final exam Patrick Loiseau, Michela Chessa Game Theory, Fall 2013 2 hours, no document allowed

Particle Filter-based Estimation of Orbital Parameters of Visual Binary Stars with Incomplete

Game Theory Rohit Vaish Department of Computer Science &amp; Automation Indian Institute of

Conflict, Evolution, Hegemony, and the Power of the State David K. Levine and Salvatore Modica

AI Safety and Beneficence Some Current Research Paths Presentation to Data Learning and

Mcanismes de scurit et de coopration entre nuds d'un rseaux mobile ad hoc Pietro

Game Theory Rohit Vaish Department of Computer Science & Automation Indian Institute of