Introduction to Stochastic Optimization January 13, 2015 P. - PowerPoint PPT Presentation

General Introduction to Stochastic Optimization Stochastic Gradient Method Overview Introduction to Stochastic Optimization January 13, 2015 P. Carpentier Master MMMEF — Cours MNOS 2014-2015 3 / 265

General Introduction to Stochastic Optimization Stochastic Gradient Method Overview Lecture Outline General Introduction to Stochastic Optimization 1 Motivation and Goals Reminders in the Deterministic Framework Switching to the Stochastic Case Stochastic Gradient Method Overview 2 Stochastic Gradient Algorithm Connexion with Stochastic Approximation Asymptotic Efficiency and Averaging Practical Considerations P. Carpentier Master MMMEF — Cours MNOS 2014-2015 4 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case General Introduction to Stochastic Optimization 1 Motivation and Goals Reminders in the Deterministic Framework Switching to the Stochastic Case Stochastic Gradient Method Overview 2 Stochastic Gradient Algorithm Connexion with Stochastic Approximation Asymptotic Efficiency and Averaging Practical Considerations P. Carpentier Master MMMEF — Cours MNOS 2014-2015 5 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Goals of this Course General objective: present numerical methods (convergence results, discretization schemes, algorithms. . . ) in order to be able to solve optimization problems in a stochastic framework. Specific objective: be able to deal with large scale system problems for which standard methods are no more effective (dynamic programming, curse of dimensionality). Problems under Consideration Open-loop problems: decisions do not depend on specific observation of the uncertainties. Closed-loop problems: available observations reveal some information and decisions depend on these observations. New concept in stochastic optimization: information structure, that is, amount of information available to the decision maker. P. Carpentier Master MMMEF — Cours MNOS 2014-2015 7 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Expected Difficulties Solving stochastic optimization problems is not only a matter of optimizing a criterion under conventional constraints. Issues are: how to compute (conditional) expectations? how to deal with (probability) constraints? how to properly handle informational constraints? Examples Open-loop problem: take a decision facing an uncertain future (investment problem). Recourse problem: take a first decision, and then optimize its consequences (investment-operating problem). Multistage problems: a decision has to be taken at each time step (management, planning). P. Carpentier Master MMMEF — Cours MNOS 2014-2015 8 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Deterministic Optimization Problems General Problem u ∈ U ad ⊂ U J ( u ) min (1a) subject to Θ( u ) ∈ − C ⊂ V . (1b) Dynamic Problem T − 1 � min L t ( x t , u t ) + K ( x T ) (2a) ( u 0 ,..., u T − 1 , x 0 ,..., x T ) t =0 subject to � x 0 = x ini given , (2b) x t +1 = f t ( x t , u t ) , t = 0 , . . . , T − 1 . P. Carpentier Master MMMEF — Cours MNOS 2014-2015 10 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Extension of the General Problem – Open-Loop Case (1) Consider Problem (1) without explicit constraint Θ, and suppose that J is in fact the expectation of a function j , depending on a random variable W defined on a probability space (Ω , A , P ) and valued on ( W , W ): � � J ( u ) = E j ( u , W ) . Then the optimization problem writes � � min j ( u , W ) . (3) u ∈ U ad E The decision u is a deterministic variable, which only depends on the probability law of W (and not on on-line observations of W ). � Main difficulty: calculation of the expectation. P. Carpentier Master MMMEF — Cours MNOS 2014-2015 12 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Extension of the General Problem – Open-Loop Case (2) Solution using Exact Quadrature � � � � J ( u ) = E j ( u , W ) , ∇ J ( u ) = E ∇ u j ( u , W ) . Projected gradient algorithm: � � u ( k +1) = proj U ad u ( k ) − ǫ ∇ J ( u ( k ) ) . Sample Average Approximation (SAA) Obtain a realization ( w (1) , . . . , w ( k ) ) of a k -sample of W and minimize the Monte Carlo approximation of J : � k 1 u ( k ) ∈ arg min j ( u , w ( l ) ) . k u ∈ U ad l =1 Note that u ( k ) depends on the realization ( w (1) , . . . , w ( k ) )! P. Carpentier Master MMMEF — Cours MNOS 2014-2015 13 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Extension of the General Problem – Open-Loop Case (3) Stochastic Gradient Method Underlying ideas: use an “easily computable” approximation of ∇ J based on realizations ( w (1) , . . . , w ( k ) , . . . ) of samples of W , incorporate the realizations one by one into the algorithm. These considerations lead to the following algorithm: � � u ( k +1) = proj U ad u ( k ) − ǫ ( k ) ∇ u j ( u ( k ) , w ( k +1) ) . Iterations of the gradient algorithm are used a) to move towards the solution and b) to refine the Monte-Carlo sampling process. � Topic of the first part of the course. P. Carpentier Master MMMEF — Cours MNOS 2014-2015 14 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Extension of the General Problem – Closed-Loop Case (1) Consider again Problem (3), and assume now that the control is in fact a random variable U defined on the probability space (Ω , A , P ) and valued on ( U , U ): 2 � � J ( U ) = E j ( U , W ) . Denote by F the σ -field generated by W . The (interesting part of the) information available to the decision maker is a piece of the information revealed by the noise W , and thus is represented by a σ -field G included in F . Then the optimization problem writes � � min j ( U , W ) , U � G E where U � G means that U is measurable w.r.t. the σ -field G . 2 There is here a tricky point in the notations. . . P. Carpentier Master MMMEF — Cours MNOS 2014-2015 15 / 265

Motivation and Goals General Introduction to Stochastic Optimization Reminders in the Deterministic Framework Stochastic Gradient Method Overview Switching to the Stochastic Case Extension of the General Problem – Closed-Loop Case (2) � � min j ( U , W ) . U � G E Examples. G = {∅ , Ω } : this corresponds to the open-loop case: � � min j ( u , W ) . u ∈ U E � � G = σ : we have by the interchange theorem: W u ∈ U j ( u , w ) , ∀ w ∈ W . min � � G ⊂ σ W : the problem is equivalent to � �� G E min u ∈ U E j ( u , W ) . P. Carpentier Master MMMEF — Cours MNOS 2014-2015 16 / 265

Introduction to Stochastic Optimization January 13, 2015 P. - PowerPoint PPT Presentation

General Introduction to Stochastic Optimization Stochastic Gradient Method Overview Introduction to Stochastic Optimization January 13, 2015 P. Carpentier Master MMMEF Cours MNOS 2014-2015 3 / 265 General Introduction to Stochastic

Dual Effect in Stochastic Optimization February 10, 2015 P. Carpentier Master MMMEF Cours

Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic

Stochastic Optimization and Discretization January 06, 2021 P. Carpentier Master Optimization

Stochastic Online Optimization Jian Li Institute of Interdisciplinary Information Sciences

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization

CHAPTER V V CHAPTER Annealing by Stochastic Annealing by Stochastic Neural Networks for

Overview of the Stochastic Gradient Method December 02, 2020 P. Carpentier Master Optimization

Various Topics Outline 1. Dynamic (time-varying) Optimization Problems 2. Stochastic

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

What If We Only Have Stochastic . . . What if the Stochastic . . . Approximate Stochastic

The importance of better models in stochastic optimization John Duchi (based on joint work with

Some References P. Carpentier Master MMMEF Cours MNOS 2014-2015 263 / 263 Stochastic

CSCI 1951-G Optimization Methods in Finance Part 11: Stochastic Optimization April 13, 2018

Anytime Online-to-Batch, Optimism, and Acceleration Ashok Cutkosky Google Research Stochastic

Convergence to stable laws in the space D cois Roueff 1 Philippe Soulier 2 Fran Poitiers,

Scaling limit of random planar maps Lecture 2. Olivier Bernardi, CNRS, Universit Paris-Sud

Derivative Free Optimization Optimization and AMS Masters - University Paris Saclay Exercices -

18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott Sheffield MIT 1 18.175 Lecture 12

Convergence and Efficiency of the Wang Landau algorithm Gersende FORT CNRS & Telecom

13. The Weak Law and the Strong Law of Large Numbers James Bernoulli proved the weak law of large

Reinforcement Learning Algorithms A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

t r trs

Introduction to Stochastic Optimization January 13, 2015 P. - PowerPoint PPT Presentation

General Introduction to Stochastic Optimization Stochastic Gradient Method Overview Introduction to Stochastic Optimization January 13, 2015 P. Carpentier Master MMMEF Cours MNOS 2014-2015 3 / 265 General Introduction to Stochastic

Dual Effect in Stochastic Optimization February 10, 2015 P. Carpentier Master MMMEF Cours

Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic

Stochastic Optimization and Discretization January 06, 2021 P. Carpentier Master Optimization

Stochastic Online Optimization Jian Li Institute of Interdisciplinary Information Sciences

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization

CHAPTER V V CHAPTER Annealing by Stochastic Annealing by Stochastic Neural Networks for

Overview of the Stochastic Gradient Method December 02, 2020 P. Carpentier Master Optimization

Various Topics Outline 1. Dynamic (time-varying) Optimization Problems 2. Stochastic

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

What If We Only Have Stochastic . . . What if the Stochastic . . . Approximate Stochastic

The importance of better models in stochastic optimization John Duchi (based on joint work with

Some References P. Carpentier Master MMMEF Cours MNOS 2014-2015 263 / 263 Stochastic

CSCI 1951-G Optimization Methods in Finance Part 11: Stochastic Optimization April 13, 2018

Anytime Online-to-Batch, Optimism, and Acceleration Ashok Cutkosky Google Research Stochastic

Convergence to stable laws in the space D cois Roueff 1 Philippe Soulier 2 Fran Poitiers,

Scaling limit of random planar maps Lecture 2. Olivier Bernardi, CNRS, Universit Paris-Sud

Derivative Free Optimization Optimization and AMS Masters - University Paris Saclay Exercices -

18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott Sheffield MIT 1 18.175 Lecture 12

Convergence and Efficiency of the Wang Landau algorithm Gersende FORT CNRS &amp; Telecom

13. The Weak Law and the Strong Law of Large Numbers James Bernoulli proved the weak law of large

Reinforcement Learning Algorithms A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

t r trs

Convergence and Efficiency of the Wang Landau algorithm Gersende FORT CNRS & Telecom