csci 1951 g optimization methods in finance part 09
play

CSCI 1951-G Optimization Methods in Finance Part 09: Interior - PowerPoint PPT Presentation

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods March 23, 2018 1 / 35 This material is covered in S. Boyd, L. Vandenberges book Convex Optimization https://web.stanford.edu/~boyd/cvxbook/ . Some of the


  1. CSCI 1951-G – Optimization Methods in Finance Part 09: Interior Point Methods March 23, 2018 1 / 35

  2. This material is covered in S. Boyd, L. Vandenberge’s book Convex Optimization https://web.stanford.edu/~boyd/cvxbook/ . Some of the materials and the figures are taken from it. 2 / 35

  3. Context • Two weeks ago: unconstrained problems, solved with descent methods • Last week: linearly constrained problems, solved with Newton’s method • This week: inequality constrained problems, solved with interior point methods 3 / 35

  4. Inequality constrained minimization problems min f 0 ( x ) s.t. f i ( x ) ≤ 0 , i = 1 , . . . , m Ax = b f 0 , . . . , f m : convex and twice continuously differentiable , A ∈ R p × n , rank ( A ) = p < n ) Assume: • optimal solution x ∗ exists, with obj. value p ∗ . • problem is strictly feasible (i.e., feasible region has interior points ) ⇒ Slater’s condition hold: There exist λ ∗ and ν ∗ that, with x ∗ , satisfy KKTs . 4 / 35

  5. Hierarchy of algorithms Transforming constrained problem to unconstrained : always possible , but has drawbacks Solving the constrained problem: direct , leverages problem structure What’s the constrained problem class that is the easiest to solve? Qadratic Problems with Linear equality Constraints (LCQP) Only require to solve ... a system of linear equations How did we solve generic problems with linear equality constraints? With Newton’s method , which solves a sequence of ...LCQPs! We will solve inequality constrained problems with interior point methods , which solve a sequence of linear constrained problems ! 5 / 35

  6. Problem Transformation Goal: approximate the Inequality Constrained Problem (ICP) with an Equality Constrained Problem (ECP) solvable with Newton’s method ; We start by transforming the ICP into an equivalent ECP: From: To: min f 0 ( x ) min g ( x ) s.t. f i ( x ) ≤ 0 , i = 1 , . . . , m s.t. Ax = b Ax = b For � m � 0 u ≤ 0 g ( x ) = f 0 ( x ) + I _ ( f i ( x )) where I _ ( u ) = ∞ u > 0 i =1 So we just use Newton’s method and we are done. The End. Nope. 6 / 35

  7. Logarithmic barrier m � min f 0 ( x ) + I _ ( f i ( x )) i =1 s.t. Ax = b The obj. function is in general not differentiable : We can’t use Newton’s method. We want to approximate I _ ( u ) with a differentiable function: I _ ( u ) = − 1 ˆ t log( − u ) with domain − R ++ , and where t > 0 is a parameter 7 / 35

  8. Logarithmic barrier ˆ I _ ( u ) is convex and differentiable 10 5 0 − 5 − 3 − 2 − 1 0 1 u Figure 11.1 The dashed lines show the function I − ( u ), and the solid curves show � I − ( u ) = − (1 /t ) log( − u ), for t = 0 . 5 , 1 , 2. The curve for t = 2 gives the best approximation. 8 / 35

  9. Logarithmic barrier m � min f 0 ( x ) − 1 log( − f i ( x )) t i =1 s.t. Ax = b The objective function is convex and differentiable : we can use Newton’s method φ ( x ) = − � m i =1 log( − f i ( x )) is called the logarithmic barrier for the problem 9 / 35

  10. Example: Inequality form linear programming min c T x Ax ≤ b The logarithmic barrier for this problem is � m log( b i − a T φ ( x ) = − i x ) i =1 where a i are the rows of A . 10 / 35

  11. How to choose t ? min f 0 ( x ) + 1 t φ ( x ) s.t. Ax = b is an approximation of the original problem. How does the quality of the approximation change with t ? t φ ( x ) tends to � I _ ( f i ( x )) As t grows, 1 so the approximation quality increases So let’s just use a large t ? Nope. 11 / 35

  12. Why not using (immediately) a large t ? What’s the intuition behind Newton’s method? Replace obj. function with 2nd-order Taylor approximation at x : f ( x + v ) ≈ f ( x ) + ∇ f ( x ) T v + 1 2 v T ∇ 2 f ( x ) v When does this approximation (and Newton’s method) work well ? When the Hessian changes slowly Is it the case for the barrier function? 12 / 35

  13. Back to the example min c T x s.t. Ax ≤ b m � log( b i − a T φ ( x ) = − i x ) i =1 � m 1 ∇ 2 φ ( x ) = i x ) 2 a i a T i ( b i − a T i =1 The Hessian changes fast as x gets close to the boundary of the feasible region. 13 / 35

  14. Why not using (immediately) a large t ? The Hessian of the function f 0 + 1 t φ varies rapidly near the boundary of the feasible set. This fact makes directly using a large t not efficient Instead, we will solve a sequence of problems in the form min f 0 ( x ) + 1 t φ ( x ) s.t. Ax = b for increasing values of t We start each Newton’ minimization at the solution of the problem for the previous value of t . 14 / 35

  15. The central path Slight rewrite: min tf 0 ( x ) + φ ( x ) s.t. Ax = b Assume it has a unique solution x ∗ ( t ) for each t > 0 . Central path : { x ∗ ( t ) : t > 0 } (made of central points) 15 / 35

  16. The central path Necessary and sufficient conditions for x ∗ ( t ) : • Strict feasibility : Ax ∗ ( t ) = b f i ( x ∗ ( t )) < 0 , i = 1 , . . . , m • Zero of the Lagrangian ( centrality condition ): Exists ˆ ν 0 = t ∇ f 0 ( x ∗ ( t )) + ∇ φ ( x ∗ ( t )) + A T ˆ ν m � 1 − f i ( x ∗ ( t )) ∇ f i ( x ∗ ( t )) + A T ˆ = t ∇ f 0 ( x ∗ ( t )) + ν i =1 16 / 35

  17. Back to the example min c T x s.t. Ax ≤ b m � log( b i − a T φ ( x ) = − i x ) i =1 Centrality condition: 0 = t ∇ f 0 ( x ∗ ( t )) + ∇ φ ( x ∗ ( t )) + A T ˆ ν m � 1 = tc + i xa i b i − a T i =1 17 / 35

  18. Back to the example 0 = tc + � m 1 i x a i i =1 b i − a T c x ⋆ (10) x ⋆ Figure 11.2 Central path for an LP with n = 2 and m = 6. The dashed curves show three contour lines of the logarithmic barrier function φ . The central path converges to the optimal point x ⋆ as t → ∞ . Also shown is the point on the central path with t = 10. The optimality condition (11.9) at this point can be verified geometrically: The line c T x = c T x ⋆ (10) is tangent to the contour line of φ through x ⋆ (10). 18 / 35

  19. Dual point from the central path Every central point x ∗ ( t ) yields a dual feasible point ( λ ∗ ( t ) , ν ∗ ( t )) , thus a ... lower bound to the optimal obj. value p ∗ : 1 ν ∗ ( t ) = ˆ ν λ ∗ i ( t ) = − tf i ( x ∗ ( t )) , i = 1 , . . . , m t The proof gives us a lot of information 19 / 35

  20. Proof • λ i ( t ) > 0 because f i ( x ∗ ( t )) < 0 • Rewrite the centrality condition: � m 1 − f i ( x ∗ ( t )) ∇ f i ( x ∗ ( t )) + A T ˆ 0 = t ∇ f 0 ( x ∗ ( t )) + ν i =1 m � i ( t ) ∇ f i ( x ∗ ( t )) + A T ν ∗ ( t ) = ∇ f 0 ( x ∗ ( t )) + λ ∗ i =1 • The above equals ∂L ∂x ( x ∗ ( t ) , λ ∗ ( t ) , ν ∗ ( t )) = 0 i.e., x ∗ ( t ) ... minimizes the Lagrangian at λ ∗ ( t ) , ν ∗ ( t ) ; 20 / 35

  21. Proof Let’s look at the dual function: m � g ( λ ∗ ( t ) , ν ∗ ( t )) = f 0 ( x ∗ ( t )) + λ ∗ i ( t ) f i x ∗ ( t ) + ν ∗ ( t )( Ax − b ) i =1 It holds g ( λ ∗ ( t ) , ν ∗ ( t )) = f 0 ( x ∗ ( t )) − m/t So f 0 ( x ∗ ( t )) p ∗ ≤ m/t i.e., x ∗ ( t ) is no more than m/t -suboptimal ! x ∗ ( t ) converges to x ∗ as t → ∞ . 21 / 35

  22. The barrier method To get an ε -approximation we could just set t = m/ε and solve min m ε f 0 ( x ) + φ ( x ) Ax = b This method does not scale well with the size of the problem and with ε . Barrier method : Compute x ∗ ( t ) for an increasing sequence of values t until t ≥ m/ε 22 / 35

  23. The barrier method input: strictly feasible x = x (0) , t = t (0) > 0 , µ > 1 , ε > 0 repeat: 1 Centering step : Compute x ∗ ( t ) by minimizing tf 0 + φ subject to Ax = b , starting at x 2 Update : x ← x ∗ ( t ) 3 Stopping criterion : quit if m/t < ε 4 Increase t : t ← µt What can we ask about this algorithm? 23 / 35

  24. The barrier method What can we ask about this algorithm? 1 How many iterations does it take to converge? 2 Do we need to optimally solve the centering step ? 3 What is a good value for µ ? 4 How to choose t (0) ? 24 / 35

  25. Convergence • The algorithm stops when m/t < ε • t starts at t (0) • t increases to µt at each iteration How to compute the number of iterations needed? We must find the smallest i such that m ε < t (0) µ i It holds: � log � m εt (0) i = log m Is there anything important that this analysis does not tell us ? It does not tell us whether, as t grows , the centering step becomes more difficult . (It does not ) 25 / 35

  26. 35 30 Newton iterations 25 20 15 10 1 10 2 10 3 m Figure 11.8 Average number of Newton steps required to solve 100 randomly generated LPs of different dimensions, with n = 2 m . Error bars show stan- dard deviation, around the average value, for each value of m . The growth in the number of Newton steps required, as the problem dimensions range over a 100:1 ratio, is very small. 26 / 35

  27. The barrier method What can we ask about this algorithm? 1 How many iterations does it take to converge? 2 Do we need to optimally solve the centering step ? 3 What is a good value for µ ? 4 How to choose t (0) ? 27 / 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend