Convex Optimization ( EE227A: UC Berkeley ) Lecture 25 (Newton, - PowerPoint PPT Presentation

Convex Optimization ( EE227A: UC Berkeley ) Lecture 25 (Newton, quasi-Newton) 23 Apr, 2013 ◦ Suvrit Sra

Admin ♠ Project poster presentations: Soda 306 HP Auditorium Fri May 10, 2013 4pm – 8pm ♠ HW5 due on May 02, 2013 Will be released today. 2 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . 3 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . ◮ Key idea: linear approximation . 3 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . ◮ Key idea: linear approximation . ◮ Suppose we are at some x close to x ∗ (the root ) 3 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . ◮ Key idea: linear approximation . ◮ Suppose we are at some x close to x ∗ (the root ) g ( x + ∆ x ) = g ( x ) + g ′ ( x )∆ x + o ( | ∆ x | ) . 3 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . ◮ Key idea: linear approximation . ◮ Suppose we are at some x close to x ∗ (the root ) g ( x + ∆ x ) = g ( x ) + g ′ ( x )∆ x + o ( | ∆ x | ) . ◮ Equation g ( x + ∆ x ) = 0 approximated by g ( x ) + g ′ ( x )∆ x = 0 ⇒ ∆ x = − g ( x ) /g ′ ( x ) . = 3 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . ◮ Key idea: linear approximation . ◮ Suppose we are at some x close to x ∗ (the root ) g ( x + ∆ x ) = g ( x ) + g ′ ( x )∆ x + o ( | ∆ x | ) . ◮ Equation g ( x + ∆ x ) = 0 approximated by g ( x ) + g ′ ( x )∆ x = 0 ⇒ ∆ x = − g ( x ) /g ′ ( x ) . = ◮ If x is close to x ∗ , we can expect ∆ x ≈ ∆ x ∗ = x ∗ − x 3 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . ◮ Key idea: linear approximation . ◮ Suppose we are at some x close to x ∗ (the root ) g ( x + ∆ x ) = g ( x ) + g ′ ( x )∆ x + o ( | ∆ x | ) . ◮ Equation g ( x + ∆ x ) = 0 approximated by g ( x ) + g ′ ( x )∆ x = 0 ⇒ ∆ x = − g ( x ) /g ′ ( x ) . = ◮ If x is close to x ∗ , we can expect ∆ x ≈ ∆ x ∗ = x ∗ − x ◮ Thus, we may write x ∗ ≈ x − g ( x ) g ′ ( x ) 3 / 25

Newton method ◮ Recall numerical analysis: Newton method for solving equations g ( x ) = 0 x ∈ R . ◮ Key idea: linear approximation . ◮ Suppose we are at some x close to x ∗ (the root ) g ( x + ∆ x ) = g ( x ) + g ′ ( x )∆ x + o ( | ∆ x | ) . ◮ Equation g ( x + ∆ x ) = 0 approximated by g ( x ) + g ′ ( x )∆ x = 0 ⇒ ∆ x = − g ( x ) /g ′ ( x ) . = ◮ If x is close to x ∗ , we can expect ∆ x ≈ ∆ x ∗ = x ∗ − x ◮ Thus, we may write x ∗ ≈ x − g ( x ) g ′ ( x ) ◮ Which suggests the iterative process x k +1 ← x k − g ( x k ) g ′ ( x k ) 3 / 25

Newton method ◮ Suppose we have a system of nonlinear equations G : R n → R n . G ( x ) = 0 4 / 25

Newton method ◮ Suppose we have a system of nonlinear equations G : R n → R n . G ( x ) = 0 ◮ Again, arguing as above we arrive at the Newton system G ( x ) + G ′ ( x )∆ x = 0 , where G ′ ( x ) is the Jacobian . 4 / 25

Newton method ◮ Suppose we have a system of nonlinear equations G : R n → R n . G ( x ) = 0 ◮ Again, arguing as above we arrive at the Newton system G ( x ) + G ′ ( x )∆ x = 0 , where G ′ ( x ) is the Jacobian . ◮ Assume G ′ ( x ) is non-degenerate (invertible), we obtain x k +1 = x k − [ G ′ ( x k )] − 1 G ( x k ) . 4 / 25

Newton method ◮ Suppose we have a system of nonlinear equations G : R n → R n . G ( x ) = 0 ◮ Again, arguing as above we arrive at the Newton system G ( x ) + G ′ ( x )∆ x = 0 , where G ′ ( x ) is the Jacobian . ◮ Assume G ′ ( x ) is non-degenerate (invertible), we obtain x k +1 = x k − [ G ′ ( x k )] − 1 G ( x k ) . ◮ This is Newton’s method for solving nonlinear equations 4 / 25

Newton method f ( x ) such that x ∈ R n min 5 / 25

Newton method f ( x ) such that x ∈ R n min ∇ f ( x ) = 0 is necessary for optimality 5 / 25

Newton method f ( x ) such that x ∈ R n min ∇ f ( x ) = 0 is necessary for optimality Newton system ∇ f ( x ) + ∇ 2 f ( x )∆ x = 0 , which leads to x k +1 = x k − [ ∇ 2 f ( x k )] − 1 ∇ f ( x k ) . the Newton method for optimization 5 / 25

Newton method – remarks ◮ Newton method for equations is more general than minimizing f ( x ) by finding roots of ∇ f ( x ) = 0 6 / 25

Newton method – remarks ◮ Newton method for equations is more general than minimizing f ( x ) by finding roots of ∇ f ( x ) = 0 ◮ Reason: Not every function G : R n → R n is a derivative! Example Consider the linear system Ax − b = 0 . Unless A is symmetric, does not correspond to a derivative (Why?) 6 / 25

Newton method – remarks ◮ Newton method for equations is more general than minimizing f ( x ) by finding roots of ∇ f ( x ) = 0 ◮ Reason: Not every function G : R n → R n is a derivative! Example Consider the linear system Ax − b = 0 . Unless A is symmetric, does not correspond to a derivative (Why?) ◮ If it were a derivative, then its own derivative is a Hessian, and we know that Hessians must be symmetric, QED. 6 / 25

Newton method – remarks ◮ In general, Newton method highly nontrivial to analyze Example Consider the iteration x k +1 = x k − 1 x k , x 0 = 2 . May be viewed as iter for e x 2 / 2 = 0 (which has no real solution ) 7 / 25

Newton method – remarks ◮ In general, Newton method highly nontrivial to analyze Example Consider the iteration x k +1 = x k − 1 x k , x 0 = 2 . May be viewed as iter for e x 2 / 2 = 0 (which has no real solution ) Unknown whether this iteration generates a bounded sequence! 7 / 25

Newton method – remarks ◮ In general, Newton method highly nontrivial to analyze Example Consider the iteration x k +1 = x k − 1 x k , x 0 = 2 . May be viewed as iter for e x 2 / 2 = 0 (which has no real solution ) Unknown whether this iteration generates a bounded sequence! Newton fractals (Complex dynamics) z 3 − 2 z + 2 x 8 + 15 x 4 − 16 7 / 25

Newton method – alternative view Quadratic approximation 2 �∇ 2 f ( x k )( x − x k ) , x − x k � . φ ( x ) := f ( x ) + �∇ f ( x k ) , x − x k � + 1 8 / 25

Newton method – alternative view Quadratic approximation 2 �∇ 2 f ( x k )( x − x k ) , x − x k � . φ ( x ) := f ( x ) + �∇ f ( x k ) , x − x k � + 1 Assuming ∇ 2 f ( x k ) ≻ 0 , choose x k +1 as argmin of φ ( x ) 8 / 25

Newton method – alternative view Quadratic approximation 2 �∇ 2 f ( x k )( x − x k ) , x − x k � . φ ( x ) := f ( x ) + �∇ f ( x k ) , x − x k � + 1 Assuming ∇ 2 f ( x k ) ≻ 0 , choose x k +1 as argmin of φ ( x ) φ ′ ( x k +1 ) = ∇ f ( x k ) + ∇ 2 f ( x k )( x k +1 − x k ) = 0 . 8 / 25

Newton method – convergence ◮ Method breaks down if ∇ 2 f ( x k ) �≻ 0 ◮ Only locally convergent Example Find the root of x g ( x ) = √ 1 + x 2 . Clearly, x ∗ = 0 . 9 / 25

Newton method – convergence ◮ Method breaks down if ∇ 2 f ( x k ) �≻ 0 ◮ Only locally convergent Example Find the root of x g ( x ) = √ 1 + x 2 . Clearly, x ∗ = 0 . Exercise: Analyze behavior of Newton method for this problem. Hint: Consider the cases: | x 0 | < 1 , x 0 = ± 1 and | x 0 | > 1 . 9 / 25

Newton method – convergence ◮ Method breaks down if ∇ 2 f ( x k ) �≻ 0 ◮ Only locally convergent Example Find the root of x g ( x ) = √ 1 + x 2 . Clearly, x ∗ = 0 . Exercise: Analyze behavior of Newton method for this problem. Hint: Consider the cases: | x 0 | < 1 , x 0 = ± 1 and | x 0 | > 1 . Damped Newton method x k +1 = x k − α k [ ∇ 2 f ( x k )] − 1 ∇ f ( x k ) 9 / 25

Newton – local convergence rate ◮ Suppose method generates sequence { x k } → x ∗ 10 / 25

Newton – local convergence rate ◮ Suppose method generates sequence { x k } → x ∗ ◮ where x ∗ is a local min, i.e., ∇ f ( x ∗ ) = 0 and ∇ 2 f ( x ∗ ) ≻ 0 10 / 25

Newton – local convergence rate ◮ Suppose method generates sequence { x k } → x ∗ ◮ where x ∗ is a local min, i.e., ∇ f ( x ∗ ) = 0 and ∇ 2 f ( x ∗ ) ≻ 0 ◮ Let g ( x k ) ≡ ∇ f ( x k ) ; Taylor’s theorem: 0 = g ( x ∗ ) = g ( x k ) + �∇ g ( x k ) , x ∗ − x k � + o ( � x k − x ∗ � ) 10 / 25

Newton – local convergence rate ◮ Suppose method generates sequence { x k } → x ∗ ◮ where x ∗ is a local min, i.e., ∇ f ( x ∗ ) = 0 and ∇ 2 f ( x ∗ ) ≻ 0 ◮ Let g ( x k ) ≡ ∇ f ( x k ) ; Taylor’s theorem: 0 = g ( x ∗ ) = g ( x k ) + �∇ g ( x k ) , x ∗ − x k � + o ( � x k − x ∗ � ) ◮ Multiply by [ ∇ g ( x k )] − 1 to obtain x k − x ∗ − [ ∇ g ( x k )] − 1 g ( x k ) = o ( � x k − x ∗ � ) 10 / 25

Convex Optimization ( EE227A: UC Berkeley ) Lecture 25 (Newton, - PowerPoint PPT Presentation

Convex Optimization ( EE227A: UC Berkeley ) Lecture 25 (Newton, quasi-Newton) 23 Apr, 2013 Suvrit Sra Admin Project poster presentations: Soda 306 HP Auditorium Fri May 10, 2013 4pm 8pm HW5 due on May 02, 2013 Will be

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

16. Review of convex optimization Convex sets and functions Convex programming models

Faster convex optimization Simulated annealing & Interior point Elad Hazan Joint work with

CS675: Convex and Combinatorial Optimization Spring 2018 Duality of Convex Sets and Functions

Deep Neural Networks CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Deep learning slides credit:

DEVELOPMENT OF A VIRTUAL MUSEUM INCLUDING A 4D PRESENTATION OF BUILDING HISTORY IN VIRTUAL REALITY

Q3 2015 Investor Presentation Q3 2014 Investor Presentation Global Partners LP (NYSE: GLP)

SenseWeb: Shared Macro-scopes for Scientific Exploration Aman Kansal*, Suman Nath, Feng Zhao

While Loops Announcements for This Lecture Assignments Prelim 2 Thursday, 7:30-9pm A5

Using word-pictorial presentation model to simplify understanding concept test of Newtons law

Simulation of a Conjugate Heat Transfer using a preCICE Coupling Library Dehee Kim a , Jongtae

Review of One Two Dimensional Forces and Work Gravitational Potential Energy Dimensional

Convex Optimization ( EE227A: UC Berkeley ) Lecture 25 (Newton, - PowerPoint PPT Presentation

Convex Optimization ( EE227A: UC Berkeley ) Lecture 25 (Newton, quasi-Newton) 23 Apr, 2013 Suvrit Sra Admin Project poster presentations: Soda 306 HP Auditorium Fri May 10, 2013 4pm 8pm HW5 due on May 02, 2013 Will be

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

16. Review of convex optimization Convex sets and functions Convex programming models

Faster convex optimization Simulated annealing &amp; Interior point Elad Hazan Joint work with

CS675: Convex and Combinatorial Optimization Spring 2018 Duality of Convex Sets and Functions

Deep Neural Networks CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Deep learning slides credit:

DEVELOPMENT OF A VIRTUAL MUSEUM INCLUDING A 4D PRESENTATION OF BUILDING HISTORY IN VIRTUAL REALITY

Q3 2015 Investor Presentation Q3 2014 Investor Presentation Global Partners LP (NYSE: GLP)

SenseWeb: Shared Macro-scopes for Scientific Exploration Aman Kansal*, Suman Nath, Feng Zhao

While Loops Announcements for This Lecture Assignments Prelim 2 Thursday, 7:30-9pm A5

Using word-pictorial presentation model to simplify understanding concept test of Newtons law

Simulation of a Conjugate Heat Transfer using a preCICE Coupling Library Dehee Kim a , Jongtae

Review of One Two Dimensional Forces and Work Gravitational Potential Energy Dimensional

Faster convex optimization Simulated annealing & Interior point Elad Hazan Joint work with