Duality correspondences Geoff Gordon & Ryan Tibshirani - PowerPoint PPT Presentation

Duality correspondences Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1

Remember KKT conditions Recall that for the problem x ∈ R n f ( x ) min subject to h i ( x ) ≤ 0 , i = 1 , . . . m ℓ j ( x ) = 0 , j = 1 , . . . r the KKT conditions are m r � � • 0 ∈ ∂f ( x ) + u i ∂h i ( x ) + v i ∂ℓ j ( x ) (stationarity) i =1 j =1 • u i · h i ( x ) = 0 for all i (complementary slackness) • h i ( x ) ≤ 0 , ℓ j ( x ) = 0 for all i, j (primal feasibility) • u i ≥ 0 for all i (dual feasibility) These are necessary for optimality (of a primal-dual pair x ⋆ and u ⋆ , v ⋆ ) under strong duality, and sufficient for convex problems 2

Remember solving the primal via the dual An important consequence of stationarity: under strong duality, given a dual solution u ⋆ , v ⋆ , any primal solution x ⋆ solves m r � � u ⋆ v ⋆ x ∈ R n f ( x ) + min i h i ( x ) + i ℓ j ( x ) i =1 j =1 Often, solutions of this unconstrained problem can be expressed explicitly, giving an explicit characterization of primal solutions (from dual solutions) Furthermore, suppose the solution of this problem is unique; then it must be the primal solution x ⋆ This can be very helpful when the dual is easier to solve than the primal 3

Consider as an example (from B & V page 249): n f i ( x i ) subject to a T x = b � min x ∈ R n i =1 where each f i : R → R is a strictly convex function. Dual function: n f i ( x i ) + v ( b − a T x ) � g ( v ) = min x ∈ R n i =1 n � = bv + min x i ∈ R ( f i ( x i ) − a i vx i ) i =1 n � f ∗ = bv − i ( a i v ) i =1 where f ∗ i is the conjugate of f i , to be defined shortly 4

Therefore the dual problem is n � f ∗ v ∈ R bv − max i ( a i v ) i =1 or equivalently n � f ∗ i ( a i v ) − bv min v ∈ R i =1 This is a convex minimization problem with scalar variable—much easier to solve than primal Given v ∗ , the primal solution x ⋆ solves n � ( f i ( x i ) − a i v ⋆ x i ) min x ∈ R n i =1 Strict convexity of each f i implies that this has a unique solution, namely x ⋆ , which we compute by solving ∂f i ( x i ) ∋ a i v ⋆ for each i 5

Dual subtleties • Often, we will transform the dual into an equivalent problem and still call this the dual. Under strong duality, we can use solutions of the (transformed) dual problem to characterize or compute primal solutions Warning: the optimal value of this transformed dual problem is not necessarily the optimal primal value • A common trick in deriving duals for unconstrained problems is to first transform the primal by adding a dummy variable and an equality constraint Usually there is ambiguity in how to do this, and different choices lead to different dual problems! 6

Lasso dual Recall the lasso problem: 1 2 � y − Ax � 2 + λ � x � 1 min x ∈ R p Its dual function is just a constant (equal to f ⋆ ). Therefore we redefine the primal as 1 2 � y − z � 2 + λ � x � 1 subject to z = Ax min x ∈ R p , z ∈ R n so dual function is now 1 2 � y − z � 2 + λ � x � 1 + u T ( z − Ax ) g ( u ) = min x ∈ R p , z ∈ R n = 1 2 � y � 2 − 1 2 � y − u � 2 − I { v : � v � ∞ ≤ 1 } ( A T u/λ ) This calculation will make sense once we learn conjugates, shortly 7

Therefore the lasso dual problem is 1 � � y � 2 − � y − u � 2 � subject to � A T u � ∞ ≤ λ max 2 u ∈ R n or equivalently u ∈ R n � y − u � 2 subject to � A T u � ∞ ≤ λ min Note that strong duality holds here (Slater’s condition), but the optimal value of the last problem is not necessarily the optimal lasso objective value Further, note that given u ⋆ , any lasso solution x ⋆ satisfies (from the z block of the stationarity condition) z ⋆ − y + u ⋆ = 0 , i.e., Ax ⋆ = y − u ⋆ So the lasso fit is just the dual residual 8

Outline Today: • Conjugate function • Dual cones • Dual polytopes • Polar sets (And there are lots more duals—e.g., dual graphs, alebgraic dual, analytic dual—all related in some way...) 9

Conjugate function Given a function f : R n → R , define its conjugate f ∗ : R n → R , x ∈ R n y T x − f ( x ) f ∗ ( y ) = max Note that f ∗ is always convex, since it is the pointwise maximum of convex (affine) functions in y ( f need not be convex) f ∗ ( y ) : maximum gap between linear function y T x and f ( x ) (From B & V page 91) For differentiable f , conjugation is called the Legendre transform 10

Properties: • Fenchel’s inequality: for any x, y , f ( x ) + f ∗ ( y ) ≥ x T y • Hence conjugate of conjugate f ∗∗ satisfies f ∗∗ ≤ f • If f is closed and convex, then f ∗∗ = f • If f is closed and convex, then for any x, y , x ∈ ∂f ∗ ( y ) ⇔ y ∈ ∂f ( x ) f ( x ) + f ∗ ( y ) = x T y ⇔ • If f ( u, v ) = f 1 ( u ) + f 2 ( v ) (here u ∈ R n , v ∈ R m ), then f ∗ ( w, z ) = f ∗ 1 ( w ) + f ∗ 2 ( z ) 11

Examples: 2 x T Qx , where Q ≻ 0 . Then • Simple quadratic: let f ( x ) = 1 y T x − 1 2 x T Qx is strictly concave in y and is maximized at y = Q − 1 x , so f ∗ ( y ) = 1 2 y T Q − 1 y Note that Fenchel’s inequality gives: 1 2 x T Qx + 1 2 y T Q − 1 y ≥ x T y • Indicator function: if f ( x ) = I C ( x ) , then its conjugate is x ∈ C y T x f ∗ ( y ) = I ∗ C ( y ) = max called the support function of C ; we’ll revisit this later 12

• Norm: if f ( x ) = � x � , then its conjugate is � if � y � ∗ ≤ 1 0 f ∗ ( y ) = ∞ else where � · � ∗ is the dual norm of � · � (recall that we defined � y � ∗ = max � z �≤ 1 z T y ). Why? Note that if � y � ∗ > 1 , then there exists � z � ≤ 1 with z T y = � y � ∗ > 1 , so ( tz ) T y − � tz � = t ( z T y − � z � ) → ∞ , as t → ∞ i.e., f ∗ ( y ) = ∞ On the other hand, if � y � ∗ ≤ 1 , then z T y − � z � ≤ � z �� y � ∗ − � z � ≤ 0 and = 0 when z = 0 , so f ∗ ( y ) = 0 13

Conjugates and dual problems Conjugates appear frequently in derivation of dual problems, via x ∈ R n f ( x ) − u T x − f ∗ ( u ) = min in minimization of the Lagrangian. E.g., consider x ∈ R n f ( x ) + g ( x ) min ⇔ x ∈ R n , z ∈ R n f ( x ) + g ( z ) subject to x = z min Lagrange dual function: x ∈ R n , z ∈ R n f ( x ) + g ( z ) + u T ( z − x ) = − f ∗ ( u ) − g ∗ ( − u ) g ( u ) = min Hence dual problem is u ∈ R n − f ∗ ( u ) − g ∗ ( − u ) max 14

Examples of this last calculation: • Indicator function: dual of x ∈ R n f ( x ) + I C ( x ) min is u ∈ R n − f ( u ) − I ∗ max C ( − u ) where I ∗ C is the support function of C • Norms: the dual of x ∈ R n f ( x ) + � x � min is u ∈ R n − f ∗ ( u ) subject to � u � ∗ ≤ 1 max where � · � ∗ is the dual norm of � · � 15

Double dual Consider general minimization problem with linear constraints: x ∈ R n f ( x ) min subject to Ax ≤ b, Cx = d The Lagrangian is L ( x, u, v ) = f ( x ) + ( A T u + C T v ) T x − b T u − d T v and hence the dual problem is u ∈ R m , v ∈ R r − f ∗ ( − A T u − C T v ) − b T u − d T v max subject to u ≥ 0 Recall property: f ∗∗ = f if f is closed and convex. Hence in this case, we can show that the dual of the dual is the primal 16

Actually, the connection (between duals of duals and conjugates) runs much deeper than this, beyond linear constraints. Consider x ∈ R n f ( x ) min subject to h i ( x ) ≤ 0 , i = 1 , . . . m ℓ j ( x ) = 0 , j = 1 , . . . r If f and h 1 , . . . h m are closed and convex, and ℓ 1 , . . . ℓ r are affine, then the dual of the dual is the primal This is proved by viewing the minimization problem in terms of a bifunction. In this framework, the dual function corresponds to the conjugate of this bifunction (for more, read Chapters 29 and 30 of Rockafellar) 17

Cones A set K ∈ R n is called a cone if x ∈ K ⇒ θx ∈ K for all θ ≥ 0 It is called a convex cone if x 1 , x 2 ∈ C ⇒ θ 1 x 1 + θ 2 x 2 ∈ C for all θ 1 , θ 2 ≥ 0 i.e., K is convex and a cone (From B & V page 26) 18

Examples: • Linear subspace: any linear subspace is a convex cone • Norm cone: if � · � is a norm then K = { ( x, t ) ∈ R n +1 : � x � ≤ t } is a convex cone, called a norm cone (epigraph of norm function). Under 2-norm, called second-order cone, e.g., (From B & V page 31) 19

• Normal cone: given a set C , recall we defined its normal cone at a point x ∈ C as N C ( x ) = { g ∈ R n : g T x ≥ g T y for any y ∈ C } ● ● This is always a convex cone, regardless of C ● ● • Positive semidefinite cone: consider the set of (symmetric) positive semidefinite matrices + = { X ∈ R n × n : X = X T , X � 0 } S n This is a convex cone, because for A, B � 0 and θ 1 , θ 2 ≥ 0 , x T ( θ 1 A + θ 2 B ) x = θ 1 x T Ax + θ 2 x T Bx ≥ 0 20

Dual cones For a cone K ∈ R n , K ∗ = { y ∈ R n : y T x ≥ 0 for all x ∈ K } is called its dual cone . This is always a convex cone (even if K is not convex) Note that y ∈ K ∗ ⇔ the halfspace { x ∈ R n : y T x ≥ 0 } contains K (From B & V page 52) Important property: if K is a closed convex cone, then K ∗∗ = K 21

Duality correspondences Geoff Gordon & Ryan Tibshirani - PowerPoint PPT Presentation

Duality correspondences Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember KKT conditions Recall that for the problem x R n f ( x ) min subject to h i ( x ) 0 , i = 1 , . . . m j ( x ) = 0 , j = 1 , . . . r

Review of duality so far LP/QP duality, cone duality, set duality All are halfspace bounds

3. Correspondences Daisuke Oyama Mathematics II April 10, 2020 Correspondences Let X and Y be

Duality of abelian groups stacks and T -duality U. Bunke September 6, 2006 String

Computational Geometry Lecture 11: Arrangements and Duality Computational Geometry Lecture 11:

Stone duality, more duality, and dynamics in Will Brian May 22, 2014 Will Brian Stone

CS675: Convex and Combinatorial Optimization Spring 2018 Duality of Convex Sets and Functions

CS675: Convex and Combinatorial Optimization Fall 2019 Geometric Duality of Convex Sets and

T-duality Invariant Formalisms at the Quantum Level Daniel Thompson Queen Mary University of

10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The

First-Order Logical Duality Henrik Forssell June 2008 First-Order Logical Duality Introduction

Duality Sensitivity Analysis Marco Chiarandini Department of Mathematics & Computer Science

A word on duality Jonathan Turk Arizona State University October 21, 2020 Overview

Introduction to Priestley duality 1 / 24 Outline What is a distributive lattice? Priestley

Involutive factorisation systems & Dold-Kan correspondences Clemens Berger 1 University of

Panorama Due November 4 th Goal Find key features in images and correspondences between

Conflicts in the World of Duality PL 81 Presentation Notes 1: Fear of Death

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Power cones in second-order cone form and dual recovery SIAM Conference on Optimization 2017

13. Cones and semidefinite constraints Geometry of cones Second order cone programs

The Ghost in the Machine Dawie van den Heever SSM with implants 2 SSM with implants 3 SSM

The Feasibility Pump heuristic for Mixed-Integer Conic Programming Workshop on Discrepancy Theory

Monomial Tropical Cones for Multicriteria Optimization Georg Loho joint work with Michael Joswig

CS-184: Computer Graphics Lecture #2: Color Prof. James OBrien University of California,

Completely positive semidefinite matrices: conic approximations and matrix factorization ranks

Sambuz

Useful Links

Newsletter

Mail Us

Duality correspondences Geoff Gordon & Ryan Tibshirani - PowerPoint PPT Presentation

Duality correspondences Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember KKT conditions Recall that for the problem x R n f ( x ) min subject to h i ( x ) 0 , i = 1 , . . . m j ( x ) = 0 , j = 1 , . . . r

Review of duality so far LP/QP duality, cone duality, set duality All are halfspace bounds

3. Correspondences Daisuke Oyama Mathematics II April 10, 2020 Correspondences Let X and Y be

Duality of abelian groups stacks and T -duality U. Bunke September 6, 2006 String

Computational Geometry Lecture 11: Arrangements and Duality Computational Geometry Lecture 11:

Stone duality, more duality, and dynamics in Will Brian May 22, 2014 Will Brian Stone

CS675: Convex and Combinatorial Optimization Spring 2018 Duality of Convex Sets and Functions

CS675: Convex and Combinatorial Optimization Fall 2019 Geometric Duality of Convex Sets and

T-duality Invariant Formalisms at the Quantum Level Daniel Thompson Queen Mary University of

10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The

First-Order Logical Duality Henrik Forssell June 2008 First-Order Logical Duality Introduction

Duality Sensitivity Analysis Marco Chiarandini Department of Mathematics &amp; Computer Science

A word on duality Jonathan Turk Arizona State University October 21, 2020 Overview

Introduction to Priestley duality 1 / 24 Outline What is a distributive lattice? Priestley

Involutive factorisation systems &amp; Dold-Kan correspondences Clemens Berger 1 University of

Panorama Due November 4 th Goal Find key features in images and correspondences between

Conflicts in the World of Duality PL 81 Presentation Notes 1: Fear of Death

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Power cones in second-order cone form and dual recovery SIAM Conference on Optimization 2017

13. Cones and semidefinite constraints Geometry of cones Second order cone programs

The Ghost in the Machine Dawie van den Heever SSM with implants 2 SSM with implants 3 SSM

The Feasibility Pump heuristic for Mixed-Integer Conic Programming Workshop on Discrepancy Theory

Monomial Tropical Cones for Multicriteria Optimization Georg Loho joint work with Michael Joswig

CS-184: Computer Graphics Lecture #2: Color Prof. James OBrien University of California,

Completely positive semidefinite matrices: conic approximations and matrix factorization ranks

Sambuz

Useful Links

Newsletter

Mail Us

Duality Sensitivity Analysis Marco Chiarandini Department of Mathematics & Computer Science

Involutive factorisation systems & Dold-Kan correspondences Clemens Berger 1 University of