Exponential cone in MOSEK ISMP2018, Relative Entropy Optimization, 6 - PowerPoint PPT Presentation

Exponential cone in MOSEK ISMP2018, Relative Entropy Optimization, 6 July 2018 Micha� l Adamaszek, MOSEK ApS www.mosek.com

MOSEK • linear conic solver: SOCP, SDP, EXP, POW, • primal/dual simplex for LPs, • convex QPs, • + mixed-integer, • APIs: MATLAB, C, Python, Java, .NET, R, Julia, • conic modeling language Fusion , C++, Java, .NET, Python, • third party: AMPL, GAMS, CVX, CVXPY, YALMIP, JuMP • version 9 (soon). 1 / 21

Conic problems A conic problem in canonical primal form: c T x minimize s . t . Ax = b x ∈ K with dual b T y maximize c − A T y ∈ K ∗ s . t . where K = K 1 × · · · × K s is a product of cones. Extremely disciplined convex programming : a problem in conic form is convex by construction. 2 / 21

Conic problems Nonlinear symmetric cones supported in MOSEK: • quadratic (SOC) and rotated quadratic: n ) 1 / 2 , x 1 ≥ ( x 2 2 + · · · + x 2 2 x 1 x 2 ≥ x 2 3 + · · · + x 2 n • semidefinite: + = { X ∈ R n × n : X = FF T } S n 3 / 21

Exponential cone K exp = cl { x ∈ R 3 : x 1 ≥ x 2 exp( x 3 /x 2 ) , x 1 , x 2 > 0 } Equivalently − x 3 ≥ x 2 log x 2 /x 1 = rel entr ( x 2 , x 1 ) or the perspective cone (epigraph of the perspective function ( x, y ) → xf ( y/x ) ) for either f ( u ) = exp( u ) or f ( u ) = u log( u ) . 4 / 21

Modeling with the exponential cone • t ≥ exp( x ) ⇐ ⇒ ( t, 1 , x ) ∈ K exp • t ≤ log( x ) ⇐ ⇒ ( x, 1 , t ) ∈ K exp • t ≥ a x 1 1 · · · a x k � ⇐ ⇒ ( t, 1 , x i log a i ) ∈ K exp , a i ∈ R + k • t ≥ x exp( x ) t ≥ x exp( y/x ) ( t, x, y ) ∈ K exp y ≥ x 2 (0 . 5 , y, x ) ∈ Q r 5 / 21

Modeling with the exponential cone What is (SOC,EXP,POW,SDP) — representable? Probably a lot. From ask.cvxr.com: — — 6 / 21

Modeling with the exponential cone • Product of variables in the objective � max( x 1 x 2 · · · x n ) ⇐ ⇒ max( log x i ) Appears in maximum likelihood optimization. • Log-sum-exp t ≥ log( e x 1 + · · · + e x n ) is equivalent to e x 1 − t + · · · + e x n − t ≤ 1 . 7 / 21

Power cone pow = { x ∈ R 3 : x p − 1 K p x 2 ≥ | x 3 | p , x 1 , x 2 > 0 } , p > 1 1 • generalizes the Lorentz cone ( p = 2 ), • is also a perspective cone (of f ( u ) = | u | p ), • allows modeling of x p , � x � p , etc. 8 / 21

Geometric programming A geometric program (GP) has the form minimize f 0 ( x ) s.t. f j ( x ) ≤ 1 , j = 1 , . . . , m x i > 0 , i = 1 , . . . , n. where each f is a posynomial : � c k x α k , c k > 0 , α k ∈ R n , f ( x ) = j e.g. 2 √ x + 0 . 1 x − 1 z 3 ≤ 1 . For x i = exp( y i ) constraints take a convex (conic) form � c k exp( α T k y k ) ≤ 1 . k Applications: circuit design, chemical engineering, mechanical engineering, wireless networks, ... 9 / 21

Logistic regression Training data: ( x 1 , y 1 ) , . . . , ( x n , y n ) ∈ R d × { 0 , 1 } . Classify new data using 1 h θ ( x ) = 1 + exp( − θ T x ) ∼ P [ y = 1] . Cost function � J ( θ ) = − y i log( h θ ( x i )) − (1 − y i ) log(1 − h θ ( x i )) . i Regularized optimization problem minimize θ ∈ R d J ( θ ) + λ � θ � 2 . 10 / 21

Logistic regression — conic model � minimize θ ∈ R d − y i log( h θ ( x i )) − (1 − y i ) log(1 − h θ ( x i ))+ λ � θ � 2 . i Formulate as: 1 T t i + λr minimize = log(1 + exp( − θ T x i )) s.t t i ≥ − log( h θ ( x )) if y i = 1 , = log(1 + exp( θ T x i )) t i ≥ − log(1 − h θ ( x )) if y i = 0 , r ≥ � θ � 2 , Each constraint is conic-representable: • r ≥ � θ � 2 ⇐ ⇒ ( r, θ ) ∈ Q • t ≥ log(1 + exp( u )) ⇐ ⇒ exp( − t ) + exp( u − t ) ≤ 1 ⇐ ⇒ y 1 + y 2 ≤ 1 , ( y 1 , 1 , u − t ) ∈ K exp , ( y 2 , 1 , − t ) ∈ K exp . 11 / 21

Logistic regression in Fusion # t >= log( 1 + exp(u) ) def softplus(M, t, u): y = M.variable(2) # y_1 + y_2 <= 1 M.constraint(Expr.sum(y), Domain.lessThan(1.0)) # [ y_1 1 u-t ] # [ y_2 1 -t ] in ExpCone M.constraint(Expr.hstack(y, Expr.constTerm(2, 1.0), Expr.vstack(Expr.sub(u,t), Expr.neg(t))), Domain.inPExpCone()) def logisticRegression(X, y, lamb=1.0): n, d = X.shape # num samples, dimension M = Model() theta = M.variable(d) t = M.variable(n) reg = M.variable() M.objective(ObjectiveSense.Minimize, Expr.add(Expr.sum(t), Expr.mul(lamb,reg))) M.constraint(Var.vstack(reg, theta), Domain.inQCone()) for i in range(n): dot = Expr.dot(X[i], theta) if y[i]==1: softplus(M, t.index(i), Expr.neg(dot)) else: softplus(M, t.index(i), dot) M.solve() 12 / 21

Logistic regression — example Logistic regression with increasing regularization. Every point lifted through 28 degree ≤ 6 monomials. Remark: logistic regression is a (log-)likelihood maximization problem: � h θ ( x i ) y i (1 − h θ ( x i )) 1 − y i . J ( θ ) = log i 13 / 21

Luxemburg norms Dirk Lorenz https://regularize.wordpress.com/2018/05/24/ building-norms-from-increasing-and-convex-functions-the-luxemburg-norm/ ϕ : R + → R + — increasing, convex with ϕ (0) = 0 . Then the following is a norm on R n : � � � | x i | � � � x � ϕ = inf λ > 0 : ϕ ≤ 1 . λ i Example: ϕ ( x ) = x p : � 1 /p � p �� | x i | � | x i | p ≤ 1 ⇐ ⇒ λ ≥ , λ i i so � x � ϕ = � x � p . 14 / 21

Luxemburg norms — conic representability Observation. The epigraph of the ϕ –Luxemburg–norm t ≥ � x � ϕ is conic representable if the perspective function of ϕ is. Proof. w i ≥ | x i | s i ≥ tϕ ( w i /t ) � s i = t add up to � 1 ≥ ϕ ( | x i | /t ) ⇐ ⇒ t ≥ � x � ϕ . Corollary. We can compute with balls in Luxemburg norms for x p , x · log(1 + x ) , exp( x ) − 1 . 15 / 21

Maximal inscribed cuboid Find the maximal volume axis-parallel cuboid inscribed in a given convex (conic-representable) set K ⊆ R n . � maximize log d i x + ε ◦ d ∈ K, for all ε ∈ { 0 , 1 } n s.t. x, d ∈ R n . 16 / 21

GP — performance 400 conic (3) GP primal (14) GP dual (2) 300 iterations 200 100 0 0 25 50 75 100 125 prob instance 17 / 21

LogExpCR — performance Log-exponential convex risk measure, (Vinel, Krokhmal, 2017).   m � p j f ( − r T η + (1 − α ) − 1 f − 1 minimize j x − η )   j =1 1 T x ≤ 1 s.t. x T � r j p j ≥ ¯ r x ∈ R n , η ∈ R • generalization of CVaR (Rockafellar, Uryasev, 2002), • f — vanishing on R − , f (0) = 0 , convex on R + . Here: f ( u ) = exp([ u ] + ) − 1 . • n — number of assets. • m — number of historical scenarios r 1 , . . . , r m ∈ R n with probabilities p 1 , . . . , p m . 18 / 21

LogExpCR — performance Easy instances Numerically harder instances n m 8 9 n m 8 9 200 100 0 . 08 (20) 0.05 (22) 200 100 0 . 12 (23) 0.06 (29) 200 200 0.17 (21) 0 . 19 (25) 200 200 0 . 42 (67) 0.29 (37) 200 500 0 . 91 (31) 0.35 (27) 200 500 1 . 12 (43) 0.77 (59) 200 1000 4 . 08 (28) 0.57 (27) 200 1000 6 . 01 (51) 1.83 (71) 200 2000 3 . 32 (39) 0.99 (28) 200 2000 3.44 (87) 500 100 0 . 13 (20) 0.11 (23) 500 100 0.09 (24) 500 200 0.28 (20) 0 . 36 (27) 500 200 0.35 (27) 0 . 37 (31) 500 500 1 . 61 (34) 1.41 (31) 500 500 2.08 (44) 500 1000 5 . 92 (29) 1.56 (30) 500 1000 8 . 12 (46) 4.45 (80) 500 2000 25 . 25 (34) 2.44 (30) 500 2000 5.84 (64) 1000 100 0.21 (22) 0 . 21 (29) 1000 100 0 . 31 (38) 0.13 (22) 1000 200 0.42 (20) 0 . 59 (30) 1000 200 0.51 (27) 0 . 58 (28) 1000 500 3 . 03 (34) 2.53 (31) 1000 500 3 . 66 (43) 3.23 (40) 1000 1000 9 . 43 (31) 6.87 (35) 1000 1000 12.32 (44) 12 . 83 (66) 1000 2000 35 . 26 (32) 8.66 (32) 1000 2000 16.78 (70) 1500 100 0 . 24 (18) 0.20 (23) 1500 100 0 . 31 (24) 0.18 (22) 1500 200 0.62 (20) 0 . 82 (31) 1500 200 2 . 08 (83) 0.70 (28) 1500 500 4 . 11 (35) 3.99 (33) 1500 500 6.04 (51) 1500 1000 16 . 39 (33) 10.42 (37) 1500 1000 11.65 (42) 1500 2000 45 . 67 (31) 12.15 (34) 1500 2000 73 . 21 (52) 24.77 (67) time in sec. (intpnt. iterations) 19 / 21

Closing remarks Software: • CVXPY has a K exp –capable MOSEK interface (Riley Murray). • Also YALMIP. • MOSEK Version 9 release this year. Links: • WWW www.mosek.com • Demos github.com/MOSEK/Tutorials • Blog themosekblog.blogspot.com/ • I found a bug! / MOSEK is too slow! support@mosek.com • Twitter @mosektw • Modeling Cookbook www.mosek.com/documentation • Slides: www.mosek.com/resources/presentations Reading: • V.Chandrasekaran, P.Shah, Relative entropy optimization and its applications , Math. Program., Ser. A (2017) 161:1-32 20 / 21

Thank you! Smallest enclosing ball of a random point set in R 2 in the (exp( x ) − 1) –Luxemburg norm. 21 / 21

Exponential cone in MOSEK ISMP2018, Relative Entropy Optimization, 6 - PowerPoint PPT Presentation

Exponential cone in MOSEK ISMP2018, Relative Entropy Optimization, 6 July 2018 Micha l Adamaszek, MOSEK ApS www.mosek.com MOSEK linear conic solver: SOCP, SDP, EXP, POW, primal/dual simplex for LPs, convex QPs, +

On recent improvements in MOSEK Erling D. Andersen MOSEK ApS, Fruebjergvej 3, Box 16, 2100

Tour de MOSEK 7: The short version Erling D. Andersen MOSEK ApS, Fruebjergvej 3, Box 16, 2100

Extending MOSEK with exponential cones ISMP Bordeaux 2018 joachim.dahl@mosek.com www.mosek.com

An overview of MOSEK Erling D. Andersen MOSEK ApS Fruebjergvej 3, Symbion Science park, Box 16

Solving conic optimization problems using MOSEK December 16th 2017 e.d.andersen@mosek.com

What is Mosek up to January 15, 2019 Erling D. Andersen www.mosek.com Mosek A software

On Using MOSEK to Solve The MOSEK solvers Large-Scale Linear and Conic Optimization Problems

MOSEK version 9 (work in progress) July 4, 2018 Erling D. Andersen www.mosek.com Mosek A

Projection and presolve in MOSEK: exponential and power cones ISMP 2018 Henrik A. Friberg

Power and Exponential cones x y 1 | z | and x ye z/y June 27, 2018 Ulf Worse

Mixed-integer conic optimization and MOSEK Dagstuhl seminar on MINLP, February 20th 2018 Sven

The R-to-MOSEK Optimization Interface Henrik Alsing Friberg MOSEK ApS, Fruebjergvej 3, Box 16,

On Recent Improvements in the Interior-Point Optimizer in MOSEK ISMP2015 14 July 2015

Mixed-integer conic optimization and MOSEK Dagstuhl seminar on MINLP, February 20th 2018 Sven

Exponential Families Leila Wehbe March 19, 2013 Leila Wehbe Exponential Families Exponential

A primal-dual algorithm for expontial-cone optimization ICCOPT Berlin, August 8th, 2019

Improving Neural Language Modeling via Adversarial Training Dilin Wang, Chengyue Gong (equal

Announcements 61A Lecture 37 Syntactic Ambiguity in English Sentence Noun Verb Phrase

NPFL103: Information Retrieval (5) Ranking, Complete search system, Evaluation, Benchmarks Pavel

Overview Motivation and introduction Structure independent approach ECE 553: TESTING AND

T w + C Minimize T z fo r some Z spae N 1 n 2 w n =1 K ( x , x ) =

CS453 Intro and PA1 1 operator < If statement Low level pseudocode for result of translation

A Type System for Functional Traversal-Based Aspects Bryan Chadwick and Karl Lieberherr March 2

Variational denoising for manifold-valued data Andreas Weinmann Helmholtz Center Munich & TU

Exponential cone in MOSEK ISMP2018, Relative Entropy Optimization, 6 - PowerPoint PPT Presentation

Exponential cone in MOSEK ISMP2018, Relative Entropy Optimization, 6 July 2018 Micha l Adamaszek, MOSEK ApS www.mosek.com MOSEK linear conic solver: SOCP, SDP, EXP, POW, primal/dual simplex for LPs, convex QPs, +

On recent improvements in MOSEK Erling D. Andersen MOSEK ApS, Fruebjergvej 3, Box 16, 2100

Tour de MOSEK 7: The short version Erling D. Andersen MOSEK ApS, Fruebjergvej 3, Box 16, 2100

Extending MOSEK with exponential cones ISMP Bordeaux 2018 joachim.dahl@mosek.com www.mosek.com

An overview of MOSEK Erling D. Andersen MOSEK ApS Fruebjergvej 3, Symbion Science park, Box 16

Solving conic optimization problems using MOSEK December 16th 2017 e.d.andersen@mosek.com

What is Mosek up to January 15, 2019 Erling D. Andersen www.mosek.com Mosek A software

On Using MOSEK to Solve The MOSEK solvers Large-Scale Linear and Conic Optimization Problems

MOSEK version 9 (work in progress) July 4, 2018 Erling D. Andersen www.mosek.com Mosek A

Projection and presolve in MOSEK: exponential and power cones ISMP 2018 Henrik A. Friberg

Power and Exponential cones x y 1 | z | and x ye z/y June 27, 2018 Ulf Worse

Mixed-integer conic optimization and MOSEK Dagstuhl seminar on MINLP, February 20th 2018 Sven

The R-to-MOSEK Optimization Interface Henrik Alsing Friberg MOSEK ApS, Fruebjergvej 3, Box 16,

On Recent Improvements in the Interior-Point Optimizer in MOSEK ISMP2015 14 July 2015

Mixed-integer conic optimization and MOSEK Dagstuhl seminar on MINLP, February 20th 2018 Sven

Exponential Families Leila Wehbe March 19, 2013 Leila Wehbe Exponential Families Exponential

A primal-dual algorithm for expontial-cone optimization ICCOPT Berlin, August 8th, 2019

Improving Neural Language Modeling via Adversarial Training Dilin Wang*, Chengyue Gong* (equal

Announcements 61A Lecture 37 Syntactic Ambiguity in English Sentence Noun Verb Phrase

NPFL103: Information Retrieval (5) Ranking, Complete search system, Evaluation, Benchmarks Pavel

Overview Motivation and introduction Structure independent approach ECE 553: TESTING AND

T w + C Minimize T z fo r some Z spae N 1 n 2 w n =1 K ( x , x ) =

CS453 Intro and PA1 1 operator &lt; If statement Low level pseudocode for result of translation

A Type System for Functional Traversal-Based Aspects Bryan Chadwick and Karl Lieberherr March 2

Variational denoising for manifold-valued data Andreas Weinmann Helmholtz Center Munich &amp; TU

Improving Neural Language Modeling via Adversarial Training Dilin Wang, Chengyue Gong (equal

CS453 Intro and PA1 1 operator < If statement Low level pseudocode for result of translation

Variational denoising for manifold-valued data Andreas Weinmann Helmholtz Center Munich & TU