Shape-constrained regression and sum of squares polynomials
Georgina Hall INSEAD, Decision Sciences Joint work with Mihaela Curmei (Berkeley, EECS)
1
and sum of squares polynomials Georgina Hall INSEAD, Decision - - PowerPoint PPT Presentation
Shape-constrained regression and sum of squares polynomials Georgina Hall INSEAD, Decision Sciences Joint work with Mihaela Curmei (Berkeley, EECS) 1 Shape-constrained regression (1/2) =1,, where
Georgina Hall INSEAD, Decision Sciences Joint work with Mihaela Curmei (Berkeley, EECS)
1
Data: ππ, π
π π=1,β¦,π where ππ β πΆ β βπ (πΆ is a box) and π π β β
Goal: Fit a polynomial ΰ· ππ,π of degree π to the data that minimizes Οπ=1β¦π π
π β π(ππ) 2
and that has certain constraints on its shape.
2
3
Convexity over B Monotonicity over B Lipschitz with constant π³ For a full-dimensional box B and a twice-continuously differentiable function π: π is convex over πΆ β β2π π¦ β½ 0, βπ¦ β πΆ Example:
For a continuously differentiable function π: π is increasing (resp. decreasing) in component π¦π β
ππ π¦ ππ¦π β₯ 0 π ππ‘π. β€ 0 , βπ¦ β πΆ
Example:
price For any function π and a fixed scalar πΏ > 0: π is Lipschitz with constant K β π π¦ β π π§ β€ πΏ π¦ β π§ , βπ¦, π§ β πΆ Use as a regularizer: stops π from growing too steeply Focus on convex regression here.
4
A candidate for our regressor: ΰ· ππ,π π¦ β arg min
π Οπ=1..π π π β π ππ 2
s.t. π is a polynomial of degree π β2π π¦ β½ 0, βπ¦ β πΆ Butβ¦ Theorem [Ahmadi, H.]: It is (strongly) NP-hard to test whether a polynomial π of degree β₯ 3 is convex over a box πΆ. (Reduction from problem of testing whether a matrix whose entries are affine polynomials in π¦ is positive semidefinite for all π¦ in πΆ.)
5
then we are in business!
for degree of π β₯ 4.
π(π¦) convex
β π§πβ2π(π¦)π§ β₯ 0, βπ¦ β πΆ, βπ§ β βπ β2π(π¦) β½ 0, βπ¦ β πΆ β Polynomial in π and π
6
Idea Find a property that implies nonnegativity but that is easy to test. = sum of squares (sos) Definition: A polynomial π is sos if it can be written as π π¦ = Οπ ππ π¦ 2.
Sos polynomials Nonnegative polynomials
β β‘ β Yes! Even equal sometimes : π = 1, π = 2, π, π = (3,4) [Hilbert] What about β‘? Also yes! Letβs see why.
A polynomial π(π¦) of degree 2d is sos if and only if βπ β½ 0 such that where π¨ = 1, π¦1, β¦ , π¦π, π¦1π¦2, β¦ , π¦π
π π is the vector of monomials of degree up to π.
π π¦ = π¦1
4 β 6π¦1 3π¦2 + 2π¦1 3π¦3 + 6π¦1 2π¦3 2 + 9π¦1 2π¦2 2 β 6π¦1 2π¦2π¦3 β 14π¦1π¦2π¦3 2 + 4π¦1π¦3 3
+5π¦3
4 β 7π¦2 2π¦3 2 + 16π¦2 4
= π¦1
2 β 3π¦1π¦2 + π¦1π¦3 + 2π¦3 2 2 + π¦1π¦3 β π¦2π¦3 2 + 4π¦2 2 β π¦3 2 2
Ex:
= ππ
π
ππππ ππ
π
ππππ ππππ ππ
π T
π βπ π π π π βπ π π βπ π βπ π π ππ π π βπ π βπ π π βπ π π π π βπ π π π βπ π π π π ππ
π
ππππ ππ
π
ππππ ππππ ππ
π
π π¦ = π π πΌπΉπ(π)
7
degree) is an SDP.
8
min
π 0
s.t. π π¦ = π¨ π¦ ππ π¨ π¦ βπ¦ π β½ 0
Linear equations involving coefficients of π and entries of π
min
c1,c2 π1 + π2
π‘. π’. π1 β 3π2 = 4 π1π¦1
2 β 2π2π¦1π¦2 + 5π¦2 4 sos
min
c1,c2,π π1 + π2
π‘. π’. π1 β 3π2 = 4 π1π¦1
2 β 2π2π¦1π¦2 + 5π¦2 4 = π¨ π¦ ππ π¨ π¦
π β½ 0
9
π(π¦) convex
β π§πβ2π(π¦)π§ β₯ 0, βπ β πͺ, βπ§ β βπ β2π(π¦) β½ 0, βπ¦ β πΆ β Theorem [Putinar β93]: For a box πΆ = π¦1, β¦ , π¦π π1 β€ π¦1 β€ π£1, β¦ , ππ β€ π¦π β€ π£π}, we write instead: π§πβ2π π¦ π§ = π0 π¦, π§ + π1 π¦, π§ π£1 β π¦1 π¦1 β π1 + β― + ππ π¦, π§ π£π β π¦π (π¦π β ππ) where π0 π¦, π§ , π1 π¦, π§ , β¦ ππ(π¦, π§) are sos polynomials in π¦ and π§
10
A new candidate for the regressor: ΰ·€ ππ,π,π π¦ β arg min
π,π0,β¦ππ Οπ=1..π π π β π ππ 2
s.t. π is a polynomial of degree π π§πβ2π π¦ π§ = π0 π¦, π§ + β― + ππ π¦, π§ π£π β π¦π π¦π β ππ π0 π¦, π§ , π1 π¦, π§ , β¦ , ππ(π¦, π§) are sos of degree π in π¦ (and 2 in π§)
ππ,π.
11
Our method Existing method
[Lim & Glynn, Seijo & Sen]
polynomially in number of features.
polynomial estimator
constraints and Lipschitz constraints
linearly (resp. quadratically) with number of datapoints
program
see [Mazumder et al.])
(see [Lim & Glynn]) and Lipschitz constraints (see [Mazumder et al.])
What about ours?
12
Theorem [Curmei, H.] The regressor ΰ·€ ππ,π,π is a consistent estimator of π over any compact π· β πΆ, i.e., sup
π¦βπ·
ΰ·€ ππ,π,π (π¦) β π π¦ β 0 a.s., when π, π, π β β
For ππ ππ are iid, with support πΆ, and πΉ ππ
2 < β
For ππ π
π = π ππ + ππ for π = 1, β¦ , π
with πΉ ππ ππ = 0 a.s. and πΉ ππ
2 < β
For π π is twice continuously differentiable and convex over πΆ
Assumptions on the data:
Proof ideas: inspired by [Lim and Glynn, ORβ12]
π π¦ β ΰ·€ ππ,π,π (π¦) β€ π π¦ β ΰ· ππ,π π¦ + ΰ· ππ,π π¦ β ΰ·€ ππ,π,π π¦
a convex polynomial ππ of degree π such that sup
π¦βπ·
π π¦ β ππ π¦ < π
13
Can show sup
π¦βπ·
ΰ· ππ,π π¦ β ΰ·€ ππ,π,π π¦ β 0 when π β β
π π¦ β ΰ· ππ,π π¦ β€ π π¦ β ππ π¦ + ππ π¦ β ππ ππ + ππ ππ β ΰ· ππ,π ππ + | ΰ· ππ,π ππ β ΰ· ππ,π π¦ | Remains to show that
1 π Οπ=1..π ππ ππ β ΰ·
ππ,π ππ
2 β 0 a.s. when π β β
14
Upper bound with π
Show that ππ is Lipschitz (use convexity of ππ over B) Show that ΰ· ππ,π is Lipschitz (uniformly in π) (bound | ΰ· ππ,π| over π· unif. in π and use convexity) Upper bound this (algebra) by
1 π Οπ=1..π ππ ππ β ΰ·
ππ,π ππ
2
1 π Οπ=1..π ππ ππ β ΰ·
ππ,π ππ
2 β 0 a.s. when π β β
ππ,π is a minimizer of Οπ π
π β π ππ 2 to obtain 1 π Οπ=1..π ππ ππ β ΰ·
ππ,π ππ
2 β€ 2 π Οπ π π β ππ ππ
β ( ΰ· ππ,π ππ β ππ ππ )
ππ,π is a polynomial that depends on ππ and π
π
ππ,π by a deterministic function which is bounded over π·.
ππ,π belongs (for large enough π) to a compact set whose elements are bounded over π·
ππ,π by an element of this set which is π-close and bounded over π·
15
16
π
π = π ππ + π β π ΰ΄€
π β ππ
function
π) is the (empirical) standard deviation of (π π1 , β¦ , π ππ )
17
capital (K) and intermediate goods (I).
industries
18
19
Cobb Douglas functions ππ£π’ = π β πΏππππ½π π > 0, b, c, d > 0 and b + c + π β€ 1 β concave +monotone Fit in log-space: linear regression
20
Outperforms Cobb-Douglas on 50/65 industries.
constraints)
21
22