CS6501: T
- pics in Learning and Game Theory
(Fall 2019)
Linear Programming
Instructor: Haifeng Xu
Slides of this lecture is adapted from Shaddin Dughmi at https://www-bcf.usc.edu/~shaddin/cs675sp18/index.html
Linear Programming Instructor: Haifeng Xu Slides of this lecture is - - PowerPoint PPT Presentation
CS6501: T opics in Learning and Game Theory (Fall 2019) Linear Programming Instructor: Haifeng Xu Slides of this lecture is adapted from Shaddin Dughmi at https://www-bcf.usc.edu/~shaddin/cs675sp18/index.html Outline Linear Programing
CS6501: T
(Fall 2019)
Instructor: Haifeng Xu
Slides of this lecture is adapted from Shaddin Dughmi at https://www-bcf.usc.edu/~shaddin/cs675sp18/index.html
2
Ø Linear Programing Basics Ø Dual Program of LP and Its Properties
3
Ø The task of selecting the best configuration from a “feasible” set to
minimize (or maximize) 𝑔(𝑦) subject to 𝑦 ∈ 𝑌
Ø Example 1: minimize 𝑦', s.t. 𝑦 ∈ [−1,1] Ø Example 2: pick a road to school
4
ØA problem can be solved in polynomial time if there exists an
algorithm that solves the problem in time polynomial in its input size
ØWhy care about polynomial time? Why not quadratic or linear?
between easy and difficult problems
can quickly reduce the polynomial degree to be small (e.g., solving LPs)
ØIn algorithm analysis, a significant chunk of research is devoted to
studying the complexity of a problem by proving it is poly- time solvable or not (e.g., NP-hard problems)
5
minimize (or maximize) 𝑔(𝑦) subject to 𝑦 ∈ 𝑌
Ø Difficult to solve without any assumptions on 𝑔(𝑦) and 𝑌 Ø A ubiquitous and well-understood case is linear program
6
minimize (or maximize) 𝑑. ⋅ 𝑦 subject to 𝑏1 ⋅ 𝑦 ≤ 𝑐1 ∀𝑗 ∈ 𝐷7 𝑏1 ⋅ 𝑦 ≥ 𝑐1 ∀𝑗 ∈ 𝐷' 𝑏1 ⋅ 𝑦 = 𝑐1 ∀𝑗 ∈ 𝐷:
ØDecision variable: 𝑦 ∈ ℝ< ØParameters:
7
maximize 𝑑. ⋅ 𝑦 subject to 𝑏1 ⋅ 𝑦 ≤ 𝑐1 ∀𝑗 = 1, ⋯ , 𝑛 𝑦? ≥ 0 ∀𝑘 = 1, ⋯ , 𝑜
Ø minimize 𝑑. ⋅ 𝑦
⇔ maximize −𝑑. ⋅ 𝑦
Ø 𝑏1 ⋅ 𝑦 ≥ 𝑐1 ⇔ −𝑏1 ⋅ 𝑦 ≤ −𝑐1 Ø 𝑏1 ⋅ 𝑦 = 𝑐1 ⇔ 𝑏1 ⋅ 𝑦 ≤ 𝑐1 and −𝑏1 ⋅ 𝑦 ≤ −𝑐1 Ø Any unconstrained 𝑦? can be replaced by 𝑦?
D − 𝑦? E with 𝑦? D, 𝑦? E ≥ 0
8
𝑏1 ⋅ 𝑦 = 𝑐1 𝑑 𝑑 ⋅ 𝑦 = 𝑤
9
𝑏1 ⋅ 𝑦 = 𝑐1 𝑑 𝑑 ⋅ 𝑦 = 𝑤
10
11
Ø 𝑜 products, 𝑛 raw materials ØEvery unit of product 𝑘 uses 𝑏1? units of raw material 𝑗 ØThere are 𝑐1 units of material 𝑗 available ØProduct 𝑘 yields profit 𝑑
? per unit
ØFactory wants to maximize profit subject to available raw materials
maximize 𝑑. ⋅ 𝑦 subject to 𝑏1 ⋅ 𝑦 ≤ 𝑐1 ∀𝑗 = 1, ⋯ , 𝑛 𝑦? ≥ 0 ∀𝑘 = 1, ⋯ , 𝑜
where variable 𝑦? = # units of product 𝑘 𝑘: product index 𝑗: material index
12
ØHyperplane: The region defined by a linear equality 𝑏1 ⋅ 𝑦 = 𝑐1 ØHalfspace: The region defined by a linear inequality 𝑏1 ⋅ 𝑦 ≤ 𝑐1 ØPolyhedron: The intersection of a set of linear inequalities
ØPolytope: Bounded polyhedron ØVertex: A point 𝑦 is a vertex of polyhedron 𝑄 if ∄ 𝑧 ≠ 0 with 𝑦 +
𝑧 ∈ 𝑄 and 𝑦 − 𝑧 ∈ 𝑄
Red point: vertex Blue point: not a vertex
13
Convex set: A set 𝑇 is convex if ∀𝑦, 𝑧 ∈ 𝑇 and ∀𝑞 ∈ [0,1], we have 𝑞 ⋅ 𝑦 + 1 − 𝑞 ⋅ 𝑧 ∈ 𝑇 convex Non-convex Ø Inherently related to convex functions
14
Convex set: A set 𝑇 is convex if ∀𝑦, 𝑧 ∈ 𝑇 and ∀𝑞 ∈ [0,1], we have 𝑞 ⋅ 𝑦 + 1 − 𝑞 ⋅ 𝑧 ∈ 𝑇 Convex hull: the convex hull of points x7, ⋯ , 𝑦O ∈ ℝ is convhull 𝑦7, ⋯ , 𝑦< = x = W
1X7 <
𝑞1𝑦1 : ∀𝑞 ∈ ℝD
< 𝑡. 𝑢. ∑𝑞1 = 1
That is, convhull 𝑦7, ⋯ , 𝑦< includes all points that can be written as expectation of 𝑦7, ⋯ , 𝑦< under some distribution 𝑞.
Ø Any polytope (i.e., a bounded polyhedron)
is the convex hull of a finite set of points
Geometric visualization of convex hull
15
Fact: The feasible region of any LP (a polyhedron) is a convex set. All possible objective values form an interval (possibly unbounded).
Note: intervals are the only convex sets in ℝ
𝑑 ⋅ 𝑦 = 𝑤
7
𝑑 ⋅ 𝑦 = 𝑤
'
Any 𝑤 ∈ [𝑤7, 𝑤'] must also be a possible objective value
16
Fact: The feasible region of any LP (a polyhedron) is a convex set. All possible objective values form an interval (possibly unbounded). Fact: The set of optimal solutions of any LP is a convex set. Fact: At a vertex, 𝑜 linearly independent constraints are satisfied with equality (a.k.a., tight).
Formal proofs: homework exercise Note: intervals are the only convex sets in ℝ Ø It is the intersection of feasible region and hyperplane 𝑑. ⋅ 𝑦 = 𝑃𝑄𝑈
17
Fact: An LP either has an optimal solution, or is unbounded or infeasible
𝑑
18
Fact: An LP either has an optimal solution, or is unbounded or infeasible
𝑑
19
Fact: An LP either has an optimal solution, or is unbounded or infeasible
𝑑
20
Theorem: if an LP in standard form has an optimal solution, then it has a vertex optimal solution. Proof Ø Assume not, and take a non-vertex optimal solution ̅ 𝑦 with the maximum number of tight constraints Ø There is 𝑧 ≠ 0 s.t. ̅ 𝑦 ± 𝑧 are feasible Ø 𝑧 is orthogonal to objective function and all tight constraints at ̅ 𝑦
. ⋅ 𝑧 = 0 whenever the 𝑗’th constraint is tight for ̅
𝑦
a) Arguments for 𝑏1
. ⋅ 𝑧 = 0
𝑦 ± 𝑧 feasible ⇒ 𝑏1
. ⋅
̅ 𝑦 ± 𝑧 ≤ 𝑐1
𝑦 is tight at constraint 𝑗 ⇒ 𝑏1
.⋅ ̅
𝑦 = 𝑐1
. ⋅ ± 𝑧 ≤ 0 ⇒ 𝑏1 . ⋅ 𝑧 = 0
b) Similarly, ̅ 𝑦 optimal implies 𝑑. ̅ 𝑦 ± 𝑧 ≤ 𝑑. ̅ 𝑦 ⇒ 𝑑c𝑧 = 0
21
Theorem: if an LP in standard form has an optimal solution, then it has a vertex optimal solution. Proof Ø Assume not, and take a non-vertex optimal solution 𝑦 with the maximum number of tight constraints Ø There is 𝑧 ≠ 0 s.t. 𝑦 ± 𝑧 are feasible Ø 𝑧 is orthogonal to objective function and all tight constraints at 𝑦
. ⋅ 𝑧 = 0 whenever the 𝑗’th constraint is tight for 𝑦
Ø Can choose 𝑧 s.t. 𝑧? < 0 for some 𝑘 Ø Let 𝛽 be the largest constant such that 𝑦 + 𝛽𝑧 is feasible
Ø An additional constraint becomes tight at 𝑦 + 𝛽𝑧, contradiction
22
Theorem: if an LP in standard form has an optimal solution, then it has a vertex optimal solution. Corollary [counting non-zero variables]: If an LP in standard form has an optimal solution, then there is an optimal solution with at most 𝑛 non-zero variables. maximize 𝑑. ⋅ 𝑦 subject to 𝑏1 ⋅ 𝑦 ≤ 𝑐1 ∀𝑗 = 1, ⋯ , 𝑛 𝑦? ≥ 0 ∀𝑘 = 1, ⋯ , 𝑜 Ø Meaningful when 𝑛 < 𝑜 Ø E.g. for optimal production with 𝑜 = 10 products and 𝑛 = 3 raw materials, there is an optimal plan using at most 3 products.
23
ØOriginal proof gives an algorithm with very high polynomial degree ØNow, the fastest algorithm with guarantee takes min(𝑜, 𝑛) ⋅ 𝑈
where 𝑈 = time of solving linear equation systems of the same size
ØIn practice, Simplex Algorithm runs extremely fast though in
(extremely rare) worst case it still takes exponential time
ØWe will not cover these algorithms; Instead, we use them as
building blocks to solve other problems Theorem: any linear program with 𝑜 variables and 𝑛 constraints can be solved in poly(𝑛, 𝑜) time.
24
ØThe forefather of convex optimization problems, and the most
ubiquitous.
ØDeveloped by Kantorovich during World War II (1939) for
planning the Soviet army’s expenditures and returns. Kept secret.
ØDiscovered a few years later by George Dantzig, who in 1947
developed the simplex method for solving linear programs
ØJohn von Neumann developed LP duality in 1947, and applied it
to game theory
ØPolynomial-time algorithms: Ellipsoid method (Khachiyan 1979),
interior point methods (Karmarkar 1984).
25
Ø Linear Programing Basics Ø Dual Program of LP and Its Properties
26
Ø𝑧1 is the dual variable corresponding to primal constraint 𝑏1
.𝑦 ≤ (or =)𝑐1
Ø l
𝑏?𝑧 ≥ (or =)𝑑
? is the dual constraint corresponding to primal variable 𝑦?
max 𝑑. ⋅ 𝑦 s.t. 𝑏1
.𝑦 ≤ 𝑐1,
∀𝑗 ∈ 𝐷7 𝑏1
.𝑦 = 𝑐1,
∀𝑗 ∈ 𝐷' 𝑦? ≥ 0, ∀𝑘 ∈ 𝐸7 𝑦? ∈ ℝ, ∀𝑘 ∈ 𝐸' Primal LP min 𝑐. ⋅ 𝑧 s.t. l 𝑏?𝑧 ≥ 𝑑
?,
∀𝑘 ∈ 𝐸7 l 𝑏? 𝑧 = 𝑑
?,
∀𝑘 ∈ 𝐸' 𝑧1 ≥ 0, ∀𝑗 ∈ 𝐷7 𝑧1 ∈ ℝ, ∀𝑗 ∈ 𝐷' Dual LP 𝑧1: 𝑧1: 𝑦?: 𝑦?:
27
max 𝑑. ⋅ 𝑦 s.t. 𝑏1
.𝑦 ≤ 𝑐1,
∀𝑗 ∈ 𝐷7 𝑏1
.𝑦 = 𝑐1,
∀𝑗 ∈ 𝐷' 𝑦? ≥ 0, ∀𝑘 ∈ 𝐸7 𝑦? ∈ ℝ, ∀𝑘 ∈ 𝐸' Primal LP min 𝑐. ⋅ 𝑧 s.t. l 𝑏?𝑧 ≥ 𝑑
?,
∀𝑘 ∈ 𝐸7 l 𝑏? 𝑧 = 𝑑
?,
∀𝑘 ∈ 𝐸' 𝑧1 ≥ 0, ∀𝑗 ∈ 𝐷7 𝑧1 ∈ ℝ, ∀𝑗 ∈ 𝐷' Dual LP 𝑏1
n
l 𝑏? 𝑧1: 𝑧1: 𝑦?: 𝑦?:
28
Ø 𝑑 ∈ ℝ<, 𝐵 ∈ ℝO×<, 𝑐 ∈ ℝO Ø𝑧1 is the dual variable corresponding to primal constraint 𝐵1𝑦 ≤ 𝑐1 Ø 𝐵?
. 𝑧 ≥ 𝑑 ? is the dual constraint corresponding to primal variable 𝑦?
max 𝑑. ⋅ 𝑦 s.t. 𝐵𝑦 ≤ 𝑐 𝑦 ≥ 0 Primal LP min 𝑐. ⋅ 𝑧 s.t. 𝐵.𝑧 ≥ 𝑑 𝑧 ≥ 0 Dual LP
29
Interpretation 1: Economic Interpretation
Recall the optimal production problem
Ø𝑜 products, 𝑛 raw materials ØEvery unit of product 𝑘 uses 𝑏1? units of raw material 𝑗 ØThere are 𝑐1 units of material 𝑗 available ØProduct 𝑘 yields profit 𝑑
? per unit
ØFactory wants to maximize profit subject to available raw materials
30
Interpretation 1: Economic Interpretation
Dual LP corresponds to the buyer’s optimization problem, as follows:
ØBuyer wants to directly buy the raw material ØDual variable 𝑧1 is buyer’s proposed price per unit of raw material 𝑗 ØDual price vector is feasible if factory is incentivized to sell materials ØBuyer wants to spend as little as possible to buy raw materials
max 𝑑. ⋅ 𝑦 s.t. ∑?X7
<
𝑏1? 𝑦? ≤ 𝑐1, ∀𝑗 ∈ [𝑛] 𝑦? ≥ 0, ∀𝑘 ∈ [𝑜] Primal LP Dual LP min 𝑐. ⋅ 𝑧 s.t. ∑1X7
O 𝑏1? 𝑧1 ≥ 𝑑 ?, ∀𝑘 ∈ [𝑜]
𝑧1 ≥ 0, ∀𝑗 ∈ [𝑛]
𝑘: product index 𝑗: material index
31
Interpretation 1: Economic Interpretation
max 𝑑. ⋅ 𝑦 s.t. ∑?X7
<
𝑏1? 𝑦? ≤ 𝑐1, ∀𝑗 ∈ [𝑛] 𝑦? ≥ 0, ∀𝑘 ∈ [𝑜] Primal LP Dual LP min 𝑐. ⋅ 𝑧 s.t. ∑1X7
O 𝑏1? 𝑧1 ≥ 𝑑 ?, ∀𝑘 ∈ [𝑜]
𝑧1 ≥ 0, ∀𝑗 ∈ [𝑛]
price of material units of products
32
Interpretation II: Finding Best Upperbound
Ø Consider the simple LP from previous 2-D example ØWe found that the optimal solution was at (
' : , ' :) with an optimal
value of
q :.
ØWhat if, instead of finding the optimal solution, we sought to find
an upperbound on its value by combining inequalities?
33
Interpretation II: Finding Best Upperbound
ØMultiplying each row 𝑗 by 𝑧1 and summing gives the inequality
𝑧.𝐵𝑦 ≤ 𝑧.𝑐
(now we see why 𝑧1 ≥ 0 when 𝑏1𝑦 ≤ 𝑐1 but 𝑧1 ∈ ℝ when 𝑏1𝑦 = 𝑐1)
ØWhen 𝑑. ≤ 𝑧.𝐵, the right hand side of the inequality is an upper
bound on 𝑑.𝑦 for every feasible 𝑦, because 𝑑.𝑦 ≤ 𝑧.𝐵𝑦
ØThe dual LP can be interpreted as finding the best upperbound on
the primal that can be achieved this way. max 𝑑. ⋅ 𝑦 s.t. 𝐵𝑦 ≤ 𝑐 𝑦 ≥ 0 Primal LP min 𝑐. ⋅ 𝑧 s.t. 𝐵.𝑧 ≥ 𝑑 𝑧 ≥ 0 Dual LP ≤ 𝑧.𝑐
34
Ø Duality is an inversion
Fact: Given any primal LP, the dual of its dual is itself.
Proof: homework exercise
max 𝑑. ⋅ 𝑦 s.t. 𝑏1
.𝑦 ≤ 𝑐1,
∀𝑗 ∈ 𝐷7 𝑏1
.𝑦 = 𝑐1,
∀𝑗 ∈ 𝐷' 𝑦? ≥ 0, ∀𝑘 ∈ 𝐸7 𝑦? ∈ ℝ, ∀𝑘 ∈ 𝐸' Primal LP min 𝑐. ⋅ 𝑧 s.t. l 𝑏?𝑧 ≥ 𝑑
?,
∀𝑘 ∈ 𝐸7 l 𝑏? 𝑧 = 𝑑
?,
∀𝑘 ∈ 𝐸' 𝑧1 ≥ 0, ∀𝑗 ∈ 𝐷7 𝑧1 ∈ ℝ, ∀𝑗 ∈ 𝐷' Dual LP
Haifeng Xu
University of Virginia hx4ad@virginia.edu