Optimization Problems LING 572 Advanced Statistical Methods for NLP - - PowerPoint PPT Presentation

optimization problems
SMART_READER_LITE
LIVE PREVIEW

Optimization Problems LING 572 Advanced Statistical Methods for NLP - - PowerPoint PPT Presentation

Optimization Problems LING 572 Advanced Statistical Methods for NLP January 28, 2020 1 Announcements HW2 grades posted: 85.7 avg Historically the hardest / most time-consuming assignment of 572 [NB: we accept +/- 5% differences from


slide-1
SLIDE 1

Optimization Problems

LING 572 Advanced Statistical Methods for NLP January 28, 2020

1

slide-2
SLIDE 2

Announcements

  • HW2 grades posted: 85.7 avg
  • Historically the hardest / most time-consuming assignment of 572
  • [NB: we accept +/- 5% differences from our results on test data]
  • Format moving forward (no points this time; will update the specs):
  • Each line needs to have the specified format
  • Blocks for the classes need to be in the same order as example file (e.g.

talk.politics.guns, then talk.politics.mideast, talk.politics.misc)

  • Include the same commented lines as the example files

2

slide-3
SLIDE 3

Performance

  • A fair number of people struggled to get reasonable performance
  • Depth + recursion = explosion
  • General lesson: think about what repeated operations you will be doing a

lot, and choose data structures that do those efficiently

  • e.g. dicts/sets are hash tables, so very efficient lookup / insertion (O(1) avg)
  • Useful for the built-in datatypes: https://wiki.python.org/moin/TimeComplexity
  • Common issue: pandas data-frames can be slow
  • To the quick live demo!

3

slide-4
SLIDE 4

Linguistics Twitter Field Day

4

link

slide-5
SLIDE 5

Optimization

5

slide-6
SLIDE 6

What is an optimization problem?

  • The problem of finding the best solution from all feasible solutions.
  • Given a function

, find that optimizes .

  • f is called

▪ an objective function, ▪ a loss function or cost function (minimization), or ▪ a utility function or fitness function (maximization), etc.

  • X is an n-dimensional vector space:

▪ discrete (possible values are countable): combinatorial optimization problem ▪ continuous: e.g., constrained problems

f : X → ℝ x0 ∈ X f

6

slide-7
SLIDE 7

Components of each optimization problem

  • Decision variables X: describe our choices that are under our control.
  • We normally use n to represent the number of decision variables, and x_i to

represent the i-th decision variable.

  • Objective function f: the function we wish to optimize
  • Constraints: describe the limitations that restrict our choice for decision

variables.

7

slide-8
SLIDE 8

Standard form of a continuous optimization problem

8

slide-9
SLIDE 9

Common types of optimization problem

  • Linear programming (LP) problems:

▪ Definition: Both objective function and constraints are linear ▪ The problems can be solved in polynomial time. ▪ https://en.wikipedia.org/wiki/Linear_programming

  • Integer linear programming (ILP) problems:

▪ Definition: LP problem in which some or all of the variables are restricted to be

integers

▪ Often, solving ILP problem is NP-hard. ▪ https://en.wikipedia.org/wiki/Integer_programming

9

slide-10
SLIDE 10

Common types of optimization problem (cont’d)

▪ Quadratic programming (QP):

▪ Definition: The objective function is quadratic, and the constraints are linear ▪ Solving QP problems is simple under certain conditions ▪ https://en.wikipedia.org/wiki/Quadratic_programming

  • Convex optimization:
  • Definition: f(x) is a convex function, and X is a convex set.
  • Property: if a local minimum exists, then it is a global minimum.
  • https://en.wikipedia.org/wiki/Convex_optimization

10

slide-11
SLIDE 11

Convex set

11

A set C is said to be convex if, for all x and y in C and all t in the interval (0, 1), the point (1 − t)x + ty also belongs to C

slide-12
SLIDE 12

Convex function

12

▪ Let X be a convex set in a real vector space and

a function.

▪ f is convex just in case: ▪ ▪ (strictly convex: strict inequality, with t ranging in (0, 1), excluding endpoints.)

f : X → ℝ ∀x1, x2 ∈ X, ∀t ∈ [0,1], f(tx1 + (1 − t)x2) ≤ tf(x1) + (1 − t)f(x2)

slide-13
SLIDE 13

Terms

  • A solution is the assignment of values to all the decision variables
  • A solution is called feasible if it satisfies all the constraints.
  • The set of all the feasible solutions forms a feasible region.
  • A feasible solution is called optimal if f(x) attains the optimal value at the

solution.

13

slide-14
SLIDE 14

Terms

  • If a problem has no feasible solution, the problem itself is called infeasible.
  • If the value of the objective function can be infinitely large, the problem is

called unbounded.

14

slide-15
SLIDE 15

Linear programming

15

slide-16
SLIDE 16

Linear Programming

  • The linear programming method was first developed by Leonid Kantorovich

in late 1930s.

  • Main applications: diet problem, supply problem
  • A primary method for solving LP is the simplex method.
  • LP problems can be solved in polynomial time.

16

slide-17
SLIDE 17

An example

17

slide-18
SLIDE 18

Feasible region

18

2x + 4y ≤ 220 3x + 2y ≤ 150 x ≥ 0 y ≥ 0

source

slide-19
SLIDE 19

Property of LP

  • The feasible region is convex
  • If the feasible region is non-empty and bounded, then
  • optimal solutions exist, and
  • there is an optimal solution that is a corner point

➔ We only need to check the corner points

  • The most well-known method is called the simplex method.

19

slide-20
SLIDE 20

Simplex method

20

Simplex method: ▪ Start with a feasible solution, move to another one to increase f(x)

slide-21
SLIDE 21

Integer linear programming

21

slide-22
SLIDE 22

Integer programming

  • IP is an active research area and there are still many unsolved problems.
  • Many applications: scheduling, “diet” problems, NLP, …
  • IP is more difficult to solve than LP.
  • Methods:
  • Branch and Bound
  • Use LP relaxation

22

slide-23
SLIDE 23

Example: Investment Decisions

23

Four investment options. Over 3 months, we want to invest up to 14, 12, and 15k.

link

slide-24
SLIDE 24

Example: Maximum Spanning Tree

24

max s(G) = ∑

(w1,w2,l)∈G

s(w1, w2, l) Constraint: G is a tree (no cycles) An approach to dependency parsing (see, e.g. 571 slides) More constraints possible: heads cannot have more than one outgoing label of each type

slide-25
SLIDE 25

LP vs. ILP

25

slide-26
SLIDE 26

Summary

  • Optimization problems have many real-life applications.
  • Common types: LP, IP, ILP, QP, Convex optimization problem
  • LP is easy to solve; the most well-known method is the simplex method.
  • IP is hard to resolve.
  • QP and Convex optimization are used the most in our field.

26