The Work of Mike Shub in Complexity Felipe Cucker City University - - PowerPoint PPT Presentation

the work of mike shub in complexity
SMART_READER_LITE
LIVE PREVIEW

The Work of Mike Shub in Complexity Felipe Cucker City University - - PowerPoint PPT Presentation

The Work of Mike Shub in Complexity Felipe Cucker City University of Hong Kong Shubfest, Toronto 2012 Complexity Theory Goal: Determine the amount of resources (most commonly, computer time) necessary to solve problems with a computer.


slide-1
SLIDE 1

The Work of Mike Shub in Complexity

Felipe Cucker

City University of Hong Kong

Shubfest, Toronto 2012

slide-2
SLIDE 2

Complexity Theory

Goal: Determine the amount of resources (most commonly, computer time) necessary to solve problems with a computer.

slide-3
SLIDE 3

Complexity Theory

Goal: Determine the amount of resources (most commonly, computer time) necessary to solve problems with a computer. This broad goal alternates its focus between two extremes:

slide-4
SLIDE 4

Complexity Theory

Goal: Determine the amount of resources (most commonly, computer time) necessary to solve problems with a computer. This broad goal alternates its focus between two extremes: (G) To develop a general theory of computational cost (which includes formal models of computation, diverse cost notions, complexity classes built upon them, complete problems in these classes, and —the ultimate desideratum— separations beteeen these complexity classes).

slide-5
SLIDE 5

Complexity Theory

Goal: Determine the amount of resources (most commonly, computer time) necessary to solve problems with a computer. This broad goal alternates its focus between two extremes: (G) To develop a general theory of computational cost (which includes formal models of computation, diverse cost notions, complexity classes built upon them, complete problems in these classes, and —the ultimate desideratum— separations beteeen these complexity classes). (P) To analyze (in terms of cost) the behavior of specific algorithms (meant to solve specific problems).

slide-6
SLIDE 6

Mike has worked on both ends of this spectrum with contributions that can be grouped in 3 main themes:

slide-7
SLIDE 7

Mike has worked on both ends of this spectrum with contributions that can be grouped in 3 main themes: (1) Zeros of Polynomial Systems.

slide-8
SLIDE 8

Mike has worked on both ends of this spectrum with contributions that can be grouped in 3 main themes: (1) Zeros of Polynomial Systems. (2) Structural Complexity for Numerical Problems.

slide-9
SLIDE 9

Mike has worked on both ends of this spectrum with contributions that can be grouped in 3 main themes: (1) Zeros of Polynomial Systems. (2) Structural Complexity for Numerical Problems. (3) Conditioning of Numerical Problems.

slide-10
SLIDE 10

Zeros of Polynomial Systems

  • M.S., S. Smale. “Computational complexity. On the geometry of

polynomials and a theory of cost.” I. Ann. Sci. ´ Ecole Norm. Sup.,

  • 1985. II. SIAM J. Comput., 1986.

One polynomial in one variable.

slide-11
SLIDE 11

Zeros of Polynomial Systems

  • M.S., S. Smale. “Computational complexity. On the geometry of

polynomials and a theory of cost.” I. Ann. Sci. ´ Ecole Norm. Sup.,

  • 1985. II. SIAM J. Comput., 1986.

One polynomial in one variable.

  • M.S., S. Smale. “Complexity of B´

ezout’s Theorem.” I, II, III, IV, and V, 1993–1996.

n polynomials in n + 1 homogeneous variables.

slide-12
SLIDE 12

Smale’s 17th problem: Can one find an approximate zero of a system (n polynomials in n + 1 homogeneous variables) in time polynomial on the average?

slide-13
SLIDE 13

Smale’s 17th problem: Can one find an approximate zero of a system (n polynomials in n + 1 homogeneous variables) in time polynomial on the average? approximate zero: a point from which Newton’s method converges to a zero, immediately, quadratically fast.

slide-14
SLIDE 14

Smale’s 17th problem: Can one find an approximate zero of a system (n polynomials in n + 1 homogeneous variables) in time polynomial on the average? approximate zero: a point from which Newton’s method converges to a zero, immediately, quadratically fast. polynomial time: number of arithmetic operations bounded by NO(1) where N is the size of the input system f .

slide-15
SLIDE 15

Smale’s 17th problem: Can one find an approximate zero of a system (n polynomials in n + 1 homogeneous variables) in time polynomial on the average? approximate zero: a point from which Newton’s method converges to a zero, immediately, quadratically fast. polynomial time: number of arithmetic operations bounded by NO(1) where N is the size of the input system f .

  • n the average: w.r.t. a Gaussian distribution on the input f .
slide-16
SLIDE 16

Smale’s 17th problem: Can one find an approximate zero of a system (n polynomials in n + 1 homogeneous variables) in time polynomial on the average? approximate zero: a point from which Newton’s method converges to a zero, immediately, quadratically fast. polynomial time: number of arithmetic operations bounded by NO(1) where N is the size of the input system f .

  • n the average: w.r.t. a Gaussian distribution on the input f .

D := max{d1, . . . , dn} N ≈ n D+n

n

slide-17
SLIDE 17

Adaptive linear homotopy

◮ Given an initial pair (g, ζ) with g(ζ) = 0 and an input f :

slide-18
SLIDE 18

Adaptive linear homotopy

◮ Given an initial pair (g, ζ) with g(ζ) = 0 and an input f : ◮ Consider the line segment [g, f ] connecting g and f . It

consists of the systems qt := (1 − t)g + tf for t ∈ [0, 1].

slide-19
SLIDE 19

Adaptive linear homotopy

◮ Given an initial pair (g, ζ) with g(ζ) = 0 and an input f : ◮ Consider the line segment [g, f ] connecting g and f . It

consists of the systems qt := (1 − t)g + tf for t ∈ [0, 1].

◮ If no qt has a multiple zero, then there exists a unique lifting

  • f this segment to a curve

t ∈ [0, 1] → (qt, ζt) such that ζ0 = ζ. Since q1 = f , ζ1 is a zero of f .

slide-20
SLIDE 20
slide-21
SLIDE 21

The idea is to follow this curve numerically: partition [0, 1] into t0 = 0, . . . , tk = 1. Writing qi := qti, successively compute approximations zi of ζti by Newton’s method starting with z0 := ζ. More specifically, compute zi+1 := Nqi+1(zi).

slide-22
SLIDE 22
slide-23
SLIDE 23

The B´ ezout series set up the main properties of this algorithmic scheme and put in place the theoretical tools used today in its study. I won’t give details of what these tools are or how they are used in recent work. I will instead limit my exposition to the description of the state-of-the-art in the subject.

slide-24
SLIDE 24

The B´ ezout series set up the main properties of this algorithmic scheme and put in place the theoretical tools used today in its study. I won’t give details of what these tools are or how they are used in recent work. I will instead limit my exposition to the description of the state-of-the-art in the subject. Two issues neglected in my exposition above: (1) How to choose the initial pair (g, ζ)?

slide-25
SLIDE 25

The B´ ezout series set up the main properties of this algorithmic scheme and put in place the theoretical tools used today in its study. I won’t give details of what these tools are or how they are used in recent work. I will instead limit my exposition to the description of the state-of-the-art in the subject. Two issues neglected in my exposition above: (1) How to choose the initial pair (g, ζ)? (2) How large should d(qi+1, qi) be?

slide-26
SLIDE 26

How large should d(qi+1, qi) be?

◮ We compute ti+1 adaptively from ti such that

d(qi+1, qi) = 0.0085 D3/2 µ2

norm(qi, zi).

slide-27
SLIDE 27

How large should d(qi+1, qi) be?

◮ We compute ti+1 adaptively from ti such that

d(qi+1, qi) = 0.0085 D3/2 µ2

norm(qi, zi). ◮ Denote by K(f , g, ζ) the number K of iterations performed to

follow the curve.

slide-28
SLIDE 28

How large should d(qi+1, qi) be?

◮ We compute ti+1 adaptively from ti such that

d(qi+1, qi) = 0.0085 D3/2 µ2

norm(qi, zi). ◮ Denote by K(f , g, ζ) the number K of iterations performed to

follow the curve.

“B´ ezout VI” (M.S., Found. Comput. Math. 2009)

For all i, zi is an approximate zero of qi. In particular zK is an approximate zero of f . Moreover, K(f , g, ζ) ≤ 217 D3/2 d(f , g) 1 µ2

norm(qτ, ζτ) dτ.

Here τ ∈ [0, 1] is a ratio of angles and not of Euclidean distances.

slide-29
SLIDE 29

This result relates to cost in a clear manner. Each Newton step takes O(N) arithemetic operations. Therefore, the total number of such operations performed along the homotopy is O(N K(f , g, ζ)).

slide-30
SLIDE 30

This result relates to cost in a clear manner. Each Newton step takes O(N) arithemetic operations. Therefore, the total number of such operations performed along the homotopy is O(N K(f , g, ζ)). It has been used in the following:

slide-31
SLIDE 31

This result relates to cost in a clear manner. Each Newton step takes O(N) arithemetic operations. Therefore, the total number of such operations performed along the homotopy is O(N K(f , g, ζ)). It has been used in the following: (1) a randomized algorithm computing approximate zeros in average randomized polynomial time: O(D3/2nN2) [C. Belt´ an – L.M. Pardo].

slide-32
SLIDE 32

This result relates to cost in a clear manner. Each Newton step takes O(N) arithemetic operations. Therefore, the total number of such operations performed along the homotopy is O(N K(f , g, ζ)). It has been used in the following: (1) a randomized algorithm computing approximate zeros in average randomized polynomial time: O(D3/2nN2) [C. Belt´ an – L.M. Pardo]. (2) a deterministic algorithm working in near-polynomial time (average polynomial time for all but a few pairs (n, D) and average time NO(log log N) on those pairs). [P. B¨ urgisser – F.C.].

slide-33
SLIDE 33

Additional remarks:

  • Projective Newton method introduced by Mike.
slide-34
SLIDE 34

Additional remarks:

  • Projective Newton method introduced by Mike.
  • Several extensions of Newton method to more general systems

(overdetermined, underdetermined, multihomogeneous, . . . ) studied by Mike, mostly in joint work with Jean-Pierre Dedieu.

slide-35
SLIDE 35

Additional remarks:

  • Projective Newton method introduced by Mike.
  • Several extensions of Newton method to more general systems

(overdetermined, underdetermined, multihomogeneous, . . . ) studied by Mike, mostly in joint work with Jean-Pierre Dedieu.

  • Back to the roots? [D. Armentano, M.S.]
slide-36
SLIDE 36

Structural Complexity for Numerical Problems

An algorithm solving a problem provides —through its analysis— an upper bound on the resources necessary to solve this problem.

slide-37
SLIDE 37

Structural Complexity for Numerical Problems

An algorithm solving a problem provides —through its analysis— an upper bound on the resources necessary to solve this problem. To obtain lower bounds one needs instead to consider all algorithms solving the problem. Thus, the study of lower bounds demands a formal notion of algorithm at hand.

slide-38
SLIDE 38

Structural Complexity for Numerical Problems

An algorithm solving a problem provides —through its analysis— an upper bound on the resources necessary to solve this problem. To obtain lower bounds one needs instead to consider all algorithms solving the problem. Thus, the study of lower bounds demands a formal notion of algorithm at hand. Classical complexity theory (as studied in Theoretical Computer Science) has the Turing machine for this notion. This is very useful for discrete computations but not so for numerical computations. A “continuous” complexity theory is needed in this context.

slide-39
SLIDE 39

Structural Complexity for Numerical Problems

An algorithm solving a problem provides —through its analysis— an upper bound on the resources necessary to solve this problem. To obtain lower bounds one needs instead to consider all algorithms solving the problem. Thus, the study of lower bounds demands a formal notion of algorithm at hand. Classical complexity theory (as studied in Theoretical Computer Science) has the Turing machine for this notion. This is very useful for discrete computations but not so for numerical computations. A “continuous” complexity theory is needed in this context.

  • L. Blum, M.S., S. Smale. “On a theory of computation over the

real numbers: NP-completeness, recursive functions and universal machines”, Bull. AMS, 1989.

slide-40
SLIDE 40
  • Introduced the BSS-machine.
slide-41
SLIDE 41
  • Introduced the BSS-machine.
  • Natural notions of deterministic cost and nondeterministic cost.

Cost is, essentially, number of arithmetic operations and comparisons performed. Nondeterminism is a theoretical mode of computation that, instead of “finding” or “computing” the solution to a problem, simply “verifies” that a candidate solution is a solution indeeed.

slide-42
SLIDE 42
  • Introduced the BSS-machine.
  • Natural notions of deterministic cost and nondeterministic cost.

Cost is, essentially, number of arithmetic operations and comparisons performed. Nondeterminism is a theoretical mode of computation that, instead of “finding” or “computing” the solution to a problem, simply “verifies” that a candidate solution is a solution indeeed.

  • Classes PR and NPR (and PC and NP

C).

slide-43
SLIDE 43
  • Introduced the BSS-machine.
  • Natural notions of deterministic cost and nondeterministic cost.

Cost is, essentially, number of arithmetic operations and comparisons performed. Nondeterminism is a theoretical mode of computation that, instead of “finding” or “computing” the solution to a problem, simply “verifies” that a candidate solution is a solution indeeed.

  • Classes PR and NPR (and PC and NP

C).

A problem in NPR. 4FEAS Given a polynomial f in R[X1, . . . , Xn] of degree 4, does there exist ξ ∈ Rn such that f (ξ) = 0?

slide-44
SLIDE 44
  • Introduced the BSS-machine.
  • Natural notions of deterministic cost and nondeterministic cost.

Cost is, essentially, number of arithmetic operations and comparisons performed. Nondeterminism is a theoretical mode of computation that, instead of “finding” or “computing” the solution to a problem, simply “verifies” that a candidate solution is a solution indeeed.

  • Classes PR and NPR (and PC and NP

C).

A problem in NPR. 4FEAS Given a polynomial f in R[X1, . . . , Xn] of degree 4, does there exist ξ ∈ Rn such that f (ξ) = 0? A problem in NP

C.

QUAD Given f1, . . . , fm in C[X1, . . . , Xn] of degree 2, is there a ξ ∈ Cn such that f1(ξ) = . . . = fm(ξ) = 0?

slide-45
SLIDE 45
  • Existence of natural NPR-complete problems.
slide-46
SLIDE 46
  • Existence of natural NPR-complete problems.

A complete problem P in NPR is one such that, if P ∈ PR then PR = NPR.

slide-47
SLIDE 47
  • Existence of natural NPR-complete problems.

A complete problem P in NPR is one such that, if P ∈ PR then PR = NPR. Explanation: All problems in NPR “reduce” to P (negligible

  • verhead cost).
slide-48
SLIDE 48
  • Existence of natural NPR-complete problems.

A complete problem P in NPR is one such that, if P ∈ PR then PR = NPR. Explanation: All problems in NPR “reduce” to P (negligible

  • verhead cost).

4FEAS is NPR-complete QUAD is NP

C-complete

slide-49
SLIDE 49
  • Existence of natural NPR-complete problems.

A complete problem P in NPR is one such that, if P ∈ PR then PR = NPR. Explanation: All problems in NPR “reduce” to P (negligible

  • verhead cost).

4FEAS is NPR-complete QUAD is NP

C-complete

These results put focus on the problems 4FEAS and QUAD.

slide-50
SLIDE 50
  • Existence of natural NPR-complete problems.

A complete problem P in NPR is one such that, if P ∈ PR then PR = NPR. Explanation: All problems in NPR “reduce” to P (negligible

  • verhead cost).

4FEAS is NPR-complete QUAD is NP

C-complete

These results put focus on the problems 4FEAS and QUAD. Relations of QUAD and Smale’s 17th problem: decision vs function problem

slide-51
SLIDE 51
  • Existence of natural NPR-complete problems.

A complete problem P in NPR is one such that, if P ∈ PR then PR = NPR. Explanation: All problems in NPR “reduce” to P (negligible

  • verhead cost).

4FEAS is NPR-complete QUAD is NP

C-complete

These results put focus on the problems 4FEAS and QUAD. Relations of QUAD and Smale’s 17th problem: decision vs function problem average-case vs worst-case

slide-52
SLIDE 52

The BSS paper has had a tremendous impact in the work of a group of people who made its complexity theory the center of their research.

slide-53
SLIDE 53

The BSS paper has had a tremendous impact in the work of a group of people who made its complexity theory the center of their research.

slide-54
SLIDE 54
  • F.C., M.S. “Generalized knapsack problems and fixed degree

separations”, Theoret. Comput. Sci., 1996.

slide-55
SLIDE 55
  • F.C., M.S. “Generalized knapsack problems and fixed degree

separations”, Theoret. Comput. Sci., 1996. For every d ≥ 1 DTIME(O(nd)) = NDTIME(O(nd)).

slide-56
SLIDE 56

Conditioning of Numerical Problems

ϕ : Rn → Rm a ∈ Rn The condition number of a is the worst-case magnification in ϕ(a)

  • f small relative errors in a:

condϕ(a) := lim

δ→0

sup

RelError(a)≤δ

RelError(ϕ(a)) RelError(a) .

slide-57
SLIDE 57

Conditioning of Numerical Problems

ϕ : Rn → Rm a ∈ Rn The condition number of a is the worst-case magnification in ϕ(a)

  • f small relative errors in a:

condϕ(a) := lim

δ→0

sup

RelError(a)≤δ

RelError(ϕ(a)) RelError(a) .

◮ The condition number plays a key role in finite-precision

analyses of algorithms.

slide-58
SLIDE 58

Conditioning of Numerical Problems

ϕ : Rn → Rm a ∈ Rn The condition number of a is the worst-case magnification in ϕ(a)

  • f small relative errors in a:

condϕ(a) := lim

δ→0

sup

RelError(a)≤δ

RelError(ϕ(a)) RelError(a) .

◮ The condition number plays a key role in finite-precision

analyses of algorithms.

◮ For many problems ϕ the quantity condϕ(a) can be

characterized (or approximated) in a more friendly manner.

slide-59
SLIDE 59

Conditioning of Numerical Problems

ϕ : Rn → Rm a ∈ Rn The condition number of a is the worst-case magnification in ϕ(a)

  • f small relative errors in a:

condϕ(a) := lim

δ→0

sup

RelError(a)≤δ

RelError(ϕ(a)) RelError(a) .

◮ The condition number plays a key role in finite-precision

analyses of algorithms.

◮ For many problems ϕ the quantity condϕ(a) can be

characterized (or approximated) in a more friendly manner.

◮ These characterizations have allowed, in many cases, to

  • btain estimates of the expectation I

E(condϕ) with respect to a measure on Rn.

slide-60
SLIDE 60

Conditioning of Numerical Problems

ϕ : Rn → Rm a ∈ Rn The condition number of a is the worst-case magnification in ϕ(a)

  • f small relative errors in a:

condϕ(a) := lim

δ→0

sup

RelError(a)≤δ

RelError(ϕ(a)) RelError(a) .

◮ The condition number plays a key role in finite-precision

analyses of algorithms.

◮ For many problems ϕ the quantity condϕ(a) can be

characterized (or approximated) in a more friendly manner.

◮ These characterizations have allowed, in many cases, to

  • btain estimates of the expectation I

E(condϕ) with respect to a measure on Rn.

◮ Condition numbers have also been used in estimates for the

speed of convergence of iterative algorithms (complexity!).

slide-61
SLIDE 61

Mike’s first work in conditioning studies a notion of condition number obtained by replacing “worst-case perturbation” by “average perturbation.” This is relevant for finite-precision analyses.

  • N. Weiss, G. Wasikowski, H. Wozniakowski, M.S. “Average

condition number for solving linear equations.” Linear Algebra Appl., 1986.

slide-62
SLIDE 62

Mike’s first work in conditioning studies a notion of condition number obtained by replacing “worst-case perturbation” by “average perturbation.” This is relevant for finite-precision analyses.

  • N. Weiss, G. Wasikowski, H. Wozniakowski, M.S. “Average

condition number for solving linear equations.” Linear Algebra Appl., 1986.

Then attention turned to the relationship between condition and

  • complexity. This relationship pervades the B´

ezout series.

slide-63
SLIDE 63

For each of the zeros ζ1, . . . , ζD of a system f we have that f µnorm(f , ζi) is a condition number in the sense above!

slide-64
SLIDE 64

For each of the zeros ζ1, . . . , ζD of a system f we have that f µnorm(f , ζi) is a condition number in the sense above! The problem is, the map system → zero is multivalued. What should we define as the condition of input f ?

slide-65
SLIDE 65

For each of the zeros ζ1, . . . , ζD of a system f we have that f µnorm(f , ζi) is a condition number in the sense above! The problem is, the map system → zero is multivalued. What should we define as the condition of input f ? In the B´ ezout series the answer to this problem is µmax(f ) := max

i≤D µnorm(f , ζi).

slide-66
SLIDE 66

For each of the zeros ζ1, . . . , ζD of a system f we have that f µnorm(f , ζi) is a condition number in the sense above! The problem is, the map system → zero is multivalued. What should we define as the condition of input f ? In the B´ ezout series the answer to this problem is µmax(f ) := max

i≤D µnorm(f , ζi).

The main result in B´ ezout VI allows one to use instead µav(f ) :=

  • 1

D

  • i≤D

µ2

norm(f , ζi).

This fact is, as we already pointed out, at the core of the recent advances towards a final solution to Smale’s 17th problem.

slide-67
SLIDE 67

A Unifying Theory?