How to not prove A Brief Introduction to Natural Proofs & - - PowerPoint PPT Presentation

β–Ά
how to not prove
SMART_READER_LITE
LIVE PREVIEW

How to not prove A Brief Introduction to Natural Proofs & - - PowerPoint PPT Presentation

How to not prove A Brief Introduction to Natural Proofs & Data Complexity Shubhang Kulkarni and Ryan Davis Part I : Introduction to Natural Proofs Shubhang Kulkarni The obstacle is, roughly, that a large class of


slide-1
SLIDE 1

How to not prove 𝑄 β‰  𝑂𝑄

A Brief Introduction to Natural Proofs & Data Complexity Shubhang Kulkarni and Ryan Davis

slide-2
SLIDE 2

Part I : Introduction to Natural Proofs

Shubhang Kulkarni β€œThe obstacle is, roughly, that a large class of approaches to circuit lower bounds must prove more”

  • R. Lipton
slide-3
SLIDE 3

Why This Talk is Important

Lowerbounds ≑ Computational Intractability of a problem Modern day e-commerce heavily relies on certain lower bounds being true. Its natural to ask what makes lower bound questions so difficult

  • 𝑄 β‰  𝑂𝑄 ?
  • 𝑄 β‰  𝑄𝑇𝑄𝐡𝐷𝐹 ?
  • 𝑄 β‰  𝑂𝐷 ?
slide-4
SLIDE 4

Algorithms vs Complexity Theory

Algorithms Theory

  • When can we solve problems quickly?
  • What’s an efficient way to solve the problem?

Complexity Theory

  • When can problems not be solved efficiently?
  • How can we prove that a problem is not easy?

The algorithm designers and the complexity theorists have opposing goals.

βˆƒ βˆ€

slide-5
SLIDE 5

Complexity Barriers

It turns out we have some formal understanding of why lower bounds are so tough to prove (at the moment) Any lower bound proof must overcome complexity barriers These barriers are β€œmeta-theorems” about proofs

  • Relativization
  • Natural Proofs
  • Algebrization

Baker, Gill, Solovay βˆƒ Oracles A, B such that 𝑄- = 𝑂𝑄- but 𝑄/ β‰  𝑂𝑄/

slide-6
SLIDE 6

Simple Notation

  • 0,1 3 ∢ n-bit binary strings (input x)
  • 𝑔

3 : function 𝑔: 0,1 3 β†’ {0,1}

  • 𝐺

3 ∢ set of all functions 𝑔 3

  • 𝐷3 : combinatorial property of 𝑔

3

Also denoted by 𝐢3

slide-7
SLIDE 7

Combinatorial Properties I

𝐷3 can be thought of as a subset of 𝐺

3 of the functions possessing the

property

  • 𝑔

3 β€œsatisfies” 𝐷3 ↔ 𝑔 3 ∈ 𝐷3

  • Also denoted as:

𝐷3 𝑔

3 = 1

if 𝑔

3 β€œsatisfies” 𝐷3,

= 0 otherwise

slide-8
SLIDE 8

Combinatorial Properties II

𝐷3 is Natural if it satisfies (1) Constructivity

There is a polynomial algorithm to determine whether 𝑔

3 ∈ 𝐷3

(2) Largeness

A random 𝑕3 has a β€œnon-negligible” chance of satisfying 𝐷3 Formally, |𝐷3| β‰₯ 2BC 3 β‹… |𝐺

3|

Terminology

A combinatorial property is useful against 𝑄/FGHI if the circuit sizes of all functions satisfying 𝐷3 are super-polynomial.

Actually for any subset

  • f 𝐷3
slide-9
SLIDE 9

Natural Proof

β€œA proof that some function does not have a polynomial size circuit is natural against 𝑄/FGHI if the proof contains the definition of natural combinatorial properties useful against 𝑄/FGHI.” Want to prove {𝑕3} has no polynomial circuit

  • Identify property 𝐷3 such that
  • The proof shows that βˆ€π‘”

3 ∈ 𝐷3, 𝑔 3 is β€œhard” for circuits to compute

  • 𝑕3 ∈ 𝐷3
slide-10
SLIDE 10

The Naïve Approach to P≠NP

  • Define mathematically the notion of β€œdiscrepancy” or β€œscatter” of boolean

function values

i.e. Define a 𝐷3 s.t. 𝐷3 is true for functions of β€œhigh discrepancy”

  • Show (inductively) that poly circuits can only compute low discrepancy

functions

i.e. 𝐷3 is useful as 𝑔

3 ∈ 𝐷3 cannot be computed by poly circuit

  • Show that SAT has high discrepancy

i.e. SAT ∈ 𝐷3

  • 𝑄 βŠ‚ 𝑄 /FGHI implies 𝑄 β‰  𝑂𝑄
slide-11
SLIDE 11

The Breakthrough Results

  • β€œOur main theorem … … gives evidence that no proof strategy along

these lines can ever succeed” – Razborov, Rudich ’96

  • β€œAny Large and Constructive 𝐷3 useful against 𝑄/FGHI provides a

statistical test that can be used to break any polytime psuedo-random generator.”

  • Violates a widely held belief that psuedo-random generators of

hardness 2K (πœ— > 0) exist.

slide-12
SLIDE 12

Psuedo-Random Generators

𝟏, 𝟐 R 𝟏, 𝟐 𝒐 {0,1} Ξ¦(𝑦R) f(𝑦3) π‘ŒR π‘Œ3 𝚾(𝒀𝒍)

  • 𝑄 𝑔 = 1 : Probability that 𝑔(x\) = 1
  • 𝑄 𝑔

] = 1 : Probability that 𝑔 Ξ¦ x^

= 1 Ξ¦ is a M-hard psuedo-random generator if 𝑄 𝑔 β‰ˆ 𝑄(𝑔

])

i.e. 𝑄 𝑔 βˆ’ 𝑄 𝑔

]

≀ 𝑁Bc

𝑄 𝑔 = 1 | 𝑦3 ∈ Ξ¦(π‘ŒR)

slide-13
SLIDE 13

The Entire Picture (so far…)

Empirical Evidence Integer Factorization Hardness Discrete Log Hardness One Way Functions Exist No Natural Proofs Psuedo-Random Generators Exist 𝑸 β‰  𝑢𝑸

slide-14
SLIDE 14

Implications of no Natural Proofs

Recall the definition of Natural Proofs Any property used by a non-natural lower bound proof must fall into

  • ne of:
  • Violates Largeness
  • Probability that a random function has the property is small
  • Property is shared by very few
  • Violates Constructability
  • Very complicated property
slide-15
SLIDE 15

Takeaways

Decoding the literature: A random efficiently computable function is very hard to distinguish from a random function

  • Main Proof Idea [RR97]
  • The rubik's cube
  • 3 bit scramblers’ composition

Notation Description Size of Distribution Type 0,1 3 Input Set 23 Set 𝐺

3

Functions on 0,1 3 2fg Set of Sets 𝐷3 Properties of 𝐺

3

2fhg Set of Sets of Sets

slide-16
SLIDE 16

End of Part I

Shubhang Kulkarni β€œThe general problem of mathematically proving computational lower bounds is a mystery”

  • Ryan Williams, Thinking Algorithmically About Impossibility
slide-17
SLIDE 17

Part II : Introduction to Data Complexity

Ryan Davis β€œ[We] will have to develop new methods to make a serious dent in major lower bound problems.”

  • Ryan Williams, Thinking Algorithmically About Impossibility
slide-18
SLIDE 18

Why use Data Complexity?

Closely related to program verification (testing) Carefully chosen input/output pairs to determine correctness When does it suffice to use a small number of test cases? What if we know something about the program, such as its size?

slide-19
SLIDE 19

Defining Data Complexity

Assume a known function 𝑔: {0,1}βˆ— β†’ {0,1} Given a circuit 𝐷 of size 𝑑, we wish to determine if 𝐷 computes 𝑔 Data complexity (w.r.t. 𝑑) – minimum number of input/output examples to determine if 𝐷 computes 𝑔 β€œGray-box” testing where 𝑑 is side information

Decision Problem Potential Solution

slide-20
SLIDE 20

Overarching Question

The data complexity for size 𝑑 circuits is trivially 2C(k) (Include all input/output examples up to length 𝑑) We are interested to know: For what functions 𝑔 can the data complexity be much smaller?

slide-21
SLIDE 21

Data Complexity and Circuit Complexity

β€œThe theory of circuits becomes interesting when we restrict the complexities of the circuits; The theory of test suites becomes similarly interesting when restricting the amount of necessary data.”

  • Chapman, B., Williams, R.

The data complexity of testing 𝑔 is β€œsmall” If and only if The circuit complexity of 𝑔 is β€œlarge”

slide-22
SLIDE 22

The Circuit-Input Game

  • Circuit player has a set 𝐷 of all circuits of size 𝑑
  • Size of 𝐷 is 𝐷 = 2C(k lmn k)
  • Input player has a set 𝐽 of all inputs of length π‘œ
  • Size of 𝐽 is |𝐽| = 23
  • Payoff matrix 𝑁 with 23 rows and 2C(k lmn k) columns
  • 𝑁 𝑑, 𝑦 = 0 if 𝑑 𝑦 = 𝑔(𝑦)
  • 𝑁 𝑑, 𝑦 = 1 if 𝑑 𝑦 β‰  𝑔(𝑦)

2C(k lmn k) 23

𝑁

Payoff goes to the input player

slide-23
SLIDE 23

Approximate Optimal Strategies

Theorem 2.1 (roughly)

  • 1. There exists a 𝑙-size distribution π‘ž (strategy) on 𝐷 such that

for all 𝑦 ∈ 𝐽, the circuit player has a good chance 𝑑 ∈ π‘ž will satisfy 𝑑 𝑦 = 𝑔 𝑦

  • 2. There exists a β„“-size distribution π‘ž (strategy) on 𝐽 such that

for all 𝑑 ∈ 𝐷, the input player has a good chance 𝑦 ∈ π‘ž will satisfy 𝑑 𝑦 β‰  𝑔 𝑦

Circuit player has a good strategy! Input player has a good strategy! 𝑙 β‰₯ 𝑑 u π‘œ β„“ β‰₯ 𝑑 u 𝑑 log 𝑑

slide-24
SLIDE 24

Data Complexity Consequence

Theorem 2.2 (roughly) Let π‘ž + π‘Ÿ ≀ 1 βˆ’ πœ—

  • 1. There exists a 𝑙-size set of circuits 𝑍 βŠ† 𝐷 such that for all 𝑦 ∈ 𝐽,

𝑑 𝑦 = 𝑔 𝑦 for more than a π‘ž-fraction of circuits 𝑑 ∈ 𝑍.

  • 2. There exists a β„“-size set of inputs π‘Œ βŠ† 𝐽 such that for all 𝑑 ∈ 𝐷,

𝑑 𝑦 β‰  𝑔 𝑦 for more than a π‘Ÿ-fraction of inputs 𝑦 ∈ π‘Œ.

𝑙 β‰₯ 𝑑 u π‘œ β„“ β‰₯ 𝑑 u 𝑑 log 𝑑

slide-25
SLIDE 25

Data Complexity and Circuit Complexity

The data complexity of testing 𝑔 is β€œsmall” If and only if The circuit complexity of 𝑔 is β€œlarge”

slide-26
SLIDE 26

Data Complexity and Circuit Complexity

Theorem 1.2: Let function 𝑔: {0,1}βˆ— β†’ {0,1} and 𝑇(π‘œ) β‰₯ 2π‘œ for all π‘œ

  • 1. If 𝑔 is in SIZE(𝑇(π‘œ)), the data complexity of testing size-𝑑 circuits

for 𝑔 is at least 2}(~‒€ k )

  • 2. If 𝑔 is not in SIZE(π‘œ u 𝑇(π‘œ)), the data complexity of testing size-𝑑

circuits for 𝑔 is at most 𝑃(2~‒€ k + 𝑇Bc 𝑑 u 𝑑f log 𝑑)

Hard to test! Easy to test!

slide-27
SLIDE 27

Proof Intuition

Replace data complexity with time complexity When 𝑔 has large circuit complexity, we can quickly test circuits for 𝑔 Suppose 𝑇(π‘œ) is a lower bound on circuit complexity of 𝑔

3

Given a size-𝑑 circuit, we may have to try all 23 < 2~‒€(k) inputs As circuit complexity 𝑇(π‘œ) increases, time complexity 2~‒€(k) decreases

Time complexity with respect to circuit size 𝑑

slide-28
SLIDE 28

Proof Idea Pt. 1

  • 1. Suppose a circuit 𝑑 of size 𝑇(π‘œ) that computes 𝑔
  • 2. Construct a circuit 𝑑ƒ of size 𝑇’(π‘œ) = 𝑇(π‘œ) + π‘œ that agrees

with 𝑑 on all inputs except 𝑦

  • 3. Thus, for any π‘œ-input circuit of size 𝑇’ π‘œ we must include

all inputs of length π‘œ β‰₯ 𝑇Bc 𝑑 in a test set for 𝑔

If f is in SIZE(𝑇(π‘œ)), the data complexity of testing size-𝑑 circuits for 𝑔 is at least 2}(~‒€ k )

slide-29
SLIDE 29

Proof Idea Pt. 2

  • Assume the circuit complexity of 𝑔 is greater than π‘œ u 𝑇(π‘œ)
  • There cannot exist a set 𝐸 of size 𝑇(π‘œ) circuits that a circuit player can

use to have better than Β½ chance to get the payoff for any given input.

  • Otherwise, construct a MAJORITY circuit of the outputs from 𝐸
  • This will be of size at most π‘œ u 𝑇 π‘œ , contradiction.
  • Thus, we cannot be in case 1 of Theorem 2.2

This will compute 𝑔!

If f is not in SIZE(π‘œ u 𝑇(π‘œ)), the data complexity of testing size-𝑑 circuits for 𝑔 is at most 𝑃(2~‒€ k + 𝑇Bc 𝑑 u 𝑑f log 𝑑)

slide-30
SLIDE 30

Data Complexity Consequence

Theorem 2.2 (roughly) Let π‘ž + π‘Ÿ ≀ 1 βˆ’ πœ—

  • 1. There exists a 𝑙-size set of circuits 𝑍 βŠ† 𝐷 such that for all 𝑦 ∈ 𝐽,

𝑑 𝑦 = 𝑔 𝑦 for more than a π‘ž-fraction of circuits 𝑑 ∈ 𝑍.

  • 2. There exists a β„“-size set of inputs π‘Œ βŠ† 𝐽 such that for all 𝑑 ∈ 𝐷,

𝑑 𝑦 β‰  𝑔 𝑦 for more than a π‘Ÿ-fraction of inputs 𝑦 ∈ π‘Œ.

𝑙 β‰₯ 𝑑 u π‘œ β„“ β‰₯ 𝑑 u 𝑑 log 𝑑

slide-31
SLIDE 31

Proof Idea Pt. 2.2

We must be in case 2 of Theorem 2.2:

  • There exists a 𝑇 π‘œ log 𝑇(π‘œ)-size set of inputs π‘Œ βŠ† 𝐽 such that for all 𝑑 ∈ 𝐷, 𝑑 𝑦 β‰  𝑔 𝑦 for more

than a π‘Ÿ-fraction of inputs 𝑦 ∈ π‘Œ.

  • Thus, for all size 𝑇(π‘œ) circuits on π‘œ bits, we must add π‘Œ to the test set.
  • This adds π‘œ β‹… 𝑇(π‘œ)f log 𝑇 π‘œ tests to our test set.
  • For size-𝑑 circuits with input 𝑇Bc 𝑑 < π‘œ, we may need to check all input values,
  • thus adding 2~‒€(k) tests.
  • Final total: 2~‒€(k) + π‘œ β‹… 𝑇 π‘œ f log 𝑇 π‘œ
  • Set 𝑑 = 𝑇(π‘œ) and we get 𝑃(2~‒€ k + 𝑇Bc 𝑑 u 𝑑f log 𝑑) total tests.

If f is not in SIZE(π‘œ u 𝑇(π‘œ)), the data complexity of testing size-𝑑 circuits for 𝑔 is at most 𝑃(2~‒€ k + 𝑇Bc 𝑑 u 𝑑f log 𝑑)

∎

slide-32
SLIDE 32

Data Complexity and Circuit Complexity

Theorem 1.2: Let function 𝑔: {0,1}βˆ— β†’ {0,1} and 𝑇(π‘œ) β‰₯ 2π‘œ for all π‘œ

  • 1. If 𝑔 is in SIZE(𝑇(π‘œ)), the data complexity of testing size-𝑑 circuits

for 𝑔 is at least 2}(~‒€ k )

  • 2. If 𝑔 is not in SIZE(π‘œ u 𝑇(π‘œ)), the data complexity of testing size-𝑑

circuits for 𝑔 is at most 𝑃(2~‒€ k + 𝑇Bc 𝑑 u 𝑑f log 𝑑)

Hard to test! Easy to test!

slide-33
SLIDE 33

Data Complexity and Circuit Complexity

β€œThe theory of circuits becomes interesting when we restrict the complexities of the circuits; The theory of test suites becomes similarly interesting when restricting the amount of necessary data.”

  • Chapman, B., Williams, R.

The data complexity of testing 𝑔 is β€œsmall” If and only if The circuit complexity of 𝑔 is β€œlarge”

slide-34
SLIDE 34

Data Complexity Corollary

Another way to separate 𝑂𝑄 from 𝑄/FGHI ! Corollary 1.1: 𝑂𝑄 ⊈ 𝑄/FGHI If and only if Data complexity of testing size-𝑑 circuits for π‘‡π΅π‘ˆ is at most 𝑃(2k‰)

slide-35
SLIDE 35

Open Questions

  • Can new circuit lower bounds be proven based on the guidance of

data complexity?

  • e.g. Does 𝐷𝑀𝐽𝑅𝑉𝐹3h

3/f require β‰₯ 4π‘œf size circuit?

  • How does the complexity class of 𝑔 relate to the complexity of testing

circuits for 𝑔?

  • If 𝑔 is computable in some complexity class, what can we say about the

complexity class that supports testing for 𝑔?

  • Can the equivalence of Theorem 1.2 be tightened further?
slide-36
SLIDE 36

Data Complexity and Circuit Complexity

Theorem 1.2: Let function 𝑔: {0,1}βˆ— β†’ {0,1} and 𝑇(π‘œ) β‰₯ 2π‘œ for all π‘œ

  • 1. If 𝑔 is in SIZE(𝑇(π‘œ)), the data complexity of testing size-𝑑 circuits

for 𝑔 is at least 2}(~‒€ k )

  • 2. If 𝑔 is not in SIZE(π‘œ u 𝑇(π‘œ)), the data complexity of testing size-𝑑

circuits for 𝑔 is at most 𝑃(2~‒€ k + 𝑇Bc 𝑑 u 𝑑f log 𝑑)

Hard to test! Easy to test!