ICLA 2019 Average Complexity of SAT
On Average Case Complexity of SAT
Johann A. Makowsky
Faculty of Computer Science Technion–Israel Institute of Technology Haifa, Israel
janos@cs.technion.ac.il www.cs.technion.ac.il/∼janos
File:icla-title.tex 1
On Average Case Complexity of SAT Johann A. Makowsky Faculty of - - PowerPoint PPT Presentation
ICLA 2019 Average Complexity of SAT On Average Case Complexity of SAT Johann A. Makowsky Faculty of Computer Science TechnionIsrael Institute of Technology Haifa, Israel janos@cs.technion.ac.il www.cs.technion.ac.il/ janos
ICLA 2019 Average Complexity of SAT
janos@cs.technion.ac.il www.cs.technion.ac.il/∼janos
File:icla-title.tex 1
ICLA 2019 Average Complexity of SAT
The slides were essentially prepared by Yoni Mircae
File:icla-title.tex 2
ICLA 2019 Average Complexity of SAT
File:icla-title.tex 3
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 4
ICLA 2019 Average Complexity of SAT
it runs in expected polynomial time.
E[TA] = 2n − 1 2n · n2 + 1 2n · 2n = O(n2)
A, the expected running
time of B is: E[TB] = 2n − 1 2n · n4 + 1 2n · 22n = O(2n)
File:icla-main.tex 5
ICLA 2019 Average Complexity of SAT
countable or finite set S if Σx∈Sµ(x) = 1.
Sn = {x ∈ S : |x| = n} is finite.
global randomization of S.
Prµ{x|x ∈ Sn}. The sequence < Sn, µn > is called a local randomiza- tion of S. Note that each µn is a pdf on Sn.
File:icla-main.tex 6
ICLA 2019 Average Complexity of SAT
is polynomial time computable and D ⊆ S. – Let µ∗ be defined by µ∗(x) = Σy≤xµ(x) µ is effectively computable if µ∗ is polynomial time computable. – A pair < D, µ > with µ effectively computable is called a distributional problem. We think of D as the set of positive instances of some problem.
File:icla-main.tex 7
ICLA 2019 Average Complexity of SAT
w(n) = Prµ{Sn}. Note that w is a pdf on N+.
that w is a regular weight function and that the global randomization is regular.
Σw(n)nε = ∞ then we say that w is a strongly regular weight function and that the global randomization is strongly regular. Example: w(n) = n−1(logn)−2 Note that a local randomization with a weight function defines a unique global randomization.
File:icla-main.tex 8
ICLA 2019 Average Complexity of SAT
T : S → R+, Eµn(T) = Σx∈SnT(x)µn(x) is the expectation of T on inputs
– f is an upper bound on the expectation of T if Eµn(T(x)) ≤ f(n). – f is a (local) upper bound in probability on T if limn→∞Prµn{T(x) ≤ f(n)} = 1. – f is a (local) lower bound in probability on T if limn→∞Prµn{T(x) > f(n)} = 1. Results in probabilistic analysis of algorithms are usually expressed with these types of local bounds on T.
File:icla-main.tex 9
ICLA 2019 Average Complexity of SAT
at most f on the average that was developed in ∗: Let < S, µ > be a global randomization on S and T : S → R+. For a strictly increasing function f : R+ → R+ we say that T is at most f on the average w.r.t the global randomization < S, µ > if Eµ(f −1(T(x)) |x| ) < ∞ and denote this by T ∈ AV B(< S, µ >, f) or simply T ∈ AV B(f) if the randomization is evident from context.
∗Shai Ben-David, Benny Chor, Oded Goldreich, and Michael Luby. On the theory of average
case complete complexity. Journal of Computer and System Sciences, 44(2):193-219, April 1992. File:icla-main.tex 10
ICLA 2019 Average Complexity of SAT
We can now define a (average) complexity class: Let < D, µ > be a distributional problem. We say that < D, µ > is poly- nomial on the average and write < D, µ >∈ AverP if there is a deterministic algorithm A for D with run-time TA and there is a polynomial p such that TA ∈ AV B(p). Theorem 1 (Transfer Theorem for Upper Bounds): Let < S, µ > be a global randomization, < Sn, µn > the implied local randomization and T : S → R+. For any function f : R+ → R+:
Proof: Follows from Jensen’s inequality: φ(Σaixi
Σai ) ≤ aiφ(xi) Σai
for a real convex function φ and positive weights ai. The inequality is reversed if φ is concave.
11
ICLA 2019 Average Complexity of SAT
Theorem 2 (Transfer Theorem for Lower Bounds
∗):
Let < S, µ > be a global randomization with weight function w and let f, g : R+ → R+ be two strictly increasing functions. If f is a lower bound in probability on T w.r.t < Sn, µn > and g is sufficiently small for Σ∞
n=1
w(n) n g−1(f(n)) = ∞ to hold then T / ∈ AV B(g).
∗Abraham Sharell. On the average case complexity of SAT for flat distributions. Master’s
thesis, Technion-Israel Institute of Technology, 1992. File:icla-main.tex 12
ICLA 2019 Average Complexity of SAT
Proof: Since g is strictly increasing it suffices to show that Eµ(g−1(T(x))
|x|
) = ∞. Reminder: Markov’s inequality: If X is a nonnegative random variable and a > 0, then P(x ≥ a) ≤ E(X)
a
. Applying Markov’s inequality to the (strictly positive) random variable g−1(T(x)) we derive for every n ∈ N+ Eµn(g−1(T(x))) > g−1(f(n))Prµn{g−1(T(x)) > g−1(f(n))} = g−1(f(n))Prµn{T(x) > f(n)} Observing that Eµ(g−1(T(x)) |x| ) = Σ∞
n=1
w(n) n Eµn(g−1(T(x))) > Σ∞
n=1
w(n) n g−1(f(n))Prµn{T(x) > f(n)} together with the assumptions in the hypothesis gives the desired result.
13
ICLA 2019 Average Complexity of SAT
Corollary 3 Let < S, µ > be a regular global randomization with weight function w. Let f : R+ → R+ be a strictly increasing function, and assume that f(n) is a lower bound in probability on T w.r.t < Sn, µn >. (i) If w is regular then there exists 0 < ε < 1 so that T / ∈ AV B(f(nε)). (ii) If w is strongly regular then for every 0 < ε < 1 we have T / ∈ AV B(f(nε)).
File:icla-main.tex 14
ICLA 2019 Average Complexity of SAT
Proof: For regular w let c > 1 be a constant so that for all n ∈ N+: w(n) ≥ n−c. Set ε = 1
c and g(n) = f(nε). Then g−1(f(n)) = nc and
Σ∞
n=1
w(n) n g−1(f(n)) ≥ Σ∞
n=1
1 n = ∞. By the Transfer Theorem for Lower Bounds we conclude that T / ∈ AV B(g). For strongly regular weight functions let 0 < ε < 1 and g(n) = f(nε). Then the general term in the above sum evaluates to w(n)n( 1
ε−1) and by the definition
15
ICLA 2019 Average Complexity of SAT
Proposition 4 Let f, g : R+ → R+ be two strictly increasing functions so that: limn→∞ g−1(f(n)) n = ∞ If f is a lower bound with probability 1 on T w.r.t < Sn, µn > then there exists a global randomization < S, µ > which is compatible with < Sn, µn > such that T / ∈ AV B(< S, µ >, g).
File:icla-main.tex 16
ICLA 2019 Average Complexity of SAT
Proof: Define a new function δ on a ∈ R+ by: δ(a) = g−1(f(a))
a
. By previous theorem it is sufficient to construct a weight function w so that Σ∞
n=1w(n)δ(n) = ∞
We assume without loss of generality that δ is strictly positive and differen- tiable in R+. Denote the derivative by δ
′ and define w by:
w(n) = δ
′(n)δ(n)−2
To verify that w is indeed a weight function we have to show that its sum
and using the assumptions as follows: Σ∞
n=1w(n) ≤
1
δ
′(a)δ(a)−2da =
1 δ(1) On the other hand the sum over w(n)δ(n) diverges: Σ∞
n=1w(n)δ(n) ≥
1
δ
′(a)δ(a)−1da = [lnδ(a)]∞
1 = ∞.
17
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 18
ICLA 2019 Average Complexity of SAT
Definition 5 (Gurevich, 1991): Let < S, µ > be a global randomization and let < Sn, µn > be a local randomization. Then:
ciently large input x ∈ S, µ(x) ≤ 2−|x|1/k.
sufficiently large n and input x ∈ Sn, µn(x) ≤ 2−n1/k. Theorem 6 (Equivalence of global and local flatness): Let < S, µ > be a global randomization with weight function w and let < Sn, µn > be the implied local randomization:
File:icla-main.tex 19
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 20
ICLA 2019 Average Complexity of SAT
Let V be any (finite or infinite countable) set of boolean variables.
v and ¬v occur in C. We denote the set of all clauses over V by CL(V ).
We define z(¬v) = 1 − z(v).
ℓ ∈ C.
in Σ.
File:icla-main.tex 21
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 22
ICLA 2019 Average Complexity of SAT
Definition 7 (Negation-symmetry): Let Π = {π : V → {v, ¬v : v ∈ V }|∀v ∈ V : π(v) ∈ {v, ¬v}}. We extend π ∈ Π to literals, clauses and Σ =< C1, ..., Cn >∈ CNFT in the natural way. π(Σ) is structurally the same as Σ but for some variables v ∈ V the literals v and ¬v are exchanged.
that Σ1 and Σ2 are negation-symmetric.
[Σ]Π = {π(Σ) : π ∈ Π}.
File:icla-main.tex 23
ICLA 2019 Average Complexity of SAT
Definition 8 (Negation-symmetric invariant randomization): Let S ⊆ CNFT be the union of some symmetry-classes and let Sn be the CNFT-instances in S with n clauses. We say that a (local) randomiza- tion < Sn, µn > is negation-symmetry invariant if for all Σ1, Σ2 ∈ S that are negation-symmetric we have µn(Σ1) = µn(Σ2) where n = |Σ|. Proposition 9 For Σ ∈ CNFT let var(Σ) denote the number of distinct variables appearing in Σ. Then |[Σ]Π| = 2var(Σ).
File:icla-main.tex 24
ICLA 2019 Average Complexity of SAT
Theorem 10 (Flatness Theorem, JAM and Sharell, 1992): Let S ⊆ CNFT and let < Sn, µn > be a local randomization of S which is negation- symmetric invariant. If there is a constant c ∈ N+ such that for all Σ ∈ S the number of clauses in Σ is bounded by var(Σ)c then < Sn, µn > is locally flat. Proof: Let n ∈ N+ and Σ ∈ Sn. Since S is the union of some symmetry-classes [Σ]Π ⊂ S and since all instances in a symmetry class have the same number
var(Σ) ≥ n1/c we can get a bound on µn(Σ) as follows: 1 ≥ ΣΣ′ ∈ [Σ]Πµn(Σ
′) = |[Σ]Π|µn(Σ) = 2var(Σ)µn(Σ) ≥ 2n1/cµn(Σ).
Therefore: µn(Σ) ≤ 2−n1/c.
25
ICLA 2019 Average Complexity of SAT
Remark: An interesting special case is k − CNFT since if all clauses have exactly k literals for some constant k then the number of possible clauses is bounded by a polynomial in the number of variables. So any negation- symmetric-invariant randomization on a subset of k−CNFT (where the same clause is not allowed to appear multiple times in an instance) is flat.
File:icla-main.tex 26
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 27
ICLA 2019 Average Complexity of SAT
We consider matrices M of 0’s and 1’s with 2v columns and t rows.
M(v + i, j) = 1 iff it occurs negatively.
File:icla-main.tex 28
ICLA 2019 Average Complexity of SAT
2
and t is non-decreasing.
variables v.
n = 2vt(v)).
µn(t, p)(M) = p(v)l(M) · (1 − p(v))n−l(M).
∗Cynthia Brown, Allen Goldberg, and Paul Purdom.
Average time analysis of simplified davis-putnam procedures. Information Processing Letters, 15(2), September 1982.
†Cynthia Brown and Paul Purdom. The pure literal rule and polynomial average time. SIAM
Journal of computing, 14(4), November 1985. File:icla-main.tex 29
ICLA 2019 Average Complexity of SAT
FD-distributions are clearly negation-symmetric. However, the number of clauses is not bounded by a polynomial in the number of variables, so this is not sufficient to make them flat. Theorem 11 (JAM and Sharell, 1992): A FD-distribution < S(t)n, µn(t, p) > is locally flat iff there is a k ∈ N+ such that for all sufficiently large n ∈ N+ −log2(1 − pn) ≥ n(1 k − 1) where pn = p(v) and v is the unique solution to 2vt(v) = n. Proof: −log2(1 − pn) ≥ n(1
k − 1) iff (1 − pn)n ≤ 2n−1/k iff for every input M ∈ S(t)n
pl(M)
n
(1 − pn)n−l(M) ≤ 2n−1/k (using that p(n) ≤ 1
2) iff (by definition of µn(t, p)) for every input M ∈ S(t)n
µn(t, p)(M) ≤ 2n−1/k
30
ICLA 2019 Average Complexity of SAT
no variable x both x and ¬x occur in the same clause. For v, k ∈ N+ let C(v, k) ⊆ C be the set of clauses with exactly k literals over the variables x1, ..., xv . Note that C(v, k) has exactly v
k
αn variables of length k. For Σ ∈ Sn(α, k) define µn(Σ1) = |C(αn, k)|−n. Theorem 12 For every choice of weight functions, the global version of these FS-distributions is flat. Proof: An immediate consequence of the theorem about equivalence of global and local flatness.
31
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 32
ICLA 2019 Average Complexity of SAT
the clause (A − {x}) ∪ (B − {¬x}) is called a resolvent of A and B.
vents.
File:icla-main.tex 33
ICLA 2019 Average Complexity of SAT
such that – each Ck belongs to Σ or is a resolvent for some Ci, Cj such that i, j < k. – CN is the empty clause.
satisfy each Ck. since no truth assignment satisfies the empty clause CN, it follows that Σ is unsatisfiable.
File:icla-main.tex 34
ICLA 2019 Average Complexity of SAT
est N such that there is a resolution proof C1, C2, ..., CN of unsatisfiability
in expected polynomial time.
form of resolution takes more than 2
4
√n steps for almost all instances.
∗A. Goldberg. Average case complexity of the satisfiability problem. In 4th Workshop on
Automated Deduction, pages 1-6, 1979. Austin, TX.
†John Franco and Marvin Paul.
Probabilistic analysis of the davisputman procedure for solving the satisfiability problem. Discrete Applied Mathematics, 5:77-87, 1983.
‡Vasek Chvatal and Endre Szemeredi. Many hard examples for resolution. Journal of the
ACM, 35(4), October 1988. File:icla-main.tex 35
ICLA 2019 Average Complexity of SAT
The results here are based on a modified Davis-Putnam-Procedure to test satisfiability of clauses, which we call DPP ∗, which is a special case of reso- lution. Theorem 13 (Brown-Purdom) Let < S(t)n, µn(t, p) > be a FD-distribution and assume either
−ln(1−O(√
lnv v )
t(v)
) Then there is a polynomial P(v) such that DPP ∗, and hence resolution, has expected run-time O(P(v)).
File:icla-main.tex 36
ICLA 2019 Average Complexity of SAT
Definition 14 (global FD-distribution) Let < S(t), µ > be a global randomization which ia compatible to a local FD-distribution. Then we call < S(t), µ > a global FD-distribution. Theorem 15 Let < S, µi > be a global FD-distribution compatible to a local FD-distribution subject to one of the conditions in Brown-Purdom theorem. Let < S, µ > be a global randomization that is a finite linear combination of the < S, µi >. That is for some constants a1, a2, .., am ∈ R+: µ(x) = Σm
i=1aiµi(x).
Then the distributional problem that consists of SAT and the randomization < S, µ > is in AverP. Proof: From the Transfer Theorem for Upper Bounds we derive that for each 1 ≤ i ≤ m SAT with < S, µi > is in AverP. It is easy to show that this is preserved under finite linear combinations.
37
ICLA 2019 Average Complexity of SAT
Theorem 16 (Chvatal and Szemeredi) Let < Sn(α, k), µn > be a FS- randomization with k ≥ 3 and α ≤
2−k 0.7.
Then there is a constant c ∈ R+ such that limn→∞Prµn{Tres(Σ) ≥ 2cn} = 1, where Tres(Σ) is the resolution complexity of Σ. The following result shows that the condition on α cannot be relaxed too much in the above theorem. Theorem 17 (Franco) Let < Sn(α, k), µn > be a FS-randomization with α > 1. Then there is an algorithm with runtime T so that for some c > 0: limn→∞Prµn{T(x) > nc} = 0.
File:icla-main.tex 38
ICLA 2019 Average Complexity of SAT
Theorem 18 (JAM and Sharell, 1992): Let < Sn(α, k), µn > be a local randomization with k ≥ 3 and α ≤ 2−k
0.7.
for every 0 < ε < 1, Tres / ∈ AV B(< S, µ >, 2nε.
global randomization < S, µ > compatible to < Sn(α, k), µn > so that: Tres / ∈ AV B(< S, µ >, 2g(n)) Proof:
39
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 40
ICLA 2019 Average Complexity of SAT
File:icla-main.tex 41
ICLA 2019 Average Complexity of SAT
(1992) ∗ tested empirically the hardness of randomly generated 3-SAT formulas: m clauses are constructed uniformly and independently at random, each clause is obtained by sampling uniformly and independently 3 of n vari- ables and negating each of them with probability 1/2. Using the Davis-Putnam (DP) procedure, they found an easy-hard-easy pattern, where the hardest formulas in terms of number of DP calls have a denisty (i.e. the clauses/variables ratio, m/n) of ≈ 4.3, near the point where 50% of the formulas are satisfiable. This suggests guidelines for constructing distributions of formulas for testing the average complexity
∗Mitchell, David, Bart Selman, and Hector Levesque. ”Hard and easy distributions of SAT
problems.” AAAI. Vol. 92. 1992. File:new-slides.tex 42
ICLA 2019 Average Complexity of SAT
ratio and approximability of SAT and other NP-complete problems. He showed that for a particular distrubution the clause/variable ratio affects the approximability not only of SAT problems but also for many
∗ U. Feige, Relations between Average Case Complexity and Approximation Complexity
File:new-slides.tex 43
ICLA 2019 Average Complexity of SAT
(2000) ∗ investigated experimentally the average-case complexity of random 3-SAT formulas for fixed density and varying num- ber of variables. They found a phase transition in which the complexity shifts from poly- nomial to exponential, where the value of density at which the phase transition occurs appears to be solver-dependent: the GRASP algorithm shifts from polynomial to exponential complexity near the density of 3.8, CPLEX algorithm shifts near density 3, while the transition of the CUDD algorithm is observed between densities of 0.1 and 0.5.
∗Coarfa, C., Demopoulos, D. D., Aguirre, A. S. M., Subramanian, D., and Vardi, M. Y.
”Random 3-SAT: The plot thickens.” International Conference on Principles and Practice
File:new-slides.tex 44
ICLA 2019 Average Complexity of SAT
Let Φ be a uniformly distributed random k-SAT formula with n variables and m clauses, then the Walksat algorithm finds a satisfying assignment
constant ρ > 0.
†
proved that the Walksat algorithm is ineffective with high probability if m/n > c2kln2k/k where c > 0 is an absolute constant.
∗Coja-Oghlan, Amin, and Alan Frieze.
”Analyzing Walksat on random formulas.” SIAM Journal on Computing 43.4 1456-1485. 2014
†Coja-Oghlan, Amin, Amir Haqshenas, and Samuel Hetterich.”Walksat Stalls Well Below
Satisfiability.” SIAM Journal on Discrete Mathematics 31.2: 1160-1173. 2017 File:new-slides.tex 45
ICLA 2019 Average Complexity of SAT
File:icla-thanks.tex 46