Lecture Slides for MAT-73006 Theoretical computer science PART - - PowerPoint PPT Presentation

lecture slides for mat 73006 theoretical computer science
SMART_READER_LITE
LIVE PREVIEW

Lecture Slides for MAT-73006 Theoretical computer science PART - - PowerPoint PPT Presentation

Lecture Slides for MAT-73006 Theoretical computer science PART IIIa: Space Complexity Henri Hansen March 20, 2014 1 Space complexity Let M be a TM that halts on all inputs. The space com- plexity of M is the function f : N N where f


slide-1
SLIDE 1

Lecture Slides for MAT-73006 Theoretical computer science PART IIIa: Space Complexity

Henri Hansen March 20, 2014

1

slide-2
SLIDE 2

Space complexity

  • Let M be a TM that halts on all inputs. The space com-

plexity of M is the function f : N → N where f(n) is the maximum number of tape cells that M scans on any input

  • f length n (on any branch, if M is nondeterministic)
  • If L is the language decided by M, and f is the space com-

plexity of M, then we say that M decides L in f(n) space, and that L is decided in f(n) space.

  • If f : N → R+, the space complexity class SPACE(f(n))

is the set of languages that are decided in O(f(n)) space

2

slide-3
SLIDE 3

by a deterministic turing machine, and NSPACE(f(n)) is the set of languages that are decided in O(f(n)) space by a nondeterministic turing machine.

  • Space is more powerful than time, because space can be

reused.

  • Example: Let M1 be defined as a TM that receives a boolean

formula (φ) as input, and

  • 1. For each truth assignment x1, ldots, xm of the variables
  • f φ:

2. Evaluate φ

slide-4
SLIDE 4
  • 3. If φ was ever evaluated as 1, accept, otherwise reject.
  • Clearly M1 decides SAT, and it uses only O(m) space,

where m is the number of variables in φ, which means that the space consumption is O(n) (because n > m).

  • Example: Recall that ALLDFA is the language of NFAs

that accept all strings. (this language is not known to be in

NPor coNP)

  • This can be decided in nondeterministic O(n) space as fol-

lows:

  • 1. Mark the start state
slide-5
SLIDE 5
  • 2. Repeat 2q times:

3. Select an input symbol nondeterministically and change the position of markers according to where markers might end up if the input symbol was read

  • 4. Accept if the markers are ever distributed in such a way

that no accepting state is marked, otherwise reject.

  • Space requirement is clearly O(n), and this kind of nonde-

terministic machine accepts the complement of ALLDFA.

slide-6
SLIDE 6

Savitch’s theorem

  • We know from the previous example that

¯ ALLDFA ∈ NSPACE(n).

  • For deterministic space the answer would immediately give

the space complexity of ALLDFA, but if one considers the asymmetry of NP and coNP it is not immediately obvious what the result means for ALLDFA itself.

  • We can, however prove a theorem that connects nondeter-

ministic and deterministic space classes in a way that does not seem to be the case for time classes.

3

slide-7
SLIDE 7
  • Savitch’s Theorem: For every function f : N → R+, such

that f(n) ≥ n,

NSPACE(f(n)) ⊆ SPACE(f(n)2)

  • The proof is not particularly hard to understand, but some-

what technical. The main idea is the we simulate the non- deterministic Turing machine N with a deterministic Turing machine M, by using a matrix that contains f(n) different configurations of N; because the N works in f(n) space, the space requirement of M will be f(n)2, as claimed in the theorem.

  • Let N decide A in f(n) space nondeterministically. M is

constructed so that is implements a procedure CANYIELD(c1, c2, t),

slide-8
SLIDE 8

that tests if the N can get from configuration c1 to c2 using t steps.

  • In a nondeterministic machine CANYIELD(c1, c2, t) would

be easy to implement, because all it would have to do, would be to nondeterministically guess the correct steps. But we show that we can do this deterministically as fol- lows:

  • CANYIELD(c1, c2, t) takes the three inputs and
  • 1. If t = 1, it checks if c1 = c2, or, if one of the transitions
  • f N yields c2 from c1; accept if any of these is possible,

reject otherwise.

slide-9
SLIDE 9
  • 2. If t > 1, one by one generate every configuration cm of

N that fits in f(n) space, and: (a) Check CANYIELD(c1, cm, t/2) (b) Check CANYIELD(cm, c2, t/2) If both accept, then accept.

  • 3. If you tried them all without accepting, reject
  • Since N decides a language in NSPACE(f(n)), then for

some d, the length of a configuration of N never exceeds cf(n) for some constant, and we can define another con- stant d such that N cannot have mode than 2d

f(n) different

configurations.

slide-10
SLIDE 10
  • M works so that given an input w of length n, it calculates

CANYIELD(c0, ca, 2d

f(n)), i.e., checks if N can yield an ac-

cepting configuration from the initial configuration.

  • We still need to show that M works in O(f(n)2) space!
  • You need f(n) space for each active recursive copy of CANYIELD.
  • How many recursive copies of CANYIELD are there at any

time?

  • Let us look at the values of t: Initially there is only one, one

with t = 2d

f(n).

slide-11
SLIDE 11
  • When there is one recursion running with t, then there is at

most one recursive call running with t/2 at a time.

  • The ”active” values of t are then 2d

f(n), 2d f(n)−1, . . . , 20,

i.e., the number of calls is then at most d f(n)

  • Each requires f(n) space, so that the total space needed is

d f(n)2, meaning that M decides the language in O(f(n)2) time.

  • We have just proven the theorem!
slide-12
SLIDE 12

PSPACE

  • We define the set PSPACE as the class of languages that

are decided in polynomial space by deternimistic Turing ma-

  • chines. I.e.,

PSPACE =

  • k

SPACE(nk)

  • We can define NPSPACE in a similar manner, by allowing

for the Turing machine to be nondeterministic, but Savitch theorem has the following corollary: PSPACE = NPSPACE!

  • As we have shown that SAT is in PSPACE, we have es-

tablished that NP ⊆ PSPACE, and we have: P ⊆ NP ⊆

PSPACE = NPSPACE.

4

slide-13
SLIDE 13
  • We actually do not know if these containments are proper
  • r if in fact the classes are all equal!
  • It is, however, plausible that at least not all of them are

equal.

slide-14
SLIDE 14

PSPACE-completeness

  • We define the language B as being PSPACE-complete if

it satisfies the following conditions:

  • 1. B ∈ PSPACE, and
  • 2. Every A ∈ PSPACE is polynomial time reducible to B
  • If only the second condition is satisfied, we use the term

PSPACE-hard.

  • Please note that we use the same polynomial time reducibil-

ity as we did for NP-completeness.

5

slide-15
SLIDE 15
  • The reason for this is that ”complete” problems are consid-

ered important because they are in some sense ”the most difficult” or most powerful problems of the class.

  • For it to be powerful, it would mean that it can be used to

solve all the problems in the same class through a simple

  • r easy transformation.
  • So, the reduction that is used in completeness definitions

should be deterministic and simpler than the class itself.

  • Example: the TQBF-problem.
slide-16
SLIDE 16
  • Recall that Boolean formulas are expressions over boolean

variables, using operators ∨, ∧, ¬.

  • A quantifier is either ∀ or ∃ followed by a variable. A quan-

tified Boolean formula or QBF , is defined as follows:

  • 1. Every Boolean formula φ is a QBF
  • 2. If φ is a QBF and x is a variable, then ∀x(φ) and ∃x(φ)

are QBFs.

  • This kind of formula is actually a restricted QBF

, in that it is in prenex normal form, meaning that all the quantifiers appear only in the front of the formula (please check that this is so!)

slide-17
SLIDE 17
  • If Q is some quantifier and Qx appears in the formula φ,

then we say x is bound in φ.

  • A formula is fully quantified if every variable in it is bound.
  • Let φx=1 mean the formula that results, when x is given the

value 1 in φ. The meaning of the quantifiers is defined as – ∀xφ is true iff φx=1 ∧ φx=0 is true. – ∃xφ is true iff φx=1 ∨ φx=0 is true.

  • Then we define the language

TBQF = {(φ) | φ is a true fully quantified boolean formula}

slide-18
SLIDE 18
  • Theorem: TBQF is PSPACE-complete
  • Proof: First we show that TBQF is in PSPACE. Consider

the following algorithm, with (φ) as input:

  • 1. If φ has no quantifiers, then it contains only constants,

and we can evaluate it, and accept if true

  • 2. If φ = ∃xψ, we recursively decide the truth values of

ψx=0 and ψx=1 and accept if one of them is true, oth- erwise reject

  • 3. If φ = ∀xψ, we recursively decide the truth values of

ψx=0 and ψx=1 and accept if both are true, otherwise reject

slide-19
SLIDE 19
  • Again, the number of recursive calls needed is less than n,

and even if we store all the intermediate formulas, we only use n2 space. (in fact, we can make it run in O(n) space)

  • To show PSPACE-hardness, assume A is a language that

can be decided in nk space by a Turing machine M

  • A string w is then mapped into a QBF φ that is true iff M

accepts w

  • Let c1 and c2 represent configurations of M, and t be a
  • number. We define a QBF φc1,c2,t that is true iff c1 yields

c2 in t steps.

slide-20
SLIDE 20
  • We build the same kind of matrix of cells as in the proof that

SAT is NP-complete. The matrix has n2k cells: there are nk rows in the matrix, each row containing a configuration, that has length nk. We have one variable for each cell

  • For t = 1, φc1,c2,t is constructed for triplets of cells: either

the configurations are equal, or there is some transition of M that yields c2 from c1. The number of these formulas is the square of the number of the variables.

  • For t > 1, we should define φc1,c2,t as a formula that is

equivalent to ∃m(φc1,m,t/2 ∧ φc1,c2,t/2), but this has one problem: we would need to start with t = 2dnk (as this is the maximum number of steps) which would result in

slide-21
SLIDE 21

dn2k quantifications; the problem is that each quantification would double the length of the formula!

  • However, we can introduce new variables to ”fold” the re-

cursive growth of the formula: ∃m∀c3∀c4((c3 = c1∧c4 = m)∨(c3 = m∧c4 = c2))∧ = mφc3,4,t/2)

  • We use the shorthan x = y for the formula (x ∧ y) ∨ (¬x ∧

¬y). c3 and c4 are completely new variables that are not cell-variables

  • In this constuction, the length of the formula is kept in check:

t starts from 2dnk and halves on each quantification, so the total number of recursive quantifications is dnk.

slide-22
SLIDE 22
  • Notice that the recursive quantification is needed for every

variable, of which there are nk, so the total length of the formula is dnk · nk, but this is still polynomial in n

slide-23
SLIDE 23

Games and PSPACE

  • We define a game as a competition in which two players

choose between ”moves” in turns, and attempt to achieve some outcome (i.e., ”win”).

  • We can actually express quite a large class of games us-

ing quantifiers, i.e., they have a natural connection with the TBQF-problem.

  • We discuss this by means of the so-called formula game
  • Let φ = ∃x1∀x2∃x3 · · · Qxk(ψ) be a QBF

.

6

slide-24
SLIDE 24
  • Two players, A and E play the game by taking values for

variables x1, . . . , xk. Player E selects the values for vari- ables quantified by ∃ and Player A selects the values for the variables quantified by ∀. These choices are made in the

  • rder in which the variables are numbered.
  • The game ends after all variables have gotten their values.

If the formula is then true, player 1 wins, and if it is false, player 2 wins.

  • We define the language

FG = {(φ) | player E has a sure-win strategy in the game of φ}

  • Theorem: FG is PSPACE-complete
slide-25
SLIDE 25
  • Proof: FG is really just an instance of TBQF, if you consider

it carefully: Player E wins if the formula is true in the end, and chooses the truth values for variables quantified with ∃.

  • It means that no matter what player A chooses, player E

can always counter the choice with a value that makes the formula true.

  • Another game is Geography. In it, the players take turns in

naming cities in the world. The current player in turn must always name a previously unnamed city that begins with the last letter of the city named by the previous player. If this is not possible, then the current player loses.

slide-26
SLIDE 26
  • We can model a slightly more general version of this game

by a directed graph: We start with an arbitrary node in the graph, and the player chooses an edge that leads to a pre- viously unvisited node. If all the neighbouring nodes are visited (or there are none), then the player loses.

  • We define

GG = {(G, b) | Player 1 has a winning strategy starting from b}

  • Theorem GG is PSPACE-complete
  • Firstly, that GG is in PSPACE, consider the algorithm that

receives the input (G, b)

slide-27
SLIDE 27
  • 1. If b has no outgoing edges, then reject, as this is a losing

node

  • 2. Remove all edges that come into b, to get a new graph

G′

  • 3. For each neighbouring node b′ of b in G′, recursively

solve (G′, b′)

  • 4. If they all accept, then reject, otherwise, accept
  • This is a brute force algorithm, but if the number of nodes in

G is m, we need to store at most m − 1 graphs that are all smaller than G, so that the space requirement is less than n2

slide-28
SLIDE 28
  • We show reduction from FG (or TQBF), and for simplicity,

we assume the formula part is in 3CNF, and the quantifiers alternate between universal and existential, the first being existential and

  • For every variable xi in a QBF

, we define a gadget that has four nodes, {xup

i , xleft i

, xright

i

, xdown

i

} and four edges so that you can get from up to down either by going left or

  • right. From the last node in the last gadget, i.e., xdown

k

, you have an edge to a special node c.

  • For each clause (ai ∨ bi ∨ ci) we create a clause node ci,

and there is an edge from c to each ci.

slide-29
SLIDE 29
  • For each literal ai we create a literal node, and create an

edge from ci to ai.

  • If ai is a variable xj, then we draw an edge from ai to the

node xleft

j

and if it is a negation, ¬xj, we draw the edge from ai to xright

j

.

  • Then, we let b = xup

1 .

  • Why does the reduction work?

Suppose the first player

  • wins. He has selected, for odd numbered i, whether to go to

xright

i

  • r xleft

i

. These correspond to true and false values for xi, resp.

slide-30
SLIDE 30
  • the ”end game” stage follows when we get to the special

node c. If one of the clauses is not satisfied (i.e, the formula is false), then the second player can choose a clause node that is not satisifed. If all clauses are satisfied, no matter what node the second player chooses, the first player can then choose a literal node that corresponds to a true literal, and the game ends, because the corresponding node in the variable gadget has been visited, and the second player loses.

slide-31
SLIDE 31

The classes L and NL

  • For the sort of Turing machine that we have been discussing,

a space complexity function f(n) makes very little sense unless f(n) ≥ n, because the input alone requires n cells

  • n the tape.
  • However, if we consider the input to be read-only, it makes

sense to talk about space complexity in terms of extra mem-

  • ry that is used by the TM.
  • The distinction is irrelevant for almost all other purposes,

but for space complexity that is sub-linear, we need it.

7

slide-32
SLIDE 32
  • We assume Turing machines have two tapes: Input tape

and regular tape. The input tape contains the input, and the machine cannot write anything on the input tape. The regular tape is initially empty, and the Turing machine can write anything on the regular tape.

  • For space complexities f(n) with f(n) ≥ n the two models

are equivalent.

  • We define Las the class of languages that can be decided

by a deterministic turing machine in logarithmic space, i.e.,

L = SPACE(log n)

slide-33
SLIDE 33
  • Likewise NL = NSPACE(log n) for nondeterministic Tur-

ing machines.

  • We have a theorem: L ⊆ P.
  • Proof: The number of different configurations is limited: The

second tape may contain at most O(log n) symbols. If we assume the alphabet size is k, the second tape can con- tain at most O(klog n) = O(na) for some constant a. The heads of the turing machine can be in one of O(n log n) positions, and this leads to O(na+1 log n) different config- urations.

  • Similarly, NL ⊆ P, because the number of different branches

in the nondeterministic computation is at most polynomial.

slide-34
SLIDE 34

Examples

  • The language {0k1k | k ≥ 0} is in L: We need one counter

that counts up when reading the zeroes and down when reading ones.

  • Theorem: PATH is in NL.
  • Consider a nondeterministic turing machine that has a graph

description on its input. To decide if there is a path between two nodes s and t, it had to guess the next node on the path, starting from s, and when it finds t, it accepts. Clearly, no more space is needed than the space it takes to identify

8

slide-35
SLIDE 35

the node where the search currently is; this requires log n bits.

  • There are several others which we cover during the exer-

cises

slide-36
SLIDE 36

NL-completeness

  • PATH is in NL, but we don’t know if it is in L.
  • With polynomial space, we were able to show that PSPACE =

NPSPACE, but there we resorted to polynomial space,

which is closed under taking the square. No such technique is possible, because the square of a logarithm is no longer O(log n) so it will take us out of L.

  • The question of NL vs L is an open problem similar to P

vs NP.

9

slide-37
SLIDE 37
  • This gives rise to the question: Are there problems that are

”representative” of NL in a manner that is similar to NP: Can we have NL-completeness?

  • We cannot define NL-completeness by polynomial time re-

ductions, because within Pwe can easily produce problems that are artificially much larger than the original, thereby

  • verriding the O(log n) space bound.
  • Therefore, we define a new reduction, that uses only log-

arithmic space. We define this using the concept of trans- ducer.

  • A log space transducer is a Turing machine with read-only

input tape, a write-only output take, and a work tape. The

slide-38
SLIDE 38

head of the output tape cannot move left, and the work tape may contain at most O(log n) symbols.

  • A log space transducer M computes a function f : Σ∗ →

Σ∗ where f(w) is the content of the output tape when M halts with w on the input tape.

  • We call f a log space computable function. A language A

is log space reducible to B, written A ≤L B, iff A is map- ping reducible to B, by means of a log space computable function.

  • A language B is NL-complete iff
slide-39
SLIDE 39
  • 1. B ∈ NL, and
  • 2. Every A ∈ NL is log space reducible to B
  • Theorem: If A ≤L B and B ∈ L, then A ∈ L.
  • Proof: Note that we cannot use the reduction function f to

first map to B, because writing f(w) may require more than logarithmic space. Instead, we design a single machine by putting the transducer and the decider for B together.

  • Let MA be the machine that computes the reduction f, but it

computes individual symbols of f(w) only when MB needs

  • them. Between such requests, MA is in its initial state.
slide-40
SLIDE 40
  • When MB would read a symbol of its input, MA restarts

and only outputs the required symbol of f(w). In order to do this, it needs one number, that is used for counting which symbol of f(w) is needed. Because f(w) can be calculated in logarithmic space, the length of f(w) cannot be more than a polynomial of the length of w, and an in- dex of this can be expressed in O(log(nk)) = O(log n) space.

  • MB itself needs only O(log n) space, and adding these

two together, we get O(log n) space.

  • Corollary: If any NL-complete language is in L, then NL =

L.

slide-41
SLIDE 41
  • Theorem: PATH is NL-complete
  • Proof: We know that PATH is in NL, so we still need to

show that an arbitrary problem in NLis log space reducible to PATH.

  • Assume A ∈ NL and M nondeterministically decides A

in logarithmic space. Furthermore, we assume M has a unique accepting configuration for each input; this can be achieved by emptying and rewinding the work tape and rewind- ing the input tape before accepting.

  • Given an input w of M, we construct (G, s, t) in logarith-

mic space, such that there is a path from s to t in G iff M accepts w.

slide-42
SLIDE 42
  • Nodes of G will correspond to configurations of M on w;

s is the initial configuration and t is the (unique) accepting

  • configuration. Given configurations c1 and c2 of M, we give

an edge (c1, c2) if c1 yields c2.

  • All we need to show now is that this can be done in loga-

rithmic space, i.e., we need to give a log space transducer.

  • The nodes of G can be listed first, we simply list all possible

strings that could be configurations of the machine in lexi- cographic order, and this can be done on O(log n) space, because the states of M give only a multiplicative constant, and each configuration is represented by the combining the content of the work tape which is takes O(log n) and the position of the heads which takes O(log n)

slide-43
SLIDE 43
  • For each such string, we check if each string is a possible

configuration of M, and if so, outputs this.

  • The transducer then generates the pairs of this lists ele-

ments (still O(log n)) and checks that M has some tran- sition that yields one from the other; it needs to scan the work tape and possibly use another O(log n) space to do the check. If if such a transition is found, the transducer prints out the current pair.

  • Corollary: NL ⊆ P, because this construction is polyno-

mial time also.

slide-44
SLIDE 44

NL and CoNL

  • We do not know if NP is the same as coNP, because there

is an inherent asymmetry in the definitions.

  • It seems such asymmery is in NLas well, but it turns out

that NLis closed under complement.

  • We can prove this by showing that the complement of PATH

is in NL.

  • Proof is a bit hard, so we first take on a simpler problem

which we can later use. We have the input (G, s, t, c),

10

slide-45
SLIDE 45

where c is the number of nodes in G that are reachable from s.

  • We have a nondeterministic Turing machine M that one by
  • ne goes through all the m nodes of G and nondeterminis-

tically tries to guess if the nodes are reachable from s.

  • Guessing if u is reachable from s is done by first guessing

a number l ≤ m and trying, nondeterministically if there is a path of length l from s to m.

  • The branches that fail this check, reject. Also, if a branch

guesses that t is reachable. it also rejects. M counts the

slide-46
SLIDE 46

number of nodes that have been verified to be reachable. If, after checking all nodes, the number of verified nodes is c, then M accepts, otherwise it rejects.

  • In other words: M accepts iff it can find c nodes that are

reachable from s, such that t is not among them. If c is the number of all reachable nodes, then M decides the com- plement of PATH.

  • We still need to find c in O(log n) space, to solve the com-

plement of PATH using M.

  • Let Ai be the set of nodes that there is a path of i or shorter

from s to the nodes of Ai. I.e., A0 = {s} and Ai+1 = Ai ∪ {v | (u, v) is an edge and u ∈ Ai}

slide-47
SLIDE 47
  • Am then contains all the reachable nodes of G. Let ci =

|Ai|, we need to calculate c = cm.

  • ci+1 is calculated from ci as follows: we iterate all the

nodes of G and decide if it is in Ai+1, and count accord-

  • ingly. But how do we do this in logarithmic space?
  • We go through a loop, where we, one by one, guess if a

node is in Ai. We can verify this in logarithmic space by guessing a length shorter than i and guessing a path (step by step) of that length.

  • When we have verified that a node u is in Ai, then for each

edge (u, v), v is in Ai+1. We count these v through the whole loop.

slide-48
SLIDE 48
  • We also count the u that we have been able to verify that are

in Ai. Now, we assumed we had already calculated ci, so if the latter count differs from ci, we reject. The computation can proceed only if we guessed correctly in each step; if we did, then the v counter is the correct value for ci+1, and we can proceed to calculate the next value.

  • Once we have cm, we simply run (G, s, t, cm) on M, which

accepts iff (G, s, t) is not in PATH