SLIDE 1
What, if anything, can be done in linear time?
Yuri Gurevich Tallinn, April 29, 2014
SLIDE 2 Agenda
- 1. What linear time? Why linear time?
- 2. Propositional primal infon logic
- 3. A linear time decision algorithm
- 4. Extensions with
- 1. Disjunction
- 2. Conjunctions as sets
- 3. Transitivity
SLIDE 3
WHAT LINEAR TIME? WHY LINEAR TIME?
SLIDE 4 Why
- Big data.
- Remark. In many cases, big-data algorithms
are approximate and randomized, necessarily so.
SLIDE 5 What linear time?
We use the standard computation model of the analysis of algorithms.
- A longer answer, with examples and all,
follows.
SLIDE 6 Example 1: Sorting.
- A well-known lower bond is this:
Sorting π items requires Ξ©(π β
log π ) comparisons and thus Ξ©(π β
log π ) time.
- There is no way around the lower bound.
Or maybe there is?
SLIDE 7 An array A if length n
- Indices: 0, 1, β¦, n-1
- Values A[0], A[1], β¦, A[n-1]
SLIDE 8 Distinct natural numbers < π can be sorted in time π(π).
We illustrate this with π = 7 and π΅ = π΅ 0 , π΅ 1 , π΅ 2 = 3,6,0 .
- 1. Create and auxiliary array πΆ and zero it:
πΆ = β©0,0,0,0,0,0,0βͺ.
- 2. Traverse π΅; for each value π, set πΆ[π] = 1.
πΆ becomes 1,0,0,1,0,0,1 .
- 3. Traverse πΆ outputing indices with positive
values: β©0,3,6βͺ. We forgo interesting generalizations.
SLIDE 9 The computation model
with registers of length π(log π).
β Only the initial polynomial many registers are used, with address of length π(log π). β Relations =, β₯, β€, and operations +, β are constant time.
- The model reflects the standard computer
architecture and the regular intuition of programmers.
SLIDE 10
Example 2: Tries
One application: lexical analyzers to, tea, ted, ten, A, inn
SLIDE 11 Example 3: Suffix arrays.
- Let π‘ = π0 β¦ ππβ1. Each π < π is the πππ§ for
the suffix ππ β¦ ππβ1.
- The suffix array for π‘ is an array π΅ of length π
- f π‘ where each π΅[π] is (the key of) the π-th
suffix in the lexicographical order.
- An amazing algorithm constructs the suffix array
in linear time.
SLIDE 12 Parsing logic formulas
- Using the tools above + a deterministic
pushdown automaton, produce β in linear time β the parse tree of a given logic formula.
- The nodes and edges are decorated with useful
labels and pointers.
- Two nodes may represent different occurrences
- f the same subformula; call them homonyms. All
pointers πΌ π£ from any node π£ to its homonymy
- riginal can be constructed in π(π).
SLIDE 13
PROPOSITIONAL PRIMAL INFON LOGIC
SLIDE 14 Motivation for primal logic
SLIDE 15 Why propositional?
π€1: π
1, π€2: π2, β¦
upon π(π₯1, β¦ ) if π½(β¦ ) actions Meaning: If an arriving message fits the pattern π and if the condition π½ follows from your knowledge assertions, perform the actions.
- Often, by the time you arrive to check π½, it is ground. The assertion
are typically not ground but only few particular ground instances are relevant.
SLIDE 16 Expository simplifications
- For expository reasons, we restrict attention
to the βtoplessβ (without β€) fragment that is quote-free.
SLIDE 17
The derivation rules
π¦ β§ π§ π¦ π¦ β§ π§ π§ π¦, π§ π¦ β§ π§ π¦, π¦ β π§ π§ π§ π¦ β π§
SLIDE 18 The subformula property
π½1, β¦ , π½β is a shortest derivation of π from πΌ then every π½π is a subformula of πΌ, π.
- In the βquotefulβ case, instead of subformulas
- f a formula π½, we have formulas local to π½.
There are < |π½| such local formulas.
SLIDE 19 An interpolation lemma of sorts
- Lemma. If πΌ β’ π then there is a set π½ of
subformulas of πΌ that are also subformulas of π, such that
- 1. Formulas π½ are derivable from H, and
- 2. π is derivable from π½ using only introduction
rules.
- We will not use the interpolation lemma but it
gives a useful optimization in the case where the hypotheses change rarely.
SLIDE 20 The multi-derivation problem
- Definition. Given sets πΌ (hypotheses) and π
(queries) of formulas, decide which queries follow from the hypotheses.
- Theorem. The multi-derivation problem for
propositional infon logic is solvable in linear time.
- We explain the main ideas.
- π is always the input size,
essentially πΌ + |π
|.
SLIDE 21
A LINEAR TIME DECISION ALGORITHM FOR THE MULTI-DERIVATION PROBLEM
SLIDE 22
Approach: derive them all
Compute all subformulas of πΌ, π
derivable from the hypotheses πΌ.
SLIDE 23 High-level algorithm
- Initially all subformulas of πΌ, π
are raw,
- nly hypotheses are pending and
there are no processed formulas.
- Pick the first pending formula π½,
apply all possible inference rules to π½, then mark π½ processed. β In the process some raw formulas may become pending.
- Repeat until no formula is pending.
SLIDE 24 One easy case
- Apply the β§-elimination rule π¦β§π§
π¦ .
- In this case, π½ is a conjunction. If the first
conjunct of π½ is raw, mark it pending.
SLIDE 25 One harder case
- Apply the β§-introduction rule
π¦,π§ π¦β§π§
with π½ playing the role of π¦.
- All raw formulas of the form π½ β§ π§ where y is
pending or processed, should be marked pending.
- How do we find them? We donβt have the
time to walk through the raw formulas.
SLIDE 26 Local search
- Every homonymy original node π£ is endowed
with four so-called use sets denoted β§, π , β§, π , β, π , β, π computed as follows.
- Traverse the parse tree, in the depth-first way.
- If a homonymy original π£ is the left child of a
conjunction node π₯, put πΌ(π₯) into the use set (β§, π) of π£. If u is the right child of π₯, put πΌ(π₯) use β§, π instead.
SLIDE 27 Back to applying
π¦β§π§ π¦
- Recall: we are looking for raw formulas of the
form π½ β§ π§ where π½ is the first pending formula.
- Just walk through the use set (β§, π) of π½.
SLIDE 28
EXTENTION 1: DISJUNCTIONS
SLIDE 29 Motivations
Recall the DKAL rule π€1: π
1, β¦ , π€π: π π
upon π π₯1, β¦ if π½ β¦ actions and suppose that π½ = πΎ β¨ πΏ, e.g. passport(traveller,UK) β¨ passport(traveller,EU). There may be many such disjunctions. They may be eliminated but they make rule much more succinct.
SLIDE 30
Add only introduction rules
π¦ π¦ β¨ π§ π§ π¦ β¨ π§ The linear decision algorithm generalizes in a rather obvious way.
SLIDE 31
EXTENSION 2: CONJUNCTIONS (AND DISJUNCTIONS) AS SETS
SLIDE 32 Motivation
While π¦ β§ π§ entails π§ β§ π¦,
- π¦ β§ π§ β π¨ doesnβt entail π§ β§ π¦ β π¨,
- π¨ β (π¦ β§ π§) doesnβt entail π¨ β π§ β§ π¦ ,
- π¦ β§ π§ β§ π¨ β π₯ doesnβt entail
π¦ β§ π§ β§ π¨ β π₯, etc.
SLIDE 33 The idea, a problem, and a solution
- View conjunctions as sets of conjuncts.
This repairs the missing entailments.
- But sets are not constructive objects.
- Represent sets as sequences by ordering the
conjuncts lexicographically.
SLIDE 34 The decision algorithm
- The resulting multi-derivability problem is
solvable in expected linear time.
- It is the algorithm that introduces
- randomization. No probability distribution on
inputs is assumed.
SLIDE 35
EXTENSION 3: TRANSITIVE PRIMAL INFON LOGIC
SLIDE 36 Motivation
π¦ β π§ , (π§ β π¨) donβt entail (π¦ β π¨).
SLIDE 37 New axiom and rule
- In the quoteless case, transitive primal infon
logic is the extension of primal infon logic with an axiom π¦ β π¦ and the rule π¦ β π§, π§ β π¨ π¦ β π¨
SLIDE 38
An alternative presentation of transitivity
π¦1 β π¦2, π¦2 β π¦3, β¦ , π¦πβ1 β π¦π π¦1 β π¦π Logically the alternative presentation is equivalent to the original one but algorithmically it makes a lot of difference.
SLIDE 39 Multi-derivability
- Multi-derivability problem for the transitive
primal infon logic is solvable in quadratic time.
SLIDE 40
THANK YOU
SLIDE 41
VAULT
SLIDE 42 High-level algorithm
Initially all local formulas are raw, except that hypotheses are pending. No formulas are processed. 1. Pick the first pending formula π½, 2. apply all (applicable) inference rules π to π½; if any of the conclusions are raw, make them pending. 3. mark π½ processed. 4. Repeat until no formula is pending.
- Pending and processed formulas have been derived.
- Formulas move only from raw to pending to processed.
SLIDE 43 One easy case
- π½ = πΎ β§ πΏ, π is π¦ β§ π§
π¦
β
- If πΎ is raw, mark it pending.
SLIDE 44 One harder case
π¦, π§ π¦ β§ π§ to π½, with π½ being the left
premise.
β It will be convenient to abbreviate this sentence thus: apply ππ to π½.
- All raw formulas π½ β§ π§, with π§ pending or
processed, should be marked pending. But how do we find them?
SLIDE 45 Succinct representation, 1
- Local formulas are too big objects to manipulate in
linear time. So we work with the parse tree of πΌ, π
. The subtree rooted at a node u of ParseTree(πΌ, π
) is the parse tree of some formula π, the formula of π£.
- Draft definition. If π = Formula(π£) then π£ represents
π.
- But then π may have many representations.
- Call nodes π£, π€ homonyms if their formulas are
isomorphic.
SLIDE 46 Succinct representation, 2
- Lemma. There is a linear-time algorithm that
β chooses a homonymy leader in every homonymy class, and β sets pointers πΌπ£ from any node π£ to its homonymy leader.
- The algorithm uses suffix arrays.
- Def. If π = Formula(π£) then πΌπ£ represents π.
Further, πΌπ£ = Nπππ(π).
SLIDE 47 The use sets US(ππ, π£)
- Traverse the parse tree in the depth-first
- manner. For every homonymy leader π₯ with
Formula(π₯) = π¦ β§ π§, put π₯ into the use set US(ππ, πΌπ₯π).
β Here π₯π is the left child of π₯. β Notice that πΌπ₯π =Node(π¦). β Notice that every Node(π½ β§ π§) occurs in US(ππ,Node(Ξ±)).
SLIDE 48 Applying ππ to π½
- Walk through US(ππ, Node(π½)) and mark
every raw π₯ there pending.
- How do you find Node(π½)?
That is how π½ is given in the first place.