Truly Subcubic Algorithms for Language Edit Distance and RNA Folding - - PowerPoint PPT Presentation
Truly Subcubic Algorithms for Language Edit Distance and RNA Folding - - PowerPoint PPT Presentation
Truly Subcubic Algorithms for Language Edit Distance and RNA Folding via Fast Bounded-Difference Min-Plus Product Karl Bringmann , Fabrizio Grandoni, Barna Saha, Virginia Vassilevska Williams June 11, 2017 Bounded Differences (BD) Matrices
Bounded Differences (BD) Matrices
Integer matrix π has BD if for all π, π:
π π, π β π[π, π + 1] β€ 1 π π, π β π[π + 1, π] β€ 1
and
More generally: πΏ-BD when differences are at most π
2 2 3 2 1 1 2 3 2 1 2 3 1 1 2
(min,+) Product
For πΓπ-matrices π΅, πΆ, their (min,+) product π· = π΅ β πΆ is defined by
π· π, π = min
7 π΅ π, π + πΆ[π, π]
(min,+) product is equivalent to All Pairs Shortest Paths
[Fischer,Meyerβ71]
trivial algorithm: π(π;) best known algorithm: π;/2?( @AB C
- )
[Williamsβ14]
Standard matrix multiplication:
π· π, π = E π΅ π, π β πΆ[π, π]
- 7
time π(πG) where π β€ 2.373
(min,+) Product
For πΓπ-matrices π΅, πΆ, their (min,+) product π· = π΅ β πΆ is defined by
π· π, π = min
7 π΅ π, π + πΆ[π, π]
(min,+) product is equivalent to All Pairs Shortest Paths
[Fischer,Meyerβ71]
trivial algorithm: π(π;) best known algorithm: π;/2?( @AB C
- )
[Williamsβ14]
Big Open Problem: Is (min,+) product in time π·(ππOπ») for some π» > π? Study special cases!
(min,+) Product for Structured Matrices
π΅S π, π = π¦U[V,W] π· π, π = min
7 π΅ π, π + πΆ[π, π]
π·β² π, π = E π΅S π, π β πΆS[π, π]
- 7
π· π, π = degree of highest monomial in π·S[π, π] Sketch: Matrices with small entries: If π΅, πΆ have entries in βπ, β¦ , π βͺ β then π΅ β πΆ can be computed in time π i(ππG)
[Alon,Galil,Margalitβ97]
(min,+) Product for Structured Matrices
Matrices with small entries: If π΅, πΆ have entries in βπ, β¦ , π βͺ β then π΅ β πΆ can be computed in time π i(ππG)
[Alon,Galil,Margalitβ97]
Matrices with few distinct entries: If each row of π΅ has a small number of distinct entries, then for arbitrary πΆ we can compute π΅ β πΆ in truly subcubic time
[Yusterβ09]
Question: Is (min,+) product in time π·(ππOπ») for BD matrices?
Why care about BD matrices?
1st Application: Language Edit Distance (LED)
CFG Parsing: Given a context-free grammar π» and a string π‘ of length π, is π‘ in π(π»)?
[L. Valiantβ75]
Language Edit Distance: βerror-correcting CFG parsingβ Given a CFG π» and a string π‘, compute minimum edit distance of π‘ to any string in π(π») ... is in time π i(πG) insertions, deletions, substitutions for simplicity: |π»| = π(1)
[Aho,Petersonβ72]
... is in time π(π;) We show using Valiantβs approach: If (min,+) product on BD matrices is in time π(πn), then LED is in time π i(πn) ~8 page proof intuitive reason for BD: LED(π‘) and LED(π‘π) differ by β€ 1 for any symbol π
2nd Application: RNA Folding
RNA can be seen as a sequence of symbols from {A,C,G,U} Biologists want to predict the secondary structure of RNA: A can pair with U, and C can pair with G Given an RNA sequence, find the largest set of matching pairs, such that no two pairs intersect AUUGCAG not allowed but AUUGCAG is okay
Disclaimer: No author of this paper is a biologist.
[Nussinov,Jacobsonβ80]
... is in time π(π;) ... can be cast as a LED problem (without substitutions) If (min,+) product on BD matrices is in time π(πn), then RNA Folding is in time π i(πn)
3rd Application: Optimal Stack Generation
Optimal Stack Generation: Given a string π‘ over alphabet Ξ£, determine the shortest sequence of stack
- perations push(.), emit, pop s.t. performing these operations starting from
an empty stack will emit π‘ and end with an empty stack for simplicity: |Ξ£| = π(1) ... is in time π(π;) (dynamic programming)
[Tarjanβ05]
We show: If (min,+) product on O(1)-BD matrices is in time π(πn), then Optimal Stack Generation is in time π i(πn) π‘ = bab push(b) emit push(a) emit pop emit pop b b b b b b a a b a b intuitive reason for BD: OSG(π‘) and OSG(π‘π) differ by β€ 3 for any π β Ξ£
Main Result
... so we have seen that (min,+) product of BD matrices is well motivated Main Result: We can compute the (min,+) product of BD matrices Generalization: For πΏ-BD matrix π΅ with π βͺ π;OG β πt.uvu and arbitrary πΆ we can compute their (min,+) product in randomized truly subcubic time here: π·(ππ.π) in randomized time π(πv.y;) and deterministic time π(πv.yz)
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
1) Compute approximation πΈ π, π = π· π, π Β± π πt.v
compute π· π, π exactly for all π, π that are multiples of πt.v
(π, π)
set πΈ π, π to some π·[πβ, πβ] by rounding π, π If π΅, πΆ are BD, then their (min,+) product is also BD
(πS, πS) πt.v time π(πv.u)
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
1) Compute approximation πΈ π, π = π· π, π Β± π πt.v π΅ π, π + πΆ π, π = π· π, π implies π΅ π, π + πΆ π, π β πΈ π, π β€ π πt.v call these triples (π, π, π) relevant then π· π, π = min
7:(V,7,W) β’β¬@β¬β’βΖβ π΅ π, π + πΆ[π, π]
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
2) Cover most relevant triples: fix πβ, πβ, and define matrices π΅β, πΆβ π΅β π, π β π΅ π, π + πΆ π, πβ β πΈ π, πβ β π΅ πβ, π + πΆ π, πβ β πΈ πβ, πβ πΆβ π, π β π΅ πβ, π + πΆ π, π β πΈ πβ, π (min,+) product π·β of π΅β, πΆβ: π·β π, π = min
7 π΅β π, π + πΆβ π, π
can be cancelled afterwards = π· π, π β πΈ π, πβ + πΈ πβ, πβ β πΈ πβ, π
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant:
1) Compute approximation πΈ π, π = π· π, π Β± π πt.v
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
2) Cover most relevant triples: fix πβ, πβ, and define matrices π΅β, πΆβ π΅β π, π β π΅ π, π + πΆ π, πβ β πΈ π, πβ β π΅ πβ, π + πΆ π, πβ β πΈ πβ, πβ πΆβ π, π β π΅ πβ, π + πΆ π, π β πΈ πβ, π 1) Compute approximation πΈ π, π = π· π, π Β± π πt.v if π, π, πβ , πβ, π, πβ , πβ, π, π are all relevant, then π΅β π, π , πΆβ π, π = π πt.v set all π»(πt.v)-entries of π΅β, πΆβ to β then (min,+) product of π΅β and πΆβ can be computed in time π i(πGβ‘t.v) (π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v), i.e., not set to β
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant:
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
2) Cover most relevant triples: repeat for π(πt.; log π) rounds: π΅β π, π β π΅ π, π + πΆ π, πβ β πΈ π, πβ β π΅ πβ, π + πΆ π, πβ β πΈ πβ, πβ πΆβ π, π β π΅ πβ, π + πΆ π, π β πΈ πβ, π 1) Compute approximation πΈ π, π = π· π, π Β± π πt.v set all π»(πt.v)-entries of π΅β, πΆβ to β
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant:
initialize π· Λ π, π β β pick πβ, πβ randomly compute (min,+) product π·β = π΅β β πΆβ π· Λ π, π β min π· Λ π, π , π·β π, π + πΈ π, πβ β πΈ πβ, πβ + πΈ πβ, π
(π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v) in some round
Lem: After π(πβ° log π) rounds there are π(π;Oβ°/; + πv.Ε ) uncovered relevant triples w.h.p. = π πv.βΉ time π πGβ‘t.v = π(πv.u) π i(πt.;) iterations total time π i πv.βΉ
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
2) Cover most relevant triples 1) Compute approximation πΈ π, π = π· π, π Β± π πt.v
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant:
3) Enumerate uncovered relevant triples: βfor each uncovered relevant (π, π, π):β π· Λ π, π β min π· Λ π, π , π΅ π, π + πΆ[π, π]
(π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v) in some round
now π· Λ is correct output
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
2) Cover most relevant triples 1) Compute approximation πΈ π, π = π· π, π Β± π πt.v
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant:
3) Enumerate uncovered relevant triples: (π, π) (πS, πS) for all πS, πS, πS divisible by πt.v: if πS, πS, πS is relevant and uncovered: for all πS β πt.v < π β€ πS, πS β πt.v < π β€ πS, πS β πt.v < π β€ πS: π· Λ π, π β min π· Λ π, π , π΅ π, π + πΆ[π, π] now π· Λ is correct output
(π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v) in some round
Algorithm Sketch
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
2) Cover most relevant triples 1) Compute approximation πΈ π, π = π· π, π Β± π πt.v
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant:
3) Enumerate uncovered relevant triples: for all πS, πS, πS divisible by πt.v: if πS, πS, πS is relevant and uncovered: for all πS β πt.v < π β€ πS, πS β πt.v < π β€ πS, πS β πt.v < π β€ πS: π· Λ π, π β min π· Λ π, π , π΅ π, π + πΆ[π, π]
(π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v) in some round
either all or none are relevant and uncovered π(πv.β’) iterations time π(πt.; log π) total time π(πv.βΉ)
= number of relevant uncovered triples
total time of algorithm π i πv.βΉ now π· Λ is correct output
Correctness
π΅β π, π β π΅ π, π + πΆ π, πβ β πΈ π, πβ β π΅ πβ, π + πΆ π, πβ β πΈ πβ, πβ πΆβ π, π β π΅ πβ, π + πΆ π, π β πΈ πβ, π set all π»(πt.v)-entries of π΅β, πΆβ to β
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant:
pick πβ, πβ randomly
(π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v) in some round
... ... Lem: After π(πβ° log π) rounds there are π(π;Oβ°/; + πv.Ε ) uncovered relevant triples w.h.p. If π, π, πβ , πβ, π, πβ , πβ, π, π are all relevant, then π, π, π is covered
Correctness
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant: (π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v) in some round
Lem: After π(πβ° log π) rounds there are π(π;Oβ°/; + πv.Ε ) uncovered relevant triples w.h.p. If π, π, πβ , πβ, π, πβ , πβ, π, π are all relevant, then π, π, π is covered When are π, π, πβ , πβ, π, πβ , πβ, π, π all relevant? bipartite graph π»7: π π if (π, π, π) is relevant we βcoverβ a relevant triple (π, π, π) if π, π, πβ, πβ form a 4-cycle in π»7 need: π»7 contains many 4-cycles Any bipartite graph with π β₯ 4πβ.Ε edges contains Ξ© πβ’/πβ’ 4-cyles. Lem: have: many relevant triples πβ πβ
Algorithm Recap
Input: BD matrices π΅, πΆ. Want: π· π, π = min
7 π΅ π, π + πΆ[π, π]
1) Compute approximation πΈ π, π = π· π, π Β± π πt.v
compute π· π, π exactly for all π, π that are multiples of πt.v set πΈ π, π to some π·[πβ, πβ] by rounding π, π
2) Cover most relevant triples:
repeat for π(πt.; log π) rounds: π΅β π, π β π΅ π, π + πΆ π, πβ β πΈ π, πβ β π΅ πβ, π + πΆ π, πβ β πΈ πβ, πβ πΆβ π, π β π΅ πβ, π + πΆ π, π β πΈ πβ, π set all π»(πt.v)-entries of π΅β, πΆβ to β initialize π· Λ π, π β β pick πβ, πβ randomly compute (min,+) product π·β = π΅β β πΆβ π· Λ π, π β min π· Λ π, π , π·β π, π + πΈ π, πβ β πΈ πβ, πβ + πΈ πβ, π
3) Enumerate uncovered relevant triples:
for all πS, πS, πS divisible by πt.v: if πS, πS, πS is relevant and uncovered: for all πS β πt.v < π β€ πS, πSβπt.v < π β€ πS, πSβπt.v < π β€ πS: π· Λ π, π β min π· Λ π, π , π΅ π, π + πΆ[π, π]
|π΅ π, π + πΆ π, π β πΈ π, π | β€ π πt.v (π, π, π) relevant: (π, π, π) is βcoveredβ if π΅β π, π and πΆβ π, π are π(πt.v) in some round
Conclusion
(min,+) product of BD matrices can be solved in rand. time π(πv.y;) we generalize the subcubic special cases of (min,+) matrix multiplication: this yields subcubic π(πv.y;) algorithms for:
- Language Edit Distance, a classic parsing problem from β72
- RNA Folding, a classic bioinformatics problem from β80
- Optimal Stack Generation, an open problem by Tarjan
Open Problems: Conditional lower bounds imply that LED and RNA Folding are in Ξ© β πG , 1) What is the right exponent? 2) Find more applications of BD (min,+) product
[Abboud,Backurs,V-Williams15]