On NP-Hardness of the Paired de Bruijn Sound Cycle Problem Fedor - - PowerPoint PPT Presentation

on np hardness of the paired de bruijn sound cycle problem
SMART_READER_LITE
LIVE PREVIEW

On NP-Hardness of the Paired de Bruijn Sound Cycle Problem Fedor - - PowerPoint PPT Presentation

On NP-Hardness of the Paired de Bruijn Sound Cycle Problem Fedor Tsarev Evgeny Kapun Genome Assembly Algorithms Laboratory University ITMO, St. Petersburg, Russia September 2, 2013 Genome assembly models Shortest common superstring


slide-1
SLIDE 1

On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Evgeny Kapun Fedor Tsarev Genome Assembly Algorithms Laboratory University ITMO, St. Petersburg, Russia September 2, 2013

slide-2
SLIDE 2

Genome assembly models

◮ Shortest common superstring –

NP-hard.

◮ Shortest common superwalk in a de

Bruijn graph – NP-hard.

◮ Superwalk in a de Bruijn graph with

known edge multiplicities – NP-hard.

◮ Path in a paired de Bruijn graph?

slide-3
SLIDE 3

Paired de Bruijn graph

TT CT TA TT AG TC GC CT CA TT AT TC

TTA CTT TAG TTC AGC TCT GCA CTT CAT TTC ATT TCT TAT TTC CAG TTC

slide-4
SLIDE 4

Sound path

Sound path: strings match with shift d = 6.

TAGCTCACCCGTTGGT ACCCGTTGGTAATTGC

Sound cycle: cyclic strings match with shift d = 6.

TGATAAGTAGGCTAAG GTAGGCTAAGTGATAA

slide-5
SLIDE 5

Paired de Bruijn Sound Cycle Problem

Given a paired de Bruijn graph G and an integer d (represented in unary coding), find if G has a sound cycle with respect to shift d.

slide-6
SLIDE 6

Paired de Bruijn Covering Sound Cycle Problem

Given a paired de Bruijn graph G and an integer d (represented in unary coding), find if G has a covering sound cycle with respect to shift d.

slide-7
SLIDE 7

Parameters

◮ |Σ|: size of the alphabet. ◮ k: length of vertex labels. ◮ |V |, |E|: size of the graph (bounded in

terms of |Σ| and k).

◮ d: shift distance.

slide-8
SLIDE 8

Simple cases

◮ |Σ| = 1: at most one vertex, at most

  • ne edge.

◮ k = 0: at most one vertex, at most |Σ|2

edges, reduces to the problem of computing strongly connected components.

◮ d is fixed: find a cycle in a graph of

|V ||Σ|d states.

slide-9
SLIDE 9

Interesting case: k = 1

With fixed k = 1, the problem is NP-hard. Proof outline:

  • 1. Reduce Hamiltonian Cycle Problem,

which is NP-hard, to an intermediate problem.

  • 2. Reduce that problem to Paired de Bruijn

(Covering) Sound Cycle Problem.

slide-10
SLIDE 10

The intermediate problem

Given an undirected graph,

◮ if it contains a hamiltonian cycle, output

1.

◮ if it doesn’t contain hamiltonian paths,

  • utput 0.

◮ otherwise, invoke undefined behavior.

slide-11
SLIDE 11

Solution, step 1

G1 G2 a1 a2 a3 b1 b2 b3

slide-12
SLIDE 12

Properties of a graph with a hamiltonian cycle

Such graph contains hamiltonian paths

◮ ending at any vertex. ◮ passing through any edge. ◮ for any edge {i, j} and vertex k = i, j,

passing through {i, j} such that j is between i and k on the path.

slide-13
SLIDE 13

Solution, step 2

V2n+2

s1 s2

V1

s2 s3

V2

. . . . . . s1. . . . . . . . . . . . s2. . . . . . . . . . . . s2. . . . . . . . . . . . s3. . . . . .

slide-14
SLIDE 14

Solution, step 2

In V1, V2n+2: i j − → i j

slide-15
SLIDE 15

Solution, step 2

In V3. . . V2n+1: i

slide-16
SLIDE 16

How to make a covering cycle

Pass along the loop multiple times, covering more and more edges with each iteration.

slide-17
SLIDE 17

Interesting case: |Σ| = 2

With fixed |Σ| = 2, the problem is NP-hard. The proof is done by reduction from the case k = 1. The characters are replaced with binary sequences, and a transformation is done to avoid undesired overlaps.

slide-18
SLIDE 18

Fix both k and |Σ|

If both k and |Σ| are fixed, the number of different de Bruijn graphs is finite. The only argument which can take infinitely many values is d, and it is an integer represented in unary coding. As a result, the number of valid problem instances having any fixed length is bounded. So, the language defined by the problem is sparse. Therefore, the problem is not NP-hard unless P=NP.

slide-19
SLIDE 19

Results

Paired de Bruijn (Covering) Sound Cycle Problem is

◮ NP-hard for any fixed k ≥ 1 (can be

reduced from k = 1).

◮ NP-hard for any fixed |Σ| ≥ 2 (trivially

reduced from |Σ| = 2).

◮ NP-hard in the general case.

slide-20
SLIDE 20

Results

Paired de Bruijn (Covering) Sound Cycle Problem is

◮ Not NP-hard if both k and |Σ| are fixed,

unless P=NP.

◮ Solvable in polynomial time if k = 0,

|Σ| = 1, or d is fixed.

slide-21
SLIDE 21

Thank you!