Searching for Connected/Functional Motifs in Biological Networks St - - PowerPoint PPT Presentation

searching for connected functional motifs in biological
SMART_READER_LITE
LIVE PREVIEW

Searching for Connected/Functional Motifs in Biological Networks St - - PowerPoint PPT Presentation

Searching for Connected/Functional Motifs in Biological Networks St ephane Vialette LIGM Universit e Paris-Est Marne-la-Vall ee, France ENS - 07 Septembre 2010 Networks in Biology Our environement is a combination of tightly


slide-1
SLIDE 1

Searching for Connected/Functional Motifs in Biological Networks

St´ ephane Vialette

LIGM Universit´ e Paris-Est Marne-la-Vall´ ee, France

ENS - 07 Septembre 2010

slide-2
SLIDE 2

Networks in Biology

Our environement is a combination of tightly interlinked complex system at various levels of magnitude

◮ Gene expression in cells: Gene regulation networks. ◮ Large-scale approach: Protein interaction networks. ◮ Metabolites and enzymes: Metabolic networks. ◮ Evolutionary relationships between orginisms:

Phylogenetic networks.

◮ Collecting high-throughput data: Correlation networks. ◮ . . .

slide-3
SLIDE 3

Protein-Protein Interaction (PPI) PPI networks

◮ Proteins are vertices. ◮ Interactions are (weighted) edges.

slide-4
SLIDE 4

Protein-Protein Interaction (PPI) PPI networks

◮ Proteins are vertices. ◮ Interactions are (weighted) edges.

slide-5
SLIDE 5

Gene or PPI databases

BioGRID - A Database of Genetic and Physical Interactions

DIP - Database of Interacting Proteins

MINT - A Molecular Interactions Database

IntAct - EMBL-EBI Protein Interaction

MIPS - Comprehensive Yeast Protein-Protein interactions

Yeast Protein Interactions - Yeast two-hybrid results from Fields’ group

PathCalling - A yeast protein interaction database by Curagen

SPiD - Bacillus subtilis Protein Interaction Database

AllFuse - Functional Associations of Proteins in Complete Genomes

BRITE - Biomolecular Relations in Information Transmission and Expression

ProMesh - A Protein-Protein Interaction Database

The PIM Database - by Hybrigenics

Mouse Protein-Protein interactions

Human herpesvirus 1 Protein-Protein interactions

Human Protein Reference Database

BOND - The Biomolecular Object Network Databank. Former BIND

MDSP - Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry

Protcom - Database of protein-protein complexes enriched with the domain-domain structures

Proteins that interact with GroEL and factors that affect their release

YPDTM - Yeast Proteome Database by Incyte

. . .

slide-6
SLIDE 6

Network querying

Definition Given a small network (corresponding to a known pathway or a complex of interest), the network querying problem is to identify in a large target network similar instances. Remarks

◮ Similarity is usually measured in terms of sequence and

interaction patterns.

◮ Approximate occurrences: insertions and deletions. ◮ Topology-based approach.

slide-7
SLIDE 7

Topology-based approach

PathBlast (http://www.pathblast.org) A server for querying linear pathways within PPI networks (UC San Diego, UC Berkeley, Tel Aviv University, Whitehead Insti- tute).

slide-8
SLIDE 8

Topology-based approach

NetMatch (http://baderlab.org/Software/NetMatch) A Cytoscape plugin to query networks for patterns [FERRO et al.,

08].

slide-9
SLIDE 9

Topology-based approach

Netgrep (http://genomics.princeton.edu/netgrep) Fast network schema searches in interactomes [BANKS, NABIEVA,

PETERSON, AND SINGH, 08].

slide-10
SLIDE 10

From topology-based to topology-free motifs

Views Roughly speaking, there are now two views of graph (or network) motifs:

◮ The older is the topological view where one basically ends

up with certain subgraph isomorphism problems.

◮ The recent view on graph motifs takes a more functional

  • approach. Here topology is of lesser importance but the

functionalities of network vertices form the governing principle [LACROIX, FERNANDES, AND SAGOT, 05].

slide-11
SLIDE 11

From topology-based to topology-free motifs

Views Roughly speaking, there are now two views of graph (or network) motifs:

◮ The older is the topological view where one basically ends

up with certain subgraph isomorphism problems.

◮ The recent view on graph motifs takes a more functional

  • approach. Here topology is of lesser importance but the

functionalities of network vertices form the governing principle [LACROIX, FERNANDES, AND SAGOT, 05]. Remarks The functional approach

◮ does not require information on the interconnections, ◮ is applicable in broader scenarios: complexes or pathways

whose topologies are not completely known, querying from species for which no PPI information is available, . . .

slide-12
SLIDE 12

GRAPH MOTIF

Definition (GRAPH MOTIF) Input: A set of colors C, a motif M over C (a multiset M with underlying set C), a graph G = (V, E), and a mapping λ : V → C. Task: Find an occurrence of M in G, i.e., a subset V ′ ⊆ V such that

◮ λ(V ′) = M, and ◮ G[V ′] is connected.

Remarks

◮ Introduced in [LACROIX, FERNANDES, AND SAGOT, 05]. ◮ The motif M is said to be colorful if it is a set. ◮ The multiplicity of a color c ∈ C in G is the number of

vertices u ∈ V such that λ(u) = c.

slide-13
SLIDE 13

GRAPH MOTIF

Example

M

slide-14
SLIDE 14

GRAPH MOTIF

Example

M

slide-15
SLIDE 15

GRAPH MOTIF: Preliminary results

Theorem (LACROIX, FERNANDES, AND SAGOT, 06) GRAPH MOTIF is NP-complete even if G is a tree. Remarks

◮ The proof does not hold for colorful motif. ◮ Exponential exact algorithm for the general case.

slide-16
SLIDE 16

GRAPH MOTIF: A sudden jump in complexity

Theorem (FELLOWS, FERTIN, HERMELIN, V., 07) GRAPH MOTIF is NP-complete even if

◮ G is a tree with maximum degree 4 and color multiplicity 3

and M is colorful, or

◮ G is a bipartite graph and M is built over 2 colors.

Theorem (FELLOWS, FERTIN, HERMELIN, AND V., 07) GRAPH MOTIF is polynomial-time solvable if G is a tree with color multiplicity 2.

slide-17
SLIDE 17

GRAPH MOTIF: Coping with hardness

Some lines of thought

◮ One may reasonably that the motifs tends to be small in

practice (compared to the target graph).

◮ It would be nice to design an algorithm whose running time

is polynomial in the size of the target graph and exponential in the size of the motif.

◮ It would be even nicer to design an algorithm whose

running time is polynomial in the size of the target graph and exponential in the number of distinct colors that occur in the motif.

◮ Parameterized complexity is a branch of computational

complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input.

slide-18
SLIDE 18

Parameterized complexity

Definition (Parameterized problem) A parameterized problem is a language L ⊆ Σ∗ × Σ∗, where Σ is a finite alphabet. The second component is called the parameter of the problem. Definition (Fixed-parameter tractability) A parameterized problem L is fixed-parameter tractable if it can be determined in f(k) nO(1) time whether (x, k) ∈ L, where f is a computable function only depending on k. The corresponding complexity class if called FPT. Definition (Parameterized hierarchy) FPT ⊆ W[1] ⊆ W[2] ⊆ . . . ⊆ W[sat] ⊆ W[P] ⊆ XP.

slide-19
SLIDE 19

Parameterized complexity

In a nutshell . . .

◮ Problems that enjoy a fixed-parameter tractable algorithm

can be solved efficiently for small values of the fixed parameter.

◮ W[1] is the class of decision problems of the form (x, k) (k

a parameter), that are fixed-parameter reducible to WEIGHTED 3SAT: Given a 3SAT formula, does it have a satisfying assignment of Hamming weight k?

◮ W[1] includes the first class of problems not believed to be

in FPT.

◮ If FPT = W[1] then NP is contained in DTIME(2o(n)).

slide-20
SLIDE 20

GRAPH MOTIF: Small enough motifs

Theorem (LACROIX, FERNANDES, AND SAGOT, 06) GRAPH MOTIF for trees is fixed-parameter tractable w.r.t. |M|. Remarks

◮ Fixed-parameter tractability proof does not hold for

(general graphs).

◮ Pure cominatorial enumeration algorithm.

slide-21
SLIDE 21

GRAPH MOTIF: Small enough motifs

Theorem (FELLOWS, FERTIN, HERMELIN, AND V., 07) GRAPH MOTIF is solvable in 2O(k) n2 log n) time, where k = |M| and n = |V|. Theorem (BETZLER, FELLOWS, KOMUSIEWICZ, AND NIEDERMEIER, 08) GRAPH MOTIF is solvable with error probability ε in O(4.32k k2 | log ε| m) time, where k = |M| and m = |E|.

slide-22
SLIDE 22

GRAPH MOTIF: Small enough motifs

Theorem (BETZLER, FELLOWS, KOMUSIEWICZ, AND NIEDERMEIER, 08) GRAPH MOTIF is solvable with error probability ε in O(4.32k k2 | log ε| m) time, where k = |M| and m = |E|. Key elements

◮ GRAPH MOTIF for colorful motifs. ◮ Color coding and recoloring procedure. ◮ Fast subset convolution (BJ ¨

ORKLUND, HUSFELDT, AND KASKI, 07).

◮ Algorithm engineering for color-coding (H ¨

UFFNER, WERNICKE,

ZICHNER, 07).

slide-23
SLIDE 23

GRAPH MOTIF: colorful motifs

Theorem GRAPH MOTIF for colorful motifs is solvable in O(3k m) time, where k = |M| and m = |E|. Key elements Dynamic programming approach: Du,M′ is the minimum score

  • f a color set M′ ⊆ M for a vertex v ∈ V.

Du,M′ =

  • if M′ = col(v)

1

  • therwise

Du,M′ = min

u∈N(v);M′′⊆M′

  • Du,M′\col(v),

Dv,M′′∪col(v) + Dv,(M′\M′′)∪COLOR(v)

slide-24
SLIDE 24

GRAPH MOTIF: Color coding

Color-coding

◮ ALON, YUSTER, AND ZWICK, 95. ◮ Method to derive (randomized) fixed-parameter algorithms

for several subgraph isomorphism problems.

◮ Best explained by example . . .

LONGEST PATH Input: A graph G = (V, E) and a non-negative integer k. Task: Find a simple path in G that contains k vertices.

slide-25
SLIDE 25

Color coding: k-path

Key idea

  • 1. Randomly color the vertices of the graph with k colors.
  • 2. Find a colorful path of k vertices in G (dynamic

programming step) .

s v u

slide-26
SLIDE 26

Color coding: k-path

Theorem Let G = (V, E) be a graph and f : V → {1, 2, . . . , k} be a coloring of G. Then a colorful path of vertices can be found (if it exists in 2O(k) m time, where m = |E|. Theorem LONGEST PATH is solvable in 2O(k) m expected time, where m = |E|.

slide-27
SLIDE 27

Color coding: toward deterministic algorithms

Definition (k-perfect family of hash functions) A k-perfect family of hash functions is a family H of functions from {1, 2, . . . , n} to {1, 2, . . . , k} such that for each S ⊆ {1, 2, . . . , n} with |S| = k there exists an h ∈ H such that h is one-to-one when restricted to S. Theorem One can construct a k-perfect family of hash functions from {1, 2, . . . , n} to {1, 2, . . . , k} which consist of 2O(k) log n

  • functions. For such a hash function h the value h(i), 1 ≤ i ≤ n,

can be computed in linear-time. Theorem LONGEST PATH is solvable in 2O(k) m log n time, where m = |E| and n = |V|.

slide-28
SLIDE 28

GRAPH MOTIF: Recoloring procedure

M (G, λ)

slide-29
SLIDE 29

GRAPH MOTIF: Recoloring procedure

(G, λ) M′

1 2 1 2 1 2 3 1 2

slide-30
SLIDE 30

GRAPH MOTIF: Recoloring procedure

M′

1 2 1 2 1 2 3 1 2

(G, λ′)

1 1 2 2 3 1 2 2 1 1 3 2 2 1 1 2 2 1 1 3 1 2 2 1 2 2 1 1 1 3 3 2 1

slide-31
SLIDE 31

GRAPH MOTIF: Recoloring procedure

M′

1 2 1 2 1 2 3 1 2

(G, λ′)

1 1 2 2 3 1 2 2 1 1 3 2 2 1 1 2 2 1 1 3 1 2 2 1 2 2 1 1 1 3 3 2 1 1 2 1 2 3 2 1 2 1

slide-32
SLIDE 32

GRAPH MOTIF: Recoloring procedure

M′

1 2 1 2 1 2 3 1 2 1 2 1 2 3 2 1 2 1

slide-33
SLIDE 33

GRAPH MOTIF: Recoloring procedure

Let V ′ be an occurrence of M in G.

◮ ∀c ∈ C, let Pc be the probability that the m(c) vertices in

V ′ that have color c receive a colorful recoloring Pc = m(c)! m(c)m(c) > e−m(c) 2π m(c)

◮ ∀c, c ′ ∈ C, Pc and Pc ′ are independent, and hence

Pc∧c ′ > e−m((c)+m(c ′)) 2π (m(c) + m(c ′))

◮ Let PM be the probability that the occurrence V ′ is

(recoloring) colorful PM =

  • c∈C

Pc > e−k

slide-34
SLIDE 34

GRAPH MOTIF is not always in FPT

Theorem (FELLOWS, FERTIN, HERMELIN, AND V., 07) GRAPH MOTIF is in XP when parameterized by both the number of colors in the motif and the treewidth of the input graph. Theorem (FELLOWS, FERTIN, HERMELIN, AND V., 07) GRAPH MOTIF is W[1]-hard when parameterized by the number of colors in M, even when the target graph is a tree.

slide-35
SLIDE 35

GRAPH MOTIF: Going further

◮ Allow multiple colors per vertex to model multiple

functionalities of one element.

◮ Asking for somewhat more robust motifs:

◮ Biconnex motifs. ◮ Bridge-connectivity.

◮ The GRAPH MOTIF problem is too stringent: measurement

errors might result in no occurrence of M in G whereas “good” solutions do exist, i.e., Turning GRAPH MOTIF into an optimization problem:

◮ Matching the whole motif at the price of loosing

connectivity: The occurrence is no longer required to be connected.

◮ Maintain connectivity at the price of loosing some elements:

The occurrence has to be connected but may misses some elements from the motif.

slide-36
SLIDE 36

GRAPH MOTIF: Robust motifs

Definition A vertex u is called a cut vertex if there are two distinct vertices v and w, u = v and u‘neqw, such that every path from v to w contains u. Definition A graph is biconnected is it is connected and has no cut vertex. Definition A graph is 2-edge-connected (or bridge-connected) is it cannot be disconnected by deletion of 1 edge.

slide-37
SLIDE 37

GRAPH MOTIF: Robust motifs

Theorem (BETZLER, FELLOWS, KOMUSIEWICZ, AND NIEDERMEIER, 08) BICONNECTED GRAPH MOTIF is W[1]-hard when parameterized by the number of elements in M. Remarks

◮ A stronger result actually holds: Finding a biconnected

subgraph of size k is W[1]-hard.

◮ Recall that finding a biconnected subgraph of size at least

k is solvable in linear-time [TARJAN, 72].

◮ The above theorem still holds if we replace biconnectivity

by bridge-connectivity.

slide-38
SLIDE 38

GRAPH MOTIF as an optimization problem

Definition (MIN CC) Input: A set of colors C, a motif M over C (a multiset M with underlying set C), a graph G = (V, E), and a mapping λ : V → C. Solution: A subset V ′ ⊆ V such that λ(V ′) = M. Measure: The number of connected components in the induced subgraph G[V ′]. Remarks

◮ Contains GRAPH MOTIF, i.e., the occurrence results is one

connected component.

◮ Minimization problem.

slide-39
SLIDE 39

GRAPH MOTIF as an optimization problem

Definition (MAX GRAPH MOTIF) Input: A set of colors C, a motif M over C (a multiset M with underlying set C), a graph G = (V, E), and a mapping λ : V → C. Solution: A subset V ′ ⊆ V such that

◮ λ(V ′) ⊆ M, and ◮ G[V ′] is connected.

Measure: The size of the occurrence, i.e., |V ′|. Remarks

◮ Contains GRAPH MOTIF, i.e., the occurrence uses all the

colors of the motif.

◮ Mixamization problem.

slide-40
SLIDE 40

MIN CC: Some bad news

Theorem (DONDI, FERTIN, AND V., 07) MIN CC is APX-hard , even when the target graph is a path and M is colorful. Theorem (DONDI, FERTIN, V.) MIN CC is not approximable within ratio c log(n), even when the target graph is a tree and M is colorful.

slide-41
SLIDE 41

From MIN CC to SET COVER

r S ′

1

S1 e1(S1) e2(S1) et1(S1) S ′

2

S2 e1(S2) e2(S2) et2(S2) S ′

m

Sm e1(Sm) e2(Sm) etm(Sm)

slide-42
SLIDE 42

MIN-CC: Bad and good news

Theorem (DONDI, FERTIN, AND V., 07) MIN CC is W[2]-hard when parameterized by the size of the solution, even when the target graph is a tree and M is colorful. Theorem (BETZLER, FELLOWS, KOMUSIEWICZ, NIEDERMEIER, 08) MIN CC is W[1]-hard when parameterized by the size of the solution, even when the target graph is a path. Theorem (DONDI, FERTIN, AND V., 07) MIN CC is in FPT when parameterized by the size of M. More precisely, is solvable in 2O(k)n3 log n time. The complexity reduces to O(n2k2(q+2)) time if G is a tree.

slide-43
SLIDE 43

MIN-CC: Going further . . .

◮ The fixed-parameter algorithms are still not practical. ◮ What about approximating MIN CC for paths? ◮ No efficient exponential-time algorithm is known so far. ◮ Designing algorithms that focus on the number of distinct

colors that occur in the motif.

slide-44
SLIDE 44

MAX GRAPH MOTIF: Some bad news . . . again

Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF is APX-hard , even when the target graph is a tree T of degree 3, M is colorful, and each color occurs at most twice in T. Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF is not in APX , even when the target graph is a tree T and M is colorful.

slide-45
SLIDE 45

MAX GRAPH MOTIF: Some bad news . . . again

Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF is not in APX , even when the target graph is a tree T and M is colorful.

slide-46
SLIDE 46

MAX GRAPH MOTIF: Some bad news . . . again

Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF is not in APX , even when the target graph is a tree T and M is colorful. Proof (key elements) The proof is by the self-improvement technique

◮ Given two instances I1 = (T1, M1) and I2 = (T2, M2) of

MAX COLORS we need to define the product I1,2 = I1 × I2 = (T1,2, M1,2) Informally, T1,2 is obtained by replacing each vertex vi ∈ V1 by a copy of T2, connecting these copies through their

  • roots. If vi ∈ Vi is colored ci and vj ∈ Vj is colored cj then

vertex vi(vj) ∈ T1,2 is colored ci(cj).

◮ Self-product Ik = Ik−1 × I

slide-47
SLIDE 47

MAX GRAPH MOTIF: Some bad news . . . again

Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF is not in APX , even when the target graph is a tree T and M is colorful. Proof (key elements)

◮ If Tk is a solution for MAX COLORS over instance Ik, then

there exists a solution T for MAX COLORS over instance I such that |T|k ≥ |Tk|

◮ If T is a solution for MAX COLORS over instance I, then

there exists a solution Tk for MAX COLORS over instance Ik such that |Tk| ≥ |T|k

◮ For any constant δ < 1, MAXIMUM LEVEL MOTIF cannot be

approximated within ratio 2logδ n in polynomial-time unless NP ⊆ DTIME[2poly log n]

slide-48
SLIDE 48

MAX GRAPH MOTIF: Some (not so) good news

Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF is in FPT when parameterized by the size

  • f the solution.

Remarks

◮ The idea is to combine two perfect families of hash

functions with dynamic programming.

◮ The time complexity is, however, still not praticable!

4O(k)kn2 log2 n time for graphs and 2O(k)n3 log n time for trees.

slide-49
SLIDE 49

MAX GRAPH MOTIF: Some (not so) good news

Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF is in FPT when parameterized by the size

  • f the solution.

Remarks

◮ The idea is to combine two perfect families of hash

functions with dynamic programming.

◮ The time complexity is, however, still not praticable!

4O(k)kn2 log2 n time for graphs and 2O(k)n3 log n time for trees. Theorem (DONDI, FERTIN, V., 09) MAX GRAPH MOTIF for trees of size n can be solved in O∗(1.62n) time. In case the motif is colorful, the time complexity reduces to O∗(1.33n).

slide-50
SLIDE 50

GRAPH MOTIF and variants: practical issues

Algorithmic solutions

◮ Torque [BRUCKNER, H ¨

UFFNER, KARP, SHAMIR, AND SHARAN., 2009]. ◮ Web server. ◮ Currently support queries up to 20–25 proteins.

◮ GraMoFoNe [BLIN, SIKORA, AND V., 10].

◮ cytoscape plugin. ◮ Currently support queries up to 20–25 proteins.

slide-51
SLIDE 51

Torque

http://www.cs.tau.ac.il/˜bnet/torque.html

slide-52
SLIDE 52

GraMoFoNe is a Cytoscape plugin

◮ Open-source Jave platform:

◮ import/export in numerous formats ◮ visualisation ◮ Network annotations

◮ Popular in bioinformatics ◮ Active community

slide-53
SLIDE 53

GraMoFoNe

Main features

◮ Uses a Pseudo-Boolean programming engine. ◮ Databases (native support). ◮ Deals with both colorful and muliset motifs. ◮ Can report all solutions. ◮ Deals with approximate solutions:

◮ insertions, ◮ deletions, ◮ List coloring.

Cons GraMoFoNe is still not able to deal with motifs of size 30.

slide-54
SLIDE 54

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-55
SLIDE 55

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-56
SLIDE 56

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-57
SLIDE 57

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-58
SLIDE 58

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-59
SLIDE 59

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-60
SLIDE 60

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-61
SLIDE 61

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-62
SLIDE 62

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-63
SLIDE 63

GraMoFoNe

http://igm.univ-mlv.fr/AlgoB/gramofone/

slide-64
SLIDE 64

Querying biological networks

Example

◮ Query: Mouse DNA synthesome complex (13 proteins). ◮ Target: Yeast network (5 340 proteins, 39 936 interactions). ◮ Output: The match consists in 12 proteins with 2

insertions and 3 deletions.

slide-65
SLIDE 65

Some perspectives: Pushing fpt algorithms furhter

Quetions/Observations

◮ Fixed-parameter tractable algorithms are still not able to

deal for moderate size motifs (about 30 proteins).

◮ Current approaches are still limited to motifs of size about

15 whereas practical applications do ask for motifs of size about 25–30

◮ Do we need new fpt techniques ? ◮ Is it enough to craft and fine-tune our algorithms?

slide-66
SLIDE 66

Some perspectives: Modules

Definition (Graph module) A module in a graph G = (V, E) is a subset V ′ ⊆ V such that the neighborhoods outside the module of the vertices within the module are all equal Example

1 4 2 3 5 6 7 8 10 9 11

slide-67
SLIDE 67

Some perspectives: Modules

Definition (Graph module) A module in a graph G = (V, E) is a subset V ′ ⊆ V such that the neighborhoods outside the module of the vertices within the module are all equal Questions/Observations

◮ What about replacing the connectedness demand by

modularity?

◮ Modular decomposition trees should certainly help for

algorithm design.

◮ Sure enough, replacing connectedness by the notion of

modules is not a strong enough relaxation (is it a relaxation?) to push GRAPH MOTIF towards polynomial-time tractability (actually, definitively not!).