Pattern Matching in Protein-Protein Interaction Graphs Ga elle - - PowerPoint PPT Presentation

pattern matching in protein protein interaction graphs
SMART_READER_LITE
LIVE PREVIEW

Pattern Matching in Protein-Protein Interaction Graphs Ga elle - - PowerPoint PPT Presentation

Pattern Matching in Protein-Protein Interaction Graphs Ga elle Brevier ( Universit e de Grenoble, France ) Romeo Rizzi ( Universit` a di Udine, Italy ) St ephane Vialette ( Universit e Paris-Est, France ) Lisbon, September 19, 2008


slide-1
SLIDE 1

Pattern Matching in Protein-Protein Interaction Graphs

Ga¨ elle Brevier (Universit´

e de Grenoble, France)

Romeo Rizzi (Universit`

a di Udine, Italy)

St´ ephane Vialette (Universit´

e Paris-Est, France)

Lisbon, September 19, 2008

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 1 / 38

slide-2
SLIDE 2

Introduction

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 2 / 38

slide-3
SLIDE 3

Introduction

Introduction

Protein interactions identified on a genome-wide scale are commonly visualized as protein interaction graphs, where proteins are vertices and interactions are edges.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 3 / 38

slide-4
SLIDE 4

Introduction

Gene or Protein Interactions Databases

BioGRID - A Database of Genetic and Physical Interactions DIP - Database of Interacting Proteins MINT - A Molecular Interactions Database IntAct - EMBL-EBI Protein Interaction MIPS - Comprehensive Yeast Protein-Protein interactions Yeast Protein Interactions - Yeast two-hybrid results from Fields’ group PathCalling - A yeast protein interaction database by Curagen SPiD - Bacillus subtilis Protein Interaction Database AllFuse - Functional Associations of Proteins in Complete Genomes BRITE - Biomolecular Relations in Information Transmission and Expression ProMesh - A Protein-Protein Interaction Database The PIM Database - by Hybrigenics Mouse Protein-Protein interactions Human herpesvirus 1 Protein-Protein interactions Human Protein Reference Database BOND - The Biomolecular Object Network Databank. Former BIND MDSP - Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry Protcom - Database of protein-protein complexes enriched with the domain-domain structures Proteins that interact with GroEL and factors that affect their release YPDTM - Yeast Proteome Database by Incyte . . . Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 4 / 38

slide-5
SLIDE 5

Introduction

Introduction

Comparative analysis of protein-protein interaction graphs aims at finding complexes that are common to different species. Mounting evidence suggests that proteins that function together in a pathway or a structural complex are likely to evolve in a correlated fashion.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 5 / 38

slide-6
SLIDE 6

Introduction

Intoduction

Pattern matching in protein-protein interaction graphs Finding a protein complex in another protein network. Graph matching Focus on mappings that preserve adjacencies (to deal with interaction datasets that are missing many true protein interactions). Injective list homomorphisms and optimization State-of-the art approaches to identifying orthologs (genes in different species that originate from a single gene in the last common ancestor of these species). Putative orthologs are represented by colors

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 6 / 38

slide-7
SLIDE 7

Introduction

Introduction: Searching for an exact occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 5

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 7 / 38

slide-8
SLIDE 8

Introduction

Introduction: Searching for an exact occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 5 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 7 / 38

slide-9
SLIDE 9

Introduction

Introduction: Searching for an exact occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 5 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 7 / 38

slide-10
SLIDE 10

Introduction

Introduction: Searching for an exact occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 5 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 7 / 38

slide-11
SLIDE 11

Introduction

Introduction: Searching for an exact occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 5 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 7 / 38

slide-12
SLIDE 12

Introduction

Introduction: Searching for an exact occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 5 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 7 / 38

slide-13
SLIDE 13

Introduction

Introduction: Searching for the best occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 4

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 8 / 38

slide-14
SLIDE 14

Introduction

Introduction: Searching for the best occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 4 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 8 / 38

slide-15
SLIDE 15

Introduction

Introduction: Searching for the best occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 4 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 8 / 38

slide-16
SLIDE 16

Introduction

Introduction: Searching for the best occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 4 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 8 / 38

slide-17
SLIDE 17

Introduction

Introduction: Searching for the best occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 4 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 8 / 38

slide-18
SLIDE 18

Introduction

Introduction: Searching for the best occurrence

Pattern graph (G, λG) mult(G, λG) = 2 Target graph (H, λH) mult(H, λH) = 4 θ : V(G)

λG,λH

− − − → V(H)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 8 / 38

slide-19
SLIDE 19

Introduction

Problem

Max–(ρ, σ)–Matching–Colors

  • Input : Two graphs G and H and the coloring mappings

λG : V(G) → C, mult(G, λG) = ρ, and λH : V(H) → C, mult(H, λH) = σ.

  • Solution : An injective mapping θ : V(G)

λG,λH

− − − → V(H).

  • Measure : The number of edges of G matched by the injective

mapping θ. EXACT–(ρ, σ)–MATCHING–COLORS is the extremal problem of finding an injective mapping θ : V(G)

λG,λH

− − − → V(H) that matches all the edges

  • f G.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 9 / 38

slide-20
SLIDE 20

Introduction

Introduction

Trim instance An instance of the MAX–(ρ, σ)–MATCHING–COLORS or the EXACT–(ρ, σ)–MATCHING–COLORS problem is said to be trim if the following conditions hold true:

1

for each color ci ∈ C, #CG(ci) ≤ #CH(ci), and

2

for each edge {ui, uj} ∈ E(G), there exists an edge {vi, vj} ∈ E(H) such that λG(ui) = λH(vi) and λG(uj) = λH(vj).

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 10 / 38

slide-21
SLIDE 21

Introduction

Related works in the context

List injective homomorphisms for protein graphs [Fagnot, Lelandais and V., 2007; Fertin, Rizzi and V., 2005]. Reaction motifs in metabolic networks [Lacroix, Fernandes and Sagot, 2006; Hermelin, Fellows, Fertin and V., 2007]. QPath [Shlomi, Segal, Ruppin and Sharan, 2006]. Path Matching and Graph Matching in Biological Networks [Yang and Sze, 2007].

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 11 / 38

slide-22
SLIDE 22

Exact colorful instances

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 12 / 38

slide-23
SLIDE 23

Exact colorful instances

Exact colorful instances

Theorem (Fagnot, Lelandais and V., 2007) Both the EXACT–(1, σ)–MATCHING–COLORS problem for ∆(G) ≤ 2 and the EXACT–(ρ, 2)–MATCHING–COLORS problem are solvable in polynomial-time for any constant ρ and σ. Theorem (Fertin, Rizzi and V., 2005) The EXACT–(1, 3)–MATCHING–COLORS problem for ∆(G) = 3 and ∆(H) = 4 is NP-complete. We focus here on the EXACT–(1, σ)–MATCHING–COLORS problem.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 13 / 38

slide-24
SLIDE 24

Exact colorful instances

Exact colorful instances

Algorithm 1: Rand-Exact-Matching-Colors begin terminating whether an occurrence of G in H w.r.t λG and λH is

  • found. Let θ : V(G)

λG,λH

− − − → V(H) be a random injective mapping. up to 3nG times, terminating whether an occurrence of G in H w.r.t λG and λH is found. (1) Choose at random an edge e ∈ E(G) that is not matched by θ. (2) Choose at random one vertex u ∈ e. (3) Change at random the value of θ(u) w.r.t. λG and λH. end

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 14 / 38

slide-25
SLIDE 25

Exact colorful instances

Exact colorful instances: Random walk

Particle moving along the integer line Fix an optimal solution θopt. θi and θopt agree on exactly j vertices. e = {u, v} ∈ E(G) random edge that is not matched by θi. θi and θopt disagree on exactly one of u and v.

j − 1 j j + 1 nG σ − 1 2σ − 2 1 2σ − 2 σ − 2 2σ − 2

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 15 / 38

slide-26
SLIDE 26

Exact colorful instances

Exact colorful instances: Random walk

Particle moving along the integer line Fix an optimal solution θopt. θi and θopt agree on exactly j vertices. e = {u, v} ∈ E(G) random edge that is not matched by θi. θi and θopt disagree on both u and v.

j − 1 j j + 1 nG 2 2σ − 2 2σ − 4 2σ − 2

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 15 / 38

slide-27
SLIDE 27

Exact colorful instances

Exact colorful instances: Random walk

Particle moving along the integer line Pessimistic stochastic process (Y1, Y2, . . .)

j − 1 j j + 1 nG 2σ − 3 2σ − 2 1 2σ − 2

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 15 / 38

slide-28
SLIDE 28

Exact colorful instances

Exact colorful instances: Random walk

Useful bounds Let rj be the probability of exactly k “moves down”, and j + k “moves up” in a sequence of 2k + j moves:

rj ≥ 2σ − 3 2σ − 2 k 1 2σ − 2 j+k

Let qj be the probability that the algorithm finds an injective homomorphism within j + 2k ≤ 3nG steps, starting from a random injective mapping θ : V(G)

λG,λH

− − − → V(H)

qj ≥ √ 3 8

  • πj

27(2σ − 3) 4(2σ − 2)3 j

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 16 / 38

slide-29
SLIDE 29

Exact colorful instances

Exact colorful instances: Random walk

Theorem Algorithm Rand-Exact-Matching-Colors returns an injective homomorphism θ : V(G)

λG,λH

− − − → V(H) (if such a mapping exists) in ˜ O(f(σ)nG) expected time, where f(σ) = 4σ(2σ − 2)3 4(2σ − 2)3 + 27(2σ − 3)· Notice f(σ) < σ, for σ > 2. f(3) < 2.279, f(4) < 3.460 and f(5) < 4.578. ρ = 1.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 17 / 38

slide-30
SLIDE 30

Hardness results

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 18 / 38

slide-31
SLIDE 31

Hardness results

Hardness results

Theorem The MAX–(3, 3)–MATCHING–COLORS problem is APX-hard even if both G and H are linear forests, and the MAX–(2, 2)–MATCHING–COLORS problem is APX-hard even if both G and H are trees. Notice It remains open, however, whether the MAX–(ρ, σ)–MATCHING–COLORS problem for linear forests G and H is polynomial-time solvable in case ρ < 3.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 19 / 38

slide-32
SLIDE 32

Approximation algorithms

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 20 / 38

slide-33
SLIDE 33

Approximation algorithms Bounded degree graphs

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 21 / 38

slide-34
SLIDE 34

Approximation algorithms Bounded degree graphs

Bounded degree graphs: Intermediate problem

Max–Matching–With–Color–Constraints

  • Input : A graph G together with a coloring mapping

λG : V(G) → {c1, c2, . . . , cm}, and a symmetric matrix A = [ai,j]

  • f order m whose entries are natural integers.
  • Solution : A matching M ⊆ E(G) s.t. the constraint that, for

1 ≤ i ≤ j ≤ m, the number of edges in M having one end-vertex colored ci and one end-vertex colored cj is at most ai,j.

  • Measure : The size of the matching, i.e., #M.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 22 / 38

slide-35
SLIDE 35

Approximation algorithms Bounded degree graphs

Bounded degree graphs: Intermediate problem

Theorem The MAX–MATCHING–WITH–COLOR–CONSTRAINTS problem is NP-complete but is approximable within ratio 3/2 + ε, for any ε > 0. Proof. Approximation preserving reduction to MAXIMUM B-SET PACKING. MAXIMUM SET PACKING is defined as follows: Given a collection S of finite subsets of a ground set X, find a maximum cardinality collection of pairwise disjoint sets S ′ ⊆ S. MAXIMUM B-SET PACKING is the variation of MAXIMUM SET PACKING in which the cardinality of all sets in C are bounded from above by a constant B ≥ 3.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 23 / 38

slide-36
SLIDE 36

Approximation algorithms Bounded degree graphs

Bounded degree graphs

Chromatic index An edge coloring of a graph G is proper if no two adjacent edges are assigned the same color. The smallest number of colors needed in a proper edge coloring

  • f a graph G is the chromatic index χ′(G).

Vizing’s theorem states that χ′(G) ≤ ∆(G) + 1 and that such an edge coloring can be found in polynomial-time. Petersen graph: χ′(G) = ∆(G) + 1 = 4

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 24 / 38

slide-37
SLIDE 37

Approximation algorithms Bounded degree graphs

Bounded degree graphs

Theorem For any ρ and σ, the MAX–(ρ, σ)–MATCHING–COLORS problem is approximable within ratio 3/2(∆min + 1) + ε dor any ε > 0, where ∆min = min{∆(G), ∆(H)}. Key elements Chromatic index. Vizing’s theorem. Iteratively using the (3/2 + ε)-approximation algorithm for instances of the MAX–MATCHING–WITH–COLOR–CONSTRAINTS problem.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 25 / 38

slide-38
SLIDE 38

Approximation algorithms Bounded degree graphs

Bounded degree graphs

Proof.

  • 1. ∆min = ∆(H).

1

H admits a proper edge coloring with at most ∆(H) + 1 colors, say {c ′

1, c ′ 2, . . . , c ′ ∆(H)+1}.

2

For 1 ≤ i ≤ ∆(H) + 1,

1

let Hi be the graph obtained from H by deleting all edges but those colored with color c ′

i , note that Hi is a matching.

2

Using the (3/2 + ε)-approximation algorithm for the MAX–MATCHING–WITH–COLOR–CONSTRAINTS problem, we

  • btain 2-approximation algorithm for the new instance of the

MAX–(ρ, σ)–MATCHING–COLORS problem obtained by replacing H by Hi.

3

Returning the best one these ∆(H) + 1 mappings yields an approximation algorithm with performance ratio 3/2(∆(H) + 1) + ε.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 26 / 38

slide-39
SLIDE 39

Approximation algorithms Bounded degree graphs

Bounded degree graphs

Proof.

  • 2. ∆min = ∆(G):

1

G admits a proper edge coloring with at most ∆(G) + 1 colors, say {c ′

1, c ′ 2, . . . , c ′ ∆(G)+1}.

2

For 1 ≤ i ≤ ∆(G) + 1,

1

let Gi be the graph obtained from G by deleting all edges but those colored with color c ′

i , note that Gi is a matching.

2

Using the (3/2 + ε)-approximation algorithm for the MAX–MATCHING–WITH–COLOR–CONSTRAINTS problem, we

  • btain 2-approximation algorithm for the new instance of the

MAX–(ρ, σ)–MATCHING–COLORS problem obtained by replacing G by Gi.

3

Returning the best one these ∆(G) + 1 mappings yields an approximation algorithm with performance ratio 3/2(∆(G) + 1) + ε.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 26 / 38

slide-40
SLIDE 40

Approximation algorithms Bounded degree graphs

Bounded degree graphs

Proof.

  • 3. Combining

∆min = ∆(H): (3/2(∆(H) + 1) + ε)-approximation algorithm. ∆min = ∆(G): (3/2(∆(G) + 1) + ε)-approximation algorithm. yields (3/2(∆min + 1) + ε)-approxmination algorithm for any ε > 0, ∆min = min{∆(G), ∆(H)}.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 26 / 38

slide-41
SLIDE 41

Approximation algorithms A randomized algorithm

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 27 / 38

slide-42
SLIDE 42

Approximation algorithms A randomized algorithm

A randomized algorithm

Definitions Let G be a graph and λG : V(G) → C be a coloring mapping of G. A legal (ℓ1, ℓ2)-labeling of G is an assignment to labels {ℓ1, ℓ2} to the vertices of G such that, for each color ci ∈ C, either

  • #CG(ci)

2

  • r
  • #CG(ci)

2

  • vertices in CG(ci) are labeled ℓ1.

The cut induced by a legal (ℓ1, ℓ2)-labeling to be the set of edges that have one end-vertex with label ℓ1 and one end-vertex with label ℓ2.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 28 / 38

slide-43
SLIDE 43

Approximation algorithms A randomized algorithm

A randomized algorithm

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 29 / 38

slide-44
SLIDE 44

Approximation algorithms A randomized algorithm

A randomized algorithm

ℓ1 ℓ1 ℓ1 ℓ2 ℓ2 ℓ1 ℓ1 ℓ2 ℓ2 ℓ1 ℓ1 ℓ2

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 29 / 38

slide-45
SLIDE 45

Approximation algorithms A randomized algorithm

A randomized algorithm

ℓ1-subset ℓ2-subset ℓ1 ℓ1 ℓ1 ℓ2 ℓ2 ℓ1 ℓ1 ℓ2 ℓ2 ℓ1 ℓ1 ℓ2

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 29 / 38

slide-46
SLIDE 46

Approximation algorithms A randomized algorithm

A randomized algorithm

ℓ1-subset ℓ2-subset (ℓ1, ℓ2)-cut edges ℓ1 ℓ1 ℓ1 ℓ2 ℓ2 ℓ1 ℓ1 ℓ2 ℓ2 ℓ1 ℓ1 ℓ2

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 29 / 38

slide-47
SLIDE 47

Approximation algorithms A randomized algorithm

A randomized algorithm

Theorem There exists a randomized algorithm for the MAX–(ρ, σ)–MATCHING–COLORS problem with expected performance ratio 4 σ. Key elements Random (ℓ1, ℓ2)-labeling. Random mapping θ : V(G)

λG,λH

− − − → V(H). Maximum weighted matching in bipartite graphs.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 30 / 38

slide-48
SLIDE 48

Approximation algorithms A randomized algorithm

A randomized algorithm

Pattern graph (G, λG) Target graph (H, λH)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 31 / 38

slide-49
SLIDE 49

Approximation algorithms A randomized algorithm

A randomized algorithm

Pattern graph (G, λG)

ℓ2-labeling ℓ1-labeling

Target graph (H, λH)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 31 / 38

slide-50
SLIDE 50

Approximation algorithms A randomized algorithm

A randomized algorithm

Pattern graph (G, λG)

ℓ2-labeling ℓ1-labeling

Target graph (H, λH) Optimal solution θopt

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 31 / 38

slide-51
SLIDE 51

Approximation algorithms A randomized algorithm

A randomized algorithm

Pattern graph (G, λG)

ℓ2-labeling ℓ1-labeling

Target graph (H, λH) Optimal solution θopt Random mapping

θrand

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 31 / 38

slide-52
SLIDE 52

Approximation algorithms A randomized algorithm

A randomized algorithm

Pattern graph (G, λG)

ℓ2-labeling ℓ1-labeling

Target graph (H, λH) Optimal solution θopt Random mapping

θrand u v θopt(u) = θrand(u) ∀w ∈ V(G) | ℓ1 θrand(w) = θopt(v)

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 31 / 38

slide-53
SLIDE 53

Approximation algorithms A randomized algorithm

A randomized algorithm

Pattern graph (G, λG)

ℓ2-labeling ℓ1-labeling

Target graph (H, λH) Optimal solution θopt Random mapping

θrand u v θopt(u) = θrand(u) ∀w ∈ V(G) | ℓ1 θrand(w) = θopt(v)

  • Weight. matching

M

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 31 / 38

slide-54
SLIDE 54

Approximation algorithms A randomized algorithm

A randomized algorithm

Pattern graph (G, λG)

ℓ2-labeling ℓ1-labeling

Target graph (H, λH) Optimal solution θopt Random mapping

θrand u v θopt(u) = θrand(u) ∀w ∈ V(G) | ℓ1 θrand(w) = θopt(v)

Solution

θsol = θrand + M

  • Weight. matching

M

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 31 / 38

slide-55
SLIDE 55

Approximation algorithms Linear forests

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 32 / 38

slide-56
SLIDE 56

Approximation algorithms Linear forests

Linear Forests

Theorem The MAX–(3, 3)–MATCHING–COLORS problem is APX-hard even if both G and H are linear forests. Theorem For any ρ and σ, the MAX–(ρ, σ)–MATCHING–COLORS problem is approximable within ratio 4 in case both G and H are linear forests. Key elements Balanced 2-intervals. Weighted independent set.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 33 / 38

slide-57
SLIDE 57

Approximation algorithms Linear forests

Linear Forests

Definitions A 2-interval D = (I, J) is the union of two disjoint intervals defined

  • ver a single line.

D I J A 2-interval D = (I, J) is said to be balanced if |I| = |J|. D I J Two 2-intervals D1 = (I1, J1) and D2 = (I2, J2) are disjoint, if both 2-intervals share no common point.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 34 / 38

slide-58
SLIDE 58

Approximation algorithms Linear forests

Linear Forests

1 1 1 1 2 1 1 1 1 1 1

PG

1

PG

2

PH

1

PH

2

PH

3

Pattern graph (G, λG) Target graph (H, λH) PG

1

PG

2

PH

1

PH

2

PH

3 Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 35 / 38

slide-59
SLIDE 59

Approximation algorithms Linear forests

Linear Forests

Theorem (Crochemore, Hermelin, Landau, Rawitz and V., 2006) There exists a polynomial-time algorithm with performance ratio 4 for finding a maximum weight subset of disjoint 2-intervals in a set of weighted balanced 2-intervals. Key elements Local ratio technique. r-effective weight function. Theorem For any ρ and σ, the MAX–(ρ, σ)–MATCHING–COLORS problem is approximable within ratio 4 in case both G and H are linear forests.

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 36 / 38

slide-60
SLIDE 60

Future works

Outline

1

Introduction

2

Exact colorful instances

3

Hardness results

4

Approximation algorithms Bounded degree graphs A randomized algorithm Linear forests

5

Future works

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 37 / 38

slide-61
SLIDE 61

Future works

Future works

Improve the random walk algorithm for the EXACT–(ρ, σ)–MATCHING–COLORS problem. What about ρ ≥ 2 . . . ? Improve the approximation ratio for bounded degree graphs Design a better (randomized?) approximation algorithm for the MAX–(ρ, σ)–MATCHING–COLORS problem. Is the MAX–(ρ, σ)–MATCHING–COLORS problem approximable within ratio σ ?

Brevier, Rizzi and Vialette () Pattern Matching in Protein Graphs Lisbon, September 18, 2008 38 / 38