GWAMAR: Genome-wide assessment of mutations associated with drug - - PowerPoint PPT Presentation

gwamar genome wide assessment of mutations associated
SMART_READER_LITE
LIVE PREVIEW

GWAMAR: Genome-wide assessment of mutations associated with drug - - PowerPoint PPT Presentation

Introduction Methods Results Summary GWAMAR: Genome-wide assessment of mutations associated with drug resistance in bacteria Michal Wozniak 1,2 , Limsoon Wong 2 and Jerzy Tiuryn 1 1 University of Warsaw 2 National University of Singapore 27


slide-1
SLIDE 1

Introduction Methods Results Summary

GWAMAR: Genome-wide assessment of mutations associated with drug resistance in bacteria

Michal Wozniak1,2, Limsoon Wong2 and Jerzy Tiuryn1

1University of Warsaw 2National University of Singapore

27 March, 2015

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-2
SLIDE 2

Introduction Methods Results Summary

Introduction Mechanisms of drug action against bacteria Mechanisms of drug resistance in bacteria Methods Schema of the approach Input data Association scores Results Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations Summary

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-3
SLIDE 3

Introduction Methods Results Summary Mechanisms of drug action against bacteria Mechanisms of drug resistance in bacteria

Drug action mechanisms

Rysunek : Adopted from: Platforms for antibiotic discovery; Kim Lewis; Nature

Reviews; 2013

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-4
SLIDE 4

Introduction Methods Results Summary Mechanisms of drug action against bacteria Mechanisms of drug resistance in bacteria

Timeline of antibiotics

Rysunek : Timeline of the discovery and introduction of antibiotics (based on

Platforms for antibiotic discovery; Kim Lewis; Nature Reviews; 2013).

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-5
SLIDE 5

Introduction Methods Results Summary Mechanisms of drug action against bacteria Mechanisms of drug resistance in bacteria

Drug resistance mechanisms I

There are several known drug resistance mechanisms which can be categorized as follows (adopted from: Wright GD, Chem. Comm., 2011):

◮ drug target modification; ◮ drug molecule modification by specialized enzymes ◮ reduced accumulation of the drug inside a bacteria cell by decreased cell

wall permeability or by pumping out the drug

◮ alternative metabolic pathways

These drug resistance mechanisms can be acquired either by chromosomal mutations or horizontal gene transfer.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-6
SLIDE 6

Introduction Methods Results Summary Mechanisms of drug action against bacteria Mechanisms of drug resistance in bacteria

Drug resistance mechanisms II

Rysunek : Adopted from: Platforms for antibiotic discovery; Kim Lewis; Nature

Reviews; 2013

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-7
SLIDE 7

Introduction Methods Results Summary Schema of the approach Input data Association scores

GWAMAR: drug resistance-associated mutations

Goal: identify drug resistance-associated mutations (primary and secondary) General approach implemented in GWAMAR:

◮ we use whole-genome comparative approach to identify

genetic variations among multiple bacterial strains,

◮ we retrieve from literature and databases information of the

drug resistance phenotypes of the strains,

◮ we associate the identified mutations with drug

resistance-phenotypes based on association scores,

◮ we propose a new association score, called TGH, which

implements scores phylogenetic information.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-8
SLIDE 8

Introduction Methods Results Summary Schema of the approach Input data Association scores

Genotype and phenotype data

Genotype data

We consider two kinds of genetic variations (determined by eCAMBer based on gene families and their multiple alignments):

◮ gene gain/loss, ◮ amino acid point mutation.

These genetic variations are represented as ’0’-’1’ vectors (called mutation profiles), where ’0’ denotes the reference state and ’1’ denotes some change.

Phenotype data (drug susceptibility)

Phenotype data are represented as vectors, called drug resistance profiles, with possible states: ’S’, ’R’, ’I’, ’?’.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-9
SLIDE 9

Introduction Methods Results Summary Schema of the approach Input data Association scores

Schema of the framework

Scoring of the mutation Profiles (multiprocessing) Genotype data (a set of mutations) Scored list of putative associations

  • f drug resistance with mutations

The pipeline of GWAMAR Phenotype data collected from literature or databases (a set of drug resistance profiles) Phylogenetic tree for the set of bacterial strains Consolidation of genome annotations for multiple bacterial strains and identification of gene families Preprocessing steps done by eCAMBer (this step may potentially be replaced by other tools) Multiple alignments of identified gene families computed using MUSCLE Download of genome sequences and annotations for a set of bacterial strains Identification of point mutations Reconstruction of the phylogenetic tree employing PHYLIP or PhyML Binarization of mutation profiles into binary mutation profiles The reference strain

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-10
SLIDE 10

Introduction Methods Results Summary Schema of the approach Input data Association scores

Tree-aware scores

We observe that subtrees of the phylogenetic tree very often correspond to geographic

  • locations. Since drug resistance mutations are subject to e volutionary pressure caused

by the drug treatment they should be independent of geographic location and therefore be more widely distributed over the tree, as opposed to mutations driven by

  • ther environmental factors which tend to rather concentrate in small subtrees.
HaarlemNITR202 RGTB423 CASNITR204 OSDD493 OSDD105 NAA0009 EAIOSDD271 NAA0008 RGTB327 CDC1551 CDC1551A PanR0305 PanR0316 PanR0301 SUMu001 SUMu011 SUMu010 n98R604INHRIFEM GM1503 KZNV2475 KZN1435 KZNR506 KZN605 KZN4207Broad KZN4207 F11 HM PanR0708 PanR0906 PanR0703 PanR1007 PanR0611 PanR0704 PanR0805 PanR1101 PanR0403 UT205 PanR0313 PanR0610 PanR0903 PanR0404 PanR0409 PanR0503 PanR0402 PanR0203 PanR0602 CTRI2 PanR0407 PanR0601 PanR0401 PanR0205 PanR0306 PanR0209 PanR0607 PanR0609 PanR0904 PanR0803 PanR0207 PanR0314 PanR0909 PanR0707 PanR0603 PanR0604 PanR0702 PanR0505 PanR0804 PanR0411 PanR0307 PanR0412 PanR0405 PanR0410 PanR0317 PanR0311 PanR1006 PanR1005 PanR0802 PanR0501 INSSEN INSXDR INSMDR MTB476 EAI5NITR206 GuangZ0019 SUMu006 SUMu004 SUMu005 SUMu007 SUMu003 SUMu008 SUMu002 SUMu009 PanR0208 MTB489 PanR0309 PanR0202 BTB05559 BTB05552 S96129 OSDD515 H37Rv SUMu012 H37Ra PanR0908 H37RvHA H37RvMA H37RvAE H37RvCO H37RvBroad H37RvLP H37RvJO H37RaWGS C PanR0315 PanR0907 PanR0206 Haarlem PanR0902 Erdman PanR0801 n719999 PanR0308 PanR0201 PanR0304 CPHLA K85 T92 n4316836 PR05 T17 T46 EAI5 EAS054 OSDD504 OSDD518 OSDD071 UM1072388579 FJ05194 n94M4241A CTRI4 X132 R1207 X28 X156 X85 R1746 n021987 HN878 n210 BeijingNITR203 PanR0605 PanR0606 OMV02005 XDR1221 NCGM2209 R1505 R1441 R1390 R1842 X29 X122 X189 R1909 W148 SP21 CCDC5180 BT2 CCDC5079 T85 HKBS1 WX1 WX3 XDR1219 BT1 n173 n174 n175 n176 n177 n178 n179 n180 n181 n182 n290 n183 n187 n184 n185 n186 n188 n190 n189 n191 n250 n192 n193 n194 n199 n195 n198 n196 n197 n200 n208 n201 n202 n203 n204 n205 n207 n206 n209 n211 n210 n212 n217 n213 n214 n215 n216 n218 n219 n220 n221 n222 n223 n224 n229 n225 n228 n226 n227 n230 n234 n231 n232 n233 n235 n239 n236 n238 n237 n240 n241 n242 n243 n244 n247 n245 n246 n248 n249 n251 n253 n252 n254 n255 n282 n256 n257 n265 n258 n259 n264 n260 n261 n262 n263 n266 n269 n267 n268 n270 n271 n272 n273 n274 n275 n276 n277 n278 n281 n279 n280 n283 n284 n285 n286 n287 n288 n289 n291 n293 n292 n294 n302 n295 n296 n297 n301 n298 n299 n300 n303 n305 n304 n306 n308 n307 n309 n316 n310 n311 n312 n314 n313 n315 n317 n318 n319 n324 n320 n322 n321 n323 n325 n331 n326 n327 n328 n329 n330 n332 n333 n334 n339 n335 n336 n337 n338

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-11
SLIDE 11

Introduction Methods Results Summary Schema of the approach Input data Association scores

Classical scores (tree-ignorant) association scores

The classical scores used in genotype-phynotype association studies and co-evolution studies are tree-ignorant.

◮ odds ratio:

OR(b, r) = nR

1 · nS

max(1, nR

0 ) · max(1, nS 1 ) ◮ mutual information:

MI(b, r) =

  • x∈{’0’,’1’}
  • y∈{’S’,’I’,’R’}

ny

x

n · log

ny

x · n

nx · ny

  • ◮ hypergeometric score

H(b, r) = −log

  • n
  • i=nR

H(n, nR, n1, i)

  • Michal Wozniak

GWAMAR: drug resistance-associated mutations

slide-12
SLIDE 12

Introduction Methods Results Summary Schema of the approach Input data Association scores

Weighted support

Weighted support rewards for drug-resistant strains with the mutation, penalty for drug-susceptible strains with the mutation, where weight wT(b, r, i) for drug resistant strains is 1

k , where k denotes the size of the largest subtree

with only drug resistant strains.

✉ ✉ ✉

Drug resistance profile S R S R ? ? S R R ? R Weights

  • 1.7 1.0
  • 1.7 1.0

0.0 0.0

  • 1.7 0.3

0.3 0.0 0.3

mutation2 1 1 1 mutation1 1 1 1 1 mutation3 1 1 1 1 ... ... Weighted support for mutation m is defined as follows: WST(b, r) =

  • i∈S

wT(b, r, i)[b(i) = ’1’]

Michal Wozniak GWAMAR: drug resistance-associated mutations

2.3 1.0

  • 1.0

...

slide-13
SLIDE 13

Introduction Methods Results Summary Schema of the approach Input data Association scores

TGH score I

For a given tree T, we call a subset c of its nodes a coloring, if it satisfies the following two conditions:

◮ each path from a leaf to the root contains at most one node

from c,

◮ each internal node in c has a sibling node which does not

belong to c.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-14
SLIDE 14

Introduction Methods Results Summary Schema of the approach Input data Association scores

TGH score II

Drug resistance profile

R R R S S R R

Binary mutation profile

1 1 1

A) B)

Drug resistance profile

R R R S S R R

Binary mutation profile

1 1 1 S,0 R,1 S,0 R,0 S,0

Rysunek : (A) an example of coloring ˆ

c induced by a given drug resistance profile (large red nodes) and coloring c induced by a given binary mutation profile (small

  • range nodes) for a flat tree. In this example |ˆ

c| = 5, |c| = 3 and |L(ˆ c) ∩ c| = 3. (B) another example of colorings ˆ c and c induced by the same pair of profiles but for a different tree. In this example |ˆ c| = 3, |c| = 2 and |L(ˆ c) ∩ c| = 2.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-15
SLIDE 15

Introduction Methods Results Summary Schema of the approach Input data Association scores

TGH score III

We define the TGH score as follows: TGHT(r, b) = −log

n

i=k BT,ˆ c(i, n)

VT(n)

  • where:

VT(n) = #{c ∈ CT : |c| = n} and: BT,ˆ

c(k, n) = #{c ∈ CT : |L(ˆ

c) ∩ c| = k and |c| = n}

GWAMAR implements a dynamic programming approach to calculate the score. The time complexity is O(D · NK−1 · N2 + D · N · M).

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-16
SLIDE 16

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Input datasets

We have two datasets of data for M. tuberculosis

◮ 1398 strains with 28 genes sequenced from Broad Institute (mtu broad) ◮ 173 fully sequenced strains available in NCBI and PATRIC databases

(mtu173)

Genotype data

Point mutation profiles were determined based on gene families identified with eCAMBer and their multiple alignments computed with MUSCLE.

Phenotype data (drug susceptibility)

◮ publications issued together with the fully sequenced genomes; ◮ other publications found by searching of related literature; ◮ drug resistance profiles for separate drugs are combined into: Rifampicin,

Isoniazid, Fluoroquinolones, Ethambutol, Pyrazinamide, Streptomycin

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-17
SLIDE 17

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Collected phenotype data for the mtu173 dataset

1 Table 1. Complete collected data on drug resistance. Strain Streptomycin Rifampicin Ethambutol Isoniazid Ofloxacin Kanamycin Capreomycin Amikacin Ethionamide Pyrazinamide Cycloserine Ciprofloxacin PAS Rifabutin 02 1987 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] 210 S [13] S [13] S [13] 7199-99 S [12] S [12] S [12] S [12] S [12] S [12] S [12] S [12] S [12] S [12] S [12] 94 M4241A S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] 98-R604 INH-RIF-EM S [11] R [11] R [11] R [11] S [11] S [11] BT1 R [17] R [17] R [17] R [17] R [17] R [17] R [17] R [17] R [17] R [17] BT2 R [17] R [17] R [17] R [17] R [17] R [17] R [17] R [17] R [17] R [17] BTB05-552 S [14] S [14] S [14] R [14] S [14] S [14] S [14] BTB05-559 S [14] S [14] S [14] R [14] S [14] S [14] S [14] C S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] CCDC5079 S [11]S [23] S [11]S [23] S [11]S [23] S [11]S [23] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] CCDC5080 R [23] R [23] R [23] R [23] CDC1551 S [8] S [8] S [8] S [8] S [8] S [8] S [8] S [8] S [8] S [8] CPHL A S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] CTRI-2 S [11]S [1] S [11]S [1] S [11]S [1] S [11]S [1] S [11]S [1] S [11] S [11]S [1] S [11]S [1] S [11]S [1] S [11]S [1] S [11] S [11] S [11] S [11] CTRI-4 R [1] R [1] R [1] R [1] R [1] R [1] R [1] R [1] R [1] EAI OSDD271 R [18] R [18] R [18] R [18] R [18] R [18] EAS054 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] Erdman S [11]S [10] S [11]S [10] S [11]S [10] S [11]S [10] S [11] S [11]S [10] S [11]S [10] S [11]S [10] S [11]S [10] S [11]S [10] S [11]S [10] S [11] S [11] S [11]S [10] FJ05194 R [7] R [7] R [7] R [7] R [7] R [7] R [7] R [7] GM 1503 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] GuangZ0019 R [7] R [7] R [7] R [7] R [7] R [7] R [7] R [7] H37Ra S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] H37Ra WGS S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] H37Rv S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] HKBS1 S [17] S [17] S [17] S [17] HN878 S [3] S [3] S [3] S [3] S [3] S [3] S [3] S [3] S [3] S [3] S [3] S [3] S [3] S [3] K85 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] KZN 1435 S [11] R [11] S [11] R [11] S [11] S [11] KZN 4207 S [11]S [4] S [11]S [4] S [11] S [11]S [4] S [11]S [4] S [11]S [4] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] KZN 4207 Broad S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] KZN 605 R [11] R [11] R [11] R [11] R [11] R [11] KZN R506 R [4] R [4] R [4] R [4] R [4] KZN V2475 R [4] R [4] R [4] MTB-476 S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] MTB-489 S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] S [5] OM-V02 005 R [16] R [16] OSDD105 R [20] R [20] R [20] R [20] S [20] S [20] S [20] S [20] S [20] R [20] R [20] S [20] OSDD493 R [19] R [19] R [19] R [19] R [19] R [19] S [19] S [19] R [19] S [19] S [19] S [19] OSDD515 R [21] R [21] R [21] R [21] S [21] S [21] R [21] S [21] R [21] R [21] R [21] S [21] PanR0201 R [6] R [6] S [6] R [6] PanR0202 R [6] R [6] R [6] R [6] PanR0203 R [6] R [6] R [6] R [6] PanR0205 R [6] R [6] R [6] R [6] PanR0206 R [6] R [6] S [6] R [6] PanR0207 R [6] R [6] S [6] R [6] PanR0208 R [6] R [6] S [6] R [6] PanR0209 R [6] R [6] R [6] R [6] PanR0301 R [6] R [6] S [6] R [6] PanR0304 R [6] R [6] R [6] R [6] PanR0305 S [6] R [6] S [6] R [6] PanR0306 R [6] R [6] R [6] R [6] PanR0307 S [6] R [6] S [6] R [6] PanR0308 R [6] R [6] R [6] R [6] PanR0309 R [6] R [6] S [6] R [6] PanR0311 R [6] R [6] S [6] R [6] PanR0313 S [6] R [6] S [6] R [6] PanR0314 R [6] R [6] S [6] R [6] PanR0315 S [6] R [6] S [6] R [6] PanR0316 R [6] R [6] S [6] R [6] PanR0317 R [6] R [6] S [6] R [6] PanR0401 R [6] R [6] R [6] R [6] PanR0402 R [6] R [6] S [6] R [6] PanR0403 R [6] R [6] S [6] R [6] PanR0404 S [6] R [6] S [6] R [6] PanR0405 R [6] R [6] R [6] R [6] PanR0407 R [6] R [6] S [6] R [6] PanR0408 S [6] R [6] S [6] R [6] PanR0409 S [6] R [6] S [6] R [6] PanR0410 R [6] R [6] S [6] R [6] PanR0411 R [6] R [6] S [6] R [6] PanR0412 R [6] R [6] S [6] R [6] PanR0501 R [6] R [6] S [6] R [6] PanR0503 S [6] R [6] S [6] R [6] PanR0505 R [6] R [6] R [6] R [6] PanR0601 R [6] R [6] S [6] R [6] PanR0602 S [6] R [6] S [6] R [6] PanR0603 R [6] R [6] R [6] R [6] PanR0604 R [6] R [6] R [6] R [6] PanR0605 R [6] R [6] S [6] R [6] PanR0606 S [6] R [6] S [6] R [6] PanR0607 R [6] R [6] S [6] R [6] PanR0609 R [6] R [6] S [6] R [6] PanR0610 S [6] R [6] S [6] R [6] PanR0611 S [6] R [6] S [6] R [6] PanR0702 R [6] R [6] R [6] R [6] PanR0703 R [6] R [6] R [6] R [6] PanR0704 S [6] R [6] S [6] R [6] PanR0707 R [6] R [6] R [6] R [6] PanR0708 S [6] R [6] S [6] R [6] PanR0801 R [6] R [6] S [6] R [6] PanR0802 R [6] R [6] R [6] R [6] PanR0803 R [6] R [6] S [6] R [6] PanR0804 R [6] R [6] S [6] R [6] PanR0805 S [6] R [6] S [6] R [6] PanR0902 R [6] R [6] R [6] R [6] PanR0903 S [6] R [6] S [6] R [6] PanR0904 R [6] R [6] R [6] R [6] PanR0906 S [6] R [6] S [6] R [6] PanR0907 R [6] R [6] R [6] R [6] PanR0908 R [6] R [6] R [6] R [6] PanR0909 R [6] R [6] PanR1005 R [6] R [6] R [6] R [6] PanR1006 R [6] R [6] S [6] R [6] PanR1007 S [6] R [6] S [6] R [6] PanR1101 R [6] R [6] R1207 R [3] S [3] R [3] R1390 S [3] S [3] R [3] R1441 S [3] S [3] R [3] R1505 R [3] S [3] R [3] R1746 R [3] S [3] R [3] R1842 S [3] S [3] R [3] R1909 R [3] R [3] R [3] RGTB327 R [11] RGTB423 R [11] R [11] R [11] S96-129 R : 4 [14] S [14] S [14] R [14] S [14] S [14] S [14] SP21 R [15] R [15] R [15] R [15] R [15] R [15] R [15] S [15] S [15] SUMu001 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu002 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu003 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu004 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu005 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu006 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu007 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu008 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu009 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu010 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu011 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] SUMu012 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] T17 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] T46 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] T92 S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] S [11] UM 1072388579 R [9] R [9] S [9] R [9] R [9] R [9] R [9] R [9] R [9] R [9] S [9] W-148 R [2] R [2] WX1 R [22] R [22] R [22] R [22] R [22] S [22] S [22] S [22] WX3 R [22] R [22] R [22] R [22] R [22] S [22] S [22] S [22] X122 R [3] S [3] R [3] R [3] S [3] S [3] X132 R [3] R [3] S [3] R [3] S [3] R [3] R [3] R [3] S [3] X156 R [3] R [3] S [3] R [3] S [3] R [3] R [3] R [3] S [3] X189 S [3] R [3] R [3] R [3] R [3] R [3] R [3] S [3] X28 R [3] R [3] S [3] R [3] R [3] R [3] R [3] R [3] S [3] X29 R [3] R [3] S [3] R [3] S [3] R [3] S [3] R [3] S [3] X85 R [3] S [3] R [3] R [3] R [3] S [3] R [3] R [3] XDR1219 R [22] R [22] R [22] R [22] R [22] R [22] R [22] R [22] XDR1221 R [22] R [22] R [22] R [22] R [22] R [22] R [22] R [22]

References

  • 1. Elena N. Ilina, Egor A. Shitikov, Larisa N. Ikryannikova, Dmitry G. Alekseev, Dmitri E. Kamashev, Maja V. Malakhova, Tatjana V. Parfenova, Maxim V. Afanasev, Dmitry S. Ischenko, Nikolai A. Bazaleev, Tatjana G.
Smirnova, Elena E. Larionova, Larisa N. Chernousova, Alexey V. Beletsky, Andrei V. Mardanov, Nikolai V. Ravin, Konstantin G. Skryabin, and Vadim M. Govorun. Comparative genomic analysis of mycobacterium tuberculosis drug resistant strains from russia. PLoS ONE, 8(2):e56577, February 2013.
  • 2. Broad Institute. Broad institute tuberculosis.
  • 3. Thomas R. Ioerger, Yicheng Feng, Xiaohua Chen, Karen M. Dobos, Thomas C. Victor, Elizabeth M. Streicher, Robin M. Warren, Nicolaas C. Gey van Pittius, Paul D. Van Helden, and James C. Sacchettini. The non-clonality
  • f drug resistance in beijing-genotype isolates of mycobacterium tuberculosis from the western cape of south africa. BMC Genomics, 11(1):670, November 2010. PMID: 21110864.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-18
SLIDE 18

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Phenotype data for the mtu broad dataset

# isolates

200 400 600 800 1000 1200 1400 Streptomycin Ethambutol Pyrazinamide Ethionamide Capreomycin Kanamycin Amikacin Rifampicin Ciprofloxacin Isoniazid Levofloxacin ara−aminosalicylic_acid Ofloxacin Rifabutin Clarithromycin Gatifloxacin Moxifloxacin Thioacetazone Prothionamide Cycloserine Amoxiclav Linezolid Clofloxacin

susceptible strains resistant strains intermediate−resistant strains

430 428 387 372 363 628 736 220 692 153 437 849 214 43 34 28 71 62 112 855 4 87 137 926 904 606 611 577 257 226 1150 215 1203 110 78 62 114 68 26 23 14 11 8 15 1 1 2 4 2 16 2 1

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-19
SLIDE 19

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Gold standard associations

We retrieve the gold standard associations from the TBDreamDB database for: Rifampicin, Isoniazid, Fluoroquinolones, Ethambutol, Pyrazinamide, Streptomycin.

drug name gene name positions Fluoroquinolones gyrA 90,91,94,102,126 gyrB 538 Ethambutol embB 306,406,497 Isoniazid ahpC

  • 46,-39,21

fabG1-inhA

  • 15,-8

kasA 269 katG 315 Rifampicin rpoB 432,435,441,445,450,452 Streptomycin rpsL 43,88 rrs 492,513,514,517,907 Pyrazinamide pncA

  • 11,7,10,... (60 in total)

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-20
SLIDE 20

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Comparison on mtu173 and mtu broad datasets

  • 0.20

0.25 0.30 0.35

Barplots for AUC; mtu173; positives: 94; negatives: 860

M I O R H R B M ( M I , O R , H ) W S T G H R B M ( W S , T G H ) R B M ( A L L )

  • 0.40

0.42 0.44 0.46 0.48 0.50 0.52

Barplots for AUC; mtu_broad; positives: 212; negatives: 667

M I O R H R B M ( M I , O R , H ) W S T G H R B M ( W S , T G H ) R B M ( A L L )

  • 0.30

0.35 0.40 0.45 0.50

Barplots for AUC; mtu173; positives: 39; negatives: 452

M I O R H R B M ( M I , O R , H ) W S T G H R B M ( W S , T G H ) R B M ( A L L )

  • 0.45

0.50 0.55 0.60

Barplots for AUC; mtu_broad; positives: 74; negatives: 475

M I O R H R B M ( M I , O R , H ) W S T G H R B M ( W S , T G H ) R B M ( A L L )

Rysunek : High-confidence mutations from TBDreamDB are used as positives.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-21
SLIDE 21

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Top-scoring mutations on the mtu173 dataset

drug name gene id gene name mutation all h.c. TGH Fluoroquinolones Rv0006 gyrA D94H1A5N2Y2G12 Y Y 14.184 Isoniazid Rv1908c katG S315N1G2T75 Y Y 9.045 Rifampicin Rv0667 rpoB S450L71 Y Y 8.602 Streptomycin Rv0682 rpsL K43R15 Y Y 8.323 Ethambutol Rv3795 embB M306L1I32V18 Y Y 8.250 Isoniazid Rv1483 fabG1 C-15T30 Y Y 5.845 Rifampicin Rv0667 rpoB D435Y2F5V11G3A1 Y Y 5.040 Streptomycin Rv0682 rpsL K88R5M1 Y Y 4.164 Ethambutol Rv3795 embB E504G1D1 N N 3.331 Pyrazinamide Rv2043c pncA H51P1 Y Y 2.708 Pyrazinamide Rv2043c pncA W68L1 Y Y 2.708 Rifampicin Rv0667 rpoB H445D8Y2R1 Y Y 2.530 Streptomycin Rvnr01 rrs G1108C2 N N 1.717 Ethambutol Rv3795 embB D869G1 N N 1.688 Ethambutol Rv3795 embB A505T1 N N 1.688 Ethambutol Rv3795 embB D1024N1 Y N 1.688 Fluoroquinolones Rv0005 gyrB N538T1 Y Y 1.685 Fluoroquinolones Rv0006 gyrA S91P1 Y Y 1.685 Fluoroquinolones Rv0005 gyrB T539I1 N N 1.685 Streptomycin Rvnr01 rrs A1401G17 Y N 1.288 Ethambutol Rv3795 embB Y334H2 Y N 1.054 Ethambutol Rv3795 embB Q497R2 Y Y 1.054 Rifampicin Rv0667 rpoB E250G3 N N 1.047 Fluoroquinolones Rv0006 gyrA A90V6G3 Y Y 1.035 Streptomycin Rvnr01 rrs C517T33 Y Y 0.915 Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-22
SLIDE 22

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Top-scoring mutations on the mtu broad dataset

drug name gene id gene name mutation all h.c. TGH Fluoroquinolones Rv0006 gyrA D94Y6H2A26G78N14 Y Y 128.323 Rifampicin Rv0667 rpoB S450L743W22 Y Y 72.284 Ethambutol Rv3795 embB M306T1L16V290I313 Y Y 70.217 Fluoroquinolones Rv0006 gyrA A90G2V46 Y Y 41.699 Streptomycin Rv0682 rpsL K43R228 Y Y 30.012 Isoniazid Rv1908c katG S315T895G2I3R3N27 Y Y 27.966 Ethambutol Rv3795 embB Q497H5K18P10R43 Y Y 17.081 Streptomycin Rv0682 rpsL K88Q1R28T32M7 Y Y 16.327 Fluoroquinolones Rv0005 gyrB N538K1S1T9D2 Y Y 12.605 Rifampicin Rv0667 rpoB H445P2Q2L27Y53R42D25N7 Y Y 12.252 Streptomycin Rvnr01 rrs A1401G254 Y N 9.509 Streptomycin Rvnr01 rrs A514C90 Y Y 8.940 Pyrazinamide Rv2043c pncA T135A1P22 Y N 8.814 Fluoroquinolones Rv0006 gyrA S91P9 Y Y 7.557 Rifampicin Rv0667 rpoB D435H1N2A2Y27G3V140 Y Y 7.480 Ethambutol Rv3795 embB G406C3A68D52S43 Y Y 7.057 Pyrazinamide Rv2043c pncA T-11G3C24 Y Y 6.766 Fluoroquinolones Rv0006 gyrA D89G2N4 Y N 6.253 Pyrazinamide Rv2043c pncA L120P20R5 Y N 6.146 Streptomycin Rvnr01 rrs C517T26 Y Y 5.169 Pyrazinamide Rv2043c pncA Q10H3R10P12 Y Y 5.053 Pyrazinamide Rv2043c pncA V139M3G2A7L1 Y Y 5.053 Ethambutol Rv3795 embB D328G5H1Y9 Y N 5.032 Streptomycin Rvnr01 rrs A908C7G1 Y N 4.779 Pyrazinamide Rv2043c pncA D12E1G5N1A12 Y Y 4.725 Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-23
SLIDE 23

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Putative compensatory mutations

Recent publications reporting putative compensatory mutations in

  • M. tuberculosis:

◮ Whole-genome sequencing of rifampicin-resistant

Mycobacterium tuberculosis strains identifies compensatory mutations in RNA polymerase genes; Nature Genetics; 2012

◮ Putative Compensatory Mutations in the rpoC Gene of

Rifampin-Resistant Mycobacterium tuberculosis Are Associated with Ongoing Transmission; Antimicrobial Agents and Chemotherapy; 2013

◮ Evolution and transmission of drug-resistant tuberculosis in a

Russian population; Nature Genetics; 2014

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-24
SLIDE 24

Introduction Methods Results Summary Input datasets Comparison of different association scores Top-scoring mutations Compensatory mutations

Putative compensatory mutations

rpoA 100 200 300347 rpoB 100 200 300 400 500 600 700 800 900 1000 1100 1172 rpoC 100 200 300 400 500 600 700 800 900 1000 1100 1200 1314 RRDR casali evolution 2014 mtu broad de vos putative 2013 comas whole-genome 2012 mtu173

Interestingly, several mutations identified by GWAMAR that has also been reported in at least one of the papers.

rpoA: G31S/A

rpoB: P45S/L, L731P, E761D, R827C, H835P/R

rpoC: G332R/S, V431M, G433C/S, V483G/A, W484G, I491T/V, L527V, N698K, A734V Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-25
SLIDE 25

Introduction Methods Results Summary

Summary

◮ The fast growing number of fully sequenced bacterial strains enables us to develop and test new methods to identifying drug resistance associated genes and mutations. ◮ We developed and implemented GWAMAR – a new framework for detection of drug resistance-associated mutations. This software is available at the project website: http://bioputer.mimuw.edu.pl/figures/gwamar/. ◮ We proposed a new association score, called TGH, which employ phylogenetic

  • information. It outperforms the standard tree-ignorant scores, but is more

computationally expensive. ◮ Applying our approach we identified some novel putative drug resistance-associated mutations. ◮ Future possible direction of research may include: classification of mutations into primary and secondary, grouping of mutations which are close together, incorporation of PPI networks.

Michal Wozniak GWAMAR: drug resistance-associated mutations

slide-26
SLIDE 26

Introduction Methods Results Summary

Thank you Thank you!

You are welcome to give comments or ask questions.

Michal Wozniak GWAMAR: drug resistance-associated mutations