Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max - - PowerPoint PPT Presentation

prediction of hiv viral tropism based on ngs data
SMART_READER_LITE
LIVE PREVIEW

Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max - - PowerPoint PPT Presentation

Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max Planck Institute for Informatics Cell entry Wu et al. Structures of the CXCR4 Chemokine GPCR with Small-Molecule and Cyclic Peptide Antagonists Science 19 November 2010: 330


slide-1
SLIDE 1

Prediction of HIV viral tropism based on NGS data

Nico Pfeifer

Max Planck Institute for Informatics

slide-2
SLIDE 2
slide-3
SLIDE 3

Cell entry

Wu et al. Structures of the CXCR4 Chemokine GPCR with Small-Molecule and Cyclic Peptide Antagonists Science 19 November 2010: 330 (6007), 1066-1071.

slide-4
SLIDE 4

V3 loop binds to coreceptor

Wu et al. Structures of the CXCR4 Chemokine GPCR with Small-Molecule and Cyclic Peptide Antagonists Science 19 November 2010: 330 (6007), 1066-1071.

slide-5
SLIDE 5

HIV tropism

  • Relevant coreceptors: CCR5 and CXCR4
  • Viruses that can only use the CCR5

coreceptor: R5

  • Viruses that can use the CXCR4

coreceptor: X4-capable

slide-6
SLIDE 6

Entry inhibitors

  • Maraviroc

– CCR5 antagonist – Approved for patient treatment

  • AMD-3100

– CXCR4 antagonist – Never approved for patient treatment

slide-7
SLIDE 7
slide-8
SLIDE 8

Want to know which patients benefit from taking maraviroc

  • Assays for tropism determination

– Trofile – ESTA (enhanced sensitivity trofile assay) – Disadvantages:

  • Long turnaround
  • Require large sample volume
  • Genetic tests (V3 loop of gp120)

– Sanger data – Next Generation Sequencing (NGS) data

slide-9
SLIDE 9

Tools to predict tropism from genetic data

  • Sanger data

– geno2pheno[coreceptor] [1] – WetCat [2] – WebPSSM [3]

  • NGS data

– Variants of geno2pheno[coreceptor] and WebPSSM [4]

1. Lengauer T, Sander O, Sierra S, Thielen A, Kaiser R. Nat Biotechnol., 2007 2. Pillai, S. et al. AIDS Res. Hum. Retroviruses 19, 145–149 3. Jensen, M.A. et al. J. Virol. 77, 13376–13388 4. Swenson, L. C. et al. J Infect Dis. (2011) 203 (2): 237-245

slide-10
SLIDE 10

How do we represent the virus population inside a patient (V3 loop sequences)?

CIRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CT---N--REGVHMGPG-AIYATGQIIGNIRQAHC CTR-NN-TREGVHMGPG-AIYATGQIIGNIRQAHC CTR-NN-TREGVHMGPGGAIYATGQIIGNIRQAHC CTRANNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNDNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNN-TSEHISIGPGRAWVAARNIIGD-RKAHC CTRLNN-TSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNT-EHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTGEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIEPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGLGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGKAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAR-IKRSIRKAHC CTRLNNNTNKHISIGPGRAWVAARDIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREI-GDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHR CTRLNNNTNKHISIGPGRAWVAAREIKGDMRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGGIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGGIRKAHC CTRLNNNTNKHISIGPGRAWVAARNVIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISIGPGRTWVAARQIIGDIRKAHC CTRLNNNTNKHISLGPGRAWVAARNIIGDIRKAHC CTRLNNNTREGV-MGPG-AIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPG-AIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPG-AIYATGRIIGNIRQAHC CTRLNNNTREGVHMGPGGAIHATGQIIGNIRQAHC CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQARC CTRLNNNTREGVHMGPGGAIYATRQIIGNIRQAHC CTRLNNNTREGVHMVPGGAIYATGQIIGNIRQAHC CTRLNNNTRVGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSE-ISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARN-IGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNVIGDIRKAHC CTRLNNNTSEHISIGPGRAWVVARNIIGDIRKAHC CTRLNNNTSERISIGPGRAWVAARNVIGDIRKAHC CTRLNNNTSKHISIGPGRAWVAARNIIGDIRKAHC CTRLNSNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLSNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRPNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRPNNNTRRSIHIGPGRAFYAG---IGDIRQAHC CTRPNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRPYANRKKSIHIGTG--FYTIKEIKGNVKQAYC CTRPYANRKKSIHIGTGR-FYTIKEIKGNVKQAYC CTRPYANRRKSIHIGTG--FYTIKEIKGNVKQAYC CTRPYANRRKSIHIGTGR-FYTIKEIKGNVKQAYC CTRPYANSRKSIHIGTG--FYTIKEIKGNVKQAYC CTRVNNNTREGVHMGPG-AICATGQIIGNIRQAHC CTRVNNNTREGVHMGPG-AIYATGQIIGNIRQAHC CTRVNNNTREGVHMGPGGAIYATGQIIGNIRQAHC YTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC

slide-11
SLIDE 11

Principal Component Analysis (PCA)

  • Represent axes of maximal variance

(principal components)

slide-12
SLIDE 12

Principal Component Analysis (PCA)

  • Represent axes of maximal variance

(principal components)

Principal component 1 (PC1)

slide-13
SLIDE 13

PCA

CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC

slide-14
SLIDE 14

Next Generation Multi-Instance Learning

Support Vector Machine with normalized set kernel:

CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRPYANRRKSIHIGTGRAFYTIKEIKGNVKQAYC CTRLNNNTREGVHMGPGRAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Patient 1 Patient 2

𝑙𝑂𝑂𝑂 =

  • 𝑙𝑡 𝑦𝑗, 𝑦𝑘

𝑙𝑡 𝑦𝑗, 𝑦𝑗 𝑙𝑡(𝑦𝑘, 𝑦𝑘)

𝑦𝑗∈𝑌𝑗,𝑦𝑘∈𝑌𝑘

Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels. International Conference on Machine Learning

slide-15
SLIDE 15

Support Vector Machine with normalized set kernel:

Next Generation Multi-Instance Learning

CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRPYANRRKSIHIGTGRAFYTIKEIKGNVKQAYC CTRLNNNTREGVHMGPGRAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Patient 1 Patient 2

𝑙𝑂𝑂𝑂 =

  • 𝑙𝑡 𝑦𝑗, 𝑦𝑘

𝑙𝑡 𝑦𝑗, 𝑦𝑗 𝑙𝑡(𝑦𝑘, 𝑦𝑘)

𝑦𝑗∈𝑌𝑗,𝑦𝑘∈𝑌𝑘

Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels. International Conference on Machine Learning

slide-16
SLIDE 16

Support Vector Machine with normalized set kernel:

Next Generation Multi-Instance Learning

CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRPYANRRKSIHIGTGRAFYTIKEIKGNVKQAYC CTRLNNNTREGVHMGPGRAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Patient 1 Patient 2

𝑙𝑂𝑂𝑂 =

  • 𝑙𝑡 𝑦𝑗, 𝑦𝑘

𝑙𝑡 𝑦𝑗, 𝑦𝑗 𝑙𝑡(𝑦𝑘, 𝑦𝑘)

𝑦𝑗∈𝑌𝑗,𝑦𝑘∈𝑌𝑘

Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels. International Conference on Machine Learning

slide-17
SLIDE 17

Support Vector Machine with normalized set kernel: Improve predictions for last generation sequencing

CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Patient 1 Patient 2

𝑙𝑂𝑂𝑂 =

  • 𝑙𝑡 𝑦𝑗, 𝑦𝑘

𝑙𝑡 𝑦𝑗, 𝑦𝑗 𝑙𝑡(𝑦𝑘, 𝑦𝑘)

𝑦𝑗∈𝑌𝑗,𝑦𝑘∈𝑌𝑘

Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels. International Conference on Machine Learning

slide-18
SLIDE 18

Support Vector Machine with normalized set kernel: Improve predictions for last generation sequencing

CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Patient 1 Patient 2

𝑙𝑂𝑂𝑂 =

  • 𝑙𝑡 𝑦𝑗, 𝑦𝑘

𝑙𝑡 𝑦𝑗, 𝑦𝑗 𝑙𝑡(𝑦𝑘, 𝑦𝑘)

𝑦𝑗∈𝑌𝑗,𝑦𝑘∈𝑌𝑘

Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels. International Conference on Machine Learning

slide-19
SLIDE 19

Support Vector Machine with normalized set kernel: Improve predictions for last generation sequencing

CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Patient 1 Patient 2

𝑙𝑂𝑂𝑂 =

  • 𝑙𝑡 𝑦𝑗, 𝑦𝑘

𝑙𝑡 𝑦𝑗, 𝑦𝑗 𝑙𝑡(𝑦𝑘, 𝑦𝑘)

𝑦𝑗∈𝑌𝑗,𝑦𝑘∈𝑌𝑘

Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels. International Conference on Machine Learning

slide-20
SLIDE 20

Data

  • Maraviroc versus Optimized Therapy in

Viremic Antiretroviral Treatment- Experienced Patients (MOTIVATE) + 1029

– 876 patients with NGS data of V3 loop

  • Also patients with X4-capable viruses (according to

Trofile)

– Treatment: maraviroc once-daily/twice-daily – Viral loads measured at various time points

Swenson, L. C. et al. J Infect Dis. (2011) 203 (2): 237-245

slide-21
SLIDE 21

Performance comparison

  • Predict class label:

Treatment success

  • Compare measures in patient classes

– Median log10 reduction in pVL after eight weeks

  • 5-fold nested cross-validation
slide-22
SLIDE 22

Median log10 reduction in pVL after eight weeks

Success No success delta geno2pheno-C [Sanger+] 2.4 2.1 0.3 geno2pheno-C [Sanger+] (NMMK) 2.4 1.6 0.8 geno2pheno-C [NGS−Sanger+] 2.4 1.7 0.7 geno2pheno-C [NGS−Sanger+] (NMMK) 2.4 1.2 1.2 geno2pheno-C [NGS] (both) 2.4 1.0 1.4 Swenson et al. (2.4) (1.4) 1.0

Swenson, L. C. et al. J Infect Dis. (2011) 203 (2): 237-245

classified as

new

slide-23
SLIDE 23

Median log10 reduction in pVL after eight weeks

Success No success delta geno2pheno-C [Sanger+] 2.4 2.1 0.3 geno2pheno-C [Sanger+] (NMMK) 2.4 1.6 0.8 geno2pheno-C [NGS−Sanger+] 2.4 1.7 0.7 geno2pheno-C [NGS−Sanger+] (NMMK) 2.4 1.2 1.2 geno2pheno-C [NGS] (both) 2.4 1.0 1.4 Swenson et al. (2.4) (1.4) 1.0

Swenson, L. C. et al. J Infect Dis. (2011) 203 (2): 237-245

classified as

new

slide-24
SLIDE 24

Median log10 reduction in pVL after eight weeks

Success No success delta geno2pheno-C [Sanger+] 2.4 2.1 0.3 geno2pheno-C [Sanger+] (NMMK) 2.4 1.6 0.8 geno2pheno-C [NGS−Sanger+] 2.4 1.7 0.7 geno2pheno-C [NGS−Sanger+] (NMMK) 2.4 1.2 1.2 geno2pheno-C [NGS] (both) 2.4 1.0 1.4 Swenson et al. (2.4) (1.4) 1.0

Swenson, L. C. et al. J Infect Dis. (2011) 203 (2): 237-245

classified as

new

slide-25
SLIDE 25

Visualization

slide-26
SLIDE 26

Visualization

slide-27
SLIDE 27

Visualization

slide-28
SLIDE 28

Interpretable prediction result

towards no success towards success

slide-29
SLIDE 29

Summary and outlook

  • Significantly improved prediction

performance for geno2pheno- C[NGS−Sanger+] with NMMK

  • Have to validate cut-offs on other data

(also other subtypes)

slide-30
SLIDE 30

Geno2pheno[structure]

  • Previous structure-based

approaches only took 3D- distances into account

  • Computations were very

expensive (computationally)

  • New method accounts for the

3D structure

  • Efficient method

⇒Webserver available

Bozek K, Lengauer T, Sierra S, Kaiser R, Domingues FS (2013), PLoS Computational Biology 9:e1002977

slide-31
SLIDE 31

Geno2pheno[structure]

Bozek K, Lengauer T, Sierra S, Kaiser R, Domingues FS (2013), PLoS Computational Biology 9:e1002977

slide-32
SLIDE 32

Thanks to …

Thomas Lengauer Kasia Bozek Rolf Kaiser Saleta Sierra Alexander Thielen Hernan Valdez Charles Craig

slide-33
SLIDE 33

Thanks to …

Thomas Lengauer Kasia Bozek Rolf Kaiser Saleta Sierra Alexander Thielen Hernan Valdez Charles Craig