Effects of Sequencing Errors on geno2pheno [coreceptor] Alejandro - - PowerPoint PPT Presentation

effects of sequencing errors
SMART_READER_LITE
LIVE PREVIEW

Effects of Sequencing Errors on geno2pheno [coreceptor] Alejandro - - PowerPoint PPT Presentation

Effects of Sequencing Errors on geno2pheno [coreceptor] Alejandro Pironti, Saleta Sierra, Rolf Kaiser, Thomas Lengauer and Nico Pfeifer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics April 18, 2013


slide-1
SLIDE 1

Effects of Sequencing Errors

  • n geno2pheno[coreceptor]

Alejandro Pironti, Saleta Sierra, Rolf Kaiser, Thomas Lengauer and Nico Pfeifer

Computational Biology and Applied Algorithmics Max Planck Institute for Informatics April 18, 2013

slide-2
SLIDE 2

April 18, 2013 Alejandro Pironti

Motivation

  • How safe is a geno2pheno[coreceptor] prediction?
  • What happens if the submitted sequence

contains (editing) errors?

  • Do sequence errors have the same influence on

X4 and R5 predictions?

  • What is the influence of cut-offs in this context?
slide-3
SLIDE 3

Materials and Methods

  • 70,644 HIV-1 nucleotide

sequences:

– Non-duplicated V3 regions of the ENV gene – Los Alamos National Laboratory Sequence Database

  • Dataset for in-silico

experiment 1:

– All sequences in a dataset

  • Datasets for in-silico experiment

2:

– Build 6 datasets containing 1000 sequences each – Choose sequences at random

In-silico Experiment 1:

Exchange in-silico each position in each sequence in dataset.

– Replace original nucleotide by another nucleotide or IUPAC ambiguity code – Evaluate with geno2pheno[coreceptor]

In-silico Experiment 2:

Introduce one, two or three random changes in each sequence

– Position(s) chosen at random – Differentiate between nucleotides and ambiguity codes

In both experiments, sequences with alignment errors are discarded.

April 18, 2013 Alejandro Pironti

slide-4
SLIDE 4

Evaluation of Unchanged Sequences

April 18, 2013 Alejandro Pironti

Logo for 5 most frequent aminoacids. Height of letter is proportional to frequency. Color: see key to the right 50 100 Average FPR Histogram of the original FPRs

  • 65,309 sequences aligned

correctly.

  • Average FPR: 44.65 (SD=33)
slide-5
SLIDE 5

April 18, 2013 Alejandro Pironti

Original average FPR: 44.65 (SD=33) Altered average FPR: 42.18 (SD= 34)

Comparison of the FPR histograms for the unchanged and the altered sequences.

In-silico Experiment 1: Altered Sequence FPRs

slide-6
SLIDE 6

In-silico Experiment 1: Mean FPR Shifts by Position

April 18, 2013 Alejandro Pironti

Aminoacid position 11 Aminoacid position 25 On average:

  • 64 positions lower FPR
  • 41 positions increase FPR
slide-7
SLIDE 7

In-silico Experiment 1: Mean FPR Shifts by Nucleotide

April 18, 2013 Alejandro Pironti

slide-8
SLIDE 8

In-silico Experiment 1: Effect of Cut- Offs on Predicted Tropism

Data X4 Intermediate R5 Original Sequences 10,484 (16%) 6,157 (9%) 48,668 (75%) Altered Sequences 16,538,450 (17%) 12,109,891 (13%) 66,396,041 (70%)

April 18, 2013 Alejandro Pironti

Data X4 R5 Original Sequences 13,181 (20%) 52,128 (80%) Altered Sequences 23,625,020 (25%) 71,419,362 (75%)

FPR ≤ 5: X4, 5 < FPR < 15: Intermediate, FPR ≥ 15: R5 FPR < 10: X4, FPR ≥ 10: R5

Data X4 R5 Original Sequences 20,385 (31%) 44,924 (69%) Altered Sequences 34,047,123 (36%) 60,997,259 (64%)

FPR < 20: X4, FPR ≥ 20: R5

slide-9
SLIDE 9

In-silico Experiment 1: Effect of Cut- Offs on Predicted Tropism

April 18, 2013 Alejandro Pironti

X4 R5 Int X4 R5 X4 R5

FPR ≤ 5: X4, 5 < FPR < 15: Intermediate, FPR ≥ 15: R5 FPR < 10: X4, FPR ≥ 10: R5 FPR < 20: X4, FPR ≥ 20: R5

0.00 0.04 0.96 0.07 0.92 0.01 0.16 0.14 0.70 0.02 0.98 0.06 0.94 0.93 0.07 0.90 0.10

) ( ) ( ) | (

  • s
  • s

T

  • riginal

P T

  • riginal

T switch P T

  • riginal

T switch P = = ∩ = = = =

slide-10
SLIDE 10

1 2 3 4 5 6 7 8 9 10

1 Nucleotide Change 2 Nucleotide Changes 3 Nucleotide Changes 1 Ambiguity Change 2 Ambiguity Changes 3 Ambiguity Changes

Change in Predicted Tropism to X4 Change in Predicted Tropism to R5

In-silico Experiment 2: Results

April 18, 2013 Alejandro Pironti

% Switches

FPR < 20: X4, FPR ≥ 20: R5

Each pair of bars is one experiment with 1000 sequences. One, two or three nucleotide changes were introduced to each sequence at random. Changes were either nucleotides or ambiguity codes.

slide-11
SLIDE 11

Conclusions

  • MVC prescription with genotypic geno2pheno[coreceptor]

tropism determination is safe (European coreceptor proficiency panel)

  • Changes in predicted tropism from R5 to X4 are more

frequent

  • FPR shifts can vary depending on nucleotide position
  • Changes with unique nucleotides cause larger shifts

than those with ambiguity codes

  • Importance of accurate base calling is underlined

April 18, 2013 Alejandro Pironti