Novel method for estimating isotope incorporation using the - - PowerPoint PPT Presentation

novel method for estimating isotope incorporation using
SMART_READER_LITE
LIVE PREVIEW

Novel method for estimating isotope incorporation using the - - PowerPoint PPT Presentation

Novel method for estimating isotope incorporation using the half-decimal place rule Ingo Fetzer Department of Environmental Microbiology userR Conference 2009, Rennes Problem 1.2e+7 100% 100% 0% 0% 50% 1.0e+7 Intensity 8.0e+6


slide-1
SLIDE 1

Department of Environmental Microbiology userR Conference 2009, Rennes

Novel method for estimating isotope incorporation using the ‘half-decimal place rule‘

Ingo Fetzer

slide-2
SLIDE 2

Page 2

100%

100%

Problem

mass

835 840 845 850 855 860 865 870 875 880 0.0 2.0e+6 4.0e+6 6.0e+6 8.0e+6 1.0e+7 1.2e+7

Intensity

0% 0% 50%

Substrate fluxes in Procaryotes Function Activity Identitiy Interactions: Competition Mutualism

slide-3
SLIDE 3

Page 3

Goal

Develop an algorithm for estimating 13C incorporation by using ‘half decimal place rule’

slide-4
SLIDE 4

Page 4

‘Half-decimal place rule’ (HDPR)

Mann (1995)

m/z tryptic peptides Heliobacter pylori [Da] Decimal places

0% 100%

Schmidt et al 2003

slide-5
SLIDE 5

Page 5

Outline

  • 1. Peptide mass calculation for 12C and 13C
  • 2. Estimation of 12C and 13C slopes (HDPR)
  • 3. Estimation of relative 13C incorporation rates

(of user data) implemented in ‘R‘ (R-project.org)

slide-6
SLIDE 6

Page 6

Flowchart Script 1

Peptide database Peptide sequences

criterions AA molecular formula Atomic weights

Peptide sequence masses

slide-7
SLIDE 7

Page 7

Peptide mass calculation for 12C and 13C

dataset M. tuberculosis H37Rv

315,579 peptide fragments amino acid sequences length 2 – 40 90,637 peptide sequences

ChemScore ≥ 10 Missing cleavage = 0 Modifications = Null

  • Mol. weight ≤ 5000 Da

Virtual digestion with MS-Digest

Sanger Institute

(ftp://ftp.sanger.ac.uk/pub/tb/sequences/TB.pep)

slide-8
SLIDE 8

Page 8

Peptide mass calculation for 12C and 13C

slide-9
SLIDE 9

Page 9

Peptide mass calculation for 12C and 13C

Script 1:

  • 1. Reduction of dataset (ChemScore, Modification etc.)

2 a. Peptide mass calculation Molecular formula of AA Sum of C, H, N, O for each sequence Sequence in DB + Why? Calculation of percentage 13C incorporation

GAG G=C2H6NO2 A=C3H8NO2 C7 H20 N3 O6 315,579 90,637

slide-10
SLIDE 10

Page 10

Peptide mass calculation for 12C and 13C

  • 2b. Peptide mass calculation

Sum of C, H, N, O of each sequence Molecular weights of sequences (with decimal residuals)

+ Atomic weights

12C = 12.000000 Da 13C = 13.003355 Da

N = 14.003074 Da O = 15.994915 Da H = 1.007825 Da

C7 H20 N3 O6

12C=242.135212 Da 13C=249.158697 Da

slide-11
SLIDE 11

Page 11

+ +

C C O H

1R

NH3

+

OH

C C O H

1R

NH3

+

C C O H

2R

N OH H

C C O H

2R

NH3

+

OH

H3O+

Peptide mass calculation for 12C and 13C

  • 2a. Peptide mass calculation

Molecular weights of sequences (with decimal residuals)

+

Count sum of C, H, N, O, S for each sequence

MW(H3O+) = 19.01839 Da

Atomic weights

12C = 12.000000 Da 13C = 13,003355 Da

N = 14.003074 Da O = 15.994915 Da H = 1.007825 Da S = 31.972071 Da

Molecular Masses of 12C and 13C peptides

slide-12
SLIDE 12

Page 12

Flowchart Script 2

Peptide sequence masses

Transformation

  • Transf. Peptide

sequence masses

k-means clustering

Groupings

  • Transf. Decimal

Residuals m/z – DR plot Slope 0%/100% 13C

Linear fitting

slide-13
SLIDE 13

Page 13

1000 2000 3000 4000 5000 6000 0.0 0.2 0.4 0.6 0.8 1.0

m/z Decimal residuals

Estimation of 12C and 13C slopes

Script 2: m/zTemp = m/z - 1800 * DR

slide-14
SLIDE 14

Page 14

1000 2000 3000 4000 5000 6000 0.0 0.2 0.4 0.6 0.8 1.0

m/zTemp Decimal residuals

Estimation of 12C and 13C slopes

k-means clustering

using kmeans()

Hartigan & Wong (1979)

slide-15
SLIDE 15

Page 15

Estimation of 12C and 13C slopes

Hartigan & Wong (1979)

1000 2000 3000 4000 5000 6000 0.0 0.2 0.4 0.6 0.8 1.0

m/zTemp Decimal residuals

1 2 3 Add k-means clustering

using kmeans()

slide-16
SLIDE 16

Page 16

Estimation of 12C and 13C slopes

m/z Decimal residuals

1000 2000 3000 4000 5000 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

100% slope = 0.000633 0% slope = 0.000514

lm() function Stats package

0% 100%

slide-17
SLIDE 17

Page 17

Flowchart Script 3

Slope 0% + 100% 13C User data

Robust linear fitting

Slope user data Estimation of relative

13C incorporation rates

calculation reference k-means clustering

slide-18
SLIDE 18

Page 18

Estimation of relative 13C incorporation rates

Script 3:

1000 2000 3000 4000 5000 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 100% slope = 0.000633 0% slope = 0.000514

m/z Decimal residuals

m/z b DR ∗ =

(with a=0)

‘outliers‘

Influence on slope estimation 100% 0%

slide-19
SLIDE 19

Page 19

Estimation of relative 13C incorporation rates

1000 2000 3000 4000 5000 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

m/z Decimal residuals

Linear fitting: minimizing the error sum of the squares

∗ − =

2

) ( min m/z b DR SSE

Breakdown point value = 0

slide-20
SLIDE 20

Page 20

Estimation of relative 13C incorporation rates

1000 2000 3000 4000 5000 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

m/z Decimal residuals

Robust linear model (Huber 1973) M-estimator type Iterative re-weighted least squares (IWLS; Huber 1981, Hampel et al 1986) rlm() function

MASS package (Venable & Ripley 2002)

Breakdown point value set = 0.5 >50% of datapoints must change

robust linear fitting

slide-21
SLIDE 21

Page 21

Estimation of relative 13C incorporation rates

1000 2000 3000 4000 5000 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 100% slope = 0.000633 0% slope = 0.000514

m/z Decimal residuals

100 %

12 13 12 13

∗ − − =

C C C user User

b b b b C

m/z b DR ∗ =

(with a=0) 100% 0%

Your incorporation rate is 98.3%

slide-22
SLIDE 22

Page 22

User data input

+ R Script 3

slide-23
SLIDE 23

Page 23

User data output

+ R Script 3

slide-24
SLIDE 24

Page 24

Sensitivity of Method

Dataset Pseudomonas putida

  • 1. Calculated 50% and 100% 13C incorporation
  • 2. Randomly sampled 100 times each

10-100 (steps by 10), 150, 200, 300, 500, 1000 sequences (0%,50% and 100%)

  • 3. Statistics on estimated incorporation rate for 0%,

50% and 100%

slide-25
SLIDE 25

Page 25

50% 0%

Incorporation [%]

10 20 30 40 50 60 70 80 90 100

Number of peptides

10 50 100 150 200 300 500 1000

  • 50
  • 40
  • 30
  • 20
  • 10

10 20 30 40 50

100%

50 60 70 80 90 100 110 120 130 140 150

Sensitivity of Method

~100 samples

slide-26
SLIDE 26

Page 26

Conclusion

  • 1. ‚Half-decimal place rule‘ useful for

the estimation of 13C incorporation rates

  • 3. >100 measurements needed for

prescision <5% incorporation estimation

  • 2. Robust linear models better suited

for fitting of highly variable user data than MinSSE fitting

http://s3.amazonaws.com/readers/2008/08/14/draddalybig_1.jpg

slide-27
SLIDE 27

Page 27

Outlook

  • 1. Application of HPDR on DNA
  • 3. Include N-isotope

incoorporation

  • 2. Backcalculation to 12C-peaks

function identification

http://www.sharp.co.jp/plasmacluster-tech/en/release/images/041117_3.gif

slide-28
SLIDE 28

Page 28

Acknowledgement

Nico Jemlich (UFZ) Frank Schmidt (Uni Greifwald) Hauke Harms (UFZ) Martin von Bergen (UFZ) Jens Mattow (MPI Berlin) Carsten Vogt (UFZ) Bernd Thiede (Uni Oslo) Hans-Hermann Richnow (UFZ) R development team

http://cache.gawker.com/assets/images/gizmodo/2009/01/bactsunsuet_01.jpg