quantification of cross hybridization on oligonucleotide
play

Quantification of cross hybridization on oligonucleotide - PowerPoint PPT Presentation

Quantification of cross hybridization on oligonucleotide microarrays Li Zhang Dept. of Biostatistics, UT MDACC, Houston, TX 77030 DNA/RNA duplex on oligonucleotide microarrays The probe is a 25-mer DNA oligo: ATCAGCATACGA C AGAATGATGGAT


  1. Quantification of cross hybridization on oligonucleotide microarrays Li Zhang Dept. of Biostatistics, UT MDACC, Houston, TX 77030

  2. DNA/RNA duplex on oligonucleotide microarrays The probe is a 25-mer DNA oligo: ATCAGCATACGA C AGAATGATGGAT ATCAGCATACGAGAGAATGATGGAT ||||||||||||||||||||||||| AAUAGUCGUAUGCUCUCUUACUACCUAGC cRNA fragment in solution expressed from a targeted gene

  3. Modes of binding on probes 1. Gene-specific binding: (Mismatches=0) 2. Cross hybridization: (I) Non-specific binding (Mismatches>5) (II) Binding of related sequences (0<Mismatches<5 )

  4. Binding energies Binding energy = f (distance, interacting partners) Gene-specific binding energy: ∑ Ε = ω ε ( b , b ) + i i 1 i Non-specific binding energy: ∑ Ε = ω ε * * * ( b , b ) + i i i 1

  5. Thermodynamic model of binding on a probe N N * j = + + I B Probe Signal: ij E E * + + 1 e 1 e ij ij ∑ = − Fitness: 2 T (ln I ln I ) ij , obs ij Constraints: • N*, B are the same on a microarray; • N j is the same in a probe set. •Energy parameters Minimization of T • B, N*, N j

  6. Weight factors reflect dynamic properties of binding on the probes Non-specific binding(PM & MM) Gene-specific binding (PM) Gene-specific binding (MM)

  7. Stacking energy of base-pairs

  8. Fitting the model ln (signal) Probe index N N * j = + + I B ij E E * + + 1 e 1 e ij ij

  9. The baseline of non-specific binding N N * j = + + I B Non-specific binding energy ij E E * + + 1 e 1 e ij ij

  10. The effect of mismatch depends on the nearest-neighbors 3 C C T 2 T A A G A G 1 0 -1 -2 < ln(PM/MM) > E*(PM)-E*(MM) -3 Middle 3 bases of PM probe N N * j = + + I B ij E E * + + 1 e 1 e ij ij

  11. Effect of cross hybridization on model fitting

  12. Latin-square tests with ‘spiked-in’ genes Gene 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 0 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 Sample 2 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 0 3 0.5 1 2 4 8 16 32 64 128 256 512 1024 0 0.25 4 1 2 4 8 16 32 64 128 256 512 1024 0 0.25 0.5 5 2 4 8 16 32 64 128 256 512 1024 0 0.25 0.5 1 6 4 8 16 32 64 128 256 512 1024 0 0.25 0.5 1 2 7 8 16 32 64 128 256 512 1024 0 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 0 0.25 0.5 1 2 4 8 9 32 64 128 256 512 1024 0 0.25 0.5 1 2 4 8 16 10 64 128 256 512 1024 0 0.25 0.5 1 2 4 8 16 32 11 128 256 512 1024 0 0.25 0.5 1 2 4 8 16 32 64 12 256 512 1024 0 0.25 0.5 1 2 4 8 16 32 64 128 13 512 1024 0 0.25 0.5 1 2 4 8 16 32 64 128 256 14 1024 0 0.25 0.5 1 2 4 8 16 32 64 128 256 512 Data source: Affymetrix Inc.

  13. Accuracy of estimated gene expression levels

  14. Reproducibility of estimated gene expression levels

  15. Correlation of expression levels between batches of experiments

  16. Spotting cross hybridizing probes Cross hyb probes: unknown EST Cross hyb probes: interleukin-8 receptor type B Cross hyb source: salivary alpha-amylase Cross hyb source: angiotensinogen serine (or cysteine) proteinase inhibitor

  17. The missing 12 th gene?

  18. Is the missing gene an alternative splicing variant? Gene name: rTS beta protein

  19. Conclusions •Probe signals can be decomposed into two modes: gene specific binding and non-specific binding. •Sequence dependence of probe signals can be determined by a thermodynamic model. •For the given data set, the amount of cross hybridization and its sources on the probes can be determined.

  20. Acknowledgements Ken D Aldape Ken Hess Keith A. Baggerly James Mitchell Norris Clift Lianchun Xiao Kevin R. Coombes

  21. Website for downloading the program Perfect Match http://bioinformatics.mdanderson.org

  22. Clustering crosshyb effects

  23. The misfits happen in the same probe pairs Plot of residues 3 2 ln(PMfitted) - ln (PM) 1 0 -1 -2 -3 -4 -2 0 2 4 ln(MMfitted) - ln (MM)

  24. Basic questions of microarrays •How can we determine gene expression levels from probe signals? •How does probe binding affinity depend on the probe’s sequence? •How much binding on a probe is due to non- specific binding? •Why sometimes a mismatch probe signal is stronger than the corresponding perfect match probe signal?

  25. Why a good physical model for microarrays is important •Recognize erratic probe signals •Eliminate inefficient probes •Extract accurate and reliable gene expression levels

  26. Protocol of a microarray experiment

  27. Oligonucleotide microarray technology Affymetrix array ~ Affinity matrix Basic features: •Probe set -- Multiple probes for a gene target •Probe pair -- Perfect match vs. mismatch

  28. Effects of alternative splicing Even probes: 2 4 6 8 10 12 14 16 Odd probes: 1 3 5 7 9 11 13 15 1 st half probes: 1 2 3 4 5 6 7 8 2 nd half probes: 9 10 11 12 13 14 15 16 Alternative splicing: DNA: ------------------------ Fitting the model with sub- mRNA1: ------------------------ mRNA2: ----------- divided probe sets mRNA3: ------------ --------- (

  29. Mechanism of non-specific binding on the probes A. Non-specific binding energy is much higher than gene specific energy (E* - E = 5 k B T) B. Source of non-specific binding is much higher than source of gene-specific binding. C. Non-specific binding is very loose, flexible, and contains many mismatches. D. Non-specific binding depends on stacking energy, which in turn depends on the probe sequence.

  30. Energetic aspect of probe design 9 8 Total signal 7 ln (observed signal) 6 Gene specific signal 5 4 3 2 1 Background 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Gene specific binding energy

  31. Boltzmann Distribution N 1 − E E N − 1 2 = k T 1 e N 2 B N E 1 2 E 2 N 1 = 1 Binding affinity: ∆ + + E N N 1 e 1 2

  32. Binding on microarrays Probe’s response to gene expression: • Some probes always give strong signals, even when Probe Signal the targeted gene is absent. • Some probes always give weak signals, even when the targeted gene is abundant. Level of expression

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend