Quantification of cross hybridization on oligonucleotide - - PowerPoint PPT Presentation

quantification of cross hybridization on oligonucleotide
SMART_READER_LITE
LIVE PREVIEW

Quantification of cross hybridization on oligonucleotide - - PowerPoint PPT Presentation

Quantification of cross hybridization on oligonucleotide microarrays Li Zhang Dept. of Biostatistics, UT MDACC, Houston, TX 77030 DNA/RNA duplex on oligonucleotide microarrays The probe is a 25-mer DNA oligo: ATCAGCATACGA C AGAATGATGGAT


slide-1
SLIDE 1

Quantification of cross hybridization on

  • ligonucleotide microarrays

Li Zhang

  • Dept. of Biostatistics, UT MDACC, Houston, TX 77030
slide-2
SLIDE 2

DNA/RNA duplex on oligonucleotide microarrays

The probe is a 25-mer DNA oligo:

ATCAGCATACGAGAGAATGATGGAT ||||||||||||||||||||||||| AAUAGUCGUAUGCUCUCUUACUACCUAGC

cRNA fragment in solution expressed from a targeted gene

ATCAGCATACGACAGAATGATGGAT

slide-3
SLIDE 3

Modes of binding on probes

  • 1. Gene-specific binding:

(Mismatches=0)

  • 2. Cross hybridization:

(I) Non-specific binding (Mismatches>5) (II) Binding of related sequences (0<Mismatches<5 )

slide-4
SLIDE 4

Gene-specific binding energy: Non-specific binding energy: Binding energy = f(distance, interacting partners)

Binding energies

) , (

1 +

= Ε

i i i

b b ε ω

) , ( * * *

1 +

= Ε

i i i

b b ε ω

slide-5
SLIDE 5

Thermodynamic model of binding on a probe

− =

2 ,

) ln (ln

ij

  • bs

ij

I I T

B e N e N I

ij ij

E E j ij

+ + + + =

*

1 * 1

Minimization of T

  • Energy parameters
  • B, N*, Nj
  • N*, B are the same on a microarray;
  • Nj is the same in a probe set.

Probe Signal: Fitness: Constraints:

slide-6
SLIDE 6

Weight factors reflect dynamic properties

  • f binding on the probes

Non-specific binding(PM & MM) Gene-specific binding (PM) Gene-specific binding (MM)

slide-7
SLIDE 7

Stacking energy of base-pairs

slide-8
SLIDE 8

Fitting the model

ln (signal) Probe index

B e N e N I

ij ij

E E j ij

+ + + + =

*

1 * 1

slide-9
SLIDE 9

The baseline of non-specific binding

Non-specific binding energy

B e N e N I

ij ij

E E j ij

+ + + + =

*

1 * 1

slide-10
SLIDE 10
  • 3
  • 2
  • 1

1 2 3

Middle 3 bases of PM probe

< ln(PM/MM) > E*(PM)-E*(MM)

A C G T

The effect of mismatch depends on the nearest-neighbors

A A C G T B e N e N I

ij ij

E E j ij

+ + + + =

*

1 * 1

slide-11
SLIDE 11

Effect of cross hybridization on model fitting

slide-12
SLIDE 12

Latin-square tests with ‘spiked-in’ genes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 2 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 3 0.5 1 2 4 8 16 32 64 128 256 512 1024 0.25 4 1 2 4 8 16 32 64 128 256 512 1024 0.25 0.5 5 2 4 8 16 32 64 128 256 512 1024 0.25 0.5 1 6 4 8 16 32 64 128 256 512 1024 0.25 0.5 1 2 7 8 16 32 64 128 256 512 1024 0.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 0.25 0.5 1 2 4 8 9 32 64 128 256 512 1024 0.25 0.5 1 2 4 8 16 10 64 128 256 512 1024 0.25 0.5 1 2 4 8 16 32 11 128 256 512 1024 0.25 0.5 1 2 4 8 16 32 64 12 256 512 1024 0.25 0.5 1 2 4 8 16 32 64 128 13 512 1024 0.25 0.5 1 2 4 8 16 32 64 128 256 14 1024 0.25 0.5 1 2 4 8 16 32 64 128 256 512

Gene Sample

Data source: Affymetrix Inc.

slide-13
SLIDE 13

Accuracy of estimated gene expression levels

slide-14
SLIDE 14

Reproducibility of estimated gene expression levels

slide-15
SLIDE 15

Correlation of expression levels between batches of experiments

slide-16
SLIDE 16

Spotting cross hybridizing probes

Cross hyb probes: unknown EST Cross hyb source: salivary alpha-amylase Cross hyb probes: interleukin-8 receptor type B Cross hyb source: angiotensinogen serine (or cysteine) proteinase inhibitor

slide-17
SLIDE 17

The missing 12th gene?

slide-18
SLIDE 18

Is the missing gene an alternative splicing variant?

Gene name: rTS beta protein

slide-19
SLIDE 19

Conclusions

  • Probe signals can be decomposed into two

modes: gene specific binding and non-specific binding.

  • Sequence dependence of probe signals can be

determined by a thermodynamic model.

  • For the given data set, the amount of cross

hybridization and its sources on the probes can be determined.

slide-20
SLIDE 20

Acknowledgements

Ken D Aldape Ken Hess Keith A. Baggerly James Mitchell Norris Clift Lianchun Xiao Kevin R. Coombes

slide-21
SLIDE 21

Website for downloading the program

http://bioinformatics.mdanderson.org

Perfect Match

slide-22
SLIDE 22

Clustering crosshyb effects

slide-23
SLIDE 23

Plot of residues

  • 3
  • 2
  • 1

1 2 3

  • 4
  • 2

2 4 ln(MMfitted) - ln (MM) ln(PMfitted) - ln (PM)

The misfits happen in the same probe pairs

slide-24
SLIDE 24

Basic questions of microarrays

  • How can we determine gene expression

levels from probe signals?

  • How does probe binding affinity depend on

the probe’s sequence?

  • How much binding on a probe is due to non-

specific binding?

  • Why sometimes a mismatch probe signal is

stronger than the corresponding perfect match probe signal?

slide-25
SLIDE 25

Why a good physical model for microarrays is important

  • Recognize erratic probe signals
  • Eliminate inefficient probes
  • Extract accurate and reliable gene

expression levels

slide-26
SLIDE 26

Protocol of a microarray experiment

slide-27
SLIDE 27

Oligonucleotide microarray technology

Affymetrix array ~ Affinity matrix

Basic features:

  • Probe set -- Multiple

probes for a gene target

  • Probe pair -- Perfect

match vs. mismatch

slide-28
SLIDE 28

Effects of alternative splicing

Even probes:

2 4 6 8 10 12 14 16

Odd probes:

1 3 5 7 9 11 13 15

1st half probes:

1 2 3 4 5 6 7 8

2nd half probes:

9 10 11 12 13 14 15 16

Fitting the model with sub- divided probe sets

Alternative splicing: DNA: ------------------------ mRNA1: ------------------------ mRNA2: ----------- mRNA3: ------------

  • (
slide-29
SLIDE 29

Mechanism of non-specific binding

  • n the probes
  • A. Non-specific binding energy is much higher

than gene specific energy (E* - E = 5 kBT)

  • B. Source of non-specific binding is much

higher than source of gene-specific binding.

  • C. Non-specific binding is very loose, flexible,

and contains many mismatches.

  • D. Non-specific binding depends on stacking

energy, which in turn depends on the probe sequence.

slide-30
SLIDE 30

Energetic aspect of probe design

1 2 3 4 5 6 7 8 9

  • 5
  • 4
  • 3
  • 2
  • 1

1 2 3 4 5

Gene specific binding energy ln (observed signal) Background

Total signal Gene specific signal

slide-31
SLIDE 31

Boltzmann Distribution

E1 E2 N1 N2

T k E E

B

e N N

2 1

2 1 − −

=

E

e N N N

+ = + 1 1

2 1 1

Binding affinity:

slide-32
SLIDE 32

Binding on microarrays

  • Some probes always give

strong signals, even when the targeted gene is absent.

  • Some probes always give

weak signals, even when the targeted gene is abundant. Probe’s response to gene expression:

Level of expression Probe Signal