Characterization, Modeling, and Characterization, Modeling, and - - PowerPoint PPT Presentation

characterization modeling and characterization modeling
SMART_READER_LITE
LIVE PREVIEW

Characterization, Modeling, and Characterization, Modeling, and - - PowerPoint PPT Presentation

Characterization, Modeling, and Characterization, Modeling, and Simulation Simulation of Mouse Microarray Microarray Data Data of Mouse David S. Lalush Bioinformatics Research Center North Carolina State University Acknowledgments


slide-1
SLIDE 1

Characterization, Modeling, and Characterization, Modeling, and Simulation Simulation

  • f Mouse
  • f Mouse Microarray

Microarray Data Data

David S. Lalush Bioinformatics Research Center

North Carolina State University

slide-2
SLIDE 2

Acknowledgments

  • Assistance from:

– Jeff Tucker (NIEHS) – Pierre Bushel (NIEHS) – Bruce Weir (NCSU)

  • Funded by K01 HG02428, National Human

Genome Research Institute

slide-3
SLIDE 3

Outline

  • Microarray Simulation Project
  • Characterization of Microarray Images
  • Results of Characterization
  • Simulations
  • Conclusion
slide-4
SLIDE 4

Outline

  • Microarray Simulation Project
  • Characterization of Microarray Images
  • Results of Characterization
  • Simulations
  • Conclusion
slide-5
SLIDE 5

Microarray in Diagnosis

Type I tumors Type II tumors Gene expression pattern Gene expression pattern Microarray Microarray

slide-6
SLIDE 6

Microarray in Diagnosis

Unknown tumor Gene expression pattern Microarray Type I or type II? Probability of misclassification?

slide-7
SLIDE 7

Research Focus

  • Evaluating classification methods
  • Studying variability in microarray data

Problems:

  • Many replications are required to evaluate error rates.
  • Microarray experiments are expensive.
  • True patterns are unknown in real data.
slide-8
SLIDE 8

Microarray Simulation

  • Creating a realistic simulation of microarray

data

  • Accounting for various sources of variability

in the system

Advantages:

  • Generates many replications cheaply.
  • True patterns are known.
  • Can control sources of variability.
slide-9
SLIDE 9

Microarray System

Slide Printing Sample Preparation Hybridization Scanning Image Processing Data Analysis

slide-10
SLIDE 10

Simulation Model

Sample Slide Pin Array Printing And Hybridization Scanning

slide-11
SLIDE 11

Simulation Model

Sample Slide Pin Array Printing And Hybridization Scanning

  • Gene expression variation modeled

as multivariate normal

  • Global expression variations

modeled as normal

slide-12
SLIDE 12

Simulation Model

Sample Slide Pin Array Printing And Hybridization Scanning

  • Background level modeled as

normal (dye-dependent)

  • Defects modeled as 2D causal

Markov random field

slide-13
SLIDE 13

Simulation Model

Sample Slide Pin Array Printing And Hybridization Scanning

  • Spot size, shape, and orientation

modeled as normal

  • Spot defects modeled with 2D causal

Markov random field

slide-14
SLIDE 14

Simulation Model

Sample Slide Pin Array Printing And Hybridization Scanning

  • Instantiates spots based on properties

from sample, slide, and pin

slide-15
SLIDE 15

Simulation Model

Sample Slide Pin Array Printing And Hybridization Scanning

  • Creates discretized image based on

spots, SNR, gain, resolution, and blur parameters

slide-16
SLIDE 16

Characterization

  • Characterization of existing microarray

images

– Spot properties (size, shape, uniformity) – Pin properties (spot uniformity) – Slide properties (background, signal-to-noise) – Gene properties (mean, variance, covariance)

slide-17
SLIDE 17

Outline

  • Microarray Simulation Project
  • Characterization of Microarray Images
  • Results of Characterization
  • Simulations
  • Conclusion
slide-18
SLIDE 18

Characterization

  • Characterization of mouse kidney dataset

– Six mice – Four slides each (2x2 fluor flip) – 24 slides in all – 5520 spots in 16 blocks, 4x4 block pattern

slide-19
SLIDE 19

Characterization of Spots

  • Step 1: Spot Detection
slide-20
SLIDE 20

Characterization of Spots

  • Step 2: Spot Morphology Measures

Cast rays from centroid Radius Area Eccentricity

slide-21
SLIDE 21

Characterization of Spots

  • Step 3: Spot Intensity Measures

– Mean and standard deviation of spot pixels – Mean and standard deviation of background pixels

slide-22
SLIDE 22

Characterization of Spots

  • Step 4: Secondary Intensity Measures

2 2

) (

background signal

background signal σ σ + −

Separability

slide-23
SLIDE 23

Characterization of Spots

  • Step 4: Secondary Intensity Measures

signal

signal

σ

Spot Uniformity

slide-24
SLIDE 24

Characterization of Spot Defects

  • Spots often exhibit characteristic

nonuniformities

– Low center – Spot breaks

slide-25
SLIDE 25

Characterization of Spot Defects

Normal region Defect region

Consider each spot to have two regions

slide-26
SLIDE 26

Characterization of Spot Defects

State 0: N State 1: D

Each region acts as a hidden state. Each state has its own distribution of emitted intensities.

slide-27
SLIDE 27

Characterization of Spot Defects

The probability of a pixel being in a given state depends on its neighbors. N N D D X P(X | N,N,D,D)

slide-28
SLIDE 28

Characterization of Spot Defects

State 0: N State 1: D

Region Model (2D causal MRF):

  • 16 parameters for state transition
  • 2 parameters for intensity of D region pixels

relative to N region (mean, s.d.)

slide-29
SLIDE 29

Characterization of Spot Defects

State 0: N State 1: D

Applying the Region Model

Pixel is in D region if:

  • It is in the spot
  • It is below the spot average intensity in BOTH channels
slide-30
SLIDE 30

Characterization of Spot Defects

State 0: N State 1: D

Applying the Region Model

  • Smooth region boundary
  • Compute the 18 parameters for each spot
slide-31
SLIDE 31

Characterization of Background

  • Base level and variation

– Modeled as stationary across slide

  • Background defects

– Marks, scratches, bright spots, other features – Modeled with 2D Markov random field

slide-32
SLIDE 32

Characterization of Background

  • Classify all background pixels as normal or defect

– Defect is 2σ above background mean

  • Compute statistics on normal background
  • Apply 2D MRF to model defect state

– Similar to region model – Intensities are modeled as beta distribution

  • Measures taken only by slide

0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.01 0.02 0.03 0.04 0.05

Relative Defect Intensity

Probability

slide-33
SLIDE 33

Characterization of Gene Expression

  • Multivariate normal distribution for each

sample (test or reference)

– Mean vector – Covariance matrix

  • Linear model to account for global effects

from slide to slide and dye effects

Sample = (mean gene expression) + slope * (slide perturbation) + (variable expression)

slide-34
SLIDE 34

Characterization of Gene Expression

  • Problem: Covariance matrix is BIG (5200x5200)

– In simulation, we will have to diagonalize it.

  • Model the most significant correlations

– Compute correlations between each pair of genes on each slide – Cluster genes by correlation distance – Each gene in a cluster has greater than .48 absolute correlation with every other gene in the cluster

slide-35
SLIDE 35

Analyzing Characterization Data

  • Two-way ANOVA

– By slide (fixed) – By pin (random)

  • Which properties varied more?

– By slide – By pin – By spot

slide-36
SLIDE 36

Analyzing Characterization Data

  • Spot morphology measures
  • Spot secondary intensity measures
  • Spot defect model parameters
  • Background defect model parameters (by

slide measurement only - no ANOVA)

Only spots with separability > 1 used in ANOVA

slide-37
SLIDE 37

Outline

  • Microarray Simulation Project
  • Characterization of Microarray Images
  • Results of Characterization
  • Simulations
  • Conclusion
slide-38
SLIDE 38

Results

Sometimes the images have their own story to tell.

slide-39
SLIDE 39

Results: Spot Morphology

  • Most variation (75% for size measures) was

attributed to variation by spot

  • Pins behaved similarly (mostly)
  • Slides showed some differences in last eight

slides (mice five and six)

slide-40
SLIDE 40

Results: Spot Morphology

1 2 3 4 5 6 7 8 9

Pin Number Radius (pixels)

Spot size vs. Pin Number

slide-41
SLIDE 41

Results: Spot Morphology

Spot size vs. Slide Number

1 2 3 4 5 6 7 8 9 Slide Number Radius (pixels)

Mouse 1 Mouse 2 Mouse 3 Mouse 4 Mouse 5 Mouse 6

slide-42
SLIDE 42

Results: Spot Intensities

  • Most variation in separability (83-90%) was

attributed to variation by spot

  • Spot uniformity varied considerably by

slide, mostly due to last eight slides

slide-43
SLIDE 43

Results: Spot Intensities

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Slide Number Uniformity (532 nm)

Mouse 1 Mouse 2 Mouse 3 Mouse 4 Mouse 5 Mouse 6

Spot uniformity (532nm) vs. Slide Number

slide-44
SLIDE 44

Results: Spot Defect MRF

  • The 16 region transition probability

parameters varied by pin

– Model the MRF as a property of a pin, not a slide

  • The mean intensity of defect region was

strongly dependent on the pin.

  • Mean intensity of defect region varied

considerably by slide.

slide-45
SLIDE 45

Results: Spot Defect MRF

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Slide Number Low region mean intensity relative to normal region mean

Mouse 1 Mouse 2 Mouse 3 Mouse 4 Mouse 5 Mouse 6

Defect region intensity vs. Slide Number

slide-46
SLIDE 46

Results: Background MRF

  • Last eight slides had more intense

background defects

  • Last eight also had higher probabilities of

generating a defect

slide-47
SLIDE 47

Results: Background MRF

0.1 0.2 0.3 0.4 0.5 Slide Number Intensity of Background Defects Relative to Background Mean

Mouse 1 Mouse 2 Mouse 3 Mouse 4 Mouse 5 Mouse 6

Background defect intensity vs. Slide Number

slide-48
SLIDE 48

Results: General

  • Slide-pin interactions were small (<5% of

variance in all cases)

  • Therefore, modeling of slide and pin effects

separately is justified.

slide-49
SLIDE 49

Results: Summary

  • Characterization shows differences in the

properties of slides for mice five and six:

– Spots were more likely to be broken. – Spot breaks were more severe. – Background defects were more numerous. – Background defects were more intense.

Did this impact the estimated mouse-to-mouse variation?

slide-50
SLIDE 50

Results

Slide 2 (Mouse 1) Slide 19 (Mouse 5)

slide-51
SLIDE 51

Outline

  • Microarray Simulation Project
  • Characterization of Microarray Images
  • Results of Characterization
  • Simulations
  • Conclusion
slide-52
SLIDE 52

Simulations

From mouse 1-4 properties Slide 2 (Mouse 1) Simulation

slide-53
SLIDE 53

Simulations

From mouse 1-4 properties Slide 2 (Mouse 1) Simulation

slide-54
SLIDE 54

Simulations

From mouse 5,6 properties Slide 19 (Mouse 5) Simulation

slide-55
SLIDE 55

Simulations

From mouse 5,6 properties Slide 19 (Mouse 5) Simulation

slide-56
SLIDE 56

Outline

  • Microarray Simulation Project
  • Characterization of Microarray Images
  • Results of Characterization
  • Simulations
  • Conclusion
slide-57
SLIDE 57

Conclusions

  • Characterization of microarray images can

reveal important effects

– In the mouse kidney set, the slides from two mice may have been handled differently.

  • Realistic simulation of microarray images

may allow us to estimate the effects of variations due to parts of the microarray system.

slide-58
SLIDE 58

To Do List

  • Noncausal MRF for spot and background

defects

  • Multiscale modeling of large defects
  • Simulation study to estimate effects of spot

uniformity and background defects