MGA Molecular Genome Analysis - Biostatistics and Modelling - - PowerPoint PPT Presentation

mga
SMART_READER_LITE
LIVE PREVIEW

MGA Molecular Genome Analysis - Biostatistics and Modelling - - PowerPoint PPT Presentation

Sunday Sep 26 th 2010 Dynamic Deterministic Effects Propagation Networks - Learning signalling pathways from longitudinal data Christian Bender MGA Molecular Genome Analysis - Biostatistics and Modelling Contents 1) Introduction and


slide-1
SLIDE 1

MGA

Molecular Genome Analysis - Biostatistics and Modelling

Sunday Sep 26th 2010

Dynamic Deterministic Effects Propagation Networks - Learning signalling pathways from longitudinal data

Christian Bender

slide-2
SLIDE 2

Christian Bender Biostatistics - Molecular Genome Analysis

Contents 1) Introduction and motivation: a) Model system: EGF-Receptor (ERBB) signalling network b) Goals and experimental setup c) Technology: Reverse Phase Protein Arrays 2) Network reconstruction framework: DDEPN a) Overview b) System state generation and state sequence optimisation c) Likelihood calculation d) Network structure search by genetic algorithm 3) Testing and application to longitudinal ERBB data set 4) Conclusions

slide-3
SLIDE 3

Christian Bender Biostatistics - Molecular Genome Analysis

1) Model system: EGF-Receptor signalling

  • Important cancer related signalling pathway
  • ERBB2 over expressed in 25-30% of human breast tumours
  • Stimulation by Epidermal Growth Factor (EGF) and Heregulin (HRG)
slide-4
SLIDE 4

Christian Bender Biostatistics - Molecular Genome Analysis

1) Goals and experimental setup

  • Goal: analyse response of various cell lines to different stimuli

Experimental setup:

  • HCC1954 cell line, ERBB2 positive
  • 2 ligands: EGF and HRG and combination EGF+HRG
  • 5 biological replicates, 3 technical replicates
  • 10 time points between 0 and 60 minutes after stimulation
  • Measurements of phosphoprotein abundance generated on Reverse

Phase Protein Arrays

  • 16 antibodies targeting phosphoproteins
slide-5
SLIDE 5

Christian Bender Biostatistics - Molecular Genome Analysis

1) Technology: Reverse phase protein arrays (RPPA) Single lysate spot Cell line lysate Primary antibody detects target protein Visualisation/ Quantification IR-dye labeled secondary antibody binds to primary antibody

slide-6
SLIDE 6

Christian Bender Biostatistics - Molecular Genome Analysis

2) Example plots of the measured Data after EGF stimulation

slide-7
SLIDE 7

Christian Bender Biostatistics - Molecular Genome Analysis

2) Dynamic Deterministic Effects Propagation Networks (DDEPN)

  • Aims for modelling approach:
  • Include time dependencies and perturbation (stimulatory/inhibitory)
  • Model activating and inhibiting edges
  • Modelling the signal flow from stimulations downwards the signalling

cascade:

  • Define protein status (active/passive) during this propagation
  • Estimate Gaussian distributions for active/passive state of each

protein

  • Calculate model likelihood depending on activation states
  • Optimise to find best network
slide-8
SLIDE 8

Christian Bender Biostatistics - Molecular Genome Analysis

2) DDEPN framework

Γ 1

2 3 S

1 1 1

A

1 1

B

1

Matrix of reachable system states HMM Modify network hypothesis

  • > genetic algorithm

Signal propagation

S A B

Network hypothesis

Θ ˆ

Parameter estimation

) ˆ , ˆ | (

* Θ

Γ X p

Likelihood calculation

S xS1 .. .. xS4 A xA1 .. .. xA4 B xB1 .. .. xB4 S xS1 .. .. xS4 A xA1 .. .. xA4 B xB1 .. .. xB4

Data

S xS1 .. .. xS4 A xA1 .. .. xA4 B xB1 .. .. xB4

proteins

X

t1 t2 t3 t4 S

xS1 .. .. xS4

A

xA1 .. .. xA4

B

xB1 .. .. xB4

replicate measurements

time

S 1 1 1 1 A 1 1 B 1 S 1 1 1 1 A 1 1 B 1 S 1 1 1 1 A 1 1 B 1

t1 t2 t3 t4 S

1 1 1 1

A

1 1

B

1

Optimal state sequence

*

ˆ Γ

slide-9
SLIDE 9

Christian Bender Biostatistics - Molecular Genome Analysis

2) Generation of reachable system states

  • N nodes give rise to 2N system states
  • Depending on the network structure, some state vectors can never be

reached => Reduce to the states that are implied by the network

S A B

Network hypothesis

State of protein vi in step k: Start at step 1: Stimulus S is active. Example: state of protein B in step 2 If at least one activating parent is 1 and no inhibiting parent is 1 in step k-1, set to 1

slide-10
SLIDE 10

Christian Bender Biostatistics - Molecular Genome Analysis

2) Most likely system state series using an HMM

  • We do not know which state is reached at which time point

=> find series of system states using an HMM

Viterbi Training

Transition matrix Model parameters

?

slide-11
SLIDE 11

Christian Bender Biostatistics - Molecular Genome Analysis

  • Given a state sequence matrix Γ*, each data point

follows one of two Gaussian distributions: ^ 2) Likelihood of network hypothesis given system state matrix

  • The full likelihood for a network hypothesis Φ is:
slide-12
SLIDE 12

Christian Bender Biostatistics - Molecular Genome Analysis

2) Genetic algorithm for optimising a population of networks

Φ1 Φ2 Φ3 Φ4 p(Φ1) p(Φ2) p(Φ3) p(Φ4) P ≥ ≥ ≥

P'

Φ1 Φ2 p(Φ1) p(Φ2)

Selection

≥ Φ3

'

Φ4

'

p(Φ3

')

p(Φ4

')

Crossing over

? ? Φ1 Φ2

'

Φ3

''

Φ4

'

p(Φ1) p(Φ2

')

p(Φ3

'')

p(Φ4

') P'

Mutation

Repeat 1) Selection/Crossover proportional to network likelihoods ⇒ Keep 'good' networks 2) Mutation introduces randomness in network evolution

slide-13
SLIDE 13

Christian Bender Biostatistics - Molecular Genome Analysis

3) Testing: Increasing the number of perturbations

  • nstim: single treatments, e.g. EGF
  • cstim: combined treatments, e.g. EGF+HRG
  • Substantial increase of AUC by inclusion of multiple stimuli
slide-14
SLIDE 14

Christian Bender Biostatistics - Molecular Genome Analysis

3) Testing: Comparison to related methods G1DBN and ebdbNet

slide-15
SLIDE 15

Christian Bender Biostatistics - Molecular Genome Analysis

3) Resulting network from ERBB data

  • Identified 13 of 22 edges in agreement with current literature
  • Edges show high support from the data (see edge numbers)
slide-16
SLIDE 16

Christian Bender Biostatistics - Molecular Genome Analysis

3) Summary and conclusions

  • Reconstruction of signalling networks under external perturbations
  • Model effect of both external stimulation and inhibition
  • Model activatory as well as inhibiting edges
  • Good theoretical performance of the algorithm
  • Successful reconstruction of ERBB signalling interactions from RPPA

data

  • R-package ddepn available on CRAN: http://cran.r-project.org/
slide-17
SLIDE 17

Christian Bender Biostatistics - Molecular Genome Analysis

MGA – Lab work

  • Frauke Henjes
  • Ulrike Korf
  • Vivian Szabo

MGA – Biostatistics & Modelling

  • Anika Jöcker

Cancer Genome Research - Division of Molecular Genetics

  • Maria Fälth
  • Marc Johannes
  • Stephan Gade

Bonn-Aachen international center for IT

  • Holger Fröhlich

Acknowledgements

University Medicine Göttingen

  • Tim Beißbarth

.... For supervision of my PhD thesis

ECCB10 Travel Fellowship

slide-18
SLIDE 18

Christian Bender Biostatistics - Molecular Genome Analysis