Computational Modeling and Analysis for Complex Systems Edmund M. - - PowerPoint PPT Presentation
Computational Modeling and Analysis for Complex Systems Edmund M. - - PowerPoint PPT Presentation
Computational Modeling and Analysis for Complex Systems Edmund M. Clarke, Paolo Zuliani School of Computer Science Carnegie Mellon University http://cmacs.cs.cmu.edu Apologies Lecture Structure I. Introduction to Model Checking II. Theory:
Apologies
Lecture Structure
I. Introduction to Model Checking
- II. Theory: Statistical Model Checking
- III. Applications:
- Hybrid systems - Verification of Stateflow/Simulink models
- Biological systems - Signaling pathways
Intel Pentium FDIV Bug
- Try 4195835 – 4195835 / 3145727 * 3145727.
In 94’ Pentium, it doesn’t return 0, but 256.
- Intel uses the SRT algorithm for floating point division.
Five entries in the lookup table are missing.
- Cost: $400 - $500 million
- Xudong Zhao’s Thesis on Word Level Model Checking
Temporal Logic Model Checking
- Model checking is an automatic verification technique
for finite state concurrent systems.
- Developed independently by Clarke and Emerson and
by Queille and Sifakis in early 1980’s.
- Specifications are written in propositional temporal
- logic. (Pnueli 77)
- Verification procedure is an intelligent exhaustive
search of the state space of the design.
- No proofs!!! (Algorithmic rather than Deductive)
- Fast (compared to other rigorous methods such as
theorem proving)
- Diagnostic counterexamples
- No problem with partial specifications
- Logics can easily express many concurrency properties
Advantages of Model Checking
Main Disadvantage State Explosion Problem:
2-bit counter
0,0 0,1 1,1 1,0
n-bit counter has 2n states
1 2 3 a b c
||
n states, m processes
1,a 2,a 1,b 2,b 3,a 1,c 3,b 2,c 3,c
nm states
Main Disadvantage (Cont.)
State Explosion Problem:
Main Disadvantage (Cont.)
Unavoidable in worst case, but steady progress over the past 28 years using clever algorithms, data structures, and engineering
Determines Patterns on Infinite Traces
Atomic Propositions Boolean Operations Temporal operators
a “a is true now”
X a “a is true in the neXt state” F a “a will be true in the Future” G a “a will be Globally true in the future” a U b “a will hold true Until b becomes true”
LTL - Linear Time Logic (Pn 77)
a
Determines Patterns on Infinite Traces
Atomic Propositions Boolean Operations Temporal operators a “a is true now”
X a “a is true in the neXt state”
F a “a will be true in the Future” G a “a will be Globally true in the future” a U b “a will hold true Until b becomes true”
LTL - Linear Time Logic (Pn 77)
a
Determines Patterns on Infinite Traces
Atomic Propositions Boolean Operations Temporal operators a “a is true now” X a “a is true in the neXt state”
F a “a will be true in the Future”
G a “a will be Globally true in the future” a U b “a will hold true Until b becomes true”
LTL - Linear Time Logic (Pn 77)
a
Determines Patterns on Infinite Traces
Atomic Propositions Boolean Operations Temporal operators a “a is true now” X a “a is true in the neXt state” F a “a will be true in the Future”
G a “a will be Globally true in the future”
a U b “a will hold true Until b becomes true”
LTL - Linear Time Logic (Pn 77)
a a a a a
Determines Patterns on Infinite Traces
Atomic Propositions Boolean Operations Temporal operators a “a is true now” X a “a is true in the neXt state” F a “a will be true in the Future” G a “a will be Globally true in the future”
a U b “a will hold true Until b becomes true”
LTL - Linear Time Logic (Pn 77)
a a a a b
Branching Time (EC 80, BMP 81)
CTL: Computation Tree Logic
EF g “g will possibly become true”
CTL: Computation Tree Logic
AF g “g will necessarily become true”
CTL: Computation Tree Logic
AG g “g is an invariant”
CTL: Computation Tree Logic
EG g “g is a potential invariant”
CTL: Computation Tree Logic
CTL uses the temporal operators
AX, AG, AF, AU EX, EG, EF, EU
CTL* allows complex nestings such as AXX, AGX, EXF, ...
- A CTL formula not expressible in LTL
AG (EF p)
“from any state p will necessarily become true”
- A LTL formula not expressible in CTL
A (FG p)
“p will necessarily become an invariant”
- A CTL* formula not expressible in either CTL or LTL
AG (EF p) v A (FG p)
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
CTL, LTL, CTL*
Model Checking Problem
- Let M be a state-transition graph.
- Let ƒ be the specification in temporal logic.
- Find all states s of M such that M, s |= ƒ.
- CTL Model Checking: CE 81; CES 83/86; QS 81/82.
- LTL Model Checking: LP 85.
- Automata Theoretic LTL Model Checking: VW 86.
- CTL* Model Checking: EL 85.
State-transition graph describes system evolving
- ver time.
Trivial Example
~ Start ~ Close ~ Heat ~ Error Start ~ Close ~ Heat Error ~ Start Close ~ Heat ~ Error ~ Start Close Heat ~ Error Start Close Heat ~ Error Start Close ~ Heat ~ Error Start Close ~ Heat Error
Microwave Oven
Temporal Logic and Model Checking
- The oven doesn’t heat up until the door is closed
- Not heat_up holds until door_closed
- (~ heat_up) U door_closed
Transition System
(Automaton, Kripke structure)
Hardware Description
(VERILOG, VHDL, SMV)
Informal Specification Temporal Logic Formula
(CTL, LTL, etc.)
Model Checking
Transition System Informal Specification Temporal Logic Formula
(CTL, LTL, etc.)
Safety Property:
bad state unreachable:
satisfied
Initial State
Counterexamples
Program or circuit
Transition System Program or circuit Informal Specification Temporal Logic Formula
(CTL, LTL, etc.)
Initial State
Safety Property:
bad state unreachable
Counterexample
Counterexamples
Transition System Program or circuit Informal Specification Temporal Logic Formula
(CTL, LTL, etc.)
Initial State
Safety Property:
bad state unreachable
Counterexamples
Counterexample
- Bounded Model Checking
- Biere, Cimatti, Clarke, Zhu 99
- Using Fast SAT solvers
- Can handle thousands
- f state elements
Can the given property fail in k steps?
I(V0) Λ T(V0,V1) Λ … Λ T(Vk-1,Vk) Λ (¬ P(V0) V … V ¬ P(Vk))
k-steps Property fails in some step Initial state BMC in practice: Circuit with 9510 latches, 9499 inputs BMC formula has 4 x 106 variables, 1.2 x 107 clauses Shortest bug of length 37 found in 69 seconds
Two Big Breakthroughs on State Space Explosion Problem
- Counterexample Guided Abstraction Refinement
(CEGAR)
- Clarke, Grumberg, Jha, Lu, Veith 2000
- Used in most software model checkers
Two Big Breakthroughs on State Space Explosion Problem (Cont’d)
Existential Abstraction
M M
Given an abstraction function : S S , the concrete states are grouped and mapped into abstract states:
Preservation Theorem ?
Preservation Theorem
- Theorem (Clarke, Grumberg, Long) If property holds on
abstract model, it holds on concrete model.
- Technical conditions
- Property is universal i.e., no existential quantifiers
- Atomic formulas respect abstraction mapping
- Converse implication is not true !
Spurious Behavior
AG AF red
“Every path necessarily leads back to red.”
Spurious Counterexample: <go><go><go><go> ...
Artifact of the abstraction !
“red” “go”
Automatic Abstraction
M
Original Model Refinement Refinement
M
Initial Abstraction
Spurious Spurious counterexample
Validation or Counterexample
Correct !
CEGAR
CounterExample-Guided Abstraction Refinement Circuit or Program
Initial Abstraction
Simulator
No error
- r bug found
Property holds Simulation successful Bug found Abstraction refinement
Refinement Model Checker
Verification Spurious counterexample Counterexample
Abstract Model
End of Part I
Problem
Verification of Stochastic Systems
- Uncertainties in the system environment, modeling a fault,
stochastic processors, biological signaling pathways ...
- Modeling uncertainty with a distribution → Stochastic systems
- Models:
- for example, Discrete, Continuous Time Markov Chains
- Property specification:
- “does the system fulfill a request within 1.2 ms with probability at least
.99”?
- If Ф = “system fulfills request within 1.2 ms”, decide between:
P≥.99 (Ф) or P<.99 (Ф)
Equivalently
- A biased coin (Bernoulli random variable):
- Prob (Head) = p
Prob (Tail) = 1-p
- p is unknown
- Question: Is p ≥ θ ? (for a fixed 0<θ<1)
- A solution: flip the coin a number of times, collect the
- utcomes, and use:
- Statistical hypothesis testing: returns yes/no
- Statistical estimation: returns “p in (a,b)” (and compare a with θ)
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
Towards verification
Stochastic system M Property Ф
+ =
Biased coin!
Key: define a probability measure on the set of traces (simulations) of M. The set of traces satisfying Ф is measurable.
Key idea
- Suppose system behavior w.r.t. a (fixed) property Ф can be
modeled by a Bernoulli random variable of parameter p:
- System satisfies Ф with (unknown) probability p
- Question: P≥θ (Ф)? (for a fixed 0<θ<1)
- Draw a sample of system simulations and use:
- Statistical hypothesis testing: Null vs. Alternative hypothesis
- Statistical estimation: returns “p in (a,b)” (and compare a with θ)
Statistical Model Checking
Motivation
- State Space Exploration infeasible for large systems
- Symbolic MC with OBDDs scales to 10300 states
- Scalability depends on the structure of the system
- Pros: Simulation is feasible for many more systems
- Often easier to simulate a complex system than to build the
transition relation for it
- Easier to parallelize
- Cons: answers may be wrong
- But error probability can be bounded
Existing Work
- [Younes and Simmons 06] use Wald’s SPRT
– SPRT: Sequential Probability Ratio Test
- The SPRT decides between
– the simple null hypothesis vs – the simple alternative hypothesis
- SPRT is asymptotically optimal (when or is true)
– Minimizes the expected number of samples – Among all tests with equal or smaller error probability.
- But MC chooses between two mutually exclusive
composite hypotheses
Null Hypothesis vs Alternate Hypothesis
- We have developed a new statistical MC algorithm
– Sequential sampling – Performs Composite Hypothesis Testing and Estimation – Based on Bayes Theorem and the Bayes Factor.
Bayesian Statistical Model Checking
S. K. Jha, E. M. Clarke, C. J. Langmead, A. Legay, A. Platzer, P. Zuliani. CMSB 2009. P. Zuliani, A. Platzer, E. M. Clarke. HSCC 2010.
Bayesian Statistics
Three ingredients:
- 1. Prior probability
- Models our initial (a priori) uncertainty/belief about
parameters (what is Prob(p ≥ θ) ?)
- 2. Likelihood function
- Describes the distribution of data (e.g., a sequence of
heads/tails), given a specific parameter value
- 3. Bayes Theorem
- Revises uncertainty upon experimental data - compute
Prob(p ≥ θ | data)
Bounded Linear Temporal Logic
- Bounded Linear Temporal Logic (BLTL): Extension of LTL
with time bounds on temporal operators.
- Let σ = (s0, t0), (s1, t1), . . . be an execution of the model
- along states s0, s1, . . .
- the system stays in state si for time ti
- divergence of time: Σi ti diverges (i.e., non-zeno)
- σi: Execution trace starting at state i.
- A model for simulation traces (e.g. Simulink, BioNetGen)
Semantics of BLTL
The semantics of BLTL for a trace σk:
- σk
ap iff atomic proposition ap true in state sk
- σk
Φ1 v Φ2 iff σk Φ1 or σk Φ2
- σk
¬Φ iff σk Φ does not hold
- σk
Φ1 Ut Φ2 iff there exists natural i such that
1)
σk+i Φ2 2) Σj<i tk+j ≤ t 3) for each 0 ≤ j < i, σk+j Φ1
“within time t, Φ2 will be true and Φ1 will hold until then”
- In particular, Ft Φ = true Ut Φ,
Gt Φ = ¬Ft ¬Φ
- Simulation traces are finite: is σ╞═ Φ well defined?
- Definition: The time bound of Φ:
- #(ap) = 0
- #(¬Φ) = #(Φ)
- #(Φ1 v Φ2) = max (#(Φ1), #(Φ2))
- #(Φ1 Ut Φ2) = t + max (#(Φ1), #(Φ2))
- Lemma: “Bounded simulations suffice”
Let Ф be a BLTL property, and k≥0. For any two infinite traces ρ, σ such that ρk and σk “equal up to time #(Ф)” we have
ρk ╞═ Φ iff σk ╞═ Φ
Semantics of BLTL (cont’d)
Sequential Bayesian Statistical MC - I
- Model Checking
- Suppose satisfies with (unknown) probability p
- p is given by a random variable (defined on [0,1]) with density g
- g represents the prior belief that satisfies
- Generate independent and identically distributed (iid)
sample traces.
- xi: the ith sample trace satisfies
- xi = 1 iff
- xi = 0 iff
- Then, xi will be a Bernoulli trial with conditional density
(likelihood function) f(xi|u) = uxi(1 − u)1-xi
- a sample of Bernoulli random variables
- Prior probabilities P(H0), P(H1) strictly positive, sum to 1
- Posterior probability (Bayes Theorem [1763])
for P(X) > 0
- Ratio of Posterior Probabilities:
Bayes Factor
Sequential Bayesian Statistical MC - II
Sequential Bayesian Statistical MC - III
- Recall the Bayes factor
- Jeffreys’ [1960s] suggested the Bayes factor as a statistic:
- For fixed sample sizes
- For example, a Bayes factor greater than 100 “strongly supports” H0
- We introduce a sequential version of Jeffrey’s test
- Fix threshold T ≥ 1 and prior probability.
Continue sampling until
- Bayes Factor > T: Accept H0
- Bayes Factor < 1/T: Reject H0
Require: Property P≥θ(Φ), Threshold T ≥ 1, Prior density g
n := 0 {number of traces drawn so far} x := 0 {number of traces satisfying Φ so far} repeat σ := draw a sample trace of the system (iid) n := n + 1 if σ Φ then x := x + 1 endif B := BayesFactor(n, x, θ, g) until (B > T v B < 1/T ) if (B > T ) then return “H0 accepted” else return “H0 rejected” endif
Sequential Bayesian Statistical MC - IV
- The Bayes Factor uses posterior (and prior) probability
- Posterior density (Bayes Theorem) (iid Bernoulli samples)
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
Computing the Bayes Factor - I
Likelihood function
Definition: Bayes Factor of sample X and hypotheses H0, H1 is
- prior g is Beta of parameters α>0, β>0
joint (conditional) density of independent samples
Computing the Bayes Factor - II
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
Why Beta priors?
- Defined over [0,1]
- Beta distributions are conjugate to Binomial distributions:
- If prior g is Beta and likelihood function is Binomial
then posterior is Beta
- Suppose likelihood Binomial(n,x), prior Beta(α,β): posterior
f(u | x1,…,xn) ≈ f(x1|u) ∙ ∙ ∙ f(xn|u) ∙ g(u) = ux(1 − u)n-x ∙ uα-1(1 − u)β-1 = ux+α -1(1 − u)n-x+β-1 where x = Σi xi
- Posterior is Beta of parameters x+α and n-x+β
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 1.5 2 2.5 3 3.5
α=.5 β=.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8
α=2 β=.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 1.5 2 2.5 3 3.5
α=4 β=10
Beta Density Shapes
Unimodal, but can form convex combinations …
Proposition The Bayes factor of H0:M╞═ P≥θ (Φ) vs H1:M╞═ P<θ (Φ) for n Bernoulli samples (with x≤n successes) and prior Beta(α,β) where F(∙,∙)(∙) is the Beta distribution function.
- No need of integration when computing the Bayes factor
Computing the Bayes Factor - III
Theorem (Termination). The Sequential Bayesian Statistical MC algorithm terminates with probability one. Theorem (Error bounds). When the Bayesian algorithm – using threshold T – stops, the following holds: Prob (“accept H0” | H1) ≤ 1/T Prob (“reject H0” | H0) ≤ 1/T
Note: bounds independent from the prior distribution.
Correctness
Bayesian Interval Estimation - I
- Estimating the (unknown) probability p that “system╞═ Ф”
- Recall: system is modeled as a Bernoulli of parameter p
- Bayes’ Theorem (for iid Bernoulli samples)
- We thus have the posterior distribution
- So we can use the mean of the posterior to estimate p
- mean is a posterior Bayes estimator for p (it minimizes the
integrated risk over the parameter space, under a quadratic loss)
Bayes Estimators - I
- Quadratic loss function:
u (unknown) parameter, d(x) estimator for u
- Risk of estimator d: average loss over all possible data
Bayes Estimators - II
- Integrated risk of estimator d with respect to prior g
- U is the parameter space ([0,1] for us).
- Using the posterior mean as estimator minimizes r(g,d)
- In our case the posterior mean is
(x+α)/(n+α+β) where x≤n number of successes, α,β Beta prior parameters.
- By integrating the posterior we get Bayesian intervals for p
- Fix a coverage ½ < c < 1. Any interval (t0, t1) such that
is called a 100c percent Bayesian Interval Estimate of p
- An optimal interval minimizes t1- t0: difficult in general
- Our approach:
- fix a half-interval width δ
- Continue sampling until the posterior probability of an interval of
width 2δ containing the posterior mean exceeds coverage c
Bayesian Interval Estimation - II
- Computing the posterior probability of an interval is easy
- Suppose n Bernoulli samples (with x≤n successes) and
prior Beta(α,β)
- Again, no numerical integration
Bayesian Interval Estimation - III
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 30 35 40
Bayesian Interval Estimation - IV
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 1.5 2 2.5
prior is beta(α=4,β=5) posterior density after 1000 samples and 900 “successes” is beta(α=904,β=105) posterior mean = 0.8959 width 2δ
Require: BLTL property Φ, interval-width δ, coverage c,
prior beta parameters α,β n := 0 {number of traces drawn so far} x := 0 {number of traces satisfying so far} repeat σ := draw a sample trace of the system (iid) n := n + 1 if σ Φ then x := x + 1 endif mean = (x+α)/(n+α+β) (t0,t1) = (mean-δ, mean+δ) I := PosteriorProbability (t0,t1,n,x,α,β) until (I > c) return (t0,t1), mean
Bayesian Interval Estimation - V
- Recall the algorithm outputs the interval (t0,t1)
- Define the null hypothesis
H0: t0 < p < t1
- We can use the previous results for hypothesis testing
Theorem (Error bound). When the Bayesian estimation algorithm (using coverage ½< c < 1) stops – we have Prob (“accept H0” | H1) ≤ (1/c -1)π0/(1-π0) Prob (“reject H0” | H0) ≤ (1/c -1)π0/(1-π0) π0 is the prior probability of H0
Bayesian Interval Estimation - VI
End of Part II
Fuel Control System - I
The Simulink model:
Fuel Control System - II
- Ratio between air mass flow rate and fuel mass flow rate
- Stoichometric ratio is 14.6
- Senses amount of oxygen in exhaust gas, pressure,
engine speed and throttle to compute correct fuel rate.
- Single sensor faults are compensated by switching to a higher
- xygen content mixture.
- Multiple sensor faults force engine shutdown.
07/16/09
Fuel Control System - III
- Stateflow part of the model has 24 locations
- grouped in 6 simultaneously active states
- Simulink part of the model is rich
- Several nonlinear equations
- Linear ODE
- Probabilistic behavior because of random faults
- in the EGO (oxygen), pressure and speed sensors.
- Faults modeled by three independent Poisson processes
- We did not change the speed or throttle inputs.
07/16/09
Fuel Control System - IV
- We Model Check the formula (Null hypothesis)
M, FaultRate ╞═ P≥θ (¬F100 G1(FuelFlowRate = 0)) for θ = .5, .7, .8, .9, .99
- “It is not the case that within 100 seconds, FuelFlowRate
is zero for 1 second”
- We use various values of FaultRate for each of the three
sensors in the model
- We choose Bayes threshold T = 1000, i.e., stop when
probability of error is < .001
- Uniform, equally likely priors and “informative” priors
Fuel Control System results:
Hypothesis testing
Recall the Null hypothesis:
M, FaultRate ╞═ P≥θ (¬F100 G1(FuelFlowRate = 0))
Priors: uniform, equally likely. Number of samples and test decision:
- red / blue number: reject / accept null hypothesis
Probability threshold θ
.5 .7 .8 .9 .99
Fault rates
[3 7 8] 58 17 10 8 2 [10 8 9] 32 95 394 710 8 [20 10 20] 9 16 24 44 1,626 [30 30 30] 9 16 24 44 239
Longest run: 1h 5’ on a 2.4GHz Pentium 4 computer
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
Fuel Control System results:
Hypothesis testing
Probability threshold θ
.5 .7 .8 .9 .99
Fault rates
[3 7 8] 55 12 10 8 2 [10 8 9] 28 64 347 255 8 [20 10 20] 8 13 20 39 1,463 [30 30 30] 7 13 18 33 201
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 30 35 40
Informative priors: convex combinations of Betas
Example: for fault rates [10 8 9] we used 0.01 x beta(1,1) + 0.99 x beta(1000,172.6)
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
Fuel Control System results:
Hypothesis testing
Probability threshold θ
.5 .7 .8 .9 .99
Fault rates
[3 7 8] 55 (3) 12 (5) 10 8 2 [10 8 9] 28 (4) 64 (31) 347 (47) 255 (455) 8 [20 10 20] 8 (1) 13 (3) 20 (4) 39 (5) 1,463 (163) [30 30 30] 7 (2) 13 (3) 18 (6) 33 (11) 201 (38)
Informative priors: convex combinations of Betas Savings with respect to uniform prior:
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
Fuel Control System results:
Interval estimation
Interval coverage c
.9 .95 .99 .999
Fault rates
[3 7 8] .3603 .3559 .3558 .3563 [10 8 9] .8534 .8518 .8528 .8534 [20 10 20] .9764 .9784 .9840 .9779 [30 30 30] .9913 .9933 .9956 .9971
- Bayesian estimation algorithm, uniform prior.
- Want to estimate the probability that
M, FaultRate ╞═ (¬F100 G1(FuelFlowRate = 0))
- For half-width δ=.01 and several values of coverage c
- Posterior mean: add/subtract δ to get the Bayesian interval
07/16/09 07/16/09 07/16/09 07/16/09 07/16/09 07/16/09
Interval coverage c
.9 .95 .99 .999
Fault rates
[3 7 8] 6,234 8,802 15,205 24,830 [10 8 9] 3,381 4,844 8,331 13,569 [20 10 20] 592 786 1,121 2,583 [30 30 30] 113 148 227 341 Chernoff bound 119,829 147,555 211,933 304,036
- Number of samples
- Comparison with Chernoff-Hoeffding bound (Bernoulli r.v.’s)
Pr (| X – p | ≥ δ) ≤ exp(-2nδ2) where X = 1/n Σi Xi , E[Xi]=p
Fuel Control System results:
Interval estimation
Performance of Bayesian Estimation
07/16/09
P53-Mdm2 and DNA Repair Circuit
Kurt W. Kohn, Molecular Biology of the Cell 1999
R.A. Weinberg, The Hallmarks of Cancer
The Hallmarks of Cancer
Jim Faeder, UPMC begin molecule types A(b,Y~U~P) B(a) end molecule types begin reaction rules A(b)+ B(a)<-> A(b!1).B(a!1) A(Y~U) -> A(Y~P) end reaction rules
Faeder JR, Blinov ML, Hlavacek WS Rule-Based Modeling of Biochemical Systems with BioNetGen. In Methods in Molecular Biology: Systems Biology, (2009).
A
b Y
U P
B
a
A
b
B
a
+
A
b
B
a
A
Y
U
A
Y
P
The BioNetGen Language
Existing Approach: Manual Analysis
Many y simulati tion
- n traces
ces need d to be ca carefu efull lly analyz lyzed ed !
Model Checking Approach
BioLab 2.0
Auto tomated mated Analys lysis is !
BioLab 2.0
Model Checking Biochemical Stochastic models: M╞═ P≥θ(Φ) ?
Model M BioNetGen Bayesian Model Checker BLTL formula Φ BLTL to Monitor compiler Formula monitor M╞═ P≥θ (Φ) Bayes Test M╞═ P≥θ (Φ)
# PI3K phosphorylates PIP2 PI3K + PIP2 PI3K + PIP3 p1 # PTEN dephosphorylates PIP3 PTEN + PIP3 PTEN + PIP2 d1 # P53-dependent production of PTEN P53(c~p) P53(c~p) + PTEN Hill(d2,K,3) #PIP3 phosphorylates AKT PIP3 + AKT(a~U) PIP3 + AKT(a~p) AKT(a~p) + MDM2(b~U) AKT(a~p) + MDM2(b~p) p2 # MDM2p drives P53 degradation MDM2(b~p) + P53(c~U) MDM2(b~p) d5 # P53 synthesis I() I() + P53(c~U) s0
PTEN PI3K MDM2 AKT P53p PIP3 MDM2 p AKTp P53 PIP2
deg
BioNetGen Model
- Set BayesFactor threshold T = 10,000, probability of error
<0.0001
- P53 concentration increase when DNA is damaged.
Property (Null hypothesis): Pr≥0.9 [ F10,000 (P53 >180,000)] Result: Accepted to be True (127 samples, 122 successes)
- P53 protein concentration remains at a low level in the
normal cell. Property (Null hypothesis): Pr≥0.9 [F60,000( G20,000 (P53 < 50,000) )] Result: Accepted to be True (43 samples, 43 successes)
Statistical Model Checking
Boolean Network Model of HMGB1
Some update rules:
PI3k(t+1) = RAC1(t) | RAS(t) IKK(t+1) = (TAB1(t) | AKT(t) | ERK(t) ) & ~A20(t) E2F(t+1) = Myc(t) & ~RB(t) RB(t+1) = ~CyclinD(t) | ~CyclinE(t) CyclinD(t+1) = (AP1(t) | Myc(t)) & ~INK4a & ~P21 PIP3(t+1) = PI3K(t) & ~PTEN(t) P53(t+1) = ~MDM2(t) MDM2(t+1) = AKT(t) & ~ARF(t)
INK4a HMGB1
PIP3
IRAKs RAC1 MYD88 RAGE TLR2/4 RAF ERK MEK RAS A20 E2F AP1
NFkB IKK
Myc AKT PI3K TAB1 Cyclin E Cyclin D
IkB RB
ARF PTEN
P53
Bcl- XL
MDM2 BAX Apoptosis Proliferate
P21
Assume INK4A = 0 ( INK4A loss in pancreatic cancer )
HMGB1 will activate the cell proliferation in the future: AF (Proliferate) : True If RAS is overexpressed, CyclinE will be activated in the future A(RAS AF(CyclinE)): True HMGB1 can activate E2F while passing by AKT EF( AKT & EF(E2F) ): True ERK is not activated before E2F is activated: E( (~ERK) U E2F ): False HMGB1 can inhibit Apoptosis in the future EF(~Apoptosis): True
Symbolic Model Checking (SMV)
Assume INK4A =1 ( NO INK4A mutation )
- 1. CyclinD = ( Myc | AP1 ) | ~INK4A
HMGB1 will activate E2F in the future: AF(E2F): True HMGB1 and its effectors have a stronger effect than INK4A
- 2. CyclinD = ( Myc | AP1 ) & ~INK4A
AF(E2F): False INK4A has a stronger effect than HMGB1 and its effectors: HMGB1 can not activate E2F.
Model checking can help to rule out or modify models which do not satisfy properties verified by experiment.
Model development
Coming soon
http://symbolaris.com/lahs/