Statistical Model Checking for Markov Decision Processes
David Henriques
Joint work with Jo˜ ao Martins, Paolo Zuliani, Andr´ e Platzer and Edmund M. Clarke
QEST, September 18th, 2012
David Henriques (CMU) SMC for MDPs QEST’12 1 / 37
Statistical Model Checking for Markov Decision Processes David - - PowerPoint PPT Presentation
Statistical Model Checking for Markov Decision Processes David Henriques Joint work with Jo ao Martins, Paolo Zuliani, Andr e Platzer and Edmund M. Clarke QEST, September 18 th , 2012 David Henriques (CMU) SMC for MDPs QEST12 1 / 37
Joint work with Jo˜ ao Martins, Paolo Zuliani, Andr´ e Platzer and Edmund M. Clarke
David Henriques (CMU) SMC for MDPs QEST’12 1 / 37
1 Markov Decision Processes 2 Probabilisitic MC and Statistical MC 3 SMC for MDPs 4 Why does it work? 5 Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 2 / 37
Markov Decision Processes
1 Markov Decision Processes 2 Probabilisitic MC and Statistical MC 3 SMC for MDPs 4 Why does it work? 5 Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 3 / 37
Markov Decision Processes
2/3 1/6 1/6 1/4 1/2 1 1 1 1/4 3/4 1/2 1/2 1/4
David Henriques (CMU) SMC for MDPs QEST’12 4 / 37
Markov Decision Processes
2/3 1/6 1/6 1/4 1/2 1 1 1 1/4 3/4 1/2 1/2 1/4
David Henriques (CMU) SMC for MDPs QEST’12 4 / 37
Markov Decision Processes
2/3 1/6 1/6 1/4 1/2 1 1 1 1/4 3/4 1/2 1/2 1/4 4/5 1/5 1 1/2 1/2 1 1 1 1 1/2 1/2 1 1
David Henriques (CMU) SMC for MDPs QEST’12 4 / 37
Markov Decision Processes
2/3 1/6 1/6 1/4 1/2 1 1 1 1/4 3/4 1/2 1/2 1/4 4/5 1/5 1 1/2 1/2 1 1 1 1 1/2 1/2 1 1
David Henriques (CMU) SMC for MDPs QEST’12 4 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 5 / 37
Markov Decision Processes
send wait wait reset ack
David Henriques (CMU) SMC for MDPs QEST’12 5 / 37
Markov Decision Processes
send wait wait reset ack 1 1 1 0.001 0.001 0.999 0.999
David Henriques (CMU) SMC for MDPs QEST’12 5 / 37
Markov Decision Processes
fail
send wait wait reset ack 1 1 1 0.001 0.001 0.999 0.999
David Henriques (CMU) SMC for MDPs QEST’12 5 / 37
Markov Decision Processes
a∈A ps,aτ(s, a) with a∈A ps,a = 1.
David Henriques (CMU) SMC for MDPs QEST’12 6 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 7 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 7 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 7 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
0≤i<n σ(πi)(πi+1)
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
0≤i<n σ(πi)(πi+1)
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
0≤i<n σ(πi)(πi+1)
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Markov Decision Processes
0≤i<n σ(πi)(πi+1)
David Henriques (CMU) SMC for MDPs QEST’12 8 / 37
Probabilisitic MC and Statistical MC
1 Markov Decision Processes 2 Probabilisitic MC and Statistical MC 3 SMC for MDPs 4 Why does it work? 5 Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 9 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 10 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 11 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 11 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 12 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 12 / 37
Probabilisitic MC and Statistical MC
David Henriques (CMU) SMC for MDPs QEST’12 12 / 37
SMC for MDPs
1 Markov Decision Processes 2 Probabilisitic MC and Statistical MC 3 SMC for MDPs 4 Why does it work? 5 Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 13 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 14 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 14 / 37
SMC for MDPs
σ
φ ≡
David Henriques (CMU) SMC for MDPs QEST’12 15 / 37
SMC for MDPs
Fully Probabilistic System + σ
φ ≡ p1 U<12 (G<10 (¬ p3))
BLTL formula
Probability Treshold Evaluate Traces Sample Traces Hypothesis Testing Answer
Sufficient Statistical Evidence David Henriques (CMU) SMC for MDPs QEST’12 15 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
#samples→∞
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
#samples→∞
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
#samples→∞
1000 tries 0 successes
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
#samples→∞
1000 tries 0 successes 500 tries 500 successes
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
#samples→∞
1000 tries 0 successes 500 tries 500 successes 700 tries 525 successes
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
#samples→∞
1000 tries 0 successes 500 tries 500 successes 700 tries 525 successes
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
# times (s,a) was seen
#samples→∞
David Henriques (CMU) SMC for MDPs QEST’12 16 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 17 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 17 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 17 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 17 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 17 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 18 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 18 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 18 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 18 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 18 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 19 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 19 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 19 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 19 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 19 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 20 / 37
SMC for MDPs
David Henriques (CMU) SMC for MDPs QEST’12 20 / 37
SMC for MDPs
Theorem David Henriques (CMU) SMC for MDPs QEST’12 21 / 37
Why does it work?
1 Markov Decision Processes 2 Probabilisitic MC and Statistical MC 3 SMC for MDPs 4 Why does it work? 5 Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 22 / 37
Why does it work?
David Henriques (CMU) SMC for MDPs QEST’12 23 / 37
Why does it work?
David Henriques (CMU) SMC for MDPs QEST’12 23 / 37
Why does it work?
David Henriques (CMU) SMC for MDPs QEST’12 23 / 37
Why does it work?
David Henriques (CMU) SMC for MDPs QEST’12 24 / 37
Why does it work?
David Henriques (CMU) SMC for MDPs QEST’12 24 / 37
Why does it work?
David Henriques (CMU) SMC for MDPs QEST’12 24 / 37
Why does it work?
Proof David Henriques (CMU) SMC for MDPs QEST’12 25 / 37
Experimental Validation
1 Markov Decision Processes 2 Probabilisitic MC and Statistical MC 3 SMC for MDPs 4 Why does it work? 5 Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 26 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 27 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 28 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 28 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 28 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 28 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 28 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 28 / 37
Experimental Validation
CSMA 3 4 θ 0.5 0.8 0.85 0.9 0.95 PRISM
F F F T T 0.86 t 1.7 11.5 35.9 115.7 111.9 136 CSMA 3 6 θ 0.3 0.4 0.45 0.5 0.8 PRISM
F F F T T 0.48 t 2.5 9.4 18.8 133.9 119.3 2995 CSMA 4 4 θ 0.5 0.7 0.8 0.9 0.95 PRISM
F F F F T 0.93 t 3.5 3.7 17.5 69.0 232.8 16244 CSMA 4 6 θ 0.5 0.7 0.8 0.9 0.95 PRISM
F F F F F
timeout
t 3.7 4.1 4.2 26.2 258.9
timeout
WLAN 5 θ 0.1 0.15 0.2 0.25 0.5 PRISM
F F T T T 0.18 t 4.9 11.1 124.7 104.7 103.2 1.6 WLAN 6 θ 0.1 0.15 0.2 0.25 0.5 PRISM
F F T T T 0.18 t 5.0 11.3 127.0 104.9 102.9 1.6
David Henriques (CMU) SMC for MDPs QEST’12 29 / 37
Experimental Validation
CSMA 3 4 θ 0.5 0.8 0.85 0.9 0.95 PRISM
F F F T T 0.86 t 1.7 11.5 35.9 115.7 111.9 136 CSMA 3 6 θ 0.3 0.4 0.45 0.5 0.8 PRISM
F F F T T 0.48 t 2.5 9.4 18.8 133.9 119.3 2995 CSMA 4 4 θ 0.5 0.7 0.8 0.9 0.95 PRISM
F F F F T 0.93 t 3.5 3.7 17.5 69.0 232.8 16244 CSMA 4 6 θ 0.5 0.7 0.8 0.9 0.95 PRISM
F F F F F
timeout
t 3.7 4.1 4.2 26.2 258.9
timeout
WLAN 5 θ 0.1 0.15 0.2 0.25 0.5 PRISM
F F T T T 0.18 t 4.9 11.1 124.7 104.7 103.2 1.6 WLAN 6 θ 0.1 0.15 0.2 0.25 0.5 PRISM
F F T T T 0.18 t 5.0 11.3 127.0 104.9 102.9 1.6
David Henriques (CMU) SMC for MDPs QEST’12 29 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 30 / 37
Experimental Validation
1U≤30RendezVous
2U≤30RendezVous
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
1U≤30RendezVous
2U≤30RendezVous
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
1U≤30RendezVous
2U≤30RendezVous
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
1U≤30RendezVous
2U≤30RendezVous
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 31 / 37
Experimental Validation
timeout
timeout
timeout
timeout David Henriques (CMU) SMC for MDPs QEST’12 32 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 33 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 34 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 34 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 34 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 35 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 36 / 37
Experimental Validation
David Henriques (CMU) SMC for MDPs QEST’12 37 / 37
Experimental Validation
log η T
Back David Henriques (CMU) SMC for MDPs QEST’12 37 / 37
Experimental Validation
V σ[σ(s)→σ′(s)(s) =
a∈A(s) pǫ(s, a)Qσ(s, a) + (1 − ǫ) maxa∈A(s) Qσ(s, a)
=
a∈A(s) pǫ(s, a)Qσ(s, a) + a∈A(s) σ(s, a) − a∈A(s) pǫ(s, a)
=
a∈A(s) pǫ(s, a)Qσ(s, a) + a∈A(s) [σ(s, a) − pǫ(s, a)] maxa∈A(s) Qσ(s, a)
=
a∈A(s) pǫ(s, a)Qσ(s, a) + a∈A(s)
≥
a∈A(s) pǫ(s, a)Qσ(s, a) + a∈A [(σ(s, a) − pǫ(s, a))Qσ(s, a)]
=
a∈A(s) pǫ(s, a)Qσ(s, a) + a∈A(s) σ(s, a)Qσ(s, a) − a∈A(s) pǫ(s, a)Qσ(s, a)
=
a∈A(s) σ(s, a)Qσ(s, a) = V σ(s)
σ′(s, a) = (1 − h)
Back David Henriques (CMU) SMC for MDPs QEST’12 37 / 37