Hybrid System Falsification and Reinforcement Learning Formal - - PowerPoint PPT Presentation

hybrid system falsification and reinforcement learning
SMART_READER_LITE
LIVE PREVIEW

Hybrid System Falsification and Reinforcement Learning Formal - - PowerPoint PPT Presentation

Hybrid System Falsification and Reinforcement Learning Formal Method for Cyber-Physical Systems Clovis Eberhart David Sprunger National Institute of Technology, Japan SOKENDAI lesson, July 1, 8, and 22 1 / 31 Quick reminder Falsification:


slide-1
SLIDE 1

Hybrid System Falsification and Reinforcement Learning

Formal Method for Cyber-Physical Systems

Clovis Eberhart David Sprunger

National Institute of Technology, Japan

SOKENDAI lesson, July 1, 8, and 22

1 / 31

slide-2
SLIDE 2

Quick reminder

Falsification:

method to find counterexamples to a property, useful in the world of formal methods, black-box method, relies on optimisation algorithms.

Hybrid system:

continuous and discrete parameters, non-linear behaviour, very expressive.

Formulas:

expressed in a temporal logic, boolean and robustness semantics.

2 / 31

slide-3
SLIDE 3

1

Refining robustness

2

Time staging

3

Coverage-based falsification

3 / 31

slide-4
SLIDE 4

Table of Contents

1

Refining robustness

2

Time staging

3

Coverage-based falsification

4 / 31

slide-5
SLIDE 5

Refining robustness

Why?

more expressivity (i.e., finer modelling) more techniques (e.g., optimisation techniques work better)

Attention

more expressivity ❀ more complex algorithms

5 / 31

slide-6
SLIDE 6

Refining robustness

Why?

more expressivity (i.e., finer modelling) more techniques (e.g., optimisation techniques work better)

Attention

more expressivity ❀ more complex algorithms (here, however,

  • nly sliding-window algorithms)

5 / 31

slide-7
SLIDE 7

Space-time robustness

Donz´ e, A. and Maler O. Robust satisfaction of temporal logic over real-valued

  • signals. FORMATS 2010.

Until now, robustness is spatial. Problems:

6 / 31

slide-8
SLIDE 8

Space-time robustness

Donz´ e, A. and Maler O. Robust satisfaction of temporal logic over real-valued

  • signals. FORMATS 2010.

Until now, robustness is spatial. Problems: all these signals verify ✸[a,b]x > 0 with the same robustness

6 / 31

slide-9
SLIDE 9

Space-time robustness

Donz´ e, A. and Maler O. Robust satisfaction of temporal logic over real-valued

  • signals. FORMATS 2010.

Until now, robustness is spatial. Problems: all these signals verify ✸[a,b]x > 0 with the same robustness the similarity between these two signals is lost when computing ρ(σ, ✸[a,b]x > 0)

6 / 31

slide-10
SLIDE 10

Space-time robustness

Donz´ e, A. and Maler O. Robust satisfaction of temporal logic over real-valued

  • signals. FORMATS 2010.

Until now, robustness is spatial. Problems: all these signals verify ✸[a,b]x > 0 with the same robustness the similarity between these two signals is lost when computing ρ(σ, ✸[a,b]x > 0) ❀ missing a temporal component

6 / 31

slide-11
SLIDE 11

Adding time

Assumption: set P = {p1, . . . , pn} of atomic propositions. Standard boolean semantics: χ(σ, ϕ, t).

Time robustness

θ−(σ, p, t) = χ(σ, p, t) · max {d ≥ 0 | ∀t′ ∈ [t − d, t].χ(σ, p, t′) = χ(σ, p, t)} θ+(σ, p, t) = χ(σ, p, t) · max {d ≥ 0 | ∀t′ ∈ [t, t + d].χ(σ, p, t′) = χ(σ, p, t)} θs(σ, ¬ϕ, t) = −θs(σ, ϕ, t) . . .

7 / 31

slide-12
SLIDE 12

Interpreting θ+ and θ−

θ+(σ, ϕ, t) = s > 0: σ ϕ for at least time s θ+(σ, ϕ, t) = s < 0: σ ϕ for at least time s θ−(σ, ϕ, t) = s > 0: σ ϕ since at least time s θ−(σ, ϕ, t) = s < 0: σ ϕ since at least time s

8 / 31

slide-13
SLIDE 13

Space-time Robustness

Assumption: atomic propositions are functions (e.g., x2 + y2). Standard robustness semantics: ρ(σ, ϕ, t).

Space-time robustness

For any c ∈ R: θ+

c (σ, f , t) = θ+(χc(σ, f , t)),

θ−

c (σ, f , t) = θ−(χc(σ, f , t)),

θs

c(σ, ¬ϕ, t) = −θs c(σ, ϕ, t).

. . . Interpretation: θ+

c (σ, ϕ, t) = s > 0: ρ(σ, ϕ, t) > c for at least time s,

. . .

9 / 31

slide-14
SLIDE 14

Space-time Robustness

Assumption: atomic propositions are functions (e.g., x2 + y2). Standard robustness semantics: ρ(σ, ϕ, t).

Space-time robustness

For any c ∈ R: θ+

c (σ, f , t) = θ+(χc(σ, f , t)),

θ−

c (σ, f , t) = θ−(χc(σ, f , t)),

θs

c(σ, ¬ϕ, t) = −θs c(σ, ϕ, t).

. . . Interpretation: θ+

c (σ, ϕ, t) = s > 0: ρ(σ, ϕ, t) > c for at least time s,

. . . Remarks: hopefully more efficient how to choose c? not more expressive

9 / 31

slide-15
SLIDE 15

More flexibility

Akazaki T. and Hasuo I. Time robustness in MTL and expressivity in hybrid system falsification. CAV 2015.

Spatial robustness: Temporal robustness:

10 / 31

slide-16
SLIDE 16

AvSTL

Syntax

AP = x < r | x ≤ r | x > r | x ≥ r ϕ = ⊤ | ⊥ | AP | ¬ϕ | ϕ∨ϕ | ϕ∧ϕ | ϕ UI ϕ | ϕ RI ϕ | ϕ UI ϕ | ϕ RI ϕ

Semantics

ρ+(σ, x < r, t) = max{0, r − σ(x)(t)} ρ−(σ, x < r, t) = min{0, r − σ(x)(t)} . . . ρ+(σ, ¬ϕ, t) = ρ−(σ, ϕ, t) ρ+(σ, ϕ U[a,b] ψ, t) =

1 b−a

b

a ρ(σ, ϕ U[a,b]∩[0,τ] ψ, t)dτ

. . .

11 / 31

slide-17
SLIDE 17

Example

Robustnesses: ρ+, ρ− ϕ = x ≥ 0: ϕ = FI (x ≥ 0): Consequences: temporal aspects spatial aspects

12 / 31

slide-18
SLIDE 18

Expressivity

expeditiousness: F[0,a] ϕ deadline: F[0,a] ϕ ∨ F[a,b] ϕ persistence: G[0,a] ϕ ∧ G[a,b] ϕ

13 / 31

slide-19
SLIDE 19

Experimental results

14 / 31

slide-20
SLIDE 20

Table of Contents

1

Refining robustness

2

Time staging

3

Coverage-based falsification

15 / 31

slide-21
SLIDE 21

Time staging

Zhang, Z., Ernst, G., Sedwards, S., Arcaini, P., and Hasuo, I. Two-Layered Falsification of Hybrid Systems Guided by Monte Carlo Tree Search. EMSOFT 2018. Ernst, G., Sedwards, S., Zhang, Z., and Hasuo, I. Fast Falsification of Hybrid Systems using Probabilistically Adaptive Input. QEST 2019.

Idea

σout causally dependent on σin

  • ptimisation methods blind to this dependence

❀ modify the algorithm to take it into account

16 / 31

slide-22
SLIDE 22

A picture is worth a thousand words

17 / 31

slide-23
SLIDE 23

High-Level Algorithm

Alternate between: Monte-Carlo Tree Search to find a good zone, hill-climbing to find a good point in the zone.

18 / 31

slide-24
SLIDE 24

Monte-Carlo Tree Search

Each node equipped with: robustness estimate, number of visits. To choose a node, balance between: an exploitation score (bigger with smaller robustness estimates), an exploration score (bigger with fewer visits to the node).

19 / 31

slide-25
SLIDE 25

Robustness estimates

To get robustness estimates: complete the signal by pure hill-climbing. For example, for a newly-expanded node:

20 / 31

slide-26
SLIDE 26

Experimental results

Interpretation: MTCS explores more, so: better results on hard problems slower on simple problems

21 / 31

slide-27
SLIDE 27

Adaptive Las Vegas Tree Search

To build signal σ incrementally: randomly choose a level l of “granularity” (initially, low granularity is favoured), choose σ′ = Dl(σ), where Dl chooses “finer” signals for large l (shorter time, more precise value), adapt Dl according to ρ(σσ′, ϕ, t).

22 / 31

slide-28
SLIDE 28

Experimental results

Interpretation: falsifying signals are often coarse, or slight variations of such, so explored very fast by this algorithm, robustness scores that concern discrete variables are hard to manipulate for optimisation algorithm (not continuous)

23 / 31

slide-29
SLIDE 29

Table of Contents

1

Refining robustness

2

Time staging

3

Coverage-based falsification

24 / 31

slide-30
SLIDE 30

Idea

Adimoolam, A., Dang, T., Donz´ e, A., Kapinski, J., and Jin, X. Classification and coverage-based falsification for embedded control systems. CAV 2017.

Trade-off between: define a coverage metric of the input space, alternate between:

a global search to classify the search space into zones, local searches on the promising zones to converge to a minimum.

25 / 31

slide-31
SLIDE 31

High-level algorithm

Input: tmax Output: a u such that M(u) ϕ S = sample N points at random; R = zones( S ); while t < tmax do subdivide( R ); S += biased-sampling( R ); S += singularity-sampling( R ); S += local-search( R ); end for u in S do if ρ(u) < 0 then return u end end return None

26 / 31

slide-32
SLIDE 32

Subdivision

Goal: divide the search space into rectangles with different average robustnesses. Input: R a list of rectangles, S a list of sampled points, K a threshold Output: a list of subdivided rectangles for r in R do pop(R, r); if |S ∩ r| > K then H = argmin(ΓH(R, S), H hyperplane); push(R,r ∩ H−,r ∩ H+); end end Γ(d,r,p)(R, S) =

x∈S∩R e(d,r,p)(x)

e(d,r,p)(x) = max{p(ρ(x) − µ)(xd − r), 0}

27 / 31

slide-33
SLIDE 33

Samplings

Biased sampling

Goal: increase coverage and decrease robustness. Idea: sample according to a weighted sum of two distributions: Pi

c: proportional to the numbers of unoccupied cells in

rectangle Ri, Pi

r: takes into consideration how the robustness of sampled

points varies from the average.

Singularity sampling

Goal: sample more in rectangles with “singular” samples (robustness much lower than average in rectangle).

28 / 31

slide-34
SLIDE 34

Local search

Goal: converge to a minimum faster by using local search with a good seed.

29 / 31

slide-35
SLIDE 35

Experimental results

Interpretation: other methods got caught in local minima.

30 / 31

slide-36
SLIDE 36

Conclusion

different notions of robustness:

can be more expressive can make algorithms more efficient

time staging:

explores more hence can resolve harder problems

coverage-based falsification:

theoretical result (if there exists an ε-robust counterexample, there is a grid size such that will find it) coverage helps falsification by exploring more, thus avoiding local minima

31 / 31