Learning Register Automata Models Falk Howar IPSSE, TU Clausthal, - - PowerPoint PPT Presentation

learning register automata models
SMART_READER_LITE
LIVE PREVIEW

Learning Register Automata Models Falk Howar IPSSE, TU Clausthal, - - PowerPoint PPT Presentation

Learning Register Automata Models Falk Howar IPSSE, TU Clausthal, Goslar, Germany Dagstuhl Seminar 16172 Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 1 / 22 Scenario: Verification of Component-based Systems Environment


slide-1
SLIDE 1

Learning Register Automata Models

Falk Howar

IPSSE, TU Clausthal, Goslar, Germany

Dagstuhl Seminar 16172

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 1 / 22

slide-2
SLIDE 2

Scenario: Verification of Component-based Systems

Environment Requirement Component

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 2 / 22

slide-3
SLIDE 3

Scenario: Verification of Component-based Systems

Environment Requirement Component Env. Model Comp. Model Req. Model

  • |

=

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 2 / 22

slide-4
SLIDE 4

Scenario: Verification of Component-based Systems

Environment Requirement Component Env. Model Comp. Model Req. Model

  • |

=

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 2 / 22

slide-5
SLIDE 5

APIs with Data Parameters

Internal State Data Parameters Assignments Guards

public class Stack { private final int capacity = 3; private int size = 0; private Object elements [] = new Object[capacity ]; public boolean push(Object o) { if (size == capacity) return false; elements[size ++] = o; return true; } public Object pop () { if (size == 0) return null; return elements[--size ]; } }

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 3 / 22

slide-6
SLIDE 6

Challenge: From Finite State to Infinite State

Only Concrete Values Uninterpreted Labels Symbolic Data Flow Mealy-Machine Model: l0 l1 l2 l3

push(1) / true push(1) / true push(1) / true pop() / 1 pop() / 1 pop() / 1 push(1) / false pop() / null

What is really needed:

l0 l1 l2 l3

push(p) | true x1:=p

  • push(p) | true

x2:=p

  • push(p) | true

x3:=p

  • pop() | true

  • (x3)

pop() | true −

  • (x1)

pop() | true −

  • (x1)

push(p) | true −

  • ×

pop() | true −

  • null

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 4 / 22

slide-7
SLIDE 7

Outline

(1) Learning Basics (2) Learning Register Automata Models (3) Quo Vadis / Future Research Directions

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 5 / 22

slide-8
SLIDE 8

Minimally Adequate Teachers

SUL

MAT due to [Angluin, 1987] Other learning models have fewer assumptions ...

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 6 / 22

slide-9
SLIDE 9

Minimally Adequate Teachers

SUL

push(1) pop() o(1) ?

∈ LSUL

MAT due to [Angluin, 1987] Other learning models have fewer assumptions ...

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 6 / 22

slide-10
SLIDE 10

Minimally Adequate Teachers

SUL

push(1) pop() o(1) ?

∈ LSUL

H

MAT due to [Angluin, 1987] Other learning models have fewer assumptions ...

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 6 / 22

slide-11
SLIDE 11

Minimally Adequate Teachers

SUL

H

equivalent? MAT due to [Angluin, 1987] Other learning models have fewer assumptions ...

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 6 / 22

slide-12
SLIDE 12

Minimally Adequate Teachers

SUL

H

equivalent? Yes: done MAT due to [Angluin, 1987] Other learning models have fewer assumptions ...

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 6 / 22

slide-13
SLIDE 13

Minimally Adequate Teachers

SUL

H

equivalent? Yes: done No: counterexample

w ∈ (LH ∪ LSUL) \ (LH ∩ LSUL)

MAT due to [Angluin, 1987] Other learning models have fewer assumptions ...

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 6 / 22

slide-14
SLIDE 14

Model Generation using State Classifiers

Nerode Relation u ≡L u′ iff ∀v ∈ Σ∗ . uv ∈ L ⇔ u′v ∈ L

q0

ε

  • Access sequences: “Spanning tree” of model. (thick edges)
  • Suffixes: Partial residuals of words. (triangles)

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 7 / 22

slide-15
SLIDE 15

Model Generation using State Classifiers

Nerode Relation u ≡L u′ iff ∀v ∈ Σ∗ . uv ∈ L ⇔ u′v ∈ L

q0

ε

q2 push(1) q3 push(2)

  • Access sequences: “Spanning tree” of model. (thick edges)
  • Suffixes: Partial residuals of words. (triangles)

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 7 / 22

slide-16
SLIDE 16

Model Generation using State Classifiers

Nerode Relation u ≡L u′ iff ∀v ∈ Σ∗ . uv ∈ L ⇔ u′v ∈ L

q0

ε

q2 push(1) q3 push(2) q? pop()

  • Access sequences: “Spanning tree” of model. (thick edges)
  • Suffixes: Partial residuals of words. (triangles)
  • Remaining prefixes: Remaining transitions. (blue edges)

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 7 / 22

slide-17
SLIDE 17

Model Generation using State Classifiers

Nerode Relation u ≡L u′ iff ∀v ∈ Σ∗ . uv ∈ L ⇔ u′v ∈ L

q0

ε

q2 push(1) q3 push(2) pop()

  • Access sequences: “Spanning tree” of model. (thick edges)
  • Suffixes: Partial residuals of words. (triangles)
  • Remaining prefixes: Remaining transitions. (blue edges)

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 7 / 22

slide-18
SLIDE 18

Inferring Models with Data and Memory

State Classifier Nerode Relation Multiple Ideas CEGAR, Symbolic Decision Trees, ... Memorable Data Values [Benedikt et al.] (1) Identify Locations (2) Identify Transition Guards (3) Identify Registers (and Assignments) Important but skipped: Analyzing Counterexamples

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 8 / 22

slide-19
SLIDE 19

A short history: Learning Models with Data

Automata without memory:

  • Sets of concrete values as guards [Shahbaz

et al., 2007]

  • Conjunctions of Boolean parameters as

guards [Berg et al., 2006]

  • Inferred Alphabet Abstractions [Howar et al.,

2011]

  • Arbitrary Formulas through Symbolic

Execution [Giannakopoulou et al., 2012]

  • Integer Intervals [Maler and Mens, 2014]

Fixed set of registers:

  • First and most recent value of a parameter

[Aarts et al., 2012]

  • First and most recent value of a parameter

[Bollig et al., 2013]

  • Last k values [Botinˇ

can and Babi´ c, 2013]

  • White-box access to class variables [Xiao

et al., 2013] Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 9 / 22

slide-20
SLIDE 20

Learning Models with Data (contd.)

‘Symbolic Execution vs. Predicate Abstraction’ Extending L∗ to RAs:

  • Equality [Howar

et al., 2012b]

  • Output [Howar et al.,

2012a]

  • More Relations

[Cassel et al., 2016]

  • Fresh Output [Cassel

et al., 2015] Mapper/CEGAR for dealing with RAs:

  • Equality and Output

[Aarts, 2014]

  • Fresh Output [Aarts

et al., 2015] Multi-Step-Inference:

  • State Merging +

DAIKON [Lorenzoli et al., 2008]

  • EDSM + WEKA

[Walkinshaw et al., 2013]

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 10 / 22

slide-21
SLIDE 21

Outline

(1) Learning Basics (2) Learning Register Automata Models (3) Quo Vadis / Future Research Directions

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 11 / 22

slide-22
SLIDE 22

Data Languages

Assume:

  • an infinite domain D of data values
  • a finite set of actions

push(1) push(1) pop() o(1) set of data words, closed under permutations on D data symbol data word data language

push(1) pop() o(1) ∈ L ⇒ push(2) pop() o(2) ∈ L push(3) pop() o(3) ∈ L . . .

Example: Lstore = {push(d1) pop() o(d2) | d1 = d2}

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 12 / 22

slide-23
SLIDE 23

Register Automata

  • Locations
  • Registers (e.g., x1)
  • Transitions with:
  • Actions with formal parameters
  • Guards
  • Assignments to registers

l0 l1

push(p) | true x1:=p | true − pop() | true −

  • (p) | x1=p

− Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 13 / 22

slide-24
SLIDE 24

Register Automata

  • Locations
  • Registers (e.g., x1)
  • Transitions with:
  • Actions with formal parameters
  • Guards
  • Assignments to registers

l0 l1

push(p) | true x1:=p

  • pop() | true

  • (x1)

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 13 / 22

slide-25
SLIDE 25

Nerode-like Equivalence Relation

  • (1)
  • (2)

push(1) pop() push(2) pop()

Let W be the set of all data words. Equivalence wrt. L Two words u, u′ ∈ W are equivalent wrt. ≡L iff there exists a permutation π on D s.t. for all v ∈ W uv ∈ L ⇔ u′π(v) ∈ L Characterization Theorem: [Cassel et al., 2011]

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 14 / 22

slide-26
SLIDE 26

Nerode-like Equivalence Relation

  • (1)
  • (2)

= π( )

π(1) = 2 π(2) = 1 ... π(i) = i ...

Let W be the set of all data words. Equivalence wrt. L Two words u, u′ ∈ W are equivalent wrt. ≡L iff there exists a permutation π on D s.t. for all v ∈ W uv ∈ L ⇔ u′π(v) ∈ L Characterization Theorem: [Cassel et al., 2011]

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 14 / 22

slide-27
SLIDE 27

Theories and Symbolic Decision Trees

Theory: Data Domain + Set of Relations Examples:

  • N with =, =
  • R with <, >
  • Z with =, succ

Symbolic Decision Tree: Symbolic Classifier Example: SDT for suffix o(p) after push(1) pop()

x1

  • (p) | x1 = p
  • (p) | x1 = p

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 15 / 22

slide-28
SLIDE 28

Theories and Symbolic Decision Trees

Theory: Data Domain + Set of Relations Examples:

  • N with =, =
  • R with <, >
  • Z with =, succ

Symbolic Decision Tree: Symbolic Classifier Example: SDT for suffix o(p) after push(1) pop()

x1

  • (p) | x1 = p
  • (p) | x1 = p

Tree Oracle A tree oracle computes a SDT for a prefix u and a set of symbolic suffixes v.

  • Tree Oracle for certain classes of theories can be realized using only

membership queries.

  • Idea: Unique test-cases for all minterms, Merge canonically

[Cassel et al., 2016]

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 15 / 22

slide-29
SLIDE 29

The SL∗Algorithm

Prefixes lead to locations

  • Equiv. SDTs

identify locations SDTs provide registers and guards Modular learning algorithm for RAs with theories

  • Inspired by L∗
  • Tree oracle produces

SDTs

  • SDTs replace suffixes

U ∪ U + V {ε, , null, o(p)} push(1) push(1) pop() pop() push(1) push(2)

  • . . .

x1

  • (p) | x1 = p
  • (p) | x1 = p

. . . null . . .

  • . . .

. . . . . .

Some requirements on tree oracle: Canonicity and coherence of SDTs [Cassel et al., 2016]

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 16 / 22

slide-30
SLIDE 30

Summary

Learning RA Models:

  • Identify locations, registers, and

transitions

  • Multiple lines of work
  • Current expressivity: Fresh outputs, simple

arithmetic operations Tools:

  • LearnLib/ RaLib
  • Tomte
  • PSYCO
  • EFSMTool

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 17 / 22

slide-31
SLIDE 31

Outline

(1) Learning Basics (2) Learning Register Automata Models (3) Quo Vadis / Future Research Directions

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 18 / 22

slide-32
SLIDE 32

Quo Vadis / Future Research Directions

Environment Requirement Component Env. Model Comp. Model Req. Model

  • |

=

Learning Models:

  • Multi-step learning for

environment models?

  • Quality of Models (PAC-style)?
  • Life-long Learning

Expressivity:

  • Limits of theories (and

abstraction)? Efficiency:

  • Transfer results from regular

languages to RAs?

  • Integrate CEGAR-based and

SE-based approaches? Applications:

  • Enabler for Industrial-scale

Compositional Verification?

Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 19 / 22

slide-33
SLIDE 33

References I

FD Aarts. Tomte: bridging the gap between active learning and real-world systems, 2014. Fides Aarts, Faranak Heidarian, Harco Kuppens, Petur Olsen, and Frits W. Vaandrager. Automata learning through counterexample guided abstraction refinement. In FM 2012: Formal Methods - 18th International Symposium, Paris, France, August 27-31, 2012. Proceedings, pages 10–27, 2012. doi: 10.1007/978-3-642-32759-9 4. URL http://dx.doi.org/10.1007/978-3-642-32759-9_4. Fides Aarts, Paul Fiterau-Brostean, Harco Kuppens, and Frits W. Vaandrager. Learning register automata with fresh value

  • generation. In Theoretical Aspects of Computing - ICTAC 2015 - 12th International Colloquium Cali, Colombia,

October 29-31, 2015, Proceedings, pages 165–183, 2015. doi: 10.1007/978-3-319-25150-9 11. URL http://dx.doi.org/10.1007/978-3-319-25150-9_11. Dana Angluin. Learning Regular Sets from Queries and Counterexamples. Information and Computation, 75(2):87–106, 1987.

  • M. Benedikt, C. Ley, and G. Puppis. What You Must Remember When Processing Data Words. In Proceedings of the 4th

Alberto Mendelzon Int. Workshop on Foundations of Data Management, volume 619 of CEUR Workshop Proceedings. Therese Berg, Bengt Jonsson, and Harald Raffelt. Regular Inference for State Machines with Parameters. In Proceedings

  • f the 9th Int. Conf. on Fundamental Approaches to Software Engineering, FASE ’06, volume 3922 of Lecture Notes

in Computer Science, pages 107–121. Springer Verlag, 2006. ISBN 3-540-33093-3. Benedikt Bollig, Peter Habermehl, Martin Leucker, and Benjamin Monmege. A fresh approach to learning register

  • automata. In Developments in Language Theory, pages 118–130. Springer, 2013.

Matko Botinˇ can and Domagoj Babi´

  • c. Sigma*: symbolic learning of input-output specifications. In ACM SIGPLAN

Notices, volume 48, pages 443–456. ACM, 2013. Sofia Cassel, Falk Howar, Bengt Jonsson, Maik Merten, and Bernhard Steffen. A Succinct Canonical Register Automaton

  • Model. In Proceedings of the 9th Int. Symposium on Automated Technology for Verification and Analysis, ATVA’11,

volume 6996 of Lecture Notes in Computer Science, pages 366–380. Springer Verlag, 2011. Sofia Cassel, Falk Howar, and Bengt Jonsson. Ralib: A learnlib extension for inferring efsms. In DIFTS 2015, 2015. Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 20 / 22

slide-34
SLIDE 34

References II

Sofia Cassel, Falk Howar, Bengt Jonsson, and Bernhard Steffen. Active learning for extended finite state machines. Formal Asp. Comput., 28(2):233–263, 2016. doi: 10.1007/s00165-016-0355-5. URL http://dx.doi.org/10.1007/s00165-016-0355-5. Dimitra Giannakopoulou, Zvonimir Rakamari´ c, and Vishwanath Raman. Symbolic learning of component interfaces. In International Static Analysis Symposium (SAS), pages 248–264, 2012. Falk Howar, Bernhard Steffen, and Maik Merten. Automata Learning with Automated Alphabet Abstraction Refinement. In Porceedings of the 12th Int. Conf. on Verification, Model Checking, and Abstract Interpretation, VMCAI’11, volume 6538 of Lecture Notes in Computer Science, pages 263–277. Springer Verlag, 2011. Falk Howar, Malte Isberner, Bernhard Steffen, Oliver Bauer, and Bengt Jonsson. Inferring Semantic Interfaces of Data

  • Structures. ISoLA 2012, 2012a.

Falk Howar, Bernhard Steffen, Bengt Jonsson, and Sofia Cassel. Inferring Canonical Register Automata. In Porceedings

  • f the 13th Int. Conf. on Verification, Model Checking, and Abstract Interpretation, VMCAI’12, volume 7148 of

Lecture Notes in Computer Science, pages 251–266. Springer Verlag, 2012b. Davide Lorenzoli, Leonardo Mariani, and Mauro Pezz`

  • e. Automatic generation of software behavioral models. In

Procceedings of the 30th Int. Conf. on Software Engineering, ICSE’08, pages 501–510. ACM, 2008. ISBN 978-1-60558-079-1. doi: http://doi.acm.org/10.1145/1368088.1368157. URL http://doi.acm.org/10.1145/1368088.1368157. Oded Maler and Irini-Eleftheria Mens. Learning regular languages over large alphabets. In Tools and Algorithms for the Construction and Analysis of Systems - 20th International Conference, TACAS 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings, pages 485–499, 2014. doi: 10.1007/978-3-642-54862-8 41. URL http://dx.doi.org/10.1007/978-3-642-54862-8_41. Muzammil Shahbaz, Keqin Li, and Roland Groz. Learning Parameterized State Machine Model for Integration Testing. In Proceedings of the 31th Annual Int. Computer Software and Applications Conference, COMPSAC’07, pages 755–760. IEEE Computer Society, 2007. ISBN 0-7695-2870-8. doi: http://dx.doi.org/10.1109/COMPSAC.2007.134. Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 21 / 22

slide-35
SLIDE 35

References III

Neil Walkinshaw, Ramsay Taylor, and John Derrick. Inferring extended finite state machine models from software

  • executions. In 20th Working Conference on Reverse Engineering, WCRE 2013, Koblenz, Germany, October 14-17,

2013, pages 301–310, 2013. doi: 10.1109/WCRE.2013.6671305. URL http://dx.doi.org/10.1109/WCRE.2013.6671305. Hao Xiao, Jun Sun, Yang Liu, Shang-Wei Lin, and Chengnian Sun. Tzuyu: Learning stateful typestates. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on, pages 432–442. IEEE, 2013. Falk Howar (TU Clausthal) Learning RAs Dagstuhl Seminar 16172 22 / 22