Reverse engineering - traces to state machines Neil Walkinshaw and - - PowerPoint PPT Presentation

reverse engineering traces to state machines
SMART_READER_LITE
LIVE PREVIEW

Reverse engineering - traces to state machines Neil Walkinshaw and - - PowerPoint PPT Presentation

Inference Competition Reverse engineering - traces to state machines Neil Walkinshaw and Kirill Bogdanov 1 1 Department of Computer Science The University of Sheffield TAIC PART, September 4, 2010 U. of Sheffield Reverse engineering - traces


slide-1
SLIDE 1

Inference Competition

Reverse engineering - traces to state machines

Neil Walkinshaw and Kirill Bogdanov 1

1Department of Computer Science

The University of Sheffield

TAIC PART, September 4, 2010

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-2
SLIDE 2

Inference Competition

Outline

1

Inference Motivation The idea of a passive learner k-tails A more clever learner

2

Competition

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-3
SLIDE 3

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

State-based models are useful

For understanding software, Model-checking, Test generation.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-4
SLIDE 4

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

Maintenance can be difficult

Legacy software tends to have no models associated with it, A failing test could indicate a fault in a model, Requirements-level defects have to be corrected in both.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-5
SLIDE 5

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

Grammar inference

Assuming we know how to interpret traces from a program as sequences of events, and we know the overall pattern a model should obey (such as recognise a regular language) The task is to learn models from event traces.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-6
SLIDE 6

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

k-tails learner

Take traces and hypothesise what other traces should be possible or not . . . . . . assuming that some states in traces correspond to the same state in the model. k-tails assumes that if suffixes of length k are the same, so are the states.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-7
SLIDE 7

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

k-tails learner

Take traces and hypothesise what other traces should be possible or not . . . . . . assuming that some states in traces correspond to the same state in the model. k-tails assumes that if suffixes of length k are the same, so are the states.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-8
SLIDE 8

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

k-tails learner

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-9
SLIDE 9

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

A lot of work was done by the Grammar Inference community

  • n passive learners - no feedback from a user.

If the initial PTA has "enough" positive and negative sequences, the correct FSM will be learnt.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-10
SLIDE 10

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

edit

  • pen

save edit save edit edit edit

  • pen

edit e d i t save

Starting from the initial node, pairs of states are considered and merged in the order of their compatibility score An outcome of merging has to be validated - there is a new path open, edit, save, edit, edit which is not in the original tree.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-11
SLIDE 11

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

edit

  • pen

edit edit save save

? ?

Since dynamic analysis does not give "enough" traces, feedback is used to validate mergers. The two marked states cannot be merged - if a learner attempts to merge them, a user will say that open, save cannot be performed, hence a reject-node is added. Experimental results: if we always merge states with a high score (such as 3), we can get 10x reduction in the number of questions and around 10% reduction in the quality of the learnt machine.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-12
SLIDE 12

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

Questions can be executed on a system, checked using static analysis or presented as questions to a developer. State merging performs no systematic exploration. In order to make analysis more complete, static analysis can be used to compute an underapproximation on infeasible paths, hence a better-quality tree without extra queries.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-13
SLIDE 13

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

IF-THEN properties

edit save edit edit edit edit

  • pen

save edit save

IF THEN

edit edit s a v e save save

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-14
SLIDE 14

Inference Competition Motivation The idea of a passive learner k-tails A more clever learner

Graph comparison

rename setfiletype rename storefile makedir initialise listfiles changedirectory disconnect setfiletype logout logout storefile delete appendfile delete changedirectory logout appendfile changedirectory logout retrievefile logout makedir listnames setfiletype storefile storefile listfiles makedir changedirectory connect login

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-15
SLIDE 15

Inference Competition

Competition

Existing techniques tend to be evaluated on an alphabet of 2, Not necessarily sparse automata, With an uncertain transition structure, software models

tend to have hub-based structure more states tends to mean larger depth

The idea is to start a competition where one would aim to learn state machines typical of software.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-16
SLIDE 16

Inference Competition

Participate

100 50 25 12.5 2 5 10 20 50 alphabet sample size http://stamina.chefbe.net/ Download sequences, upload labelling of tests, USD 1053 prise money, Special issue of Journal of Empirical Software Engineering.

  • U. of Sheffield

Reverse engineering - traces to state machines

slide-17
SLIDE 17

Inference Competition

PostDoc position open

PostDoc position open at the Unversity of Sheffield, UK, for up to 2 years from now.

  • U. of Sheffield

Reverse engineering - traces to state machines