Learning Software Models
Alessandra Russo Imperial College London
Learning Software Models Alessandra Russo Imperial College London - - PowerPoint PPT Presentation
Learning Software Models Alessandra Russo Imperial College London in collaboration with.... Imperial College London Dalal Alrajeh, Jeff Kramer, Jeff Magee, Daniel Sykes Domenico Corapi Universidad de Buenos Aires Sebastian
Alessandra Russo Imperial College London
Dalal Alrajeh Dalal Alrajeh Dalal Alrajeh
Dalal Alrajeh, Jeff Kramer, Jeff Magee, Daniel Sykes Domenico Corapi
Sebastian Uchitel
Axel van Lamsweerde
Katsumi Inoue
Engineering Software Models
System goals
refinement
G[PumpOnWhenHighWaterAndNoMethane] = (tick → ((HighWater ∧ ¬ CriticalMethane) → ◯ ( ¬ tick W (tick ∧ PumpOn)))) G[PumpOffWhenLowWater] = (tick → ( LowWater → ◯( ¬ tick W (tick ∧ ¬ PumpOn)))) G[PumpOffWhenMethane] = (tick → ( CriticalMethane → ◯( ¬ tick W (tick ∧ ¬ PumpOn)))) G[AlarmWhenMethane] = (tick → ( CriticalMethane → ◯( ¬ tick W (tick ∧ Alarm))))
s t a c l e a n a l y s i s elaboration
Informal requirements
Execution traces
a d a p t a t i
holdsAt(at(loc1), 0) do(pickup, 0) holdsAt(at(loc1), 1) holdsAt(holdingObject, 1) do(move(lock, lock),1) holdsAt(at(loc3), 2) holdsAt(holdingObject, 2) do(move(loc2, loc5), 2) holdsAt(at(loc5), 3) holdsAt(holdingObject, 3) do(putdown, 3)
Learning Software Models
Logic-based learning can provide automated support to software model elaboration, refinement, adaptation and analysis
System goals
refinement
G[PumpOnWhenHighWaterAndNoMethane] = (tick → ((HighWater ∧ ¬ CriticalMethane) → ◯ ( ¬ tick W (tick ∧ PumpOn)))) G[PumpOffWhenLowWater] = (tick → ( LowWater → ◯( ¬ tick W (tick ∧ ¬ PumpOn)))) G[PumpOffWhenMethane] = (tick → ( CriticalMethane → ◯( ¬ tick W (tick ∧ ¬ PumpOn)))) G[AlarmWhenMethane] = (tick → ( CriticalMethane → ◯( ¬ tick W (tick ∧ Alarm))))
s t a c l e a n a l y s i s elaboration
Informal requirements
Execution traces
a d a p t a t i
holdsAt(at(loc1), 0) do(pickup, 0) holdsAt(at(loc1), 1) holdsAt(holdingObject, 1) do(move(lock, lock),1) holdsAt(at(loc3), 2) holdsAt(holdingObject, 2) do(move(loc2, loc5), 2) holdsAt(at(loc5), 3) holdsAt(holdingObject, 3) do(putdown, 3)
Software models: formal description of
pre-conditions trigger-conditions post-conditions
Op "TurnPumpOff"
(criticalMethane → ◯ turnPumpOff)
Operationalisation patterns
Req ⊨ Goals
….In practice
undesirable behaviours desirable behaviours
Automatically generate software models
cover scenarios of desired behaviours reject scenarios of undesirable behaviours
satisfy given domain properties and partial requirements
Undesirable behaviours Desirable behaviours Domain properties Partial requirements Operational Requirements
Automatically generate software models
cover scenarios of desired behaviours reject scenarios of undesirable behaviours
satisfy given domain properties and partial requirements
Domain properties Partial requirements Operational Requirements Undesirable behaviours Desirable behaviours
Automatically generate software models
cover scenarios of desired behaviours reject scenarios of undesirable behaviours
satisfy given domain properties and partial requirements
Domain properties Partial requirements Operational Requirements
logic-based learning
Undesirable behaviours Desirable behaviours
Logic Programming Knowledge Representation Machine Learning
Logic-based Learning
with experience
Learning declarative knowledge from
K Domain knowledge E+ Positive examples E- Negative examples IC Integrity constraints
Given Find
H New knowledge, such that K ∪ H ⊨ E+ K ∪ H ⊭ E- K ∪ H ∪ IC ⊭ false
nat(s(X)) ← nat(X) nat(0) even(0) not odd(2) not odd(4)
false ← odd(X), even(X)
g: odd(5),
not odd(4), not odd(2)
H: odd(X) H: odd(X)←
X= s(Y)
H: odd(X)←
X= s(Y), even(Y)
“artificial example” g: even(4)
even(X)←X = s(Y),
even(Y)
modeh(odd(+nat)) modeh(even(+nat)) modeb(even(+nat)) modeb(odd(+nat)) modeb(=(+nat, s(-nat)))
Key Features
Able to learn concepts that are not observed recursive inter-dependent Use of heuristics to drive the search Handle integrity constraints Amenable to distributed computation
Domain knowledge: domain properties (D) partial requirement (R) E+ desirable behaviours E- undesirable behaviours
Given Find
New operational requirements Req D ∪ R ∪ Req ⊨ E+ D ∪ R ∪ Req ⊭ E- D ∪ R ∪ Req ⊭ false
logic-based learning
◯ signalCriticalMethane , ◯²signalHighWater, ◯³turnPumpOn ◯ signalLowWater, ◯²turnPumpOn
undesired behaviours (E-)
◯ signalNoCriticalMethane, ◯²signalHighWater, ◯³turnPumpOn
desired behaviours (E+) (( ) Encode FLTL into LP Execute learning Translate back into FLTL
(pumpOn→pumpOn W turnPumpOff) ((→(()( )(→ pumpOn) )(→((
Calculus logic programs
with respect to the given desired and undesired behaviours.
positive examples because of single stable model of the EC representation.
Can we automatically generate complete set of operational requirements (i.e. pre-conditions and trigger-conditions) that together with the domain properties satisfies a given goal model?
logic-based learning
Domain knowledge: domain properties (D) set of goals (G)
Given Find
Set of operational requirements Req D ∪ Req ⊨ G D ∪ Req ⊭ false
model checking Learn
D ∪ R ∪ Repi ⊭ E- D ∪ R ∪ Repi ∪ E+ ⊭ false D ∪ R ∪ Repi ∪ G ⊭ false Compute undesirable scenarios E- D ∪ R ∪ E- ⊭ G desirable scenarios E+ D ∪ R ∪ E+ ∪ G ⊭ false R = R ∪ Repi While D ∪ R ⊭ G
Negative examples Positive examples Domain properties Operational Requirements (Pre-conditions Trigger-conditions)
logic-based learning model checking
Counterexamples Witnesses Goal Model
LTS M is synthesised from FLTL software model (D ∪ R) M is model checked again goals G, and counterexample C is generated Witnesses traces W of the violated goal are computed from M ∪ G
Positive and negative examples are generated from witnesses and counterexample FLTL software model (D ∪ R) is encoded into EC logic program (K), and goals are expressed as integrity constraints (IC) Learning task computes missing requirements (Reqi)
FLTL representation of selected hypothesis is added to software model.
Elaborating examples from counterexample and witnesses
Identify undesirable event in the counterexample
software controlled event missing pre-condition
learning the missing pre-condition
impossible(switchPumpOn, P , S) ← holdsAtPrev(criticalMethans, P , S). (tick →(criticalMethane→ ◯(¬switchPumpOn W tick))
⃞ (tick → ((CriticalMethane) → ◯(¬tick W (tick ⋀ ¬PumpOn))
Goal
Elaborating examples from counterexample and witnesses
tick event missing trigger-condition
impossible(tick, P , S) ← holdsAtPrev(criticalMethans, P , S), holdsAtprev(pumpOn,P ,S), not occursSinceLastTick(switchPumpOff, P , S).
learning the missing trigger-condition
(tick →(criticalMethane ∧ pumpOn → ◯(¬tick W switchPumpOff)))
Identify undesirable event in the counterexample
software controlled event missing pre-condition
⃞ (tick → ((CriticalMethane) → ◯(¬tick W (tick ⋀ ¬PumpOn))
Goal
most specific software model - each iterations removes traces, consistently with the goals.
more than one goal violation could be resolved in an iteration.
ambulance, etc.).
persistency (i.e. frame problem).
stable models, which helps proving the correctness of the refinement.
Requirements incompleteness
system failure
Often caused by poor risk analysis
unexpected events
Hard to anticipate what could go wrong and why?
Preconditions for goal to be violated in a given domain Feasible in the domain Domain-complete set of
D ∪ O ⊨ ¬ G D ∪ O ⊭ ⊥
D ∪ ¬O1∪…∪ ¬On ⊨ G
Preconditions for goal to be violated in a given domain Feasible in the domain Domain-complete set of
D ∪ O ⊨ ¬ G D ∪ O ⊭ ⊥
D ∪ ¬O1∪…∪ ¬On ⊨ G
assess likelihood and severity
assess likelihood and severity
Heuristics [Anton & Potts 98, van Lamsweerde & Letier 00, Sutcliffe 99]
Formal calculus [van Lamsweerde & Letier 00]
Formal obstruction patterns [van Lamsweerde & Letier 00]
assess likelihood and severity
Can we automatically generate obstacles conditions for a given goal model and domain properties?
Domain knowledge (LTL):
D ⊭ G
Given Find
Set of obstacles {O1, O2 ,.., On} D ∪ Oi ⊨ ¬G D ∪ Oi ⊭ false D ∪ {¬O1, …,¬On} ⊨ G
Positive examples Negative examples Domain properties Obstacles that explains the example
logic-based learning model checking
Counterexamples Witnesses Goal Model
{ D, G } { O } { ¬O, D, G } D ⊨ GAT ? D ⊨ G ? D ∪ ¬O ⊨ GAT? D ∪ ¬O ⊨ G? { O, O’, D’} { ¬O, ¬O’, D’, G } D ∪ ¬O ∪ ¬O’ ⊨ G? D ∪ ¬O ∪ ¬O’ ⊨ GAT? Domain-Complete Set of Obstacles {¬O1, …, ¬On, D} ⊨ G
model checking
Counterexamples Witnesses C ⇒ ⊝T
G = Goal “Train Stops At Signal if Stop Signal is On” StopSignalOn ⇒ ◦ TrainStopped D =
Fluent definitions Necessary conditions for goal target
Assumptions
Fluent definition
StopSignalOn = < set_to_stop, set_to_go, false > TrainStopped = < stop_train, start_train, false > SignalVisible = < clear_signal, obstruct_signal, true >
1 4 2 3 start train set to go set to stop clear signal driver responds
d r i v e r i g n
e stop train clear signal driver responds start train set to go signal stop
driver ignore
driver ignore start train set to go set to stop clear signal driver responds clear signal driver ignores start train set to go set to stop
driver responds start train driver responds stop train set to go set to stop clear signal
D ⊨ C ⇒ ⊝T
tr - = <set_to_stop, obstruct_signal> Goal satisfiability
D ⊨ C ⇒ ¬ ⊝T
tr+ = <set_to_stop, stop_train>
Automatically translate domain properties, goals, counterexample and witness(es) into the logic formalism understood by learning tool
G: StopSignalOn ⇒ ◦ TrainStopped D TrainStopped ⇒ SignalVisible TrainStopped ⇒ DriverResponsive TrainStopped = < stop_train, start_train, false > StopSignalOn = < set_to_stop, set_to_go, false > SignalVisible = < clear_signal, obstruct_signal, true > tr-: set_to_stop, obstruct_signal tr+: set_to_stop, stop_train
Positive examples Negative examples Obstacles that explain the examples
logic-based learning
Fluent definitions Necessary conditions for goal target C ⇒ ⊝T
G = D =
%"IC" :&"holdsAt(trainStopped,T,S)," """not"holdsAt(signalVisible,T,S)." ……" %"D" initiates(stop_train,trainStopped)."" terminates(start_train,trainStopped)." initiates(clear_signal,"signalVisible)."" terminates(obstruct_signal,"signalVisible)." initially(signalVisible)." ……" %"G" holdsAt(trainStopped,T2,S):&" """"""holdsAt(stopSignalOn,T1,S),"next(T2,T1),"" """"""not$obstructed_next(trainStopped,T1,S)." %"Traces" happens(set_to_stop,0,tr_neg)." happens(obstruct_signal,1,tr_neg)." happens(set_to_stop,0,tr_pos)." happens(stop_train,1,"tr_pos).
e+ = not holdsAt(trainStopped, 2, tr_neg)
Hypothesis must be computed to explain why the goal’s target is obstructed in this example
e+ = holdsAt(trainStopped, 2, tr_pos)
Hypothesis computed should be consistent with goal’s target not be obstructed in the witness trace
Automatically translate learned hypothesis into LTL obstacle expression
" " holdsAt(stopSignalOn,T,S)," " " not"holdsAt(signalVisible,T,S)."
O1 = ◇ (StopSignalOn ⋀ ¬ SignalVisible) ¬O1 = ( ¬ StopSignalOn ∨ SignalVisible)
Tool-supported approach for incremental generation of domain- complete set of obstacles No user intervention is required for example provision. Increased goal expressiveness with respect to earlier work Domain-feasibility of generated obstacles guaranteed by the soundness of the learner Supports elicitation of relevant domain properties. Applied successfully to other problem domains (e.g. London Ambulance System)
specifying the rules)
Context, Constraints User Agent
E+ E+ E+ E+ Co-location Location (GPS, name)
Call properties (Contact list)
Call from Alice at 9:00 Call from Bob at 10:30 Call from Charles at 11:30
User answers the call if {conditions}”
Challenge A system that learns what the user wants from past actions and uses them to reduce intervention in privacy management.
New / Revised Rules
Reality mining dataset – single users Learn rules able to predict when the user answers phone calls
modeh(accept(+date, +time, +contact, +volume, +vibrator, +battery_level, +screen_brightness, +headset, +screen_status, +light_level, +battery_charging)). 1
modeb(=(+contact, #contact), [no_ground_constants, name(c)]). 200
modeb(=(+volume, #volume), [no_ground_constants, name(vol)]). 20
modeb(=(+vibrator, #vibrator), [no_ground_constants, name(vib)]). 20
modeb(=(+battery_level, #battery_level), [no_ground_constants, name(bl)]). 200
modeb(=(+screen_brightness, #screen_brightness), [no_ground_constants, name(scb)]). 20
modeb(=(+headset, #headset), [no_ground_constants, name(hs)]). 20
modeb(=(+screen_status, #screen_status), [no_ground_constants, name(ss)]). 20
modeb(=(+light_level, #light_level), [no_ground_constants, name(ll)]).
modeb(=(+battery_charging, #battery_charging), [no_ground_constants, name(bc)]). 200
modeb(weekday(+date)). 2(Positive, Negative)
modeb(weekend(+date)). 2
modeb(evening(+time)). 2
modeb(morning(+time)). 2
modeb(afternoon(+time)). 2
modeb(in_call(+date, +time)). 2
modeb(at(+date, +time, #cell)). 200
modeb(nearDevice(+date, +time, #device)). 2000
modeb(neighbourhood(+cell, #cell)). 200
modeb(user_been_in(+date, +time, +cell)). 2
modeb(user_is_active(+date, +time)). 2
modeb(phone_charging(+date, +time)). 2
modeb(phone_on(+date, +time)). 2
modeb(user_is_using_app(+date, +time, #app)). 20
modeb(time_before_h(+time, #hour), [no_ground_constants,name(before)]). 100
modeb(time_after_h(+time, #hour), [no_ground_constants,name(after)]). 100
Battery_level ~ 102 Contacts ~ 101 Devices ~ 103 Cells ~ 102 Date x Time ~ 103 Other options ~ 10
Cell tower Bluetooth devices Activity Coverage Abstractions + Domain Knowledge
Calls answer_call(…) IF condition1,1, …, " conditionmax_c,1
Do we have to explore the whole space? Only a significant portion of the search space is explored Monotonicity assumption If an example is entailed by the current solution, there is no point at considering it in the rest of the computation Accuracy depends on the particular user (60%~90%) Why not higher? Contextual data not reach enough Data not very accurate Simple knowledge abstraction (e.g. moving/stationary, alone/ crowded place)
TAL$ Find$solu-on$ Test$ coverage$ Update$ training$data$
Ubiquitous Computing brings new challenges to Software Engineering
Systems and software are mobile, adaptive, context aware, self-healing…..
How can we design and engineer adaptive systems?
Quantitative, as well as qualitative requirements:
✴ how reliable is a network? ✴ how efficient is a phone’s power management policy? ✴ how secure is my bank’s web-service?
Quantitative approaches for the assurance of safety, security, trust dependability, performance, ... Probabilistic logic-based learning can help to
Integration of probabilistic model checking with probabilistic logic-based learning.
Static elaboration and refinement of probabilistic models of system behaviours
Environment System
Domain model
action reaction
Environment System
Domain model
action reaction
Domain model
model revision
Planning Plan execution
Domain model
learning
estimation NoMProL Plan Traces
Goal management layer Change & component layers
Domain model includes aspects of system and environment. More generally any design time information used as basis for adaptation – architecture, behaviour, strategies, NFP ... NoMProL helps the improves the domain model by computing the maximum likelihood hypothesis that explain most observations. Feedback from observations of its reactive plan executions is used to improve its domain model.
1 2 5 6 3 4
pickup m
e ( 3 ) move(5) move(5) putdown
P(success) = r P(success) = p P(success) = q Object smashed
Probability of overall success rpq Success of putdown action depends on path that was taken
possible(pickup, T) :- not holdsAt(holdingObject, T), holdsAt(at(loc1), T). possible(putdown, T) :- holdsAt(holdingObject, T), holdsAt(at(loc5), T). possible(move(L1, L2), T) :- holdsAt(at(L1), T), connected(L1, L2). ... initiates(pickup, holdingObject, T). terminates(putdown, holdingObject, T). initiates(move(L1, L2), at(L2), T). terminates(move(L1, L2), at(L1), T).
holdsAt(at(loc1), 0). do(pickup, 0). holdsAt(at(loc1), 1). holdsAt(holdingObj, 1). do(move(loc1, loc3), 1). holdsAt(at(loc3), 2). holdsAt(holdingObj, 2). do(move(loc3, loc5), 2). holdsAt(at(loc5), 3). holdsAt(holdingObj, 3). do(putdown, 3). succ(move(loc3, loc5), T) :- holdsAt(at(loc3), T), holdsAt(holdingObj, T).
Domain model Execution traces Hypothesis
holdsAt(at(loc1), 0). do(pickup, 0). holdsAt(at(loc1), 1). holdsAt(holdingObject, 1). do(move(loc1, loc3), 1). holdsAt(at(loc3), 2). holdsAt(holdingObject, 2). do(move(loc3, loc5), 2). holdsAt(at(loc5), 3). holdsAt(holdingObject, 3). do(putdown, 3). succeeds(move(loc3, loc5), T) :- holdsAt(at(loc3), T), holdsAt(holdingObject, T).
Many traces, many hypothesis Find the maximum likelihood hypothesis that explains with maximum probability the
Compute estimations of the rules using gradient descent and minimising the mean square error (MSE) with respect to the given
r1:0.7 : succeeds(pickup, T). r2:0.9 : succeeds(move(L1, L2), T) :- holdsAt(at(L1), T), connected(L1, L2), L2 != loc3. r3:0.9 : succeeds(putdown, T) :- not happened(move(loc2, loc3), T-2). r4:0.1 : succeeds(putdown, T) :- happened(move(loc2, loc3), T-2).
Learned rules result in new states and transitions in the domain model – with probabilities
1 2 6 3 4
pickup move(4) move(4) move(5) p u t d
n
p u t d
n
5a 5b
New factory floor plans
Robot navigation: global failure rate reduced from 30% to 10%
1 2 6 3 4
pickup move(4) move(4) m
e ( 5 ) move(5) putdown
putdown
5a 5b
0.7 . 9 0.9 . 9 0.9 0.9 . 1
Updated Factory Floor Model
Our state-of-the-art logic-based learning systems are capable of learning declarative software models expressed in FLTL. Integration of model checking and logic-based learning can provide automated support to various software engineering tasks. Important feature is the choice of the programming formalisms to facilitate equivalence properties of the translation function (e.g. equivalence of entailment). Modelling languages that allow finite counterexamples characterization in order to guarantee convergence of the process. Promising results in probabilistic rule-based learning.
Automated obstacle resolution using logic-based learning for theory revision, to learn new goal refinements, weakening, mitigations. Goals and obstacles with probabilities using probabilistic goals and obstacles, probabilistic model checking (e.g. PRISM) and probabilistic logic-based learning. Use of probabilistic logic-based learning for run-time adaptation
[ILP06] Dalal Alrajeh, Oliver Ray, Alessandra Russo, Sebastián Uchitel, Extracting requirements from Scenarios with ILP , ILP 2006: 64-78. [FASE08] Dalal Alrajeh, Alessandra Russo, Sebastián Uchitel, Deriving Non-zeno Behaviour Models from Goal Models using ILP , FASE 2008: 1-15. [JAL09] Dalal Alrajeh, Oliver Ray, Alessandra Russo, Sebastián Uchitel, Using abduction and induction for
[ICSE09] Dalal Alrajeh, Jeff Kramer, Alessandra Russo, Sebastián Uchitel, Learning operational requirements from goal models. ICSE 2009: 265-275. [FormAspComp10] Dalal Alrajeh, Jeff Kramer, Alessandra Russo, Sebastián Uchitel, Deriving non-Zeno behaviour models from goal models using ILP . Formal Asp. Comput. 22(3-4): 217-241 (2010). [ILP11] Dalal Alrajeh, Alessandra Russo, Sebastián Uchitel, Jeff Kramer, Integrating Model Checking and Inductive Logic Programming. ILP 2011: 45-60 [ICLP11]Dalal Alrajeh, Jeff Kramer, Alessandra Russo, Sebastián Uchitel, An Inductive Approach for Modal Transition System Refinement. ICLP (Technical Communications) 2011: 106-116 [ICSE12] Dalal Alrajeh, Jeff Kramer, Axel van Lamsweerde, Alessandra Russo, Sebastián Uchitel, Generating
[FASE12] Dalal Alrajeh, Jeff Kramer, Alessandra Russo, Sebastián Uchitel, Learning from Vacuously Satisfiable Scenario-Based Specifications. FASE 2012: 377-393. [ICSE13] Daniel Sykes, Domenico Corapi, Jeff Magee, Jeff Kramer, Alessandra Russo, Katsumi Inoue, Learning revised models for planning in adaptive systems. ICSE 2013: 63-71. [TPLP13] Dalal Alrajeh, Rob Miller, Alessandra Russo, Sebastián Uchitel, Reasoning about Triggered Scenarios in Logic Programming. TPLP 13(4-5-Online-Supplement) (2013). [TSE213] Dalal Alrajeh, Jeff Kramer, Alessandra Russo, Sebastián Uchitel, Elaborating Requirements Using Model Checking and Inductive Learning. IEEE Trans. Software Eng. 39(3): 361-383 (2013). [CACM15] Dalal Alrajeh, Jeff Kramer, Alessandra Russo, Sebastián Uchitel, Automated support for diagnosis and