Submodel Pattern Extraction for Simulink Models James R. Cordy - - PowerPoint PPT Presentation

submodel pattern extraction for simulink models
SMART_READER_LITE
LIVE PREVIEW

Submodel Pattern Extraction for Simulink Models James R. Cordy - - PowerPoint PPT Presentation

Submodel Pattern Extraction for Simulink Models James R. Cordy Queens University NECSIS Automotive Partnership Canada MPE near-miss clones near-miss clone detection Simone analysis of GM models pattern extraction pattern evolution


slide-1
SLIDE 1

Submodel Pattern Extraction for Simulink Models

James R. Cordy Queen’s University NECSIS Automotive Partnership Canada

slide-2
SLIDE 2

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-3
SLIDE 3

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-4
SLIDE 4

Model Pattern Engineering discover, catalogue and formalize

submodel patterns

emergent

domain-specific, client-specific

slide-5
SLIDE 5

discovery

analysis, identification methodology, techniques

classification

characterization, formalization notation, tooling, catalogues

application

deployment, analysis

  • rganization, documentation, use cases
slide-6
SLIDE 6

why?

reuse in model development

standards/consistency analysis/enforcement

failure/change propagation in model maintenance

verification/test optimization

deployment variation/optimization

model product lines

slide-7
SLIDE 7

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-8
SLIDE 8

code clones

copy-paste programming

efficient, widely used

problematic

slide-9
SLIDE 9

code clones

type 1 - exact

bool ConfNextToken (char **p) { // skip white space while (1) switch (**p) { case '\t' : // ignore case ' ' : (*p)++; break; case '\0' : return FALSE; default : return TRUE; }; } bool ConfNextToken (char **p) { // skip white space while (1) switch (**p) { case '\t' : // ignore case ' ' : (*p)++; break; case '\0' : return FALSE; default : return TRUE; }; } [Roy, Cordy, Koschke SCP 2009]

slide-10
SLIDE 10

code clones

type 1 - exact

bool ConfNextToken (char **p) { // skip white space while (1) switch (**p) { case '\t' : // ignore case ' ' : (*p)++; break; case '\0' : return FALSE; default : return TRUE; }; } bool ConfNextToken (char **p) { while (1) switch (**p) { case '\t': case ' ': // just skip (*p)++; break; case '\0': // eof return FALSE; default: // something we want return TRUE; }; } [Roy, Cordy, Koschke SCP 2009]

slide-11
SLIDE 11

code clones

type 2 - renamed

bool ConfNextToken (char **p) { // skip white space while (1) switch (**p) { case '\t' : // ignore case ' ' : (*p)++; break; case '\0' : return FALSE; default : return TRUE; }; } bool NextToken (char **bp) { while (1) // not really switch (**bp) { case '\t': case ' ': // next (*bp)++; break; case '\0': return 0; default: return 1; }; } [Roy, Cordy, Koschke SCP 2009]

slide-12
SLIDE 12

bool ConfNextToken (char **p) { // skip white space while (1) switch (**p) { case '\t' : // ignore case ' ' : (*p)++; break; case '\0' : return FALSE; default : return TRUE; }; }

code clones

type 3 - near miss

bool NextToken (char **bp) { while (1) // not really switch (**bp) { case '\t': (*bp)++ case ' ': break; case '\0': return 0; default: return 1; } } [Roy, Cordy, Koschke SCP 2009]

slide-13
SLIDE 13

model clones

type 1 - exact

Tfmaxs 2 Tfmaxk 1 Torque Conversion 2/3*R*muk Ratio of static to kinetic mus/muk Fn 1 Tfmaxs 2 Tfmaxk 1 Torque Conversion 2/3*R*muk Ratio of static to kinetic mus/muk Fn 1

[Alalfi, Cordy, Dean, Stephan, Stevenson ICSM 2012] [Störrle SSM 2013]

slide-14
SLIDE 14

model clones

type 2 - renamed

[Alalfi, Cordy, Dean, Stephan, Stevenson ICSM 2012] [Störrle SSM 2013]

slide-15
SLIDE 15

model clones

type 3 - near miss

[Alalfi, Cordy, Dean, Stephan, Stevenson ICSM 2012] [Störrle SSM 2013]

slide-16
SLIDE 16

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-17
SLIDE 17

ConQAT

graph-based model clone detection

[Deissenboeck et al. IWSC 2010]

slide-18
SLIDE 18

graph flattening ignores hierarchical structure

problems with near-miss

slide-19
SLIDE 19

graph flattening ignores hierarchical structure

problems with near-miss

slide-20
SLIDE 20

code-based near-miss works well

NiCad, iClones, others

mature, accurate, efficient

handles unexpected differences

threshold-based, tunable

scalable

slide-21
SLIDE 21

NiCad

parse - extract - normalize - diff threshold

Pretty-printed Potential Clones

1 2 3 4

Parsing & Potential Clone Extraction Original Code Base

  • 1. Parse / Extract
  • 2. Rename / Filter / Normalize

Renaming, Filtering, Normalization Normalized Potential Clones

1 2 3 4

Clone Classes

5.pc 23.pc 67.pc . . . 12.pc 17.pc 22.pc . . . 15.pc 18.pc 78.pc . . . 21.pc 63.pc 97.pc . . . 37.pc 39.pc 44.pc . . .

Choose Next Potential Clone as Exemplar

  • 3. Clone Analysis

Comparable Size Potential Clone Cluster Pairwise Comparison with Exemplar (Repeat) Cluster Comparable Size PCs Normalized Potential Clones

1 2 3 4

[Roy, Cordy ICPC 2008]

slide-22
SLIDE 22

crazy idea:

can we use near-miss text code methods

  • n graphical models?

“Models are source code too”

Mark Harman, keynote at SCAM 2010

[Harman SCAM 2010]

slide-23
SLIDE 23

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-24
SLIDE 24

Simone

Simulink near-miss clone detection

experiment adapt NiCad near-miss code clone

detector to graphical models

validate vs. ConQAT

for types 1 & 2

hand validate type 3 (near-miss)

[Alalfi, Cordy,Dean, Stephan, Stevenson ICSM 2012]

slide-25
SLIDE 25

Simulink

hybrid hardware/software models

widespread in industry

  • automotive, aerospace, embedded systems

mature and interesting at GM

slide-26
SLIDE 26

Simulink

hierarchical models

[Alalfi, Cordy,Dean, Stephan, Stevenson ICSM 2012]

slide-27
SLIDE 27

Challenge #1

code methods require text

NiCad requires a parser

Solution: grammar inference

  • n Simulink’s

internal form

... System { Name "onoff" Location [168, 385, 668, 686] Open on ModelBrowserVisibility off ModelBrowserWidth 200 ScreenColor "automatic" PaperOrientationi "landscape" PaperPositionMode "auto" PaperType "usletter" PaperUnits "inches" ZoomFactor "100" AutoZoom on ReportName "simulink-default.rpt" Block { BlockType DiscretePulseGenerator Name "Discrete Pulse\nGenerator" Position [45, 25, 75, 55] Amplitude "1" Period "2" PulseWidth "1" PhaseDelay "0" SampleTime "1" } Block { BlockType Product Name "Product" Ports [2, 1, 0, 0, 0] Position [145, 67, 175, 98] Inputs "2" SaturateOnIntegerOverflow on } ... } ...

slide-28
SLIDE 28

Challenge #2

what granularity?

NiCad requires candidates for comparison

Simulink: model (too big) block (too small) system (just right!)

... System { Name "onoff" Location [168, 385, 668, 686] Open on ModelBrowserVisibility off ModelBrowserWidth 200 ScreenColor "automatic" PaperOrientationi "landscape" PaperPositionMode "auto" PaperType "usletter" PaperUnits "inches" ZoomFactor "100" AutoZoom on ReportName "simulink-default.rpt" Block { BlockType DiscretePulseGenerator Name "Discrete Pulse\nGenerator" Position [45, 25, 75, 55] Amplitude "1" Period "2" PulseWidth "1" PhaseDelay "0" SampleTime "1" } Block { BlockType Product Name "Product" Ports [2, 1, 0, 0, 0] Position [145, 67, 175, 98] Inputs "2" SaturateOnIntegerOverflow on } ... } ...

slide-29
SLIDE 29

even with raw text, find some subsystem clones

but:

90% irrelevant Simulink internal “formatting” systems

some identical systems only 70% same

entirely missed exact copies displayed differently

neutral_up_down 1 mutually_exclusive

neutral up down validated_neutral validated_up validated_down

check_up

action reset checked_action

check_down

action reset checked_action

Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1

  • neutral_up_down

1 mutually_exclusive

neutral up down validated_neutral validated_up validated_down

check_up

action reset checked_action

check_down

action reset checked_action

Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1

slide-30
SLIDE 30

Challenge #3

problems with “noise”

solution: “agile parsing” to

filter out irrelevant elements

... System { Name "onoff" Location [168, 385, 668, 686] Open on ModelBrowserVisibility off ModelBrowserWidth 200 ScreenColor "automatic" PaperOrientationi "landscape" PaperPositionMode "auto" PaperType "usletter" PaperUnits "inches" ZoomFactor "100" AutoZoom on ReportName "simulink-default.rpt" Block { BlockType DiscretePulseGenerator Name "Discrete Pulse\nGenerator" Position [45, 25, 75, 55] Amplitude "1" Period "2" PulseWidth "1" PhaseDelay "0" SampleTime "1" } Block { BlockType Product Name "Product" Ports [2, 1, 0, 0, 0] Position [145, 67, 175, 98] Inputs "2" SaturateOnIntegerOverflow on } ... } ... [Dean, Cordy, Malton, Schneider JASE 2003]

slide-31
SLIDE 31

filtering

removes more than 300 kinds

  • f irrelevant

elements and blocks

increases signal-to-noise

ratio in text

... System { Name "onoff” Block { BlockType DiscretePulseGenerator Name "Discrete Pulse\nGenerator” Amplitude "1" Period "2" PulseWidth "1" PhaseDelay "0" SampleTime "1" } Block { BlockType Product Name "Product" Ports [2, 1, 0, 0, 0] Inputs "2” } ... } ...

slide-32
SLIDE 32

filtering significantly improved performance

precision - 10x fewer false positives hand validation of results

recall - many fewer false negatives

fewer missed clones much larger clones

but:

some clones we could clearly see by hand still not detected - why?

slide-33
SLIDE 33

Challenge #4

no linear order of model elements

slide-34
SLIDE 34

Challenge #4

solution: topological sort by block, line, port, branch

slide-35
SLIDE 35

sorting

increases recall, to find many more clones

neutral_up_down 1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
slide-36
SLIDE 36

sorting

increases recall, to find many more clones

neutral_up_down 1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
  • neutral_up_down
1 mutually_exclusive neutral up down validated_neutral validated_up validated_down check_up action reset checked_action check_down action reset checked_action Goto1 [reset] From2 [reset] From1 [reset] reset 4 down 3 up 2 neutral 1
slide-37
SLIDE 37

... System { Name "onoff” Block { BlockType DiscretePulseGenerator Name "Discrete Pulse\nGenerator” Amplitude "1" Period "2" PulseWidth "1" PhaseDelay "0" SampleTime "1" } Block { BlockType Product Name "Product" Ports [2, 1, 0, 0, 0] Inputs "2” } ... } ...

Challenge #5

finding type 2 (renamed) requires anonymization

names in Simulink not like other

languages

solution:

context-dependent anonymizer

slide-38
SLIDE 38

validation - Simone vs. ConQAT

  • n Matlab Central public model systems

finds all type1(exact) and type 2 (renamed)

clones found by ConQAT

finds many new type 3 (near-miss)

clones not found by ConQAT

finds larger clones and larger clone classes

[Alalfi, Cordy,Dean, Stephan, Stevenson ICSM 2012]

slide-39
SLIDE 39

Simone vs. ConQAT

slide-40
SLIDE 40

Total nontrivial subsystems 357 Extractor only Filtered Filtered & Sorted Filtered, Sorted & Renamed Clone Type Type 1 Type 3-1 @30% Type 1 Type 3-1 @30% Type 1 Type 3-1 @30% Type 2 Type 3-2 @30% Clone Pairs 116 / 10* 364 / 164* 204 204 303 181 279 1938 Clone Classes 8 / 4* 57 / 56* 44 55 45 52 48 24 Clone Coverage 8% / 3% 52% / 46% 37% 48% 42% 45% 49% 75%

Simone near-miss clones

in Simulink public automotive model variants

[Alalfi, Cordy,Dean, Stephan, Stevenson ICSM 2012]

slide-41
SLIDE 41

!"#$%&%'()(*+&$%'( ,+-.".$%-$(/%-01"-2( 3#"-'(/%-01"-2( *4.$%1.( 56$&07$%'( ,#+-%.( ,#0..%.( ,#+-%.( ,#0..%.( ,#+-%.( ,#0..%.( !"#$"%&'()#*"$& +,-& ,,.& ,+& ,/+& ,.&

  • 0,&

,1& 2345$3)6&7(48& 1/9:& ;0-& .,& ,11& .,& 1.;0&

  • -&

'8)<"#&=>"4?$(@&

  • -0&

1+.&

  • 0&

;;;&

  • 0&
  • ,.&
  • ,&

89('+$)::( 099& ;11& ;.& ;90& ;0& /::& ;:& 89('+$):;( 099& ;11& ;/& ;.-& ;.& /1.& ;1& 89('+$):<( 0./& ;1;& ;+& ;.-& ;/& /;-& ;,& 89('+$):=( 1:01& ;/-& ,+& ,9;&

  • 1&

11-9& ,:&

Simone near-miss pattern mining

in Simulink models

slide-42
SLIDE 42

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-43
SLIDE 43

Case study

GM fuel system models

SimGraph visualization

understanding Simulink model subsystem similarity

slide-44
SLIDE 44

GM Fuel System Models

Subsystem similarity overview

Large subsystems Midsize subsystems Small subsystems

Z models

(red)

Y models

(green)

X models

(blue)

slide-45
SLIDE 45

GM Fuel System Models

Subsystem similarity overview

Many subsystems unique - not similar to any others in these models

slide-46
SLIDE 46

GM Fuel System Models

Remove unique subsystems

Connecting lines represent subsystem similarity - thick lines, 90-100% similar thin lines, 70-80% similar Similar subsystems “near-miss clones” both within and between models

slide-47
SLIDE 47

GM Fuel System Models

Rearrange to cluster similar subsystems

Clusters reveal groups of similar subsystems - “clone classes”

slide-48
SLIDE 48

GM Fuel System Models

Infer common subsystem patterns

Patterns characterize common repeated similar subsystem paradigms

Small groups of relatively large similar subsystems both within and across models Large groups of small to mid-sized similar subsystems across models

slide-49
SLIDE 49

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-50
SLIDE 50

using patterns SimNav

exemplar sets as patterns

SimPat

modeling variance

slide-51
SLIDE 51

SimNav

presenting and integrating results in Simulink

exemplar sets as patterns

slide-52
SLIDE 52

SimNav

&'()*'+,

  • './0

&=!>-? !"#$"%

#12345/61078*9+170131839:7 ;69):8178901< &'()*'+,78*9+170131839:7 ;(901*6<

&'(-/@

slide-53
SLIDE 53
slide-54
SLIDE 54

SimPat

characterizing and representing subsystem patterns

modeling variance

slide-55
SLIDE 55

SimPat

slide-56
SLIDE 56

!901*7D/E1:+6

!901*7.*9+17.*/66A7 B76)56C631(78*9+16

!901*7D/E1:+7FA

!1:G17#H:9E*177/+07 D:166):17?6I(/I9+

slide-57
SLIDE 57

!901*7D/E1:+7JA

777!1:G17/**73H:117 77777776)56C631(6 7

!901*7D/E1:+6

!901*7.*9+17.*/66A7 B76)56C631(78*9+16

slide-58
SLIDE 58

!901*7D/E1:+7BA

.9((9+71*1(1+36 77777777777779+*C

!901*7D/E1:+6

!901*7.*9+17.*/66A7 B76)56C631(78*9+16

slide-59
SLIDE 59

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

!"#$

%&$ %"$

&'$ &($ &)$ &'$ &'$ &*$

%!$

&($ &)$ &*$ &($ &)$ &*$ !"#$

slide-60
SLIDE 60

model pattern evolution

SimCCT

evolution of patterns

across versions

pattern variance in two dimensions

instance, time

[Stephan, Alalfi, Cordy, Stevenson ME 2013]

slide-61
SLIDE 61

model pattern

MCC - model clone class

elements of a model pattern

MCI - model clone instance

evolution of patterns

migration of MCIs between MCCs across versions of the system

slide-62
SLIDE 62

evolution of patterns

1-1 pattern is stable across versions 1-1* pattern exists, but loses or gains MCIs 1-many pattern splits into multiple patterns 1-many* pattern splits, losing or gaining MCIs 1-0 pattern unifies or disappears

slide-63
SLIDE 63

SimCCT

slide-64
SLIDE 64

SimCCT - Power Window MCC 3

!" #" !" $" %" &" '(" '!" '#" !" $" %" &" '(" '!" '#" )("

*'" *)" *+" *,"-"*!" ./01"2/3"405/26"3/"728"9::"

!"#"$#%&'()#*+,

  • !,.,

!"#"$#%/0)#1$2",

  • !,3,

45"$6%78,

  • !,9,
slide-65
SLIDE 65

!"# !$#

$# %# &# %# &# %# &#

!'# ()*+#,)-#.*/),0#-)#1,2#344#

%# &#

!%# !&#

SimCCT - Power Window MCC 2

slide-66
SLIDE 66

!"#$

%&$ %"$

&'$ &($ &)$ &'$ &'$ &*$

%!$

&($ &)$ &*$ &($ &)$ &*$ !"#$

SimCCT - AVS MCC 7

slide-67
SLIDE 67

MPE

near-miss clones

near-miss clone detection

Simone

analysis of GM models

pattern extraction

pattern evolution

slide-68
SLIDE 68

current work

Stateflow models

deployment at GM

analysis of more systems

slide-69
SLIDE 69

Thank you!

Manar H. Alalfi Thomas R. Dean Matthew Stephan Andrew Stevenson Joseph d’Ambrosio Cheryl Williams

slide-70
SLIDE 70

James R. Cordy Queen’s University NECSIS Automotive Partnership Canada

Submodel Pattern Extraction for Simulink Models