From Compilers to Grammarware Dr. Vadim Zaytsev Introduction - - PowerPoint PPT Presentation

from compilers to grammarware
SMART_READER_LITE
LIVE PREVIEW

From Compilers to Grammarware Dr. Vadim Zaytsev Introduction - - PowerPoint PPT Presentation

From Compilers to Grammarware Dr. Vadim Zaytsev Introduction Compilers Grammarware T ransformation Maturity Consistency Understanding T esting Conclusion Introduction Vadim Zaytsev MSc in appl.math (2003) & telematics


slide-1
SLIDE 1

to Grammarware

  • Dr. Vadim Zaytsev

From Compilers

slide-2
SLIDE 2

Introduction Compilers Grammarware T ransformation Consistency Maturity Understanding T esting Conclusion

slide-3
SLIDE 3

Introduction

  • Vadim Zaytsev
  • MSc in appl.math (2003) & telematics (2004)
  • PhD in softw.lang.eng. (2010)
  • Postdoc at CWI (2010–2013)
  • Lecturer at UvA (2013–…)
slide-4
SLIDE 4

What is a compiler?

slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11
slide-12
SLIDE 12

Language processing

  • Internal structures
  • databases, configurations, tables, …
  • External structures
  • protocols, interfaces, bytecode, …
  • Software language
  • programming, modelling, markup, …
slide-13
SLIDE 13

Compiler

Front End MIDDLE End BACK End

slide-14
SLIDE 14

Multi-language compiler

Front End MIDDLE End BACK End Front End Front End

slide-15
SLIDE 15

Multi-target compiler

Front End MIDDLE End BACK End Front End Front End BACK End BACK End

slide-16
SLIDE 16

Grammarware

Front End MIDDLE End BACK End Front End Front End BACK End BACK End

slide-17
SLIDE 17

Compilers transform between languages

Grammarware commits to grammatical structure

slide-18
SLIDE 18

Kinds of grammarware

  • Parser
  • Compiler
  • Interpreter
  • Prettyprinter
  • Scanner
  • Browser
  • Static checker
  • Struct.editor
  • IDE
  • DSL
  • Preprocessor
  • Postprocessor
  • Validator
  • Model checker
  • Refactorer
  • Code slicer
  • API
  • XMLware
  • Modelware
  • Lang.
  • RE
  • Benchmark
  • Recommender
  • Renovation tool

Klint, Lämmel, Verhoef, T

  • ward an Engineering Discipline for Grammarware
slide-19
SLIDE 19

Declarative Multi-Purpose Language Definition Syntax Definition Name Binding Type Constraints Dynamic Semantics Transform

Languages vs. grammars

Visser,

slide-20
SLIDE 20

Introduction Compilers Grammarware T ransformation Consistency Maturity Understanding T esting Conclusion

slide-21
SLIDE 21

What is good grammarware?

slide-22
SLIDE 22

Case study: JLS

?

Lämmel, Zaytsev, Recovering Grammar Relationships for the

slide-23
SLIDE 23

What is good grammarware?

What is good software?

slide-24
SLIDE 24

What is good software?

  • functional
  • reliable
  • usable
  • efficient
  • maintainable
  • portable

ISO/IEC 9126.

slide-25
SLIDE 25

What is good grammarware?

  • functional: commits to the language
  • reliable: tolerant to errors
  • usable: the language is learnable
  • efficient: fast (live?) and responsive
  • maintainable: can be tested and evolved
  • portable
slide-26
SLIDE 26

Certified Language Processor

slide-27
SLIDE 27

Certified Language Engineer

slide-28
SLIDE 28

Capability Maturity Model

  • Level 1 — Chaotic
  • Level 2 — Repeatable
  • Level 3 — Defined
  • Level 4
  • Level 5 — Optimising

Paulk, Weber, Curtis, Chrissis, Capability Maturity Model for Software

slide-29
SLIDE 29

Grammar Zoo

  • 974 fetched grammars
  • 588 extracted
  • 79 connected
  • 9 adapted


 +metadata

http://slebok.github.io/zoo

Zaytsev, Grammar Maturity Model Zaytsev, Grammar Zoo: A Corpus of Experimental Grammarware

slide-30
SLIDE 30

Improving quality

  • Manual inline editing
  • Refactorings
  • Programmed transformations
  • +Differs
  • Grammar mutations
  • Inference of transformation/mutation steps
slide-31
SLIDE 31

How to transform

expr : …; atom : ID | INT | '(' expr ')'; expr : …; atom : ID; atom : INT; atom : expr; expr : …; expr : ID; expr : INT; expr : expr; expr : …; expr : ID; expr : INT; expr : …; atom : ID | INT | expr;

abstractize vertical unite abridge

Lämmel, Zaytsev, An Introduction to Grammar Convergence, IFM’

slide-32
SLIDE 32
  • Grammar has no starting symbol?
  • Reroot2top
  • Need abstract syntax from concrete syntax?
  • Retire

T s

  • Grammar productions written in an
  • DeyaccifyAll
  • Change naming convention?
  • RenameAllNLower2Camel

How to mutate

  • Zaytsev. Software Language Engineering by Intentional Rewriting, SQM’14
slide-33
SLIDE 33

How to be guided

  • Equality & algebraic equivalence
  • Prodsig-equivalence
  • signatures based on nonterminal patterns
  • tolerant to permutations
  • weak equivalence tolerant to iteration kinds
  • Abstract Normal Form
  • no terminals, labels, markers
  • consistent disjunctive style

Zaytsev, Guided Grammar Convergence,

slide-34
SLIDE 34

How to be guided

pmaster = p(ε, expr, expr · operator · expr) F pantlr = p(ε, binary, s(l, atom) · ∗(s(o, ops) · s(r, atom))) F pdcg = p(binary, expr, atom · ∗(ops · atom)) pemf = p(ε, Binary, s(ops, Ops) · s(left, Expr) · s(right, Expr)) pjaxb = p(ε, Binary, s(Ops, Ops) · s(Left, Expr) · s(Right, Expr)) pom = p(ε, Binary, s(ops, Ops) · s(left, Expr) · s(right, Expr)) F ppython = p(ε, binary, atom · ∗(operators · atom)) padt = p(ε, FLExpr, s(binary, s(e1, FLExpr) · s(op, FLOp) · s(e2, FLExpr))) prascal = p(binary, Expr, s(lexpr, Expr) · s(op, Ops) · s(rexpr, Expr)) psdf = p(binary, Expr, Expr · Ops · Expr) ptxl = p(ε, expression, expression · op · expression) pxsd = p(ε, Binary, s(ops, Ops) · s(left, Expr) · s(right, Expr)) Zaytsev, Guided Grammar Convergence,

slide-35
SLIDE 35

What we want in general

  • Maintenance assistants
  • infer whatever possible
  • provide advice on the rest
  • Not necessarily “request => result or fail”
  • pending
  • negotiated

Zaytsev, Pending Evolution of Grammars Zaytsev, Negotiated Grammar Evolution

slide-36
SLIDE 36

Negotiating the result

rename(expr,Expr)

  • k

no expr! rename(exp,Exp)

Zaytsev, Negotiated Grammar Evolution

slide-37
SLIDE 37

Key points

  • For grammarware, we need
  • consistency
  • a clear quality model
  • improvement processes
  • automation
  • Also,
  • understanding user scenarios
slide-38
SLIDE 38

Parsing in a broad sense

grouped tokens typed tokens slices/ tokens raw string visual diagram graph model vector drawing raster picture abstract model concrete model parse graph parse forest

Zaytsev, Bagge, Parsing in a Broad Sense, MoDELS'

slide-39
SLIDE 39

Introduction Compilers Grammarware T ransformation Consistency Maturity Understanding T esting Conclusion

slide-40
SLIDE 40

So, grammarware is based on grammars… …can we test/validate it based on grammars?

slide-41
SLIDE 41

Grammar-based testing

  • Purdom’s generator
  • builds the shortest conforming term
  • Maurer’s generator
  • randomly selects alternatives
  • Coverage criteria
  • TC, NC, PC, BC, UC, CDBC
  • Negative cases?

Fischer, Lämmel, Zaytsev, Comparison of CFGs Based on … T est Data

G G’ P P’

slide-42
SLIDE 42

Combinatorial explosion

250 500 750 1,000 1,250 TC PC NC BC CDBC TC PC NC BC CDBC TC PC NC BC CDBC TC PC NC BC CDBC TC PC NC BC CDBC Java (Habelitz) Java (Parr) Java (Stahl) Java (Studman) TESCOL (00001)

Fischer, Lämmel, Zaytsev, Comparison of CFGs Based on … T est Data

slide-43
SLIDE 43

Combinatorial explosion

Butrus, Zaytsev, Grammar-based T esting Made Easy with Mutations

slide-44
SLIDE 44

Nonterminal matching

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Fischer, Lämmel, Zaytsev, Comparison of CFGs Based on … T est Data

slide-45
SLIDE 45

Badly matched:

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Fischer, Lämmel, Zaytsev, Comparison of CFGs Based on … T est Data

slide-46
SLIDE 46

Differential methods

  • Oracles are unnecessary
  • Comparing grammars
  • of varying structure, style, etc
  • across TSs
  • Investigate disagreements

G G’ P P’

McKeeman, Differential T esting for Software

  • Spinellis. Differential Debugging. IEEE Software, 2013
slide-47
SLIDE 47

What is a bug?

The First Computer Bug Photo #

  • Grammarware

processes languages

  • A bug is a program
  • Inspect programs to

deal with bugs

slide-48
SLIDE 48

Reality vs. specification

  • Obtain a grammar
  • Construct as an oracle
  • Extract from the tool
  • Infer from the codebase
  • Converge/diff.test

Stevenson, Cordy, A Survey of Grammatical Inference in Software Engineering Roș

1 1 1 1 R 1 6 2 3 4 5 7 8 9

APTA state 0 (before any merge)

slide-49
SLIDE 49

Antipatterns

  • Some ways lead to bugs faster
  • Detect them => predict defects
  • Smells
  • left/right recursion
  • ambiguous x*?

T aba, Khomh, Zou, Hassan, Nagappan, Predicting Bugs Using Antipatterns, ICSM 2013 Sajnani, Saini, Lopes, A Comparative Study of Bug Patterns in Java, SCAM 2014 T rubiani, Di Marco, Cortellessa, Mani, Petriu, Exploring Synergies…

slide-50
SLIDE 50

Process improvement

  • find defects
  • fix defects
  • learn how to fix defects
  • learn to tolerate defects
  • learn to avoid
slide-51
SLIDE 51

Semiparsing

  • ad hoc lexical

analysis

  • hierarchical lexical

analysis

  • lexical conceptual

structure

  • iterative lexical

analysis

  • fuzzy parsing
  • parsing incomplete

sentences

  • island grammars
  • lake grammars
  • robust multilingual

parsing

  • gap parsing
  • noise skipping
  • bridge grammars
  • skeleton grammars
  • breadth-first

parsing

  • iterative syntactic

analysis

  • grammar

relaxation

  • agile parsing
  • permissive

grammars

  • hierarchical error

repair

  • panic mode
  • noncorrecting

error recovery

  • practical precise

parsing

Zaytsev, Formal Foundations for Semi-parsing, CSMR-WCRE’

slide-52
SLIDE 52

Conclusion

Grammarware is more than just compilers Borrow methods from other domains Automate whenever possible Compare & combine Advance taxonomies & formalisms Bet on robust/tolerant methods

slide-53
SLIDE 53

Thank you!

  • Sources:
  • Figures used from own papers & talks
  • + Eelco Visser’s keynote @ MODULARITY
  • + T
  • bias Baanders
  • + JLS book covers (Fair Use)
  • Self-made screenshots
  • All photos from public domain
  • Comfortaa: font

Questions?

slide-54
SLIDE 54

Introduction Compilers Grammarware T ransformation Consistency Maturity Understanding T esting Conclusion