Grammars and meta-models Grammars and meta-models Assignments are - - PowerPoint PPT Presentation

grammars and meta models grammars and meta models
SMART_READER_LITE
LIVE PREVIEW

Grammars and meta-models Grammars and meta-models Assignments are - - PowerPoint PPT Presentation

Grammars and meta-models Grammars and meta-models Assignments are used to assign the parsed There are three different assignment operators, each g g g information to a feature of the current object. with different semantics: The


slide-1
SLIDE 1

Grammars and meta-models

  • Assignments are used to assign the parsed

g g information to a feature of the current object.

  • The type of the current object, its EClass, is

specified by the return type of the parser rule specified by the return type of the parser rule.

  • Example:

State : 'state' name=ID ('actions' '{' (actions+=[Command])+ '}')? (transitions+=Transition)* 'end' ; Attributes of State

/ Faculteit Wiskunde en Informatica

PAGE 0 29-11-2011

Grammars and meta-models

  • There are three different assignment operators, each

g with different semantics:

  • The simple equal sign '=' is the straight forward

assignment and used for features which take only one assignment, and used for features which take only one element

  • The '+=' sign (the add operator) expects a multi-valued

f t d dd th l th i ht h d t th t feature and adds the value on the right hand to that feature, which is a list feature

  • The '?=' sign (boolean assignment operator) expects a

feature of type EBoolean and sets it to true if the right hand side was consumed independently from the concrete value of the right hand side g

/ Faculteit Wiskunde en Informatica

PAGE 1 29-11-2011

Grammars and meta-models

  • Extended Backus-Naur Form Expressions
  • Token rules are described using “Extended Backus-Naur

Form”-like (EBNF) expressions

  • There are four different possible cardinalities

1. exactly one (the default, no operator) 2

  • ne or none (operator ?)

2.

  • ne or none (operator ?)

3. any (zero or more, operator *) 4.

  • ne or more (operator +)

/ Faculteit Wiskunde en Informatica

PAGE 2 29-11-2011

Grammars and meta-models

  • Unordered Groups
  • The elements of an unordered group can occur in any order

but each element can occur at most once.

  • Unordered groups are separated with '&', e.g.

Unordered groups are separated with & , e.g.

Modifier: static?='static' & final?='final' & visibility=Visibility; enum Visibility: PUBLIC='public' | PRIVATE='private‘ | PROTECTED='protected';

  • allows

public static final static protected final private static public

/ Faculteit Wiskunde en Informatica

PAGE 3 29-11-2011

slide-2
SLIDE 2

Grammars and meta-models

  • Context-free grammars are mapped to signatures

g g

  • A signature describes the structure of abstract

syntax trees A t d l l d ib th t t f

  • A meta-model can also describe the structure of

abstract syntax trees, plus

  • Relations between identifiers

e at o s bet ee de t e s

  • Attributes to store scope information
  • Attributes to store type information

/ Faculteit Wiskunde en Informatica

PAGE 4 29-11-2011

Grammars and meta-models

  • The Xtext specification for Booleans:

Model : OrBool ; OrBool : lhs=AndBool ('|' rhs=OrBool)? ; AndBool : lhs=NotBool ('&' rhs=AndBool)? ; NotBool : (not?='~’)? arg=BracketBool ; BracketBool : '(' orArg=OrBool ')' | conArg=BoolCon ; BoolCon : {TrueNode} 'true' | {FalseNode} 'false' ;

/ Faculteit Wiskunde en Informatica

PAGE 5 29-11-2011

Grammars and meta-models

  • Resulting meta-model (of syntax tree) for Booleans:

/ Faculteit Wiskunde en Informatica

PAGE 6 29-11-2011

Grammars and meta-models

  • Resulting meta-model (of syntax tree) for Booleans:

/ Faculteit Wiskunde en Informatica

PAGE 7 29-11-2011

slide-3
SLIDE 3

Grammars and meta-models

  • First alternative for Boolean meta-model

/ Faculteit Wiskunde en Informatica

PAGE 8 29-11-2011

Grammars and meta-models

  • First alternative for Boolean meta-model

/ Faculteit Wiskunde en Informatica

PAGE 9 29-11-2011

Grammars and meta-models

  • SDF definition of toy language Pico

y g g

context-free syntax "begin" DECLS {STATEMENT ";"}* "end" -> PROGRAM "declare" {ID-TYPE ","}* ";" -> DECLS PICO-ID ":" TYPE -> ID-TYPE PICO-ID ":=" EXP -> STATEMENT "if" EXP "then" {STATEMENT ";"}* "else" {STATEMENT ";"}* "fi" -> STATEMENT "while" EXP "do" {STATEMENT ";"}* "od“

  • > STATEMENT

PICO-ID -> EXP NatCon -> EXP StrCon -> EXP EXP "+" EXP -> EXP {left} EXP "-" EXP -> EXP {left} EXP "||" EXP > EXP {l ft} EXP "||" EXP -> EXP {left} "(" EXP ")" -> EXP {bracket}

/ Faculteit Wiskunde en Informatica

PAGE 10 29-11-2011

Grammars and meta-models

  • Xtext specification for Pico:

Model : Program ; Program : 'begin' decls=Decls? stats=Statements? 'end' ; Decls : 'declare' idtypes=IdTypes ';' ; IdTypes : pairs+=IdType (',' pairs+=IdType)*; yp p yp ( , p yp ) IdType : name=ID ':' type=Type ; Type : {naturalType} 'natural' | {stringType} 'string' ; Type : {naturalType} natural | {stringType} string ; Statements : statements+=Statement (';' statements+=Statement)*;

/ Faculteit Wiskunde en Informatica

PAGE 11 29-11-2011

slide-4
SLIDE 4

Grammars and meta-models

  • Xtext specification for Pico (continued):

Statement : AssignStatement | IfStatement | WhileStatement ; AssignStatement : lhs=ID ':=' rhs=Exp ; IfStatement : 'if' Exp 'then' thenpart=Statements? 'else' p p elsepart=Statements? 'fi' ; WhileStatement : 'while' Exp 'do' dopart=Statements? 'od' ;

/ Faculteit Wiskunde en Informatica

PAGE 12 29-11-2011

Grammars and meta-models

  • Xtext specification for Pico (continued):

Exp : lhs=Term (bop=BinOp rhs=Exp)? ; Term : id=ID | literal=STRING | number=INT | '(' Exp ')' ;

predefined

BinOp: {plusOp} '+' | {minOp} '-' | {concOp} '||' ;

/ Faculteit Wiskunde en Informatica

PAGE 13 29-11-2011

Grammars and meta-models

  • Resulting meta-model (of syntax tree) for Pico:

/ Faculteit Wiskunde en Informatica

PAGE 14 29-11-2011

Grammars and meta-models

  • Resulting meta-model (of syntax tree) for Pico:

/ Faculteit Wiskunde en Informatica

PAGE 15 29-11-2011

slide-5
SLIDE 5

Grammars and meta-models

  • Resulting meta-model (of syntax tree) for Pico:

/ Faculteit Wiskunde en Informatica

PAGE 16 29-11-2011

Grammars and meta-models

  • Xtext specification for Pico (with cross references):

… AssignStatement : lhs=[IdType|ID] ':=' rhs=Exp ;

C

… Term : id=[IdType|ID] | literal=STRING | number=INT | '(' Exp ')' ;

Cross reference

[ yp | ] | | | ( p ) …

/ Faculteit Wiskunde en Informatica

PAGE 17 29-11-2011

Grammars and meta-models

  • Xtext specification for Pico:

Model : Program ; Program : 'begin' decls=Decls? stats=Statements? 'end' ; Decls : 'declare' idtypes=IdTypes ';' ; IdTypes : pairs+=IdType (',' pairs+=IdType)*; yp p yp ( , p yp ) IdType : name=ID ':' type=Type ; Type : {naturalType} 'natural' | {stringType} 'string' ; Type : {naturalType} natural | {stringType} string ; Statements : statements+=Statement (';' statements+=Statement)*;

/ Faculteit Wiskunde en Informatica

PAGE 18 29-11-2011

Grammars and meta-models

  • Xtext offers
  • “built-in” cross reference mechanism
  • scoping mechanism via writing “simple” Java methods, see

http://www.eclipse.org/Xtext/documentation/1 0 0/xtext.html#scoping http://www.eclipse.org/Xtext/documentation/1_0_0/xtext.html#scoping

  • Xtext mixes in fact context-free parsing with some

f ti l ti form semantic evaluation

/ Faculteit Wiskunde en Informatica

PAGE 19 29-11-2011

slide-6
SLIDE 6

Grammars and meta-models

  • Conclusions on Xtext
  • popular
  • well integrated in Eclipse

it d f d fi i t t f l

  • suited for defining concrete syntax of new languages
  • less suited for existing languages, because of LL class

/ Faculteit Wiskunde en Informatica

PAGE 20 29-11-2011

Grammars and meta-models

  • EMFtext
  • is tightly integrated with Eclipse Modeling Framework (EMF)
  • enables the definition of textual syntax for Ecore-based meta-

models models

  • offers a Concrete Syntax Specification Language (CS) is

EBNF based

  • ANLTR based
  • ANLTR based
  • Documentation: http://www.emftext.org/EMFTextGuide.php

/ Faculteit Wiskunde en Informatica

PAGE 21 29-11-2011

Grammars and meta-models

  • EMFtext offers
  • modular specification:

− import mechanism for various meta-models − modularization and extension of CS specifications modularization and extension of CS specifications

  • default reference resolving mechanisms

− default name resolution mechanism for models with globally unique names is available for any syntax q y y

  • comprehensive syntax analysis

− analyses of CS specifications inform the developer about potential errors p

/ Faculteit Wiskunde en Informatica

PAGE 22 29-11-2011

Grammars and meta-models

  • Developing a language with EMFText is an iterative

g g g process and consists of the following basic tasks:

1. specifying a language meta-model 2 specifying the Concrete Syntax of the language 2. specifying the Concrete Syntax of the language 3. generating language tooling 4.

  • ptionally customizing the language tooling

p y g g g g

/ Faculteit Wiskunde en Informatica

PAGE 23 29-11-2011

slide-7
SLIDE 7

Grammars and meta-models

  • The meta-model (abstract syntax) of a language is specified

using the Ecore Meta-modelling Language

  • A CS specification consists of 4 sections:

1. mandatory configuration:

− the language file extension is defined − the syntax specification is bound to the meta-model − the syntax start symbol is defined ti ll i t f th t d t d l d −

  • ptionally, import of other syntaxes and meta-models and

− various EMFText code generation options can be configured

2. basic token types used by the language lexer to tokenize language expressions are defined expressions are defined 3. token styles are defined that customize syntax highlighting in the generated editor 4. the syntax rules for the language are specified y g g p

/ Faculteit Wiskunde en Informatica

PAGE 24 29-11-2011

Grammars and meta-models

  • The syntax specification rules used in the CS

y language are derived from EBNF to support arbitrary context-free languages

  • to define syntax for EMF-based meta-models and relate to
  • to define syntax for EMF-based meta-models and relate to

Ecore meta-modeling concepts

  • it provides Ecore-specific specializations of classical EBNF

constructs like terminals and non terminals constructs like terminals, and non terminals

− this enables EMFText to provide advanced support during syntax specification, e.g., errors and warnings if the syntax specification is inconsistent with the meta-model − it enables the EMFText parser generator to derive a parser that directly instantiates EMF models from language expressions

/ Faculteit Wiskunde en Informatica

PAGE 25 29-11-2011

Grammars and meta-models

  • Configuration section of CS:

g

  • First, the file extension used for the files, containing the

models, must be defined via:

SYNTAXDEF yourFileExtension y

  • Second, the EMF generator model (.genmodel) containing

the meta classes for which the syntax is specified. The genmodel can be referred to by its namespace URI: genmodel can be referred to by its namespace URI:

FOR <yourGenModelNamespaceURI> <yourGenmodelLocation>? − EMFText uses the generator model instead of the Ecore model, because it requires information about the code generated from because t equ es

  • at o

about t e code ge e ated

  • the Ecore model

/ Faculteit Wiskunde en Informatica

PAGE 26 29-11-2011

Grammars and meta-models

  • Third, the start symbol must be defined, which must

y be a meta class from the meta-model:

START YourRootMetaClassName A CS ifi ti l h lti l t t b l

  • A CS specification can also have multiple start symbols

(separated by a comma)

  • Typical candidates for start symbols are meta classes

ith t i i t i t l ti without incoming containment relations

/ Faculteit Wiskunde en Informatica

PAGE 27 29-11-2011

slide-8
SLIDE 8

Grammars and meta-models

  • It is possible to import additional meta-models
  • if they are only referenced in the current one, and
  • a syntax for some or all of its concepts needs to be specified
  • r reused
  • r reused
  • Meta-models and syntax specifications can be

imported in a dedicated import section

IMPORTS { // imports go here }

  • There must be at least one import entry

/ Faculteit Wiskunde en Informatica

PAGE 28 29-11-2011

Grammars and meta-models

  • If a syntax is imported, all its rules are reused

y

  • Importing syntax rules is optional
  • One can also just import the meta-model contained

i th t d l in the generator model

prefix : <genModelURI> <locationOfTheGenmodel> // next line is optional WITH SYNTAX syntaxURI <locationOfTheSyntax>

/ Faculteit Wiskunde en Informatica

PAGE 29 29-11-2011

Grammars and meta-models

  • EMFText allows to specify custom tokens.

y

  • Each token consists of a name and a regular expression
  • By default, EMFText implicitly uses a set of

predefined standard tokens: predefined standard tokens:

TEXT : (’A’..’Z’|’a’..’z’|’0’..’9’|’_’|’’)+ LINEBREAK : (’\r\n’|’\r’|’\n’) WHITESPACE : (’ ’|’\t’|’\f’)

  • The predefined tokens can be explicitly excluded by

using the usePredefinedTokens option: using the usePredefinedTokens option:

OPTIONS { usePredefinedTokens }

/ Faculteit Wiskunde en Informatica

PAGE 30 29-11-2011

Grammars and meta-models

  • A TOKENS section must be added to define:

TOKENS { // token definitions go here in the form: DEFINE YOUR_TOKEN_NAME $yourRegularExpression$; }

  • Every token name starts with a capital letter
  • A regular expression must conform to the ANTLRv3 syntax for

regular expressions (without semantic annotations) g p ( )

  • Example of composed tokens:

TOKENS { DEFINE CHAR $(’a’..’z’|’A’..’Z’)$; // simple token DEFINE DIGIT $(’0’ ’9’)$ // i l t k DEFINE DIGIT $(’0’..’9’)$; // simple token // composed token DEFINE IDENTIFIER CHAR + $($ + CHAR + $|$ + DIGIT + $)*$; }

/ Faculteit Wiskunde en Informatica

PAGE 31 29-11-2011

slide-9
SLIDE 9

Grammars and meta-models

  • Syntax Rules

y

  • For each concrete meta class you can define a syntax rule
  • The rule specifies what the text that represents instances of

the class looks like the class looks like

  • Rules have two sides: a left and right-hand side.

th l ft id d t th f th t l − the left side denotes the name of the meta class − the right-hand side defines the syntax elements

/ Faculteit Wiskunde en Informatica

PAGE 32 29-11-2011

Grammars and meta-models

  • The most basic form of a syntax rule is:

y

YourMetaClass ::= "someKeyword" ;

  • This rule states that whenever the text someKeyword is

found an instance of YourMetaClass must be created found, an instance of YourMetaClass must be created

  • Besides text elements that are expected “as is”, parts of the

syntax can be optional or repeating:

YourMetaClassWithOptionalSyntax ::= ("#")? "someKeyword" ; YourMetaClassWithOptionalSyntax ::= ("#")? "someKeyword" ; YourMetaClassWithRepeatingSyntax ::= ("#")* "someKeyword" ; YourMetaClassWithRepeatingSyntax ::= ("#")+ "someKeyword" ;

/ Faculteit Wiskunde en Informatica

PAGE 33 29-11-2011

Grammars and meta-models

  • If meta classes have attributes, we can also specify

y syntax for their values, by adding square brackets:

YourMetaClassWithAttribute ::= yourAttribute[] ;

O if th f t k i id th

  • One can specify the name of a token inside the

brackets:

YourMetaClassWithAttribute ::= yourAttribute[MY TOKEN] ; YourMetaClassWithAttribute :: yourAttribute[MY_TOKEN] ;

  • If the token name is omitted EMFText uses the predefined

token TEXT, which includes alphanumeric characters

/ Faculteit Wiskunde en Informatica

PAGE 34 29-11-2011

Grammars and meta-models

  • For boolean attributes, EMFText provides a special

feature to ease syntax specification

YourMetaClassWithAttribute ::= yourAttribute["yes" : "no"] ;

Thi l t t th t t th t l d

  • This rule states that yes represents the true value and no

represents false.

  • You can also use the empty string for one of the values:

YourMetaClassWithAttribute ::= yourAttribute["set" : ""] ;

− The attribute is set to false by default and set to true in the text set is found

/ Faculteit Wiskunde en Informatica

PAGE 35 29-11-2011

slide-10
SLIDE 10

Grammars and meta-models

  • For enumeration attributes, EMFText does also

provide a special feature to ease syntax specification.

  • For each literal of the enumeration the corresponding
  • For each literal of the enumeration, the corresponding

string representation must be given:

YourMetaClassWithAttribute ::= yourAttribute[red : "r", green " " bl "b"] : "g", blue : "b"];

/ Faculteit Wiskunde en Informatica

PAGE 36 29-11-2011

Grammars and meta-models

  • Meta classes can have references and consequently

y there is a way to specify syntax for these

  • EMF distinguishes between containment and non-

containment references containment references.

  • In EMF, the elements that are referenced with the former type

are contained in the parent elements

− EMFText expects the text for the contained elements (children) to be also contained in the parent's text

  • The latter (non-containment) references are referenced only

and are contained in another (parent) element and are contained in another (parent) element

− EMFText does not expect text that represents the referenced element, but a symbolic identifier that refers to the element

/ Faculteit Wiskunde en Informatica

PAGE 37 29-11-2011

Grammars and meta-models

  • A basic example for defining a rule for a meta class

g that has a containment reference looks like this:

YourContainerMetaClass ::= "CONTAINER" yourContainmentReference ;

  • It allows to represent instances of YourContainerMetaClass

p using the keyword CONTAINER followed by one instance of the type that yourContainmentReference points to

  • If multiple children need to be contained:

p

YourContainerMetaClass ::= "CONTAINER" yourContainmentReference* ;

/ Faculteit Wiskunde en Informatica

PAGE 38 29-11-2011

Grammars and meta-models

  • Each containment reference can be restricted to allow only

certain types:

YourContainerMetaClass ::= "CONTAINER“ yourContainmentReference : SubClass ;

  • It allows only instances of SubClass after the keyword CONTAINER
  • It allows only instances of SubClass after the keyword CONTAINER

even though the reference yourContainmentReference may have a more general type

  • Multiple subclass restrictions are also possible,

Multiple subclass restrictions are also possible, separated by a comma:

YourContainerMetaClass ::= "CONTAINER“ yourContainmentReference : SubClassA, SubClassB ; , ;

/ Faculteit Wiskunde en Informatica

PAGE 39 29-11-2011

slide-11
SLIDE 11

Grammars and meta-models

  • An example for denifing a rule for a meta class that

g has a non-containment reference looks like:

YourPointerMetaClass ::= "POINTER" yourNonContainmentReference[] ;

Th l i i il t th f t i t f

  • The rule is very similar to the one for containment references,

but uses the additional brackets

  • Within the brackets the token that the symbolic name must

t h b d fi d match can be defined

/ Faculteit Wiskunde en Informatica

PAGE 40 29-11-2011

Grammars and meta-models

  • Defining syntax for an expression language (e.g.,

Defining syntax for an expression language (e.g., arithmetic expressions) via EMFText can lead to:

  • structures that are not be optimal

l ft i l

  • left recursive rules

/ Faculteit Wiskunde en Informatica

PAGE 41 29-11-2011

Grammars and meta-models

  • EMFText provides a special feature called operator

EMFText provides a special feature called operator precendence annotations (@Operator)

  • These annotations can be added to rules, which refer

t i t l ith to expression meta classes with a common superclass:

@Operator(type="binary left associative", weight="1", @Operator(type binary_left_associative , weight 1 , superclass="Expression") Additive ::= left "+" right;

  • defines syntax for a metaclass Additive
  • The references left and right must be containment references

and have the type Expression, which is the abstract supertype for all metaclasses of the expression metamodel

/ Faculteit Wiskunde en Informatica

PAGE 42 29-11-2011

Grammars and meta-models

  • The type attribute specifies the kind of expression at hand,

yp p p , which can be binary (either left_associative or right_associative), unary_prefix, unary_postfix or primitive

  • The weight attribute species the priority of one expression type

g p p y p yp

  • ver another:

@Operator(type="binary_left_associative", weight="2", superclass="Expression") Multiplicative ::= left "*" right;

  • EMFText will create an expression tree, where Multiplicative

nodes are created last (i.e., multiplicative expressions take precedence over additive expressions)

/ Faculteit Wiskunde en Informatica

PAGE 43 29-11-2011

slide-12
SLIDE 12

Grammars and meta-models

  • Unary expressions can be defined as follows:

Unary expressions can be defined as follows:

@Operator(type="unary_prefix", weight="4", superclass="Expression") Negation ::= “-" body;

  • Primitive expressions can be defined as follows:

p

@Operator(type="primitive", weight="5", superclass="Expression") IntegerLiteralExp ::= intValue[INTEGER_LITERAL];

  • They should be used for literals (e g

numbers constants or

  • They should be used for literals (e.g., numbers, constants or

variables)

/ Faculteit Wiskunde en Informatica

PAGE 44 29-11-2011

Grammars and meta-models

  • EMFtext offers:
  • concise definition of lexical and concrete syntax rules
  • modularity

EMFt t ll

  • EMFtext allows:
  • definition of editor features (syntax highlighting)
  • pretty printing

pretty printing

  • etc.

/ Faculteit Wiskunde en Informatica

PAGE 45 29-11-2011