syntax and antlr syntax vs semantics
play

Syntax and ANTLR Syntax vs. Semantics Semantics: What does a - PowerPoint PPT Presentation

CS152 Programming Language Paradigms Prof. Tom Austin Syntax and ANTLR Syntax vs. Semantics Semantics: What does a program mean? Defined by an interpreter or compiler? Syntax: How is a program structured? Defined by a


  1. CS152 – Programming Language Paradigms Prof. Tom Austin Syntax and ANTLR

  2. Syntax vs. Semantics • Semantics: – What does a program mean? – Defined by an interpreter or compiler? • Syntax: – How is a program structured? – Defined by a lexer and parser

  3. Review: Overview of Compilation Lexer/ source tokens Parser code Tokenizer Abstract Compiler Interpreter Syntax Tree (AST) Machine code Commands

  4. Tokenization Lexer/ source tokens Parser code Tokenizer Abstract Compiler Interpreter Syntax Tree (AST) Machine code Commands

  5. Tokenization • Process of converting characters to the words of the language. • Generally handled through regular expressions. • A variety of lexers exist: – Lex/Flex are old and well-established – ANTLR & JavaCC both handle lexing and parsing • Sample lexing rule for integers (in Antlr) INT : [0-9]+ ;

  6. Categories of Tokens • Reserved words or keywords – e.g. if , while • Literals or constants – e.g. 123 , "hello" • Special symbols – e.g. " ; ", " <= ", " + " • Identifiers – e.g. balance , tyrionLannister

  7. Lexing in ANTLR (v. 4) (in class)

  8. Parsing Lexer/ source tokens Parser code Tokenizer Abstract Compiler Interpreter Syntax Tree (AST) Machine code Commands

  9. Parsing • Parsers take the tokens of the language and combines them into abstract syntax trees (ASTs). • The rules for parsers are defined by context free grammars (CFGs). • Parsers can be divided into – bottom-up/shift-reduce parsers – top-down parsers

  10. Context Free Grammars • Grammars specify a language • Backus-Naur form is a common format Expr -> Number | Number + Expr • Terminals cannot be broken down further. • Non-terminals can be broken down into further phrases.

  11. Sample grammar expr -> expr + expr | expr – expr | ( expr ) | number number -> number digit | digit digit -> 0 | 1 | 2 | 3 | … | 9

  12. Bottom-up Parsers • Also known as shift-reduce parsers – shift tokens onto a stack, then reduce to a non- terminal. • LR: left-to-right, rightmost derivation • The most common type of bottom-up parsers are Look-Ahead LR parsers (LALR) – YACC/Bison are examples • Generally considered to be more powerful, though they seem to be fading from popularity.

  13. Top-down parsers • Non-terminals are expanded to match incoming tokens. • LL: left-to-right, leftmost derivation • LL(k) parsers can look ahead k elements to decide which rule to use. – example LL(k) parser: JavaCC • LL(1) parsers (known as recursive descent ) parsers are of special interest: – Easy to write/fast execution time – Some languages are designed to be LL(1)

  14. Antlr • Antlr v. 1-3 were LL(*) – Similar to LL(k), but can look ahead as far as needed. • Antlr v. 4 is Adaptive LL(*), or ALL(*) – Allows us to write left-recursive grammars that were not previously possible with LL parsers. http://www.antlr.org/papers/allstar-techreport.pdf – Sample left-recursive grammar: expr -> expr + expr | num

  15. Parsing with ANTLR (in-class)

  16. Lab: Getting to know Antlr Write a calculator using Antlr. Details in Canvas, starter code on course website.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend