Syntax and Parsing Part 1 At this point in the course, were going - - PowerPoint PPT Presentation

syntax and parsing
SMART_READER_LITE
LIVE PREVIEW

Syntax and Parsing Part 1 At this point in the course, were going - - PowerPoint PPT Presentation

Syntax and Parsing Part 1 At this point in the course, were going to start to learn how PLs work under the hood Programming languages take us from raw text on the screen to bits flipping on the processor Languages are implemented in phases


slide-1
SLIDE 1

Syntax and Parsing

Part 1

slide-2
SLIDE 2

At this point in the course, we’re going to start to learn how PLs work under the hood

slide-3
SLIDE 3

Programming languages take us from raw text on the screen to bits flipping on the processor

slide-4
SLIDE 4

Languages are implemented in phases

The raw text on the screen is gradually converted to a language the computer speaks

slide-5
SLIDE 5

http://durofy.com/phases-of-compiler-design/

slide-6
SLIDE 6

http://durofy.com/phases-of-compiler-design/ Typically called the front end

slide-7
SLIDE 7

The job of the compiler / interpreter’s front end is to break down the raw text into a structure that is easier to work with programmatically This results in an intermediate representation

slide-8
SLIDE 8

The job of the compiler / interpreter’s front end is to break down the raw text into a structure that is easier to work with programmatically This results in an intermediate representation Why?

slide-9
SLIDE 9

The job of the compiler / interpreter’s front end is to break down the raw text into a structure that is easier to work with programmatically This results in an intermediate representation Why? Working on raw text way too kludgey!

slide-10
SLIDE 10

Don’t get too hung up on specifics right now, we’ll be implementing

  • ne programming language (Forth) soon!
slide-11
SLIDE 11

Today we’re going to focus on lexical analysis

I.e., how do we break up raw text into a stream of tokens? Or, how do I define a token?

slide-12
SLIDE 12

Next lecture we’ll talk about combining these raw tokens to build up a grammar This will help us define the syntax of a PL compositionally

slide-13
SLIDE 13
slide-14
SLIDE 14

Lexical Analysis Lexical analysis breaks apart a (potentially huge) file into sequence of tokens

slide-15
SLIDE 15

Token: atomic piece of syntax of a language

slide-16
SLIDE 16

(define (hello-world) (display “Hello, world!\n”)) LPAREN ID(“define”) LPAREN Identifier(“hello-world”) RPAREN LPAREN ID(“display”) STRING(“Hello, world\n”) RPAREN RPAREN One example of a token stream

slide-17
SLIDE 17

(define (hello-world) (display “Hello, world!\n”)) LPAREN ID(“define”) LPAREN Identifier(“hello-world”) RPAREN LPAREN ID(“display”) STRING(“Hello, world\n”) RPAREN RPAREN

Lexical analysis

slide-18
SLIDE 18

Enter: Regular Expressions

slide-19
SLIDE 19

Regular expressions are basically string matchers

slide-20
SLIDE 20

A regular expression classifies strings into two categories Accept or reject

slide-21
SLIDE 21
slide-22
SLIDE 22

Regular expressions are a general device in computing, but there are many implementations They each vary a bit, so read the docs on whatever language you’re using

slide-23
SLIDE 23

(Kris now talks about basic building blocks of regexes: constants, concat, Kleene star, union, using () for grouping) Talk about derived forms: [a-z], {a,b,c}, a+

slide-24
SLIDE 24

The “language” of a regex is the set of strings it accepts

slide-25
SLIDE 25

(0|1)* What is this language?

slide-26
SLIDE 26

1(0)* What about this one?

slide-27
SLIDE 27

((0|1)(0|1)(0|1))* How about this one

slide-28
SLIDE 28

Write “the set of odd binary strings” as a regex

slide-29
SLIDE 29

Write “an odd number of bs followed by an even number of as”

slide-30
SLIDE 30

“Any number of 1s, followed by an even number of 0s, followed by a single 1”

slide-31
SLIDE 31

Regular expressions classify the so called regular languages