1
CSE 3341: Principles of Programming Languages Recursive Descent - - PowerPoint PPT Presentation
CSE 3341: Principles of Programming Languages Recursive Descent - - PowerPoint PPT Presentation
CSE 3341: Principles of Programming Languages Recursive Descent Parsing Jeremy Morris 1 Parsing A grammar is a generator for a language The rules tell us how to create strings in the language A parser is a recognizer for a language
Parsing
A grammar is a generator for a language
The rules tell us how to create strings in the language
A parser is a recognizer for a language
Confirms or rejects a string as being in the language or not being
in the language
For an arbitrary CFG we can prove that the upper bound
- n its running time is O(n3)
Earley's algorithm and CYK algorithm
Fortunately, if the CFG is carefully constructed, we can
do much better than that
LL or LR grammars 2
Top-down vs. Bottom-up parsing
"Top-down" or predictive parsing (or LL parsing)
Starts from the root node of the language and the left-most token. Build the parse tree "top down" by using tokens to drive which
rule will be next to be expanded.
Predictive parsers are most often written by hand.
"Bottom-up" parsing (or LR parsing)
Builds the parse tree from the leaves upward, matching a
collection of nodes to rule expansions.
Also starts with the left-most token, but no fixed first rule to
expand.
Bottom-up parsers are most often developed using a parser
generator such as Bison or YACC.
3
CORE parsing practice
program int Y,Z; begin Y = 20; Z = 5; Y = Y – Z; write Y; end
4
Recursive Descent
An algorithm for walking an already constructed AST
Top-down rather than bottom-up Useful for interpreting parsed code, printing parsed code,
generating new code from parsed code
Basic Idea:
Create one method/procedure for each non-terminal
The body of that method decides how to walk through its children based on the rules of the language
Start by calling the procedure for the starting non-terminal Algorithm ends when you have walked the entire tree
Never ends? Infinite loop.
5
Recursive Descent Example
void executeIf(??) bool b = evaluateCond(??) if (b) then executeSS(??) else executeSS(??)
6
<if> <cond> <stmt-seq> <stmt-seq> … … …
Arrays to represent parse trees?
Each node in the tree → one row in array. Each row has n columns:
1.
Integer corresponding to non-terminal for the node.
2.
Integer corresponding to which alternative is used to expand that non-terminal
3.
Row numbers of children used
This is how we determine n above – maximum number of children needed for our language + 2.
7
Disclaimer: Your instructor does not advocate the use of arrays for hand built parsers in the year 2016! But you should understand how this algorithm works.
Recursive Descent Example (revisited)
void executeIf(int n, int[][] pt) bool b = evaluateCond(pt[n,3], pt) if (b) then executeSS(pt[n,4], pt) else if (pt[n,2] == 2) then executeSS(pt[n,5], pt)
8
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
Recursive Descent Example (revisited)
void printIf(int n, int[][] pt) print("if") printCond(pt[n,3],pt) print("then") printSS(pt[n,4], pt) if (pt[n,2] == 2) then print("else") printSS(pt[n,5], pt) print("end;")
9
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
Recursive Descent Example (again)
void printAssign(int n, int[][] pt) printId(pt[n,3],pt) print(" = ") printExp(pt[n,4],pt)
10
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
Recursive Descent Example (again)
void execAssign(int n, int pt[][]) int result = evalExp(pt[n,4],pt) assignIdVal(pt[n,3], result)
11
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
Recursive Descent Parsing
Parsing is harder
Instead of walking the tree we are building it as we go Same idea, one method for each non-terminal… …Except that now each method will write values to the table
instead of reading from it
Calling parse method will create an empty "node" in the tree by
using the next free row in the table
Requires us to keep track of rows being used
(Also requires us to have a big table or grow it dynamically)
Ignore this for now – there's a better approach we'll focus on once we have the idea down
12
Recursive Descent Parsing Example
13
void parseIf(int n, int[][] pt) pt[n,1] = 8 String s = t.currentToken() // should be "if" t.nextToken() // consume the token pt[n,3] = currentRow++ parseCond(pt[n,3], pt) pt[n,4] = currentRow++ t.nextToken() // consume the "then" token parseSS(pt[n,4], pt) s = t.currentToken() if (s is "else") then t.nextToken() // consume the token pt[n,2] = 2 // indicate we're using the second expansion pt[n,5] = currentRow++ parseSS(pt[n,5],pt) else pt[n,2] = 1 // indicate we're using the first expansion t.nextToken() // why do this? t.nextToken() // and this? Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
Recursive Descent Parsing
Are you feeling good about this code?
As an algorithm it's fine, but as far as code goes it leaves a bit to
be desired
The code suffers from a severe lack of abstraction
We're talking about trees but operating on a table Why aren't we operating on a tree?
Let's talk about an approach that uses a bit more
abstraction
Encapsulate the data into a parse tree class Hide our operations a bit – let the parse tree class take care of
details while we focus on bigger picture
A dip into object-oriented design 14
Parse Tree Class Design
Let's think about the interface
We're going to have a tree with a cursor – a means of moving
from node to node in the tree
For each node we need to store:
The non-terminal identity
The rule alternative used in expansion of this non-terminal
For the cursor we need to be able to:
Move it to each child (child 1, 2 and 3)
Move it back up to the parent node
We need to be able to check:
Is there a child?
Is there a parent (i.e. are we at the root node?)
15
Parse Tree Class Design – Interface1
16
interface ParseTree // To get the contents of the node int getIdentity() int getAlternative() // To get the number of children int getChildCount() // To find out if it is the root boolean hasParent() // To move the cursor void moveToChild(int index) void moveToParent()
Recursive Descent Example (ParseTree)
void printIf(ParseTree pt) print("if") pt.moveToChild(1) printCond(pt) print("then") pt.moveToParent() pt.moveToChild(2) printSS(pt) pt.moveToParent() if (pt.getAlternative() == 2) then print("else") pt.moveToChild(3) printSS(pt) pt.moveToParent() // set it back at the if node print("end;")
17
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
ParseTree Interface Design
For dealing with variable assignment we need some
more operations
If we're at an <id> node we need the id name, value Add a few more methods to the interface:
// get the Id string if we are at an id node String getIdString() // set the id numeric value if we are an id node void setIdValue(int value) // get the numeric value for an id at an id node int getIdValue()
18
Recursive Descent Example (ParseTree)
void execAssign(ParseTree pt) pt.moveToChild(2) // move to the expression to evaluate int value = execExpr(pt) pt.moveToParent() pt.moveToChild(1) // move to the ID node to store the value pt.setIdValue(value) pt.moveToParent() // restore our cursor to the top of the assign
19
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
ParseTree Interface Parsing
What about parsing?
For parsing we need to be able to:
Add nodes
Set the content of nodes
Need more operations to be able to do that:
// add another child to the current node void addChild() // To set the contents of the node void setIdentity(int ident) void setAlternative(int alternative)
20
Recursive Descent Parsing Example (ParseTree)
void parseAssign(ParseTree pt) pt.setIdentity(7) // set it to an assignment pt.setAlternative(1) // use expansion 1 pt.addChild() pt.addChild() // add two children for the assignment node pt.moveToChild(1) parseID(pt) t.nextToken() // why are we doing this? pt.moveToParent() pt.moveToChild(2) parseExpr(pt) t.nextToken() // why? pt.moveToParent() // why?
21
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.
Recursive Descent Parsing
Okay, so we have a bit more abstraction
Still not great – we can do better
Let's make this all more object-oriented
Right now we're treating the whole tree like an object Let's make each node an object instead Make a separate class for each non-terminal
Build printing, parsing and executing logic into each non-terminal class
Build the children available into each class
22
Node Class Design – Interfaces
23
interface programNode void parseProgram(Tokenizer t) void printProgram() void execProgram() interface ifNode void parseIf(Tokenizer t) void printIf() void execIf() interface stmtNode void parseStmt(Tokenizer t) void printStmt() void execStmt()
Recursive Descent Example (ProgramNode)
public class ProgramNode: private: DeclSeqNode ds StmtSeqNode ss public: ProgramNode() this.ds = new DeclSeqNode() this.ss = new StmtSeqNode() void parseProgram(Tokenizer t) t.nextToken() // why? ds.parseDeclSeq(t) t.nextToken() // why? ss.parseStmtSeq(t) t.nextToken() // why?
24
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void printProgram() print("program") ds.printDeclSeq() print("begin") ss.printStmtSeq() print("end") void execProgram() ds.execDeclSeq() ss.execStmtSeq()
Recursive Descent Example (IfNode)
public class IfNode: private: CondNode condition StmtSeqNode thenSeq StmtSeqNode elseSeq int altNo; public: IfNode() this.condition = new CondNode() this.thenSeq = new StmtSeqNode() this.elseSeq = null this.altNo = 1;
25
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void parseIf(Tokenizer t) t.nextToken() // why? condition.parseCondition(t) t.nextToken() // why? thenSeq.parseStmtSeq(t) String token = t.currentToken if (token is "else") then t.nextToken() // why? this.altNo = 2; elseSeq = new StmtSeqNode() elseSeq.parseStmtSeq(t) t.nextToken() t.nextToken() // why?
Recursive Descent Example (IfNode continued)
void printIf() print("if") condition.printCondition() print("then") thenSeq.printStmtSeq() if (altNo == 2) then print("else") elseSeq.printStmtSeq() print("end;")
26
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void execIf() bool c = condition.evalCondition() if (c) then thenSeq.execStmtSeq() else if (altNo == 2) then elseSeq.execStmtSeq()
Recursive Descent Example (StmtNode)
public class StmtNode: private: AssignNode assign IfNode if LoopNode loop InputNode input OuptutNode output int altNo public: StmtNode() this.assign = null this.ifNode = null this.loop = null this.input = null this.output = null this.altNo = 1;
27
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. And parseStmtNode is incomplete void parseStmt(Tokenizer t) String tok = t.currentToken() if (tok is an id) assign = new AssignNode() altNo = 1 assign.parseAssign(t) else if (tok is "if") ifNode = new IfNode() altNo = 2 if.parseIf(t) else if (tok is "loop") loop = new LoopNode() altNo = 3 loop.parseLoop(t) …
Recursive Descent Example (StmtNode continued)
void printStmt() if (altNo == 1) assign.printAssign() ….
28
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. Also neither function here is complete. void execStmt() if (altNo == 1) assign.execAssign() …
Identifier and Assign Nodes
In this approach, we need to consider Identifier and
Assign nodes a bit differently
Need to make sure that each Identifier is only created once Later uses of the same Id should refer to the same Identifier
- bject
Need to make sure that assignment works properly Recall the symbol table from our earlier discussion? Need
something to replace that
29
Recursive Descent Example (IdNode)
public class IdNode: private: String name int value bool initialized static Map<String, IdNode> symTab IdNode(String n) this.name = n this.initialized = false
30
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. public: static IdNode parseId(Tokenizer t) String tok = t.currentToken() t.nextToken() if (tok not in symTab) IdNode node = new IdNode(tok) symTab[tok] = node return symTab[tok] void setValue(int v) this.value = v this.initialized = true int getValue() return this.value String getName() return this.getName()
Recursive Descent Example (AssignNode)
public class AssignNode: private: IdNode id ExprNode expr public: AssignNode this.id = null this.expr = new ExprNode() void parseAssign(Tokenizer t) id = IdNode.parseId(t) t.nextToken() // why? this.expr.parseExpr(t) t.nextToken()
31
Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void printAssign() print(id.getName()) print("=") this.expr.printExpr() print(";") void execAssign() int value = this.expr.evalExpr() id.setValue(value)
32