Download, print and study this document offline |
Page 1 Top-Down Parsing – 1 Compiler Design – © Muhammed Mudawwar Top-Down Parsing v A parser is top-down if it discovers a parse tree top to bottom A top-down parse corresponds to a preorder traversal of the parse tree A leftmost derivation is applied at each derivation step v Top-down parsers come in two forms Predictive Parsers ² Predict the production rule to be applied using lookahead tokens Backtracking Parsers ² Will try different productions, backing up when a parse fails v Predictive parsers are much faster than backtracking ones Predictive parsers operate in linear time – will be our focus Backtracking parsers operate in exponential time – will not be considered v Two kinds of top-down parsing techniques will be studied Recursive-descent parsing LL parsing Page 2 Top-Down Parsing – 1 Compiler Design – © Muhammed Mudawwar Top-Down Parsing v A parser is top-down if it discovers a parse tree top to bottom A top-down parse corresponds to a preorder traversal of the parse tree A leftmost derivation is applied at each derivation step v Top-down parsers come in two forms Predictive Parsers ² Predict the production rule to be applied using lookahead tokens Backtracking Parsers ² Will try different productions, backing up when a parse fails v Predictive parsers are much faster than backtracking ones Predictive parsers operate in linear time – will be our focus Backtracking parsers operate in exponential time – will not be considered v Two kinds of top-down parsing techniques will be studied Recursive-descent parsing LL parsing Top-Down Parsing – 2 Compiler Design – © Muhammed Mudawwar Top-Down Parsing by Recursive-Descent v We view a nonterminal A as a definition of a procedure A Procedure A will match the token sequence generated by nonterminal A v The RHS of a production of A specifies the code for procedure A Terminals are matched against input tokens Nonterminals produce calls to corresponding procedures v If multiple production rules exist for a nonterminal A One of them is predicted based on a lookahead token ² The lookahead token is the next input token that should be matched The predicted rule is the only one that applies ² Other rules will NOT be tried ² This is a predictive parsing technique, not a backtracking one v A syntax error is detected when Next token in the input sequence does NOT match the expected token Page 3 Top-Down Parsing – 1 Compiler Design – © Muhammed Mudawwar Top-Down Parsing v A parser is top-down if it discovers a parse tree top to bottom A top-down parse corresponds to a preorder traversal of the parse tree A leftmost derivation is applied at each derivation step v Top-down parsers come in two forms Predictive Parsers ² Predict the production rule to be applied using lookahead tokens Backtracking Parsers ² Will try different productions, backing up when a parse fails v Predictive parsers are much faster than backtracking ones Predictive parsers operate in linear time – will be our focus Backtracking parsers operate in exponential time – will not be considered v Two kinds of top-down parsing techniques will be studied Recursive-descent parsing LL parsing Top-Down Parsing – 2 Compiler Design – © Muhammed Mudawwar Top-Down Parsing by Recursive-Descent v We view a nonterminal A as a definition of a procedure A Procedure A will match the token sequence generated by nonterminal A v The RHS of a production of A specifies the code for procedure A Terminals are matched against input tokens Nonterminals produce calls to corresponding procedures v If multiple production rules exist for a nonterminal A One of them is predicted based on a lookahead token ² The lookahead token is the next input token that should be matched The predicted rule is the only one that applies ² Other rules will NOT be tried ² This is a predictive parsing technique, not a backtracking one v A syntax error is detected when Next token in the input sequence does NOT match the expected token Top-Down Parsing – 3 Compiler Design – © Muhammed Mudawwar Example on Recursive-Descent Parsing v Consider the following grammar for expressions in EBNF notation expr ? term { addop term } term ? factor { mulop factor } factor ? ‘(’ expr ‘)’ | id | num v Since three nonterminals exist, we need three parsing procedures The curly brackets expressing repetition is translated into a while loop The vertical bar expressing alternation is translated into a case statement procedure expr( ) begin term( ); while token = ADDOP do match(ADDOP); term( ); end while; end expr; procedure term( ) begin factor( ); while token = MULOP do match(MULOP); factor( ); end while; end term; procedure factor( ) begin case token of ‘(’: match(‘(’); expr( ); match(‘)’); ID: match(ID); NUM: match(NUM); else syntax_error(token); end case; end factor; Page 4 Top-Down Parsing – 1 Compiler Design – © Muhammed Mudawwar Top-Down Parsing v A parser is top-down if it discovers a parse tree top to bottom A top-down parse corresponds to a preorder traversal of the parse tree A leftmost derivation is applied at each derivation step v Top-down parsers come in two forms Predictive Parsers ² Predict the production rule to be applied using lookahead tokens Backtracking Parsers ² Will try different productions, backing up when a parse fails v Predictive parsers are much faster than backtracking ones Predictive parsers operate in linear time – will be our focus Backtracking parsers operate in exponential time – will not be considered v Two kinds of top-down parsing techniques will be studied Recursive-descent parsing LL parsing Top-Down Parsing – 2 Compiler Design – © Muhammed Mudawwar Top-Down Parsing by Recursive-Descent v We view a nonterminal A as a definition of a procedure A Procedure A will match the token sequence generated by nonterminal A v The RHS of a production of A specifies the code for procedure A Terminals are matched against input tokens Nonterminals produce calls to corresponding procedures v If multiple production rules exist for a nonterminal A One of them is predicted based on a lookahead token ² The lookahead token is the next input token that should be matched The predicted rule is the only one that applies ² Other rules will NOT be tried ² This is a predictive parsing technique, not a backtracking one v A syntax error is detected when Next token in the input sequence does NOT match the expected token Top-Down Parsing – 3 Compiler Design – © Muhammed Mudawwar Example on Recursive-Descent Parsing v Consider the following grammar for expressions in EBNF notation expr ? term { addop term } term ? factor { mulop factor } factor ? ‘(’ expr ‘)’ | id | num v Since three nonterminals exist, we need three parsing procedures The curly brackets expressing repetition is translated into a while loop The vertical bar expressing alternation is translated into a case statement procedure expr( ) begin term( ); while token = ADDOP do match(ADDOP); term( ); end while; end expr; procedure term( ) begin factor( ); while token = MULOP do match(MULOP); factor( ); end while; end term; procedure factor( ) begin case token of ‘(’: match(‘(’); expr( ); match(‘)’); ID: match(ID); NUM: match(NUM); else syntax_error(token); end case; end factor; Top-Down Parsing – 4 Compiler Design – © Muhammed Mudawwar Lookahead Token and Match Procedure v The recursive-descent procedures use a token variable v The token variable is the lookahead token Keeps track of the next token in the input sequence Is initialized to the first token before parsing begins Is updated after every call to the match procedure v The match procedure matches lookahead token with its parameter Is called to match an expected token on the RHS of a production Match succeeds if expected token = lookahead token and fails otherwise Match calls scanner function to update the lookahead token procedure match ( ExpectedToken ) begin if token = ExpectedToken then token := scan( ) ; else syntax_error( token , ExpectedToken ) ; end if; end match ; Page 5 Top-Down Parsing – 1 Compiler Design – © Muhammed Mudawwar Top-Down Parsing v A parser is top-down if it discovers a parse tree top to bottom A top-down parse corresponds to a preorder traversal of the parse tree A leftmost derivation is applied at each derivation step v Top-down parsers come in two forms Predictive Parsers ² Predict the production rule to be applied using lookahead tokens Backtracking Parsers ² Will try different productions, backing up when a parse fails v Predictive parsers are much faster than backtracking ones Predictive parsers operate in linear time – will be our focus Backtracking parsers operate in exponential time – will not be considered v Two kinds of top-down parsing techniques will be studied Recursive-descent parsing LL parsing Top-Down Parsing – 2 Compiler Design – © Muhammed Mudawwar Top-Down Parsing by Recursive-Descent v We view a nonterminal A as a definition of a procedure A Procedure A will match the token sequence generated by nonterminal A v The RHS of a production of A specifies the code for procedure A Terminals are matched against input tokens Nonterminals produce calls to corresponding procedures v If multiple production rules exist for a nonterminal A One of them is predicted based on a lookahead token ² The lookahead token is the next input token that should be matched The predicted rule is the only one that applies ² Other rules will NOT be tried ² This is a predictive parsing technique, not a backtracking one v A syntax error is detected when Next token in the input sequence does NOT match the expected token Top-Down Parsing – 3 Compiler Design – © Muhammed Mudawwar Example on Recursive-Descent Parsing v Consider the following grammar for expressions in EBNF notation expr ? term { addop term } term ? factor { mulop factor } factor ? ‘(’ expr ‘)’ | id | num v Since three nonterminals exist, we need three parsing procedures The curly brackets expressing repetition is translated into a while loop The vertical bar expressing alternation is translated into a case statement procedure expr( ) begin term( ); while token = ADDOP do match(ADDOP); term( ); end while; end expr; procedure term( ) begin factor( ); while token = MULOP do match(MULOP); factor( ); end while; end term; procedure factor( ) begin case token of ‘(’: match(‘(’); expr( ); match(‘)’); ID: match(ID); NUM: match(NUM); else syntax_error(token); end case; end factor; Top-Down Parsing – 4 Compiler Design – © Muhammed Mudawwar Lookahead Token and Match Procedure v The recursive-descent procedures use a token variable v The token variable is the lookahead token Keeps track of the next token in the input sequence Is initialized to the first token before parsing begins Is updated after every call to the match procedure v The match procedure matches lookahead token with its parameter Is called to match an expected token on the RHS of a production Match succeeds if expected token = lookahead token and fails otherwise Match calls scanner function to update the lookahead token procedure match ( ExpectedToken ) begin if token = ExpectedToken then token := scan( ) ; else syntax_error( token , ExpectedToken ) ; end if; end match ; Top-Down Parsing – 5 Compiler Design – © Muhammed Mudawwar Syntax Tree Construction for Expressions v A recursive-descent parser can be used to construct a syntax tree SyntaxTree := expr ( ) ; Calling parser function for start symbol v Parsing functions allocate and return pointers to syntax tree nodes v Construction of a syntax tree for simple expressions is given below New node allocates a tree node and returns a pointer to it function expr( ) : TreePtr begin left := term( ); while token = ADDOP do op := ADDOP.op ; match(ADDOP); right := term( ); left := new node(op, left, right); end while; return left; end expr; function term( ) : TreePtr begin left := factor( ); while token = MULOP do op := MULOP.op; match(MULOP); right := factor( ); left := new node(op, left, right); end while; return left; end term;Read More
26 videos|66 docs|30 tests
|
1. What is top-down parsing in computer science engineering? |
2. How does top-down parsing work in computer science engineering? |
3. What are the advantages of top-down parsing in computer science engineering? |
4. What are the limitations of top-down parsing in computer science engineering? |
5. What is the difference between top-down parsing and bottom-up parsing in computer science engineering? |
|
Explore Courses for Computer Science Engineering (CSE) exam
|