Page 1
Bottom-up Parsing
Page 2
Bottom-up Parsing
Recap of Top-down Parsing
Top-down parsers build syntax tree from root to leaves
Left-recursion causes non-termination in top-down parsers
— Transformation to eliminate left recursion
— Transformation to eliminate common prefixes in right recursion
FIRST, FIRST
+
, & FOLLOW sets + LL(1) condition
— LL(1) uses left-to-right scan of the input, leftmost derivation
of the sentence, and 1 word lookahead
— LL(1) condition means grammar works for predictive parsing
Given an LL(1) grammar, we can
— Build a recursive descent parser
— Build a table-driven LL(1) parser
LL(1) parser doesn’t build the parse tree
— Keeps lower fringe of partially complete tree on the stack
Comp 412, Fall 2010 1
Page 3
Bottom-up Parsing
Recap of Top-down Parsing
Top-down parsers build syntax tree from root to leaves
Left-recursion causes non-termination in top-down parsers
— Transformation to eliminate left recursion
— Transformation to eliminate common prefixes in right recursion
FIRST, FIRST
+
, & FOLLOW sets + LL(1) condition
— LL(1) uses left-to-right scan of the input, leftmost derivation
of the sentence, and 1 word lookahead
— LL(1) condition means grammar works for predictive parsing
Given an LL(1) grammar, we can
— Build a recursive descent parser
— Build a table-driven LL(1) parser
LL(1) parser doesn’t build the parse tree
— Keeps lower fringe of partially complete tree on the stack
Comp 412, Fall 2010 1 Comp 412, Fall 2010 2
Parsing Techniques
Top-down parsers (LL(1), recursive descent)
Start at the root of the parse tree and grow toward leaves
Pick a production & try to match the input
Bad “pick” Þ may need to backtrack
Some grammars are backtrack-free (predictive parsing)
Bottom-up parsers (LR(1), operator precedence)
Start at the leaves and grow toward root
As input is consumed, encode possibilities in an internal state
Start in a state valid for legal first tokens
Bottom-up parsers handle a large class of grammars
Page 4
Bottom-up Parsing
Recap of Top-down Parsing
Top-down parsers build syntax tree from root to leaves
Left-recursion causes non-termination in top-down parsers
— Transformation to eliminate left recursion
— Transformation to eliminate common prefixes in right recursion
FIRST, FIRST
+
, & FOLLOW sets + LL(1) condition
— LL(1) uses left-to-right scan of the input, leftmost derivation
of the sentence, and 1 word lookahead
— LL(1) condition means grammar works for predictive parsing
Given an LL(1) grammar, we can
— Build a recursive descent parser
— Build a table-driven LL(1) parser
LL(1) parser doesn’t build the parse tree
— Keeps lower fringe of partially complete tree on the stack
Comp 412, Fall 2010 1 Comp 412, Fall 2010 2
Parsing Techniques
Top-down parsers (LL(1), recursive descent)
Start at the root of the parse tree and grow toward leaves
Pick a production & try to match the input
Bad “pick” Þ may need to backtrack
Some grammars are backtrack-free (predictive parsing)
Bottom-up parsers (LR(1), operator precedence)
Start at the leaves and grow toward root
As input is consumed, encode possibilities in an internal state
Start in a state valid for legal first tokens
Bottom-up parsers handle a large class of grammars
Comp 412, Fall 2010 3
Bottom-up Parsing (definitions)
The point of parsing is to construct a derivation
A derivation consists of a series of rewrite steps
S Þ g
0
Þ g
1
Þ g
2
Þ … Þ g
n–1
Þ g
n
Þ sentence
Each g
i
is a sentential form
— If g contains only terminal symbols, g is a sentence in L(G)
— If g contains 1 or more non-terminals, g is a sentential form
To get g
i
from g
i–1
, expand some NT A Î g
i–1
by using A ®b
— Replace the occurrence of A Î g
i–1
with b to get g
i
— In a leftmost derivation, it would be the first NT A Î g
i–1
A left-sentential form occurs in a leftmost derivation
A right-sentential form occurs in a rightmost derivation
Bottom-up parsers build a rightmost derivation in reverse
We saw this definition earlier
Page 5
Bottom-up Parsing
Recap of Top-down Parsing
Top-down parsers build syntax tree from root to leaves
Left-recursion causes non-termination in top-down parsers
— Transformation to eliminate left recursion
— Transformation to eliminate common prefixes in right recursion
FIRST, FIRST
+
, & FOLLOW sets + LL(1) condition
— LL(1) uses left-to-right scan of the input, leftmost derivation
of the sentence, and 1 word lookahead
— LL(1) condition means grammar works for predictive parsing
Given an LL(1) grammar, we can
— Build a recursive descent parser
— Build a table-driven LL(1) parser
LL(1) parser doesn’t build the parse tree
— Keeps lower fringe of partially complete tree on the stack
Comp 412, Fall 2010 1 Comp 412, Fall 2010 2
Parsing Techniques
Top-down parsers (LL(1), recursive descent)
Start at the root of the parse tree and grow toward leaves
Pick a production & try to match the input
Bad “pick” Þ may need to backtrack
Some grammars are backtrack-free (predictive parsing)
Bottom-up parsers (LR(1), operator precedence)
Start at the leaves and grow toward root
As input is consumed, encode possibilities in an internal state
Start in a state valid for legal first tokens
Bottom-up parsers handle a large class of grammars
Comp 412, Fall 2010 3
Bottom-up Parsing (definitions)
The point of parsing is to construct a derivation
A derivation consists of a series of rewrite steps
S Þ g
0
Þ g
1
Þ g
2
Þ … Þ g
n–1
Þ g
n
Þ sentence
Each g
i
is a sentential form
— If g contains only terminal symbols, g is a sentence in L(G)
— If g contains 1 or more non-terminals, g is a sentential form
To get g
i
from g
i–1
, expand some NT A Î g
i–1
by using A ®b
— Replace the occurrence of A Î g
i–1
with b to get g
i
— In a leftmost derivation, it would be the first NT A Î g
i–1
A left-sentential form occurs in a leftmost derivation
A right-sentential form occurs in a rightmost derivation
Bottom-up parsers build a rightmost derivation in reverse
We saw this definition earlier
Comp 412, Fall 2010 4
Bottom-up Parsing (definitions)
A bottom-up parser builds a derivation by working from
the input sentence back toward the start symbol S
S Þ g
0
Þ g
1
Þ g
2
Þ … Þ g
n–1
Þ g
n
Þ sentence
To reduce g
i
to g
i–1
match some rhs b against g
i
then replace b
with its corresponding lhs, A. (assuming the production A ® b)
In terms of the parse tree, it works from leaves to root
Nodes with no parent in a partial tree form its upper fringe
Since each replacement of b with A shrinks the upper fringe,
we call it a reduction.
“Rightmost derivation in reverse” processes words left to right
The parse tree need not be built, it can be simulated
|parse tree nodes | = |terminal symbols | + |reductions |
bottom-up
Read More