Compiler Design 4. Language Grammars Notes | EduRev

: Compiler Design 4. Language Grammars Notes | EduRev

 Page 1


Compiler Design 
 
4. Language Grammars 
 
Kanat Bolazar 
January 28, 2010 
Page 2


Compiler Design 
 
4. Language Grammars 
 
Kanat Bolazar 
January 28, 2010 
Introduction to Parsing: Language Grammars 
• Programming language grammars are usually written as 
some variation of Context Free Grammars (CFG)s 
• Notation used is often BNF (Backus-Naur form): 
 
<block>         -> { <statementlist> } 
<statementlist> -> <statement> ; <statementlist> 
<statement>     -> <assignment> ; 
| if ( <expr> ) <block> else <block> 
| while ( <expr> ) <block> 
... 
Page 3


Compiler Design 
 
4. Language Grammars 
 
Kanat Bolazar 
January 28, 2010 
Introduction to Parsing: Language Grammars 
• Programming language grammars are usually written as 
some variation of Context Free Grammars (CFG)s 
• Notation used is often BNF (Backus-Naur form): 
 
<block>         -> { <statementlist> } 
<statementlist> -> <statement> ; <statementlist> 
<statement>     -> <assignment> ; 
| if ( <expr> ) <block> else <block> 
| while ( <expr> ) <block> 
... 
Example Grammar: Language 0+0 
• A language that we'll call "Language 0+0": 
E -> E + E | 0 
• Equivalently: 
E -> E + E 
E -> 0 
• Note that if there are multiple rules for the same left hand 
side, they are alternatives. 
• This language only contains sentences of the form: 
0     0+0     0+0+0     0+0+0+0   ... 
• Derivation for 0+0+0: 
E  ->  E + E  ->  E + E + E  -> 0 + 0 + 0 
• Note: This language is ambiguous: In the second step, did 
we expand the first or the second E to E + E? Both paths 
k 
 
Page 4


Compiler Design 
 
4. Language Grammars 
 
Kanat Bolazar 
January 28, 2010 
Introduction to Parsing: Language Grammars 
• Programming language grammars are usually written as 
some variation of Context Free Grammars (CFG)s 
• Notation used is often BNF (Backus-Naur form): 
 
<block>         -> { <statementlist> } 
<statementlist> -> <statement> ; <statementlist> 
<statement>     -> <assignment> ; 
| if ( <expr> ) <block> else <block> 
| while ( <expr> ) <block> 
... 
Example Grammar: Language 0+0 
• A language that we'll call "Language 0+0": 
E -> E + E | 0 
• Equivalently: 
E -> E + E 
E -> 0 
• Note that if there are multiple rules for the same left hand 
side, they are alternatives. 
• This language only contains sentences of the form: 
0     0+0     0+0+0     0+0+0+0   ... 
• Derivation for 0+0+0: 
E  ->  E + E  ->  E + E + E  -> 0 + 0 + 0 
• Note: This language is ambiguous: In the second step, did 
we expand the first or the second E to E + E? Both paths 
k 
 
Example Grammar: Arithmetic, Ambiguous 
• Arithmetic expressions: 
Exp -> num | Exp Operator Exp 
Op -> + | - | * | / | % 
• The "num" here represents a token. What it corresponds to 
is defined in the lexical analyzer with a regular expression: 
num        [0-9]+ 
• This langugage allows: 
45          35 + 257 * 5 -  2      ... 
• This language as defined here is ambiguous: 
2 + 5 * 7     Exp * 7    or    2 + Exp   ? 
• Depending on the tools you use, you may be able to just 
define precedence of operators, or may have to change the 
grammar. 
 
Page 5


Compiler Design 
 
4. Language Grammars 
 
Kanat Bolazar 
January 28, 2010 
Introduction to Parsing: Language Grammars 
• Programming language grammars are usually written as 
some variation of Context Free Grammars (CFG)s 
• Notation used is often BNF (Backus-Naur form): 
 
<block>         -> { <statementlist> } 
<statementlist> -> <statement> ; <statementlist> 
<statement>     -> <assignment> ; 
| if ( <expr> ) <block> else <block> 
| while ( <expr> ) <block> 
... 
Example Grammar: Language 0+0 
• A language that we'll call "Language 0+0": 
E -> E + E | 0 
• Equivalently: 
E -> E + E 
E -> 0 
• Note that if there are multiple rules for the same left hand 
side, they are alternatives. 
• This language only contains sentences of the form: 
0     0+0     0+0+0     0+0+0+0   ... 
• Derivation for 0+0+0: 
E  ->  E + E  ->  E + E + E  -> 0 + 0 + 0 
• Note: This language is ambiguous: In the second step, did 
we expand the first or the second E to E + E? Both paths 
k 
 
Example Grammar: Arithmetic, Ambiguous 
• Arithmetic expressions: 
Exp -> num | Exp Operator Exp 
Op -> + | - | * | / | % 
• The "num" here represents a token. What it corresponds to 
is defined in the lexical analyzer with a regular expression: 
num        [0-9]+ 
• This langugage allows: 
45          35 + 257 * 5 -  2      ... 
• This language as defined here is ambiguous: 
2 + 5 * 7     Exp * 7    or    2 + Exp   ? 
• Depending on the tools you use, you may be able to just 
define precedence of operators, or may have to change the 
grammar. 
 
Example Language: Arithmetic, Factored 
• Arithmetic expressions grammar, factored for operator 
precedence: 
Exp    -> Factor | Factor Addop Exp 
Factor -> num | num Multop Factor 
Addop  -> + | - 
Multop -> * | / | % 
• This langugage also allows the same sentences: 
45          35 + 257 * 5 -  2      ... 
• This language is not ambiguous; it first groups factors: 
2 + 5 * 7 
Factor Addop Exp 
num + Exp 
num + Factor 
num + num Multop Factor 
num + num * num 
 
Read More
Offer running on EduRev: Apply code STAYHOME200 to get INR 200 off on our premium plan EduRev Infinity!