Table of contents | |
Introduction to Lexical Analysis | |
Token Definition | |
Lexical Analyzer in Action | |
Representation of Tokens |
A lexical token is a sequence of characters treated as a unit in the grammar of programming languages.
Tokens : It include type tokens (id, number, real), punctuation tokens (IF, void, return), alphabetic tokens (keywords like for, while, if), identifiers (variable name, function name), operators (+, ++, -), and separators (, ;).
Keywords; Examples-for, while, if etc.
Identifier; Examples-Variable name, function name etc.
Operators; Examples '+', '++', '-' etc.
Separators; Examples ',' ';' etc
Non-tokens: It include comments, preprocessor directives, macros, blanks, tabs, newline, etc.
Lexeme: A lexeme is the sequence of characters forming a token or a single token's input sequence (e.g., "float", "abs_zero_Kelvin", "=", "-", "273", ";").
For example, consider the program
int main()
{
// 2 variables
int a, b;
a = 10;
return 0;
}
All the valid tokens are:
'int' 'main' '(' ')' '{' '}' 'int' 'a' 'b' ';'
'a' '=' '10' ';' 'return' '0' ';' '}'
There are 5 valid token in this printf statement.
Exercise 1:
Count number of tokens :
int main()
{
int a = 10, b = 20;
printf("sum is :%d",a+b);
return 0;
}
Answer: Total number of token: 27.
Exercise: Count number of tokens: int max(int i);
Answer: Lexical analyzer first read int and finds it to be valid and accepts as token.max is read by it and found to be a valid function name after reading (int is also a token , then again I as another token and finally.
Hence, Total number of tokens 7:
int, max, ( ,int, i, ), ;
Download the notes
Lexical Analysis
|
Download as PDF |
26 videos|66 docs|30 tests
|
1. What is the main purpose of lexical analysis in computer science engineering? |
2. How does lexical analysis contribute to the overall compiler design process? |
3. What are some common challenges faced during lexical analysis? |
4. How does lexical analysis differ from syntax analysis in compiler design? |
5. Can lexical analysis be performed manually, or is it usually automated in modern compilers? |