Input Buffer & Lexical Analyzer Generator Computer Science Engineering (CSE) Notes | EduRev

Compiler Design

Computer Science Engineering (CSE) : Input Buffer & Lexical Analyzer Generator Computer Science Engineering (CSE) Notes | EduRev

The document Input Buffer & Lexical Analyzer Generator Computer Science Engineering (CSE) Notes | EduRev is a part of the Computer Science Engineering (CSE) Course Compiler Design.
All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)

Lexical Analyzer Generator:

Creating a lexical analyzer with Lex:

  • First, a specification of a lexical analyzer is prepared by creating a program lex.l in the Lex language. Then, lex.l is run through the Lex compiler to produce a C program lex.yy.c.
  • Finally, lex.yy.c is run through the C compiler to produce an object program a.out, which is the lexical analyzer that transforms an input stream into a sequence of tokens

                                                       Input Buffer & Lexical Analyzer Generator Computer Science Engineering (CSE) Notes | EduRev
 

Lex Specification

A Lex program consists of three parts:

{ definitions } %%
{ rules } %%
{ user subroutines }
o Definitions include declarations of variables, constants, and regular definitions
o Rules are statements of the form p1 {action1}p2 {action2} … pn {action}
o where pi is regular expression and actioni describes what action the lexical analyzer should take when pattern pi matches a lexeme. Actions are written in C code.
o User subroutines are auxiliary procedures needed by the actions. These can be compiled separately and loaded with the lexical analyzer.

INPUT BUFFERING The LA scans the characters of the source program one at a time to discover tokens. Because of large amount of time can be consumed scanning characters, specialized buffering techniques have been developed to reduce the amount of overhead required to process an input character.
 

Buffering techniques:
1. Buffer pairs
2. Sentinels
 

The lexical analyzer scans the characters of the source program one a t a time to discover tokens. Often, however, many characters beyond the next token many have to be examined before the next token itself can be determined. For this and other reasons, it is desirable for the lexical analyzer to read its input from an input buffer. Figure shows a buffer divided into two halves of, say 100 characters each. One pointer marks the beginning of the token being discovered. A look ahead pointer scans ahead of the beginning point, until the token is discovered .we view the position of each pointer as being between the character last read and the character next to be read. In practice each buffering scheme adopts one convention either a pointer is at the symbol last read or the symbol it is ready to read.

Token beginnings look ahead pointer, The distance which the look ahead pointer may have to travel past the actual token may be large.
For example, in a PL/I program we may see: DECALRE (ARG1, ARG2… ARG n) without knowing whether DECLARE is a keyword or an array name until we see the character that follows the

Offer running on EduRev: Apply code STAYHOME200 to get INR 200 off on our premium plan EduRev Infinity!

Related Searches

Sample Paper

,

Exam

,

Viva Questions

,

past year papers

,

MCQs

,

Previous Year Questions with Solutions

,

study material

,

Input Buffer & Lexical Analyzer Generator Computer Science Engineering (CSE) Notes | EduRev

,

Free

,

video lectures

,

Important questions

,

Objective type Questions

,

pdf

,

practice quizzes

,

mock tests for examination

,

ppt

,

Input Buffer & Lexical Analyzer Generator Computer Science Engineering (CSE) Notes | EduRev

,

Semester Notes

,

Summary

,

Extra Questions

,

shortcuts and tricks

,

Input Buffer & Lexical Analyzer Generator Computer Science Engineering (CSE) Notes | EduRev

;