Q1: Consider the following statements. (2020)
I. Symbol table is accessed only during lexical analysis and syntax analysis.
II. Compilers for programming languages that support recursion necessarily need heap storage for memory allocation in the run-time environment.
III. Errors violating the condition 'any variable must be declared before its use' are detected during syntax analysis.
Which of the above statements is/are TRUE?
(a) I only
(b) I and III only
(c) II only
(d) None of I, II and III
Ans: (d)
Sol:
1. False.
The symbol table is accessed by most phases of a compiler, beginning with lexical analysis, and continuing through optimization.
Symbol table is accessed during other stages also.
2. Not essential, any one of heap and stack is enough to support recursion. Dynamic allocation of activation records is essential to implement recursion. Remember the stack size can also grow dynamical (see C memory layout).
3. Syntax analysis uses CFL which cannot check for this, we need power of Context sensitive language which is available in semantic analysis phase. So this error is detected only during semantic analysis phase.
Q2: A lexical analyzer uses the following patterns to recognize three tokens T1, T2, and T3 over the alphabet {a,b,c}. (2018)
T1: a?(b|c )*a
T2: b?(a|c)*b
T3: c?(b|a)*c
Note that 'x?' means 0 or 1 occurrence of the symbol x. Note also that the analyzer outputs the token that matches the longest possible prefix.
If the string bbaacabc is processed by the analyzer, which one of the following is the sequence of tokens it outputs?
(a) T1 T2 T3
(b) T1 T1 T3
(c) T2 T1 T3
(d) T3 T3
Ans: (d)
Sol: You can think T3 as ( ε + c ) ( b + a ) ∗ c
Given string is b b a a c a b c
The longest matching prefix bbaac { from regex T 3 you can easily derive bbaac }
Now the remaining abc { This can also be derived from T 3 }
Hence T 3 T 3 is the answer.
Q3: The lexical analysis for a modern computer language such as Java needs the power of which one of the following machine models in a necessary and sufficient sense? (2011)
(a) Finite state automata
(b) Deterministic pushdown automata
(c) Non-Deterministic pushdown automata
(d) Turing machine
Ans: (a)
Sol: In compiler lexical analyzer categorizes character sequence into lexemes and produces tokens as output for parser. And tokens are expressed in regular expressions so a simple Finite Automata is sufficient for it.
Q4: In a compiler, keywords of a language are recognized during (2011)
(a) parsing of the program
(b) the code generation
(c) the lexical analysis of the program
(d) dataflow analysis
Ans: (c)
Sol: Typically, the lexical analysis phase of compilation breaks the input text up into sequences of lexemes that each belongs to some particular token type that's useful in later analysis. Consequently, keywords are usually first recognized during lexical analysis in order to make parsing easier. Since parsers tend to be implemented by writing context-free grammars of tokens rather than of lexemes (that is, the category of the lexeme rather than the contents of the lexeme), it is significantly easier to build a parser when keywords are marked during lexical. Any identifier is also a token so it is recognized in lexical Analysis .
Hence, option C is True.
Q5: Which data structure in a compiler is used for managing information about variables and their attributes? (2010)
(a) Abstract syntax tree
(b) Symbol table
(c) Semantic stack
(d) Parse table
Ans: (b)
Sol: Symbol table is an important data structure created and maintained by compilers in order to store information about the occurrence of various entities such as variable names, function names, objects, classes, interfaces, etc. Symbol table is used by both the analysis and the synthesis parts of a compiler.
If a compiler is to handle a small amount of data, then the symbol table can be implemented as an unordered list, which is easy to code, but it is only suitable for small tables only. A symbol table can be implemented in one of the following ways:
- Linear (sorted or unsorted) list
- Binary Search Tree
Hash table
Among all, symbol tables are mostly implemented as hash tables, where the source code symbol itself is treated as a key for the hash function and the return value is the information about the symbol.
Q6: The number of tokens in the following C statement is (2000)
(a) 3
(b) 26
(c) 10
(d) 21
Ans: (c)
Sol: Tokens are:
1. Printf
2. (
3. "i=%d,s i=%x"
4. ,
5. i
6. ,
7. s
8. i
9. )
10. ;
Q7: The number of tokens in the FORTRAN statement DO 10 I = 1.25 is (1999)
(a) 3
(b) 4
(c) 5
(d) None of the above
Ans: (a)
Sol: token1= DO10I ,
token2= '='
token3=1.25
Within the statement field, blanks were generally ignored, allowing the programmer to omit space between tokens for brevity, or include spaces within identifiers for clarity (for example, AVG OF X was a valid identifier, and equivalent to AVGOFX).
Q8: The pass numbers for each of the following activities (1996)
i. object code generation
ii. literals added to literal table
iii. listing printed
iv. address resolution of local symbols that occur in a two pass assembler
respectively are
(a) 1, 2, 1, 2
(b) 2, 1, 2, 1
(c) 2, 1, 1, 2
(d) 1, 2, 2, 2
Ans: (b)
Sol: The functions performed in pass 1 and pass 2 in 2 pass assembler are
Pass 1
1. Assign addresses to all statements in the program.
2. Save the values assigned to all labels for use in pass 2
3. Perform some processing of assembler directives.
Pass 2
1. Assemble instructions.
2. Generate data values defined by BYTE, WORD etc.
3. Perform processing of assembler directives not done during pass 1.
4. Write the program and the assembling listing
Q9: A simple two-pass assembler does the following in the first pass: (1993)
(a) It allocates space for the literals.
(b) It computes the total length of the program.
(c) It builds the symbol table for the symbols and their values.
(d) It generates code for all the load and store register instructions.
(e) None of the above.
Ans: (a, b, c)
Sol: scan the code twice. The first time, just count how long the machine code instructions will be, just to find out the addresses of all the labels. Also, create a table that has a list of all the addresses and where they will be in the program. This table is known as the symbol table. On the second scan, generate the machine code, and use the symbol table to determine how far away jump labels are, and to generate the most efficient instruction
Q10: In a compiler the module that checks every character of the source text is called: (1988)
(a) The code generator.
(b) The code optimizer.
(c) The lexical analyzer
(d) The syntax analyzer
Ans: (c)
Sol: The first phase of compiler is known as lexical analyzer, it is also known as scanner because it scans the whole program so it checks every character of the source text .Hence answer is C
Q11:In a compiler the module that checks every character of the source text is called: (1987)
(a) The code generator.
(b) The code optimizer.
(c) The lexical analyzer.
(d) The syntax analyzer.
Ans: (c)
Sol: The first phase of compiler is known as lexical analyzer, it is also known as scanner because it scans the whole program so it checks every character of the source text .Hence answer is C