Open App

Computer Science Engineering (CSE) Exam > Computer Science Engineering (CSE) Notes > Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering

Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering - Computer Science Engineering (CSE) PDF Download

SPECIFICATION OF TOKENS

There are 3 specifications of tokens:

1) Strings

2) Language

3) Regular expression

Strings and Languages

An alphabet or character class is a finite set of symbols.
A string over an alphabet is a finite sequence of symbols drawn from that alphabet.
A language is any countable set of strings over some fixed alphabet.

In language theory, the terms "sentence" and "word" are often used as synonyms for "string." The length of a string s, usually written |s|, is the number of occurrences of symbols in s. For example, banana is a string of length six. The empty string, denoted ε, is the string of length zero.

Operations on strings

The following string-related terms are commonly used:

1. A prefix of string s is any string obtained by removing zero or more symbols from the end of string s. For example, ban is a prefix of banana.

2. A suffix of string s is any string obtained by removing zero or more symbols from the beginning of s. For example, nana is a suffix of banana.

3. A substring of s is obtained by deleting any prefix and any suffix from s. For example, nan is a substring of banana.

4. The proper prefixes, suffixes, and substrings of a string s are those prefixes, suffixes, and substrings, respectively of s that are not ε or not equal to s itself.

5. A subsequence of s is any string formed by deleting zero or more not necessarily consecutive positions of s

6. For example, baan is a subsequence of banana.

Operations on languages:

The following are the operations that can be applied to languages:

1. Union

2. Concatenation

3. Kleene closure

4. Positive closure

The following example shows the operations on strings: Let L={0,1} and S={a,b,c}

Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering - Computer Science Engineering (CSE)

Regular Expressions

Each regular expression r denotes a language L(r).
Here are the rules that define the regular expressions over some alphabet Σ and the languages that those expressions denote:

1.ε is a regular expression, and L(ε) is { ε }, that is, the language whose sole member is the empty string.

2. If ‘a’ is a symbol in Σ, then ‘a’ is a regular expression, and L(a) = {a}, that is, the language with one string, of length one, with ‘a’ in its one position.

3.Suppose r and s are regular expressions denoting the languages L(r) and L(s). Then, a) (r)|(s) is a regular expression denoting the language L(r) U L(s).

b) (r)(s) is a regular expression denoting the language L(r)L(s). c) (r)* is a regular expression denoting (L(r))*.

d) (r) is a regular expression denoting L(r).

4.The unary operator * has highest precedence and is left associative.

5.Concatenation has second highest precedence and is left associative.

6. I has lowest precedence and is left associative.

Regular set

A language that can be defined by a regular expression is called a regular set. If two regular expressions r and s denote the same regular set, we say they are equivalent and write r = s.

There are a number of algebraic laws for regular expressions that can be used to manipulate into equivalent forms.

For instance, r|s = s|r is commutative; r|(s|t)=(r|s)|t is associative.

Regular Definitions

Giving names to regular expressions is referred to as a Regular definition. If Σ is an alphabet of basic symbols, then a regular definition is a sequence of definitions of the form

d_l → r ₁

d₂ → r₂

………

d_n → r_n

1.Each d_i is a distinct name.

2.Each r_i is a regular expression over the alphabet Σ U {d_l, d₂,. . . , d_i-l}.

Example: Identifiers is the set of strings of letters and digits beginning with a letter. Regular

definition for this set:

letter → A | B | …. | Z | a | b | …. | z | digit → 0 | 1 | …. | 9

id → letter ( letter | digit ) *

Shorthands

Certain constructs occur so frequently in regular expressions that it is convenient to introduce notational short hands for them.

1. One or more instances (+):

- The unary postfix operator + means “ one or more instances of” .

- If r is a regular expression that denotes the language L(r), then ( r )⁺ is a regular expression that denotes the language (L (r ))⁺

- Thus the regular expression a⁺ denotes the set of all strings of one or more a’s.

- The operator ⁺ has the same precedence and associativity as the operator ^*.

2. Zero or one instance ( ?):

- The unary postfix operator ? means “zero or one instance of”.

- The notation r? is a shorthand for r | ε.

- If ‘r’ is a regular expression, then ( r )? is a regular expression that denotes the language

3. Character Classes:

- The notation [abc] where a, b and c are alphabet symbols denotes the regular expression a | b | c.

- Character class such as [a – z] denotes the regular expression a | b | c | d | ….|z.

- We can describe identifiers as being strings generated by the regular expression, [A–Za–z][A– Za–z0–9]*

Non-regular Set

A language which cannot be described by any regular expression is a non-regular set. Example: The set of all strings of balanced parentheses and repeating strings cannot be described by a regular expression. This set can be specified by a context-free grammar.

The document Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering - Computer Science Engineering (CSE) is a part of Computer Science Engineering (CSE) category.

All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)

Top Courses for Computer Science Engineering (CSE)

View all

FAQs on Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering - Computer Science Engineering (CSE)

1. What is the role of lexical analysis in computer programming?

Ans. Lexical analysis is the first phase of the compiler design process, which involves breaking down the input source code into a sequence of tokens. These tokens are the smallest meaningful units of the program, and they represent the basic building blocks of the programming language. The role of lexical analysis is to identify these tokens and generate a token stream that can be used by the subsequent phases of the compiler. This process involves removing any unnecessary whitespace, comments, and other non-essential elements from the source code.

2. What are tokens in computer programming?

Ans. Tokens are the smallest meaningful units of a programming language. They represent the basic building blocks of the language, and include keywords, identifiers, operators, literals, and special symbols. The role of the lexical analyzer is to identify these tokens in the source code and generate a token stream that can be used by the subsequent phases of the compiler.

3. What is the difference between a keyword and an identifier in programming?

Ans. Keywords and identifiers are both types of tokens in programming languages, but they serve different purposes. Keywords are predefined reserved words that have a specific meaning in the language. They cannot be used as identifiers or variable names. Identifiers, on the other hand, are user-defined names that are used to represent variables, functions, classes, and other program elements. They must follow certain naming conventions and cannot be the same as a keyword.

4. How does lexical analysis help in identifying syntax errors in a program?

Ans. Lexical analysis plays a crucial role in identifying syntax errors in a program. By breaking down the source code into a sequence of tokens, the lexical analyzer can identify any tokens that do not conform to the syntax rules of the language. For example, if a program contains a misspelled keyword or an illegal character, the lexical analyzer will detect this and report it as a syntax error. This information is then passed on to the subsequent phases of the compiler, which can use it to produce more helpful error messages for the programmer.

5. What are some common errors that can occur during lexical analysis?

Ans. There are several common errors that can occur during lexical analysis, including: 1. Misspelled keywords or identifiers 2. Illegal characters or symbols 3. Mismatched delimiters (such as parentheses or quotes) 4. Improperly formatted numbers or strings 5. Inconsistent use of whitespace or comments These errors can cause problems in the subsequent phases of the compiler, and can be difficult to diagnose if not caught early in the process. Therefore, it is important to pay close attention to the output of the lexical analyzer and correct any errors as soon as possible.

Related Exams

Computer Science Engineering (CSE)

About this Document

	11.4K Views
	4.60/5 Rating
	Nov 18, 2024 Last updated

Document Description: Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering for Computer Science Engineering (CSE) 2024 is part of Computer Science Engineering (CSE) preparation. The notes and questions for Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering have been prepared according to the Computer Science Engineering (CSE) exam syllabus. Information about Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering covers topics like and Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering Example, for Computer Science Engineering (CSE) 2024 Exam. Find important definitions, questions, notes, meanings, examples, exercises and tests below for Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering.

Introduction of Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering in English is available as part of our Computer Science Engineering (CSE) preparation & Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering in Hindi for Computer Science Engineering (CSE) courses. Download more important topics, notes, lectures and mock test series for Computer Science Engineering (CSE) Exam by signing up for free. Computer Science Engineering (CSE): Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering - Computer Science Engineering (CSE)

Description

Full syllabus notes, lecture & questions for Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering - Computer Science Engineering (CSE) - Computer Science Engineering (CSE) | Plus excerises question with solution to help you revise complete syllabus | Best notes, free PDF download

Information about Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering

In this doc you can find the meaning of Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering defined & explained in the simplest way possible. Besides explaining types of Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering theory, EduRev gives you an ample number of questions to practice Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering tests, examples and also practice Computer Science Engineering (CSE) tests.

Download as PDF

Explore Courses for Computer Science Engineering (CSE) exam

Top Courses for Computer Science Engineering (CSE)

Explore Courses

Signup for Free!

Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.

Start learning for Free

10M+ students study on EduRev

Sample Paper

Computer Science and IT Engineering - Computer Science Engineering (CSE)

Summary

video lectures

Free

shortcuts and tricks

Computer Science and IT Engineering - Computer Science Engineering (CSE)

mock tests for examination

Specification of Tokens - Lexical Analysis

Objective type Questions

Extra Questions

Specification of Tokens - Lexical Analysis

Important questions

Previous Year Questions with Solutions

ppt

Semester Notes

Viva Questions

past year papers

study material

practice quizzes

Exam

MCQs

pdf

;

Additional Information about Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering for Computer Science Engineering (CSE) Preparation

Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering Free PDF Download

The Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering is an invaluable resource that delves deep into the core of the Computer Science Engineering (CSE) exam. These study notes are curated by experts and cover all the essential topics and concepts, making your preparation more efficient and effective. With the help of these notes, you can grasp complex subjects quickly, revise important points easily, and reinforce your understanding of key concepts. The study notes are presented in a concise and easy-to-understand manner, allowing you to optimize your learning process. Whether you're looking for best-recommended books, sample papers, study material, or toppers' notes, this PDF has got you covered. Download the Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering now and kickstart your journey towards success in the Computer Science Engineering (CSE) exam.

Importance of Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering

The importance of Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering cannot be overstated, especially for Computer Science Engineering (CSE) aspirants. This document holds the key to success in the Computer Science Engineering (CSE) exam. It offers a detailed understanding of the concept, providing invaluable insights into the topic. By knowing the concepts well in advance, students can plan their preparation effectively. Utilize this indispensable guide for a well-rounded preparation and achieve your desired results.

Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering Notes

Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering Notes offer in-depth insights into the specific topic to help you master it with ease. This comprehensive document covers all aspects related to Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering. It includes detailed information about the exam syllabus, recommended books, and study materials for a well-rounded preparation. Practice papers and question papers enable you to assess your progress effectively. Additionally, the paper analysis provides valuable tips for tackling the exam strategically. Access to Toppers' notes gives you an edge in understanding complex concepts. Whether you're a beginner or aiming for advanced proficiency, Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering Notes on EduRev are your ultimate resource for success.

Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering Computer Science Engineering (CSE) Questions

The "Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering Computer Science Engineering (CSE) Questions" guide is a valuable resource for all aspiring students preparing for the Computer Science Engineering (CSE) exam. It focuses on providing a wide range of practice questions to help students gauge their understanding of the exam topics. These questions cover the entire syllabus, ensuring comprehensive preparation. The guide includes previous years' question papers for students to familiarize themselves with the exam's format and difficulty level. Additionally, it offers subject-specific question banks, allowing students to focus on weak areas and improve their performance.

Study Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering on the App

Students of Computer Science Engineering (CSE) can study Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering alongwith tests & analysis from the EduRev app, which will help them while preparing for their exam. Apart from the Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering, students can also utilize the EduRev App for other study materials such as previous year question papers, syllabus, important questions, etc. The EduRev App will make your learning easier as you can access it from anywhere you want. The content of Specification of Tokens - Lexical Analysis, Computer Science and IT Engineering is prepared as per the latest Computer Science Engineering (CSE) syllabus.

Education Revolution