Computer Science Engineering (CSE) Exam  >  Computer Science Engineering (CSE) Notes  >  Theory of Computation  >  Formal Representation of Languages & Chomsky Hierarchy

Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE) PDF Download

Part V. Formal Representation of Languages
Here we introduce the concept of formal languages and grammars.We know that a grammar is specified for a natural language such as English. To define valid sentences and to give structural descriptions of sentences a grammar is used. Linguistics is a branch to study the theory of languages. Scientists in this branch of study have defined a formal grammar to describe English that would make language translation using computers very easy.

English Grammar

First we will consider how grammars can be specified for English language. Then we will extend this model to our programming language grammar specification.

An example for grammar is,
ARTICLE → a | an | the
NOUN_PHRASE → ARTICLE NOUN
NOUN → boy | apple | ε

Productions
The above are called productions or rules of the grammar. Two productions are given above.
First production is,
ARTICLE → a | an | the
It can also be written as,
ARTICLE → a
ARTICLE → an
ARTICLE → the

Second production is,
NOUN_PHRASE → ARTICLE NOUN

Third production is,
NOUN → boy | apple | ε
It can also be written as,
NOUN → boy
NOUN → apple
NOUN → ε

Non-terminals
In the above, ARTICLE, NOUN_PHRASE, NOUN are called non-terminal symbols. A non-terminal is specified using upper case letters. Note that the left hand side of a production always contains a single non-terminal.

Terminals
In the above, ’a’, ’an’, ’the’, ’boy’, ’apple’ are called terminal symbols. A terminal is written using lowercase letters. ε means an empty string.

Start Symbol
In a grammar specification, one non-terminal symbol is called the start symbol. Here, NOUN_PHRASE is the start symbol. Thus a grammar consists of a set of productions, terminals, non-terminals and start symbol. If the matching left hand side of a certain rule can be found to occur in a string, it may be replaced by the corresponding right hand side.

8.1 Formal Definition of a Grammar
A grammar is Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE) where
VN is a finite non empty set whose elements are called non-terminals.
∑ is a finite non empty set whose elements are called terminals.
V∩ ∑ = φ
S is a special non terminal called start symbol.
P is a finite set whose elements are α → β, whose α and β are terminals or non terminals. Elements of P are called productions or production rules.
For example,
Consider the productions,
S → aB|bA
A → aS|bAA|a
B → bS|aBB|b
where S is the start symbol.
For the above productions the grammar is defined as,
VN = {S, A, B}
∑ = {a, b}
Sis the start symbol S
P = {S → aB|bA, A → aS|bAA|a, B → bS|aBB|b}

Derivations
Productions are used to derive various sentences.
Example 1:
Consider the grammar given below:
SENTENCE → NOUN_PHRASE VERB_PHRASE
NOUN_PHRASE → ARTICLE NOUN
VERB_PHRASE → VERB NOUN_PHRASE
ARTICLE → a | an | the
NOUN → boy | apple
VERB → apple
where SENTENCE is the start symbol.
Using this grammar, the sentence
’the boy ate an apple’
is derived as shown below:
Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)

Example 2:
Consider the grammar given below:
S → baaS | baa
In the above, S is the start symbol, S is a non-terminal and a, b are terminals.
To derive a sentence, we begin with the start symbol, S.
From the above, we can rewrite, S as baa.
Thus the sentence, baa is a valid sentence according to the grammar.
Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)

From the above rule, rewrite S as baaS. Then rewrite S in baaS as baaS. Then we get baabaaS. Again rewrite S as
baa. We get baabaabaa.
Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)
Thus baabaabaa is a valid sentence.

Example 3:
Following is a grammar for expressions:
Expr → Expr Op num | num
Op → + | - | * | /
In the above grammar, Expr, Op are non-terminals.
num, +, -, *, / are terminals.
Using the above grammar, a large set of expressions can be derived.

Example 4:
The following is a grammer for if..else construct in C.
Stmt → if ( Expr ) Stmt else Stmt | if ( Expr ) Stmt

Example 5:
The following is a grammar for a set of statements in C.
Stmt → id = Expr;
| if (Expr) Stmt
| if (Expr) Stmt else Stmt
| while (Expr) stmt
| do Stmt while (Expr);
| { Stmts }
Stmts → Stmts Stmt
| ε

8.2 Formal Definition of a Language
The language generated by a grammar G is L(G) is defined as
L(G) = { w∈ ∑∗ | S∗ ⇒G w }
The elements of L(G) are called sentences.
In a simple way, L(G) is the set of all sentences derived from the start symbol S.
For example, for the grammar
S → baaS | baa
L(G) = {baa, baabaa, baabaabaa, . . . . . . . . . . }

Example 1:
Consider the grammar given below:
ARTICLE → a | an | the
NOUN_PHRASE → ARTICLE NOUN
NOUN → boy | apple
where NOUN_PHRASE is the start symbol.
Find the language L(G) generated by the grammar.
i)
          NOUN_PHRASE → ARTICLE NOUN
                    ⇒ a boy

ii)
         NOUN_PHRASE → ARTICLE NOUN
                   ⇒ an apple

iii)
        NOUN_PHRASE → ARTICLE NOUN
                  ⇒ the boy

iv)
        NOUN_PHRASE → ARTICLE NOUN
                  ⇒ the apple

v)
       NOUN_PHRASE → ARTICLE NOUN
                 ⇒ a apple
vi)     

       NOUN_PHRASE → ARTICLE NOUN
                 ⇒ an boy

Hence L(G) = { a boy, an apple, the apple, the boy, a apple, an boy}
Note that ’a apple’ and ’an boy’ are in the set even if they are not valid English phrases.

Exercises:
1. Let grammar G = [ {S,C}, {a,b}, P, S ]
where P consists of
S → aCa
C → aCa|b. Find L(G).
2. Let G = [{S, A1, A2}, {a, b}, P, S], where P consists of
S → aA1A2a
A1 → baA1A2b
A2 →A1ab
aA1 → baa
bA2b → abab
Check whether w = baabbabaaabbaba is in L(G).

2. Let grammar G = [ {S}, {0,1}, {S →0S1, S → ∈}, S ]. Find L(G).

9 Chomsky Classification of Languages
[Kavitha2004]
A number of language families are there. Noam Chomsky, a founder of formal language theory, provided an initial classification into four language types, type 0 to type 3. A grammar represents the structure of a language as we saw earlier. Four types of languages and their associated grammars are defined in Chomsky Hierarchy (defined by Noam Chomsky).The languages are
Type- 0 languages or Unrestricted languages,
Type- 1 languages or Context sensitive languages,
Type- 2 languages or Context free languages,
Type- 3 languages or regular languages.
Each language family is a proper subset of another one. The following figure shows the relationship.
Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)
Type-0 languages are generated by Type-0 grammars, Type-1 languages by Type-1 grammars, Type-2 languages by
Type-2 grammars and Type-3 languages by Type-3 grammars.

Thus the corresponding grammars are
Type- 0 grammars or Unrestricted grammars,
Type- 1 grammars or Context sensitive grammars,
Type- 2 grammars or Context free grammars,
Type- 3 grammars or regular grammars.

Type- 0 or Unrestricted languages vs Type- 0 grammars

Type- 0 languages are defined by arbitrary grammars (Type-0).
Type- 0 grammars have productions of the form, α → β, such that α ∉ ε.
Type- 0 languages are recognized using Turing Machines (TM).
The following is an example for an Unrestricted grammar:
S → ACaB
Ca → aaC
CB → DB
aD → Da
aE → Ea
AE → ε

Type- 1 or Context- sensitive languages vs Type- 1 grammars

Type- 1 languages are described by Type- 1 grammars.
LHS of a Type- 1 grammar is not longer than the RHS.
Type- 1 grammars are represented by the productions α → β, such that |α| ≤ |β|.

Type- 1 languages are recognized using Linear Bounded Automata (LBA).
The following is an example for a Context Sensitive grammar:
S → SBC
S → aC
B → a
CB → BC
Ba → aa
C → b

Type- 2 or Context- free languages vs Type- 2 grammars

Type- 2 languages are described by Type- 2 grammars or context free grammars.
LHS of a Context free grammar (Type-2) is a single non-terminal. The RHS consists of an arbitrary string of terminals and non-terminals. Context free grammars are represented by the productions A → β.
Type 2 languages are recognized using Pushdown Automata (PDA).
The following is an example for a Context free grammar:
S → aSb
S → ε
The English grammar given below is a context free grammar.
ARTICLE → a | an | the
NOUN_PHRASE → ARTICLE NOUN
NOUN → boy | apple

Type- 3 or Regular languages vs Type- 3 grammars

Type- 3 languages are described by Type- 3 grammars.
The LHS of a Type- 3 grammar is a single non-terminal. The RHS is empty, or consists of a single terminal or a single
terminal followed by a non- terminal.
Type- 3 grammars are represented by the productions A → aB or A → a.
Type- 3 languages are recognized using Finite State Automata (FA).
The following is an example for a regular grammar:
A → aA
A →ε
A → bB
B → bB
B → ε
A → c

Following figure shows the relation between the four types of languages and machines.
Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)

The document Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE) is a part of the Computer Science Engineering (CSE) Course Theory of Computation.
All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)
18 videos|69 docs|44 tests

Top Courses for Computer Science Engineering (CSE)

FAQs on Formal Representation of Languages & Chomsky Hierarchy - Theory of Computation - Computer Science Engineering (CSE)

1. What is the Chomsky hierarchy in formal language theory?
Ans. The Chomsky hierarchy is a classification of formal languages proposed by Noam Chomsky in the field of formal language theory. It categorizes languages into four levels based on the type of grammar that can generate them: Type 3 (regular), Type 2 (context-free), Type 1 (context-sensitive), and Type 0 (unrestricted).
2. What is the significance of formal representation of languages in computer science engineering?
Ans. Formal representation of languages plays a crucial role in computer science engineering. It allows us to define the syntax and structure of programming languages, develop compilers and interpreters, analyze language complexity, and design algorithms for language processing. It provides a formal framework for studying languages and their properties.
3. How are regular languages represented in the Chomsky hierarchy?
Ans. Regular languages are represented by Type 3 grammars in the Chomsky hierarchy. These grammars use regular expressions or finite automata (such as deterministic or non-deterministic finite automata) to describe the language. Regular languages can be recognized and generated by regular expressions and finite automata.
4. What are context-free languages in the context of the Chomsky hierarchy?
Ans. Context-free languages are represented by Type 2 grammars in the Chomsky hierarchy. These grammars consist of production rules where a single non-terminal can be replaced by a sequence of terminals and non-terminals. Context-free languages can be recognized and generated by pushdown automata or context-free parsers.
5. How does the Chomsky hierarchy relate to the complexity of languages?
Ans. The Chomsky hierarchy provides a way to categorize languages based on their generative power and complexity. As we move up the hierarchy from regular to unrestricted languages, the generative power increases, allowing more complex languages to be defined. Each level of the hierarchy has associated automata or grammars that can recognize or generate the languages within that level. The complexity of language recognition or generation algorithms also increases as we move up the hierarchy.
18 videos|69 docs|44 tests
Download as PDF
Explore Courses for Computer Science Engineering (CSE) exam

Top Courses for Computer Science Engineering (CSE)

Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

study material

,

pdf

,

Viva Questions

,

Previous Year Questions with Solutions

,

Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)

,

Exam

,

Sample Paper

,

video lectures

,

Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)

,

MCQs

,

shortcuts and tricks

,

Semester Notes

,

practice quizzes

,

Extra Questions

,

Summary

,

ppt

,

mock tests for examination

,

Formal Representation of Languages & Chomsky Hierarchy | Theory of Computation - Computer Science Engineering (CSE)

,

Objective type Questions

,

past year papers

,

Free

,

Important questions

;