Regular languages are those languages that can be accepted by deterministic finite automata (DFA) or nondeterministic finite automata (NFA). They can also be described by regular expressions. Regular languages form an important class of languages in the theory of computation.
Not every language is regular. Common examples of non-regular languages are:
A standard technique to show that a language is not regular is the Pumping Lemma for Regular Languages. The lemma gives a necessary condition that every regular language must satisfy; it is most often used in proofs by contradiction to show non-regularity.
Formal statement. Let L be a regular language. Then there exists an integer p ≥ 1 (called the pumping length) such that for every string w in L with |w| ≥ p, we can write w = xyz satisfying the following three conditions:
Interpretation. For any sufficiently long string w in a regular language L, there is a substring y that occurs within the first p symbols of w and that can be repeated or removed any number of times while keeping the resulting string inside L. We say that y can be "pumped".
Remark. The pumping lemma is a necessary condition for regularity: every regular language satisfies it. It is not a sufficient condition: a language satisfying the pumping property for a particular p is not guaranteed to be regular in general.
To prove that a language L is not regular, the typical strategy is a proof by contradiction using the pumping lemma. The steps are:
Two remarks about the choice of w and the argument in the final step:
Proof (sketch using DFA and the pigeonhole principle).
Since L is regular, there exists a DFA M that recognises L. Let n be the number of states of M.
Consider any string w ∈ L with |w| ≥ n. Let the symbols of w be a1, a2, ..., am, where m ≥ n.
When M processes w from the start state, it visits a sequence of states P0, P1, P2, ..., Pm, where P0 is the start state and Pk is the state after reading a1 ... ak.
There are m + 1 ≥ n + 1 visited states but only n distinct states in the DFA. By the pigeonhole principle, two of these visited states must be the same: there exist indices i and j with 0 ≤ i < j="" ≤="" n="" such="" that="" />i = Pj.
Now split the string w into three parts as follows.
x = a1a2 ... ai.
y = ai+1 ai+2 ... aj.
z = aj+1 aj+2 ... am.
The following observations follow directly from construction:
Because Pi = Pj, reading y from state Pi returns to the same state Pi. Therefore, for any i ≥ 0, reading y repeated i times still returns to Pi, and then reading z leads to an accepting state (because the original x y z = w was accepted). Thus x yi z is accepted for every i ≥ 0 and hence belongs to L.
The language consists of strings of one or more a's: {a, aa, aaa, aaaa, ...}.
A simple DFA with one loop on state q1 accepts any number of a's and demonstrates regularity. The pumping lemma is satisfied for this language: choose a pumping length p; take any w = am with m ≥ p, and let y be a substring of some a's within the first p symbols. Pumping y just changes the number of a's and every result remains of the form ak, hence stays in the language.
This is the set of strings that consist of one or more a's followed by one or more b's. A DFA that loops on a's and then switches to a loop on b's recognises this language.
The pumping lemma is satisfied: for any string w in the language with |w| ≥ p, the substring y that lies within the first p positions consists only of a's. Pumping y changes the number of a's but the overall shape "some a's then some b's" remains, so every pumped string is still in the language.
The set contains strings with the same number of a's followed by the same number of b's, for example {ab, aabb, aaabbb, ...}.
Proof (by pumping lemma).
Assume for contradiction that L is regular. Let p be the pumping length.
Choose w = ap bp. Then |w| = 2p ≥ p, so the lemma applies and we can write w = x y z with |y| > 0 and |xy| ≤ p.
Because |xy| ≤ p, the substring y consists only of a's. There are three possible forms for y:
Thus in all actual possibilities for this choice of w, pumping produces a string not in L, contradicting the pumping lemma. Hence L is not regular.
Proof (by pumping lemma).
Assume L is regular and let p be the pumping length.
Choose w = 0p 1p. Then |w| = 2p ≥ p, so w = x y z with |y| > 0 and |xy| ≤ p.
Because |xy| ≤ p, the substring y consists only of 0's. Take i = 0. Then x z has fewer 0's than 1's, so x z ∉ L. This contradicts the pumping lemma, hence L is not regular.
Correcting the notation: L = { ai² | i ≥ 1 }.
Proof (sketch using the pumping lemma).
Assume L is regular and let p be the pumping length.
Choose w = ap². Then |w| = p² ≥ p, so w = x y z with |y| > 0 and |xy| ≤ p.
Because |xy| ≤ p, the substring y lies among the first p a's. Thus y = ak for some k ≥ 1 and k ≤ p.
If we pump with i = 0 then the length becomes p² - k, which is not necessarily a perfect square; but to make a sure contradiction choose i = p+1. The resulting string has length p² + p·k = p(p + k), which is divisible by p and hence composite (not a perfect square of an integer unless p + k = p, impossible). Therefore x yp+1 z ∉ L. This contradicts the pumping lemma, so L is not regular.
Proof (by pumping lemma with a simple number-theoretic choice).
Assume the language of prime-length a-strings is regular. Let p be the pumping length.
Choose a prime number q > p and consider w = aq ∈ L. By the pumping lemma write w = x y z with |y| > 0 and |xy| ≤ p.
Since |xy| ≤ p < q,="" y="" />k for some k ≥ 1.
Now pump with i = q + 1. The new length is q + q·k = q(1 + k), which is composite because it has q as a factor and 1 + k > 1. Therefore x yq+1 z ∉ L. This contradicts the pumping lemma. Hence the language of prime-length a-strings is not regular.
This language contains all strings that are the concatenation of a string with itself, for example aa, abab, aab aab, etc.
Proof (by pumping lemma).
Assume L = {ww | w ∈ {a,b}*} is regular and let p be the pumping length.
Consider the particular string s = ap b ap b. This string is of the form ww with w = ap b.
We have |s| = 2(p + 1) ≥ p, so by the lemma s = x y z with |y| > 0 and |xy| ≤ p. Because |xy| ≤ p, the substring y lies entirely inside the first block of a's.
If we pump with i = 0 we remove some a's from the first half but not from the second half. The resulting string will no longer be of the form uu (the two halves will be different), hence x z ∉ L. This contradicts the pumping lemma. Therefore L is not regular.
The pumping lemma for regular languages is a fundamental tool to prove that a given language is not regular. It formalises the intuitive idea that a finite-state automaton has only finitely many states, so when processing sufficiently long strings some part of the computation must repeat. Using this repetition one shows that pumping (repeating or removing) a substring leads to strings that must still be accepted by any DFA for a regular language; if this contradicts the structure of the candidate language, the language is not regular.
| 1. What is the pumping lemma for regular languages? | ![]() |
| 2. How is the pumping lemma used to prove that a language is not regular? | ![]() |
| 3. Can the pumping lemma be used to prove that a language is regular? | ![]() |
| 4. Is the pumping lemma applicable to all types of languages? | ![]() |
| 5. Are there any alternative methods to prove that a language is not regular? | ![]() |
![]() | Explore Courses for Computer Science Engineering (CSE) exam |
![]() | Get EduRev Notes directly in your Google search |