Pumping Lemma for Regular Languages

Regular languages

Regular languages are those languages that can be accepted by deterministic finite automata (DFA) or nondeterministic finite automata (NFA). They can also be described by regular expressions. Regular languages form an important class of languages in the theory of computation.

Not every language is regular. Common examples of non-regular languages are:

  • L = {anbn | n = 0, 1, 2, ...}.
  • L = {an b an | n = 0, 1, 2, ...}.
  • L = {an bn a a bn+1 | n = 1, 2, 3, ...}.

A standard technique to show that a language is not regular is the Pumping Lemma for Regular Languages. The lemma gives a necessary condition that every regular language must satisfy; it is most often used in proofs by contradiction to show non-regularity.

The Pumping Lemma for Regular Languages

Formal statement. Let L be a regular language. Then there exists an integer p ≥ 1 (called the pumping length) such that for every string w in L with |w| ≥ p, we can write w = xyz satisfying the following three conditions:

  • |y| > 0.
  • |xy| ≤ p.
  • For all i ≥ 0, the string x yi z belongs to L.

Interpretation. For any sufficiently long string w in a regular language L, there is a substring y that occurs within the first p symbols of w and that can be repeated or removed any number of times while keeping the resulting string inside L. We say that y can be "pumped".

Remark. The pumping lemma is a necessary condition for regularity: every regular language satisfies it. It is not a sufficient condition: a language satisfying the pumping property for a particular p is not guaranteed to be regular in general.

How the lemma is used

To prove that a language L is not regular, the typical strategy is a proof by contradiction using the pumping lemma. The steps are:

  1. Assume L is regular. Let p be the pumping length given by the lemma.
  2. Choose a specific string w ∈ L with |w| ≥ p, chosen carefully so that information about the beginning of the string helps derive a contradiction.
  3. By the pumping lemma, write w = x y z with |y| > 0 and |xy| ≤ p.
  4. Show there exists an integer i ≥ 0 such that x yi z ∉ L, contradicting the lemma. Hence L is not regular.

Two remarks about the choice of w and the argument in the final step:

  • Choose w so that the constraint |xy| ≤ p forces y to lie in a particular, controlled region of w (for example, entirely in the block of a's in a string of the form anbn).
  • In the final step you typically choose i = 0 (remove y) or some other i (repeat y), and then argue using length or the structural properties of strings in L to show x yi z cannot belong to L.

Proof of the Pumping Lemma

Proof (sketch using DFA and the pigeonhole principle).

Since L is regular, there exists a DFA M that recognises L. Let n be the number of states of M.

Consider any string w ∈ L with |w| ≥ n. Let the symbols of w be a1, a2, ..., am, where m ≥ n.

When M processes w from the start state, it visits a sequence of states P0, P1, P2, ..., Pm, where P0 is the start state and Pk is the state after reading a1 ... ak.

There are m + 1 ≥ n + 1 visited states but only n distinct states in the DFA. By the pigeonhole principle, two of these visited states must be the same: there exist indices i and j with 0 ≤ i < j="" ≤="" n="" such="" that="" />i = Pj.

Now split the string w into three parts as follows.

x = a1a2 ... ai.

y = ai+1 ai+2 ... aj.

z = aj+1 aj+2 ... am.

The following observations follow directly from construction:

  • x may be empty if i = 0.
  • y cannot be empty because j > i, so |y| > 0.
  • |xy| = j ≤ n, so |xy| ≤ n (that is, y lies within the first n symbols of w).

Because Pi = Pj, reading y from state Pi returns to the same state Pi. Therefore, for any i ≥ 0, reading y repeated i times still returns to Pi, and then reading z leads to an accepting state (because the original x y z = w was accepted). Thus x yi z is accepted for every i ≥ 0 and hence belongs to L.

Proof of the Pumping Lemma
Proof of the Pumping Lemma
Proof of the Pumping Lemma

Examples and applications

Example 1: The language {an | n ≥ 1} is regular

The language consists of strings of one or more a's: {a, aa, aaa, aaaa, ...}.

A simple DFA with one loop on state q1 accepts any number of a's and demonstrates regularity. The pumping lemma is satisfied for this language: choose a pumping length p; take any w = am with m ≥ p, and let y be a substring of some a's within the first p symbols. Pumping y just changes the number of a's and every result remains of the form ak, hence stays in the language.

Example 1: The language {an | n ≥ 1} is regular
Example 1: The language {an | n ≥ 1} is regular
Example 1: The language {an | n ≥ 1} is regular
Example 1: The language {an | n ≥ 1} is regular

Example 2: The language {an bm | n, m ≥ 1} is regular

This is the set of strings that consist of one or more a's followed by one or more b's. A DFA that loops on a's and then switches to a loop on b's recognises this language.

The pumping lemma is satisfied: for any string w in the language with |w| ≥ p, the substring y that lies within the first p positions consists only of a's. Pumping y changes the number of a's but the overall shape "some a's then some b's" remains, so every pumped string is still in the language.

Example 2: The language {an bm | n, m ≥ 1} is regular
Example 2: The language {an bm | n, m ≥ 1} is regular
Example 2: The language {an bm | n, m ≥ 1} is regular

Example 3: The language {an bn | n ≥ 1} is not regular

The set contains strings with the same number of a's followed by the same number of b's, for example {ab, aabb, aaabbb, ...}.

Proof (by pumping lemma).

Assume for contradiction that L is regular. Let p be the pumping length.

Choose w = ap bp. Then |w| = 2p ≥ p, so the lemma applies and we can write w = x y z with |y| > 0 and |xy| ≤ p.

Because |xy| ≤ p, the substring y consists only of a's. There are three possible forms for y:

  • y contains only a's. Then choose i = 0. The pumped string x z has fewer a's than b's, so x z ∉ L.
  • y contains only b's. This cannot happen here because |xy| ≤ p ensures y lies within the a block.
  • y contains both a's and b's. This cannot occur either because |xy| ≤ p forces y to lie entirely in the a block.

Thus in all actual possibilities for this choice of w, pumping produces a string not in L, contradicting the pumping lemma. Hence L is not regular.

Example 3: The language {an bn | n ≥ 1} is not regular
Example 3: The language {an bn | n ≥ 1} is not regular
Example 3: The language {an bn | n ≥ 1} is not regular

Example 4: The language {0i 1i | i ≥ 1} is not regular

Proof (by pumping lemma).

Assume L is regular and let p be the pumping length.

Choose w = 0p 1p. Then |w| = 2p ≥ p, so w = x y z with |y| > 0 and |xy| ≤ p.

Because |xy| ≤ p, the substring y consists only of 0's. Take i = 0. Then x z has fewer 0's than 1's, so x z ∉ L. This contradicts the pumping lemma, hence L is not regular.

Example 5: The language {a | i ≥ 1} (perfect squares) is not regular

Correcting the notation: L = { a | i ≥ 1 }.

Proof (sketch using the pumping lemma).

Assume L is regular and let p be the pumping length.

Choose w = a. Then |w| = p² ≥ p, so w = x y z with |y| > 0 and |xy| ≤ p.

Because |xy| ≤ p, the substring y lies among the first p a's. Thus y = ak for some k ≥ 1 and k ≤ p.

If we pump with i = 0 then the length becomes p² - k, which is not necessarily a perfect square; but to make a sure contradiction choose i = p+1. The resulting string has length p² + p·k = p(p + k), which is divisible by p and hence composite (not a perfect square of an integer unless p + k = p, impossible). Therefore x yp+1 z ∉ L. This contradicts the pumping lemma, so L is not regular.

Example 5: The language {ai² | i ≥ 1} (perfect squares) is not regular

Example 6: The language {ap | p is prime} is not regular

Proof (by pumping lemma with a simple number-theoretic choice).

Assume the language of prime-length a-strings is regular. Let p be the pumping length.

Choose a prime number q > p and consider w = aq ∈ L. By the pumping lemma write w = x y z with |y| > 0 and |xy| ≤ p.

Since |xy| ≤ p < q,="" y="" />k for some k ≥ 1.

Now pump with i = q + 1. The new length is q + q·k = q(1 + k), which is composite because it has q as a factor and 1 + k > 1. Therefore x yq+1 z ∉ L. This contradicts the pumping lemma. Hence the language of prime-length a-strings is not regular.

Example 6: The language {ap | p is prime} is not regular

Example 7: The language {ww | w ∈ {a,b}*} is not regular

This language contains all strings that are the concatenation of a string with itself, for example aa, abab, aab aab, etc.

Proof (by pumping lemma).

Assume L = {ww | w ∈ {a,b}*} is regular and let p be the pumping length.

Consider the particular string s = ap b ap b. This string is of the form ww with w = ap b.

We have |s| = 2(p + 1) ≥ p, so by the lemma s = x y z with |y| > 0 and |xy| ≤ p. Because |xy| ≤ p, the substring y lies entirely inside the first block of a's.

If we pump with i = 0 we remove some a's from the first half but not from the second half. The resulting string will no longer be of the form uu (the two halves will be different), hence x z ∉ L. This contradicts the pumping lemma. Therefore L is not regular.

Comments and common pitfalls

  • When applying the pumping lemma for a contradiction, the opponent chooses the decomposition x y z subject to the constraints; you must show that for every possible decomposition satisfying |y| > 0 and |xy| ≤ p there is some i with x yi z ∉ L.
  • Choose the witness string w carefully so that the restriction |xy| ≤ p forces y to lie in a region where pumping changes a property that characterises L (for example, the number of symbols in a block).
  • Remember that the lemma gives existence of p and then for every w with |w| ≥ p there exists a decomposition. You cannot choose the decomposition; you must argue about all possible decompositions that meet the constraints.
  • The pumping lemma cannot be used directly to prove a language is regular. It can only be used to show non-regularity. To prove regularity, construct a DFA/NFA or a regular expression.

Summary

The pumping lemma for regular languages is a fundamental tool to prove that a given language is not regular. It formalises the intuitive idea that a finite-state automaton has only finitely many states, so when processing sufficiently long strings some part of the computation must repeat. Using this repetition one shows that pumping (repeating or removing) a substring leads to strings that must still be accepted by any DFA for a regular language; if this contradicts the structure of the candidate language, the language is not regular.

The document Pumping Lemma for Regular Languages is a part of the Computer Science Engineering (CSE) Course Theory of Computation.
All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)

FAQs on Pumping Lemma for Regular Languages

1. What is the pumping lemma for regular languages?
Ans. The pumping lemma for regular languages is a theorem that allows us to prove that a language is not regular. It states that for any regular language L, there exists a constant p (the pumping length) such that any string s in L with length greater than or equal to p can be split into three parts: s = xyz, where y is a non-empty substring. Furthermore, for any natural number i, the string xyiz is also in L.
2. How is the pumping lemma used to prove that a language is not regular?
Ans. To prove that a language is not regular using the pumping lemma, we need to show that there is no valid partition of the string that satisfies the pumping condition. This can be done by assuming that the language is regular, choosing a suitable string, and then demonstrating that no matter how we split it, there will be at least one case where pumping the string results in a string that is not in the language. This contradiction proves that the language is not regular.
3. Can the pumping lemma be used to prove that a language is regular?
Ans. No, the pumping lemma cannot be used to prove that a language is regular. The pumping lemma only provides a way to prove that a language is not regular. It states that if a language is regular, then there exists a constant pumping length p such that any string in the language can be split and pumped to generate more strings in the language. However, the pumping lemma does not guarantee that a language is regular if it satisfies the pumping condition.
4. Is the pumping lemma applicable to all types of languages?
Ans. No, the pumping lemma is only applicable to regular languages. It does not hold for context-free languages, context-sensitive languages, or recursively enumerable languages. The pumping lemma specifically provides a property that characterizes regular languages and helps in proving that a language is not regular.
5. Are there any alternative methods to prove that a language is not regular?
Ans. Yes, there are alternative methods to prove that a language is not regular. One such method is to use closure properties of regular languages. If we can show that a language is not closed under certain operations, such as union, intersection, or complementation, then we can conclude that the language is not regular. Another method is to use the Myhill-Nerode theorem, which provides a criterion for determining whether a language is regular based on the number of distinguishable strings it contains.
Explore Courses for Computer Science Engineering (CSE) exam
Get EduRev Notes directly in your Google search
Related Searches
past year papers, Exam, shortcuts and tricks, mock tests for examination, Pumping Lemma for Regular Languages, pdf , Viva Questions, Sample Paper, Objective type Questions, ppt, Free, Semester Notes, Pumping Lemma for Regular Languages, Pumping Lemma for Regular Languages, study material, Extra Questions, Important questions, practice quizzes, Summary, video lectures, MCQs, Previous Year Questions with Solutions;