Competitive programming often requires efficient string matching algorithms to solve a variety of problems. One such algorithm is the Z-function, which provides an efficient way to find occurrences of a pattern in a string. In this article, we will delve into the details of the Z-function, explain its implementation, discuss its applications, and provide practice problems with solutions.
Before we dive into the Z-function, let's briefly discuss the trivial algorithm for string matching. The trivial algorithm compares each character of the pattern with the corresponding character in the string, sliding the pattern by one character each time until a match is found or the end of the string is reached. Although simple, this algorithm has a time complexity of O(N*M), where N is the length of the string and M is the length of the pattern.
The Z-function is an efficient algorithm that preprocesses the pattern and string to calculate an array called the Z-array. The Z-array contains values that represent the longest substring starting at each position, which is also a prefix of the entire string. It helps us find all occurrences of the pattern in the string with a time complexity of O(N+M), where N is the length of the string and M is the length of the pattern.
To compute the Z-array, we follow these steps:
Algorithm:
function computeZArray(string pattern, string str):
string concat = pattern + "$" + str
integer n = length(concat)
integer L = 0, R = 0
array Z of size n initialized with zeros
for i from 1 to n-1:
if i > R:
L = R = i
while R < n and concat[R-L] == concat[R]:
R += 1
Z[i] = R - L
R -= 1
else:
integer k = i - L
if Z[k] < R - i + 1:
Z[i] = Z[k]
else:
L = i
while R < n and concat[R-L] == concat[R]:
R += 1
Z[i] = R - L
R -= 1
return Z
The Z-function algorithm can be implemented in various programming languages. Here's an example implementation in Python:
def computeZArray(pattern, string):
concat = pattern + "$" + string
n = len(concat)
L = R = 0
Z = [0] * n
for i in range(1, n):
if i > R:
L = R = i
while R < n and concat[R - L] == concat[R]:
R += 1
Z[i] = R - L
R -= 1
else:
k = i - L
if Z[k] < R - i + 1:
Z[i] = Z[k]
else:
L = i
while R < n and concat[R - L] == concat[R]:
R += 1
Z[i] = R - L
R -= 1
return Z
The time complexity of the Z-function algorithm is O(N+M), where N is the length of the string and M is the length of the pattern. The algorithm performs a linear scan over the concatenated string, comparing characters and updating the Z-array as necessary.
The Z-function algorithm has various applications in competitive programming, including:
Problem 1: Given a string S, find the number of distinct substrings it contains.
def countDistinctSubstrings(string):
n = len(string)
Z = computeZArray(string, string)
total = n
for i in range(n):
total += Z[i]
return total
Problem 2: Given a pattern P and a string S, find all occurrences of the pattern in the string.
def findPatternOccurrences(pattern, string):
concat = pattern + "$" + string
n = len(concat)
m = len(pattern)
Z = computeZArray(pattern, concat)
occurrences = []
for i in range(m + 1, n):
if Z[i] == m:
occurrences.append(i - m - 1)
return occurrences
The Z-function algorithm provides an efficient approach to solve string matching problems in competitive programming. By preprocessing the pattern and string, we can compute the Z-array and utilize it to find occurrences of the pattern, count distinct substrings, or even compress the string. With its linear time complexity, the Z-function algorithm is a powerful tool in a competitive programmer's arsenal.