`strStr()`

is a function that takes two strings as input, `haystack`

and `needle`

, and returns the index of the first occurrence of `needle`

in `haystack`

, or `-1`

if `needle`

is not present in `haystack`

.

For example, if `haystack = "hello world"`

and `needle = "world"`

, then the function should return `6`

, because `"world"`

first appears at index 6 in `"hello world"`

. If `needle = "goodbye"`

, then the function should return `-1`

, because `"goodbye"`

does not appear in `"hello world"`

.

This problem is often used in programming interviews because it requires the use of string manipulation techniques and searching algorithms. There are many ways to solve this problem, but some common approaches include using a sliding window or using built-in string functions in the programming language being used.

## Implement strStr() Solutions

### Solution 1: Brute Force

One straightforward solution to the `strStr()`

problem is to use a brute force approach, which involves checking each possible substring of `haystack`

that has the same length as `needle`

. If a substring is found that matches `needle`

, then its starting index in `haystack`

is returned. If no matching substring is found, then `-1`

is returned. Here’s some pseudocode to illustrate this approach:

```
function strStr(haystack, needle):
n = length(haystack)
m = length(needle)
for i = 0 to n - m:
if haystack[i:i+m] == needle:
return i
return -1
```

The time complexity of this solution is O((n – m + 1) * m), where n is the length of `haystack`

and m is the length of `needle`

. This is because we are checking each substring of length `m`

in `haystack`

, and there are `n - m + 1`

such substrings. Comparing two strings of length `m`

takes O(m) time. Therefore, the overall time complexity of this solution is O(nm).

### Solution 2: Knuth-Morris-Pratt (KMP) Algorithm

The KMP algorithm is a more advanced solution to the `strStr()`

problem that uses a pre-processing step to build a pattern-matching table based on the `needle`

string. This table allows us to skip certain characters in the `haystack`

string during the matching process, which improves performance. Here’s some pseudocode to illustrate this approach:

```
function strStr(haystack, needle):
n = length(haystack)
m = length(needle)
if m == 0:
return 0
prefix = compute_prefix(needle)
j = 0
for i = 0 to n - 1:
while j > 0 and haystack[i] != needle[j]:
j = prefix[j-1]
if haystack[i] == needle[j]:
j += 1
if j == m:
return i - m + 1
return -1
function compute_prefix(needle):
m = length(needle)
prefix = [0] * m
j = 0
for i = 1 to m - 1:
while j > 0 and needle[i] != needle[j]:
j = prefix[j-1]
if needle[i] == needle[j]:
j += 1
prefix[i] = j
return prefix
```

The time complexity of this solution is O(n + m), where n is the length of `haystack`

and m is the length of `needle`

. The pre-processing step takes O(m) time to build the `prefix`

table, and the main loop of the algorithm takes O(n) time to scan through the `haystack`

string. The inner while loop in the main loop executes at most m times, so it does not contribute significantly to the overall time complexity.

### Solution 3: Rabin-Karp Algorithm

The Rabin-Karp algorithm is another advanced solution to the `strStr()`

problem that uses a rolling hash function to compute the hash values of substrings of `haystack`

and `needle`

. This allows us to compare substrings more efficiently, because we can first compare their hash values before comparing the actual strings. Here’s some pseudocode to illustrate this approach:

```
function strStr(haystack, needle):
n = length(haystack)
m = length(needle)
if m == 0:
return 0
h = pow(26, m-1) % modulus
needle_hash = compute_hash(needle, h)
haystack_hash = compute_hash(haystack[:m], h)
for i = 0 to n - m:
if needle_hash == haystack_hash and needle == haystack[i:i+m]:
return i
if i < n - m:
haystack_hash = recalculate_hash(haystack[i], haystack[i+m], haystack_hash, h)
return -1
function compute_hash(string, h):
hash_value = 0
for c in string:
hash_value = (hash_value * 26 + ord(c) - ord('a')) % modulus
return hash_value
function recalculate_hash(old_char, new_char, old_hash, h):
new_hash = (old_hash - (ord(old_char) - ord('a')) * h) % modulus
new_hash = (new_hash * 26 + ord(new_char) - ord('a')) % modulus
return new_hash
```

The time complexity of this solution is O(n + m), where n is the length of `haystack`

and m is the length of `needle`

. The algorithm computes the hash values of `needle`

and the first substring of length `m`

in `haystack`

in O(m) time, and then scans through `haystack`

one character at a time, updating the hash value of each substring in O(1) time using the rolling hash function. The comparison of substrings takes O(m) time, but is only performed when their hash values match.

The Rabin-Karp algorithm has the same time complexity as the KMP algorithm, but maybe faster in practice for certain inputs because it can avoid some comparisons by using hash values. However, it requires more memory than the KMP algorithm to store the hash values and modulus value.

## Which approach is better and why?

It depends on the specific use case and input data. The brute force solution has the worst time complexity and may be too slow for large inputs. The KMP algorithm has a better worst-case time complexity than the brute force solution, but requires extra memory to store the `prefix`

table.

The Rabin-Karp algorithm has the same time complexity as the KMP algorithm, but may be faster in practice for certain inputs. Therefore, it’s important to consider the specific constraints and requirements of the problem when choosing a solution.

The KMP algorithm is a good solution to the `strStr()`

problem because it has a better worst-case time complexity than the brute force solution, and it is relatively easy to understand and implement.

## Implement strStr() in Python using KMP Algorithm:

```
def strStr(haystack: str, needle: str) -> int:
n = len(haystack)
m = len(needle)
if m == 0:
return 0
prefix = compute_prefix(needle)
j = 0
for i in range(n):
while j > 0 and haystack[i] != needle[j]:
j = prefix[j-1]
if haystack[i] == needle[j]:
j += 1
if j == m:
return i - m + 1
return -1
def compute_prefix(needle: str) -> List[int]:
m = len(needle)
prefix = [0] * m
j = 0
for i in range(1, m):
while j > 0 and needle[i] != needle[j]:
j = prefix[j-1]
if needle[i] == needle[j]:
j += 1
prefix[i] = j
return prefix
```

## Implement strStr() in Ruby using KMP Algorithm

```
def str_str(haystack, needle)
n = haystack.length
m = needle.length
return 0 if m == 0
prefix = compute_prefix(needle)
j = 0
n.times do |i|
while j > 0 && haystack[i] != needle[j]
j = prefix[j-1]
end
if haystack[i] == needle[j]
j += 1
end
if j == m
return i - m + 1
end
end
return -1
end
def compute_prefix(needle)
m = needle.length
prefix = Array.new(m, 0)
j = 0
(1...m).each do |i|
while j > 0 && needle[i] != needle[j]
j = prefix[j-1]
end
if needle[i] == needle[j]
j += 1
end
prefix[i] = j
end
return prefix
end
```

## Implement strStr() in JavaScript using KMP Algorithm

```
function strStr(haystack, needle) {
const n = haystack.length;
const m = needle.length;
if (m == 0) {
return 0;
}
const prefix = computePrefix(needle);
let j = 0;
for (let i = 0; i < n; i++) {
while (j > 0 && haystack[i] != needle[j]) {
j = prefix[j-1];
}
if (haystack[i] == needle[j]) {
j++;
}
if (j == m) {
return i - m + 1;
}
}
return -1;
}
function computePrefix(needle) {
const m = needle.length;
const prefix = Array(m).fill(0);
let j = 0;
for (let i = 1; i < m; i++) {
while (j > 0 && needle[i] != needle[j]) {
j = prefix[j-1];
}
if (needle[i] == needle[j]) {
j++;
}
prefix[i] = j;
}
return prefix;
}
```

## Implement strStr() in Java using KMP Algorithm

```
public int strStr(String haystack, String needle) {
int n = haystack.length();
int m = needle.length();
if (m == 0) {
return 0;
}
int[] prefix = computePrefix(needle);
int j = 0;
for (int i = 0; i < n; i++) {
while (j > 0 && haystack.charAt(i) != needle.charAt(j)) {
j = prefix[j-1];
}
if (haystack.charAt(i) == needle.charAt(j)) {
j++;
}
if (j == m) {
return i - m + 1;
}
}
return -1;
}
public int[] computePrefix(String needle) {
int m = needle.length();
int[] prefix = new int[m];
int j = 0;
for (int i = 1; i < m; i++) {
while (j > 0 && needle.charAt(i) != needle.charAt(j)) {
j = prefix[j-1];
}
if (needle.charAt(i) == needle.charAt(j)) {
j++;
}
prefix[i] = j;
}
return prefix;
}
```

In each of these implementations, we first compute the prefix table using the `compute_prefix()`

function, which takes O(m) time. Then, we scan through the `haystack`

string and try to match `needle`

using the prefix table, which takes O(n) time in the worst case. Therefore, the overall time complexity of the algorithm is O(m + n).

## Leave a Reply